Data Science Education

The data science revolution has arrived! Millions of students need to be ready for a future filled with data. We're jumpstarting the new field of data science education and supporting its growing community of researchers and educators.
Scroll for more


We're developing the Common Online Data Analysis Platform—free, easy-to-use web-based software that lets students in grades 6 through college visualize, analyze, explore, and learn from data. Whether data come from a map, a game or simulation, or your own experiment, CODAP provides a dynamic, exploratory experience.
Scroll for more

Data Fluency

Data fluency connotes a sense of flowing grace and mastery with data. We're creating innovative curriculum resources for use with CODAP to prepare today's learners to be fluent in the medium that will shape our—and their—futures.
Scroll for more

We're jumpstarting data science fluency by...

Publishing findings

We've written over 20 articles and white papers on data science education. Learn what data moves are and why experts use them, how to make sense of "messy data," what it means to shift students' relationship to data from data collectors to data producers, and much more.

Researching questions

We’re researching important questions about how students use computational tools to structure and transform data for scientific inquiry, how student interpretations and interactions with data change as a function of the size of the dataset, how data modeling games can support learning, and much more.

We're spearheading this growing field by convening data science education leaders in webinars, meetups, and summits to inspire and learn alongside technology developers, teachers, civic leaders, policy makers, and business experts.

Organizing summits and webinars

We’ve hosted dozens of online webinars to disseminate research findings and display the power of our Common Online Data Analysis Platform, met with educators, administrators, and curriculum developers at national and local conferences, and convened two groundbreaking summits.

We're developing free, powerful data analysis software and student-ready resources, including our Common Online Data Analysis, plus datasets from across the subject spectrum and short activities to engage learners in grades 9-14.

Designing tools

CODAP is our most popular data analysis software, designed for students in grades 6 through college to visualize, analyze, and ultimately learn from data. We’re also developing Dataflow, a platform for programming, data processing, and real-time data graphing that uses an innovative block-like programming style.

Developing resources

We’ve created activities and games with data at their core, so students gain fluency with the data moves necessary for diving into data and build excitement for their ability to work with data. We’ve also curated a set of over 50 free datasets—from earthquakes to FastPlants, hurricanes, compound interest, the digits of pi (infinity), and beyond—to inspire new data explorations.

Building Models
Principal Investigators
Dan Damelin, Joe Krajcik

The SageModeler system modeling tool, designed for students to construct, revise, and use models, is embedded in CODAP, so students can validate model output with external datasets. We’re studying the development of students’ modeling capabilities and the potential of models to provide feedback on students’ understanding of disciplinary core ideas.

Principal Investigators
William Finzer, Dan Damelin

CODAP (Common Online Data Analysis Platform) was designed to engage students in meaningful scientific data collection, analysis, visualization, modeling, and interpretation. Project research studies students’ understanding of data representation, including their understanding of data structures in the context of authentic scientific situations.

Data Clubs
Principal Investigator
Andee Rubin

Integrating computer science and math through data science, the project creates activities for teaching and learning data science in afterschool clubs and camps in rural and urban settings with middle school youth. Using a design-based research framework, the project seeks to develop knowledge about how to design learning environments for students to learn data science.

Data Games
Principal Investigator
William Finzer

Data generated by students playing computer games form the raw material for mathematics classroom activities. A teaching experiment methodology investigated research questions about the extent to which students view the data as the result of a production process, what data structures are appropriate to introduce in middle school and high school, and more.

Data Science Games
Principal Investigators
William Finzer, Frieda Reichsman, Tim Erickson, Michelle Wilkerson-Jerde

Data Science Games created a suite of games for high school science classes in which students store their data, organize it, analyze it, and visualize it in CODAP. Research explored students’ conceptions of data structures and data science competencies to contribute principles for the design and use of integrated data science environments.

Data Sketch
Principal Investigator
Michelle Wilkerson

The project is designing software and classroom materials to make exploring and authoring data visualizations accessible for middle grades classrooms. Research explores the knowledge and skills middle school students need to interpret and construct data visualizations, how data visualization can be integrated into the grade 6-8 STEM curriculum, and what role data visualization can contribute to learning core STEM practices and content.

Principal Investigators
Hollylynne Lee, William Finzer, Rick Hudson, Stephanie Casey

ESTEEM infuses statistics content and pedagogy into undergraduate mathematics teacher preparation by providing resources, networking, and support. Research studies the nature of preservice secondary mathematics teachers’ understandings of statistics content and pedagogy; how they design CODAP-based statistics activities; and what activities and resources best support their learning to teach statistics.

Exploring the Mathematics of Biological Ecosystems with Data Science
Principal Investigators
Andee Rubin, Gillian Puttick

This project creates and studies "data excursions," structured activities that develop high school students' competencies with data practices and modeling within existing, phenomenon-based science curriculum. Using design-based research, the project investigates how working with real datasets, creating computational representations, and using them to reason about ecosystem phenomena can support students' development of data science literacy.

Principal Investigators
Chad Dorsey, William Finzer, Robert Tinker, Uri Wilensky

InquirySpace integrates innovative technologies with CODAP into a coherent, web-based environment enabling rich, collaborative scientific inquiry. Project research demonstrates that typical students can learn to use an integrated set of computer-based tools to undertake sophisticated, open-ended investigations similar to the approach and thinking used by real scientists.

InquirySpace 2
Principal Investigators
Chad Dorsey, Daniel Damelin, Hee-Sun Lee, Gey-Hong Gweon, Robert Tinker

InquirySpace integrates innovative technologies with CODAP into a coherent, web-based environment enabling rich, collaborative scientific inquiry. Project research demonstrates that typical students can learn to use an integrated set of computer-based tools to undertake sophisticated, open-ended investigations similar to the approach and thinking used by real scientists.

Principal Investigators
Sherry Hsi, Robert Tinker, Peter Sand, Lisa Hardy, Charles Xie

InSPECT is using inexpensive smart interfaces to create systems that make scientific inquiry more engaging and accessible. The project tests the feasibility of our computational resources to allow students to undertake scientific investigations, examining learning gains in students’ abilities to perform science practices, exercise computational thinking, and understand biology concepts, and identifying supports needed by teachers to implement the curriculum.

Principal Investigators
Steven Manson, Steven Ruggles, Jack DeWaard, Lara Cleveland, Leah Samberg

IPUMS Terra integrates, preserves, and disseminates data describing characteristics of the human population and the environment. By incorporating a rich collection of the world's population and agricultural census data into IPUMS Terra, this project is enhancing scientific understanding of critical policy-related issues such as food and water security, migration, and the effects of social and environmental policy on economic development, well-being, and long-run sustainability.

Linking Complex Systems
Principal Investigators
Carolyn Staudt, Eric Klopfer, Meridith Bruozas, Chad Dorsey

Linking Complex Systems merges agent-based modeling and system diagramming in a new linked hybrid modeling. The project is investigating how a linked-hybrid modeling environment can best support learners in examining systems, how systems thinking develops as learners examine “wicked problems,” and how learners gain in complex systems understanding after engaging with system modeling tasks within the linked-hybrid modeling environment.

Multilevel Computational Modeling
Principal Investigators
Daniel Damelin, Lynn Stephens

Multilevel Computational Modeling extends the modeling capabilities of SageModeler to support systems dynamics modeling and researching student learning and teacher practice with respect to students' abilities to engage in systems thinking and computational thinking through modeling complex systems.

Modeling Ecosystems
Principal Investigators
Leigh Peake, William Finzer, Daniel Damelin, Christine Voyer, Amanda Dickes

This project engages middle school students in building and revising models of variability and change in ecosystems and studies the learning and instruction in classroom contexts. The understandings from project research provide evidence toward scaling the learning experiences to the Gulf of Maine Research Institute’s Ecosystem Investigation Network and replication by programs that aim to engage students in data-rich, field-based ecological investigations.

Principal Investigators
Corey Brady, Tobin White

The project uses design-based research to build and study NetStat, a classroom network system for supporting collaborative activities in data modeling and statistics. The goal is to enhance the genre of classroom-level collaboration, and contribute important new insights into the potential and the critical features of next-generation mathematics classroom technology that integrates and extends existing software environments.

Principal Investigators
Ruth Krumhansl, Jacqueline DeLisi, Josephine Louie, Bryan Wunar, Christine Brown

Ocean Tracks harnesses the promise of emerging cyberinfrastructures to engage high school students in the use of data visualization tools to study the movement patterns and habitat usage of marine animals in relation to oceanographic variables. Ocean Tracks is developing a concrete model for the field of how to bring data into the classroom in productive ways.

Principal Investigators
Josephine Louie, Beth Chance, Soma Roy

The Strengthening Data Literacy across the Curriculum project promotes statistical understandings and interest in quantitative data analysis among high school students by developing learning materials with social justice themes. The project aims to strengthen existing theories of how to design classroom learning materials to support high school students, particularly those historically underrepresented in STEM fields.

Principal Investigators
Jie Chao, William Finzer, Shiyan Jiang, Carolyn Rose

Integrating CODAP with machine-learning and feature-extraction, StoryQ is supporting high school students in text mining practices, and studying how learning environments can be designed to help students understand core AI concepts including the structures in unstructured data and the roles of human insight in the development of AI technologies.

Students Discover
Principal Investigators
Robert Dunn, Julie Urban, Angela Duncan, Ashlie Thompson, LaTricia Townsend, Margaret Lowman, Jenifer Corn

This Community Enterprise for STEM Learning Math and Science Partnership entitled Students Discover highlights citizen science through the collection and analysis of data by middle school students for use by professional scientists. Project research examines a process for moving innovations from more ideal classroom settings to a variety of schools and community-based settings where conditions for success may be less favorable.

Principal Investigators
Gey-Hong Gweon, William Finzer

This project is researching and developing a prototype cloud-based pluggable data analytics engine to address the educational game market’s need for real-time assessment for learning. Research examines whether the assessment potential of a key knowledge tracing algorithm can be extended to players with games involving different content domains, in a greater number of classrooms with diverse demographics, and in real time.

Principal Investigators
Josephine Louie, Kevin Waterman, Asli Sezen-Barrie, Brian Fitzgerald

The WeatherX project is developing and researching curriculum units and an interactive experience with weather scientists to promote important scientific data practices and interest in big data science careers among middle-school students in underserved New England rural areas. The project studies the potential of developed resources to promote student interest, engagement, and competence in scientific data analysis and modeling.

Writing Data Stories
Principal Investigators
Michelle Wilkerson, William Finzer, Kris Gutierrez, Hollylynne Lee

Writing Data Stories infuses the analysis of scientific datasets about important socio-scientific issues into middle school science through “data storytelling.” The project studies how students use computational tools to structure and transform data for scientific inquiry; patterns of engagement in scientific practices; and literacy practices that support Dual Language Learners and others in constructing oral and written arguments and explanations using data and visualizations as evidence.

Zoom-In Data (ZiSci)
Principal Investigators
William Tally, Kira Krumhansl, Megan Silander

This project tests the popular Zoom In platform to facilitate data-focused inquiry and skill development among high school science teachers and their students by integrating CODAP into biology and Earth science curriculum units. A small-scale efficacy test assesses aspects of the implementation process, practices, and overall impact of the modules on student learning.

Teaching in a World of Messy Data
By Chad Dorsey
The Science Teacher — May/June 2021

We live in a world transformed by data. This may have never been more apparent than across the past year as we have witnessed a global pandemic, economic shock waves, critical societal reckonings with questions of race and equity, and a national election, each intertwined with questions of understanding data.

Writing Data Stories
By Bill Finzer and Michelle Wilkerson
@Concord — Spring 2020

Today’s students will encounter data in one form or another throughout their lives. The more data fluent they are when they graduate, the better prepared they will be to contribute to society as citizens and to advocate for themselves and their communities.

From Data Collectors to Data Producers: Shifting Students' Relationship to Data
Lisa Hardy, Colin Dixon, and Sherry Hsi
Journal of the Learning Sciences — Nov 2019

This paper contributes a theoretical framework informed by historical, philosophical and ethnographic studies of science practice to argue that data should be considered to be actively produced, rather than passively collected. We further argue that traditional school science laboratory investigations misconstrue the nature of data and overly constrain student agency in their production.

Monday's Lesson: Zoom in! Teaching Science with Data
Bill Finzer and Randy Kochevar
@Concord — Fall 2019

In today's data-driven world nearly all scientific discovery involves delving into data. The Next Generation Science Standards (NGSS) include “analyzing and interpreting data” as one of the key practices and emphasize the need for students to engage firsthand with authentic datasets.

Data Moves
Tim Erickson, Michelle Wilkerson, William Finzer, and Frieda Reichsman
Technology Innovations in Statistics Education — Jul 2019

When experienced analysts explore data in a rich environment, they often transform the dataset. For example, they may choose to group or filter data, calculate new variables and summary measures, or reorganize a dataset by changing its structure or merging it with other information.

Exploring the Essential Elements of Data Science Education
By William Finzer and Frieda Reichsman
@Concord — Fall 2018

She scans the numbers, scrutinizing thousands of cases and dozens of attributes. Something doesn’t look right. Missing values? Incorrect coding? Fixing those will be a start. Cleaning, checking, re-sorting—gradually she cajoles the enormous array into a workable form.

Data-driven Inquiry in the PBL Classroom
William Finzer, Amy Busey, and Randy Kochevar
The Science Teacher — Aug 2018

Data-driven inquiry as a science practice and instructional approach can provide a powerful context for project-based learning (PBL). Students engage in data-driven inquiry when they explore a rich data set and observe patterns, ask questions suggested by the data, and pursue answers.

Modeling as a Core Component of Structuring Data
Cliff Konold, William Finzer, and Kosoom Kreetong
Statistics Education Research Journal — Nov 2017

We gave participants diagrams of traffic on two roads with information about eight attributes, including the type of each vehicle, its speed, direction and the width of the road. Their task was to record and organize the data to assist city planners in its analysis.

The Data Science Education Revolution
By Chad Dorsey and William Finzer
@Concord — Fall 2017

Life is particularly interesting these days, if you have an eye for data. While predictions about entering the “data deluge” began only a decade ago, the current situation makes those warnings seem almost quaint.

Growing the Data Science Education Field
The Concord Consortium
@Concord — Spring 2017

Data-driven inquiry as a science practice and instructional approach can provide a powerful context for project-based learning (PBL). Students engage in data-driven inquiry when they explore a rich data set and observe patterns, ask questions suggested by the data, and pursue answers.

Monday’s Lesson: Exploring Data with the Ramp Game
By Tom Farmer
@Concord — Spring 2017

The Next Generation Science Standards promote learning science by doing science. Our new InquirySpace II project guides students in independent investigation by providing curricular and pedagogical scaffolds to support the exploration of phenomena. A game we are continuing to develop focuses on a classic physics experiment—rolling a car down a ramp.

Under the Hood: Weaving Collaboration into Code
By Doug Martin and Scott Cytacki
@Concord — Spring 2017

While Concord Consortium activities have long encouraged collaboration between students, we’re finding new ways to build student collaboration into our software codebase. One goal of our Common Online Data Analysis Platform (CODAP) is to enable students to work in a dynamic shared CODAP document in which any student’s changes are instantly synchronized to everyone else’s automatically.

Perspective: Preparing Learners for the Future
By Chad Dorsey
@Concord — Spring 2017

We live in an interconnected world of accelerating complexity. As populations expand and people, economies, and nations become inextricably intertwined, even seemingly simple problems reveal intricate subtleties. And the issues of our time are far from simple.

Data Science Games
By Natalya St. Clair
@Concord — Fall 2016

The emerging discipline of data science combines computational thinking, mathematics, statistics, and content knowledge, paving the way for a new genre of educational technology: data science games. Funded by the National Science Foundation, our Data Science Games project is developing games and curriculum materials for middle and high school students to use data while learning science.

Perspective: Data: A Wider View
By Chad Dorsey
@Concord — Fall 2015

As we enter an age where data seems to be everywhere, both educators and education researchers are becoming aware of its power. Yet our current view of data is highly limited.

Building the CODAP Community
By William Finzer and Dan Damelin
@Concord — Fall 2015

Over the past several decades our society has come to rely on data for nearly every aspect of its functioning. Not only is the amount of data generated each day beyond comprehension, no significant problem facing us—from traffic gridlock to climate change—can be solved without the help of people who understand how to work with data.

Under the Hood: Embedding a Simulation in CODAP
By William Finzer and Robert Tinker
@Concord — Spring 2015

In STEM education it’s essential to engage students in undertaking their own projects. But data exploration is often a neglected aspect of student project work.

Analytics and Student Learning: An Example from InquirySpace
By Amy Pallant, Hee-Sun Lee, and Nathan Kimball
@Concord — Spring 2015

In a pioneering new research direction, the InquirySpace project is capturing real-time changes in students’ development of new knowledge. As students engage in simulation-supported games their actions are automatically logged.

Common Online Data Analysis Platform
The Concord Consortium
@Concord — Spring 2014

Think of problems confronting the world: hunger, violence, disease, inequality and more. Now think of questions to be answered about the brain, the origin of the universe, how children learn and human behavior. What do they have in common?

Perspective: Defined by Data
By Chad Dorsey
@Concord — Fall 2013

There is little denying that we live in a world defined by data. When historians view this era, the explosion of data and the ways in which it shapes our lives may turn out to be one of its most distinctive characteristics.

Designing 2030:
Thinking & Doing with Data

The Designing 2030: Thinking & Doing with Data held in January 2019, brought together top data education leaders to further advance the conversation on how to achieve data fluency and support data science education for all.

DSET Conference

The 2017 Data Science Education Technology Conference convened 100 researchers, curriculum developers, and software developers to build a community and identify innovative tools and approaches for integrating data science into the K-12 classroom.

Webinars & Meetups

We host data science webinars led by pioneering data science educators throughout the year. Topics covered include data learning through citizen science, exploring urban mobility, bringing data science to middle school students, examining shifts in students' relationships to sensor data, and much more.


CODAP (Common Online Data Analysis Platform) continues the legacy of the award-winning statistical software packages Fathom and TinkerPlots. In the process, it builds on a decades-long legacy of research into interactive environments that encourage exploration, play, and puzzlement. CODAP is about exploring and learning from data from any content area—from math and science to social studies or physical education!


Dataflow software is a comprehensive platform for programming, data processing, and real-time data graphing. Students produce meaningful data and control their data through its lifecycle, making decisions as data flows from the collection device to a representation on screen. They choose what data to collect, how to modify or transform it, how to use the data to actuate a relay, and how to store and view their data. To program in Dataflow students drag and connect nodes in the workspace.

Awash in Data
This online book walks you through introductory lessons in data science with data from various sources, including public census data to health and public transportation data, with embedded CODAP examples.

CODAP (Common Online Data Analysis Platform) is easy-to-use web-based software for students in grades 6 through college to visualize, analyze, and ultimately learn from data. Whether the source of data is a game, a map, an experiment, or a simulation, CODAP provides an immersive, exploratory experience.

Data Games
Save your data as you play a game, then use math and data skills to help you win. As you play each game, develop winning strategies.

Data Science Games
Win a data science game by visualizing the data to see what’s going on, improve your strategy, and level up. We created a suite of games and explorations for middle and high school science classes, so students can experience the excitement of data science.

Dynamic Data Science
All dynamic data science activities are embedded in CODAP, and follow a similar design: students make context-specific actions, store their data, organize it, analyze it, and visualize it in the surrounding CODAP environment.

Exploring Data with the Ramp Game
Place a virtual car at the right height so it lands close to the center of a target. Use CODAP to analyze your data, determine the relationship between the starting height of the car and the distance it travels, and get better at the game.

Sample Datasets
Browse over 50 free datasets—from earthquakes to FastPlants, hurricanes, compound interest, the digits of pi (infinity), and beyond. Explore to find classroom activities and other downloadable resources.

Zoom in! Teaching Science with Data
This activity on natural selection was developed in collaboration with the ZoomIn! Teaching Science with Data project at EDC. Use CODAP to explore data about lizards and to become familiar with the data structure and attributes.