Data Science Games

Win a data science game by visualizing the data to see what’s going on, improve your strategy, and level up.


The age of data science is upon us! The more experience students have working with data, the better prepared they are to contribute to the data-driven society they are entering. And what better way to learn the basics of this new, essential field than through games? A data science game generates data—lots of data—as you play. And the only way to win is to figure out a good way to visualize the data so you can see what’s going on, improve your strategy, and level up.

We’re working with a game developer from Eeps Media and a learning science researcher from the University of California at Berkeley to create a suite of games for high school science classes, so students can experience the excitement of data science.

Data science is an emerging discipline that is a partial union of mathematics and statistics, subject matter expertise, and computational thinking. It is rapidly gaining recognition as critical to all fields of endeavor. But middle and high school students currently do not get very much experience working with data. Furthermore, experience with data is best gained in the context of learning subject-specific matter.

All data science game scenarios follow a similar design: students take context-specific actions in the game. They store their data, organize it, analyze it, and visualize it in the surrounding CODAP environment. In a chemistry scenario, for example, the game might involve the custom design of reactions to achieve specific goals such as buffering a chemical system so it doesn’t explode. Each reaction students observe generates data about bond strength, activation energy, pressure, temperature, and concentration that they use to solve specific problems or puzzles. A physics game might involve building or altering a structure within certain constraints where the data consist of the forces on all the structural elements. Each game can be used as the focal point for significant learning about subject matter content. With each game students experience new and challenging data science.

Most educational games are self-contained, providing all the functionality necessary to do well at the game within themselves. The data science games we are developing are embedded in CODAP, a data analysis environment designed for students to explore, visualize, and analyze data. Though the game is embedded in the environment’s browser page, its implementation (language, domain) is completely independent of the environment so that any developer can create a data science game without CODAP developers. The games are interoperable with the CODAP environment. These three characteristics—embeddedness, communication, and interoperability—have only recently become possible in browser-based settings. Our goal is to develop this new genre of educational technology and to explore its educational potential.


We’re exploring students’ conceptions of data structures (e.g., flat, hierarchical, and tree) and data science competencies (cleaning, transforming) to expand research on students’ conceptions of data, identify student competencies unique to contemporary data sciences, and contribute principles for the design and use of integrated data science environments.


Project Funder
This material is based upon work supported by the National Science Foundation under Grant No. IIS-1530578. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Principal Investigator
William Finzer, Frieda Reichsman, Tim Erickson, Michelle Wilkerson-Jerde
Years Active