In the midst of the data deluge, data seem to be everywhere. With the advent of ubiquitous Internet of Things sensors, data now seem to be a part of the natural world around us, too. And along with this idea that data exist everywhere, we often say that scientists collect data (think: collecting blueberries, in the spirit of the summer picking season).
In our July data science education (DSE) webinar, our own Lisa Hardy described the ways that data are highly unnatural — that we should think of them as produced, not collected — and what this means for both science and data science education.
As data science education emerges as a new field of study, students often work with datasets they did not create, and often with very little information about why or how these datasets were constructed in the first place. Data science education will need to prepare students not only with the skills to work with such datasets, but also the habits of mind of questioning how a dataset was produced, who produced it, and for what purposes. In particular, students will need to understand the ways that data is not just collected, but actively produced in the material world, by designed technologies, and for human and social purposes. The question is: what sorts of learning experiences can prepare students for understanding data in this way?
Scientific investigations with sensors may be a promising classroom setting in which students can gain familiarity thinking about the contexts of data production. In our InSPECT research project, students use scientific sensors in inquiry projects in high school biology classes. In this webinar, Lisa presented results from a recent classroom study that illustrate how using sensors in inquiry can give students opportunities to consider the material context of data production, to call into question their own methods of producing data, and to take on their own purposes for producing particular types of data.
A student uses sensors to produce data from an eco-column in the InSPECT project.
Regarding the need for students to contend with “messy” datasets, webinar participants discussed the pedagogical question of whether students should work first with clean data or “unruly” data. Should students use clean data to build skills of interpretation and analysis, even when such data are unlikely to ever be encountered in the field? And is learning with messy data at odds with learning conceptual content?
As with all our data science education webinars, we discussed the larger goal of determining how best to bring about effective learning with and about data.
Learn more about the data science education revolution, and join us!