Google Summer of Code 2016

Work on the Concord Consortium's Open Source Projects During Your Summer Break

Google Summer of Code 2016The Concord Consortium, a nonprofit educational R&D located in Concord, MA, and Emeryville, CA, is seeking talented software developers to create innovative science and math activities. We want your great ideas and help expanding our activities and systems. We are accepting proposals for the following very cool ideas list. We also welcome blue-sky proposals, so poke around our site and propose something off-the-wall you think we'd like.

We should also add that we're super-duper excited to work with you and that everybody here is really smart and nice. Please look through our ideas list below, and join our developers list to start discussing your ideas with us:
groups.google.com/group/cc-developers

Once you run your ideas by us, then go ahead and apply, already.

Learn about Piotr Janik's experience working with us during the 2012 Google Summer of Code.

Using natural language processing and machine learning to analyze and respond to student scientific arguments

Rationale

There is a strong need to understand how students construct scientific arguments in written natural language contexts. To date, supporting students in learning this skill has generally required instructors to read, grade, and respond to hundreds or thousands of written student responses. The recent advent of natural language processing (NLP) toolkits and machine learning (ML) packages and approaches opens up a new set of opportunities for processing and analyzing student written arguments, a form that possesses a unique written structure and poses specific demands and language constraints on students. Because of our extensive work in online curriculum projects, we have an existing database of thousands of open-response student scientific arguments, all of which have existing pre-scored human ratings. This poses a unique opportunity for the application of NLP and ML methods.

Approach

Apply open source NLP toolkits (e.g., Stanford’s Core NLP suite, the Natural Language Toolkit, Apache OpenNLP, or others) to process student-generated scientific arguments (generally of around 50-100 words each) and related set of human-created scores to generate a machine-predicted argument quality on a scale of 1-5 that matches the human scores as closely as possible. Apply this method to additional original student responses in a means that makes real-time feedback possible. Opportunities also exist to apply novel machine learning techniques and libraries such as Google’s Tensorflow or Facebook’s recently released open source modules for Torch.

Challenges

While there are previous examples of the output for this work to pattern after, the set of possible approaches to solving the problem is as wide open as the field itself. Successful projects will require a good degree of focus and willingness to work through a variety of potentially unpredictable challenges in these new software packages and fields.

Degree of difficulty and needed skills

This project will require a significant degree of prior experience with software development, including experience in open source software and collaborative code management with GitHub as well as experience writing packages with open APIs for integration with multiple external systems. Prior experience with machine learning is a prerequisite for candidates proposing to apply machine learning techniques to the problem. Experience with programming languages such as Python, C++, Ruby or other equivalent languages is a requirement. Proficiency in JavaScript/HTML5 is a plus.

Mentors

Noah Paessel, Hee-Sun Lee

return to project list

Create new interactives for Investigations curriculum units

Rationale

The Investigations project was funded by the National Science Foundation to create exemplar curricular materials that demonstrate what Next Generation Science Standards (NGSS) inspired pedagogy might look like. A core part of our approach uses modeling and simulations to understand how interactions at the atomic level explain many phenomena across physics, chemistry, and biology. We are looking for someone who can develop new interactives that will be embedded in the curricular units. There is high interest in this curriculum, so what gets created has the potential to impact many students and other curriculum developers.

Approach

The interactives will be developed using the Next-Generation Molecular Workbench Lab framework. Examples of interactives can be found here. An interactive is constructed by using a JSON description of the model, inputs, outputs, layout and custom scripts. Check out the underlying code for these interactives here (when viewing an interactive page click on the “interactive editor” checkbox) and learn about the Lab framework that runs these interactives here.

Challenges

Developing interactives using a system that is not well documented (outside of the code itself) and understanding some of the science/expected behaviors of the interactives being developed.

Degree of difficulty and needed skills

This will vary depending on the interactive. Some will be easier, while others will require extensive scripting in JavaScript. In the most difficult cases changes to the codebase of the Lab framework itself may be necessary if needed features don’t exist in our interactive authoring system or various modeling engines.

Mentors

Dan Damelin, Piotr Janik

return to project list

Move popular materials from Java to HTML5

Rationale

One of our most popular sets of curricular materials was designed before HTML5 made in-browser modeling and simulation possible. Running these Java-based materials has become more and more difficult in schools. Since the original development of these materials we have created a new portal for assigning and reporting, a new web-based authoring system, and an HTML5 framework for designing interactive simulations. There are two obstacles to getting the materials (which are used by tens of thousands of students) into our HTML5 pipeline: the need to create HTML5 versions of the Java-based simulations and the need to extend our framework for creating these simulations with new features.

Approach

The interactives will be developed using the Next-Generation Molecular Workbench Lab framework. Examples of interactives can be found here. An interactive is constructed by using a JSON description of the model, inputs, outputs, layout and custom scripts. Check out the underlying code for these interactives here (when viewing an interactive page click on the “interactive editor” checkbox) and learn about the Lab framework that runs these interactives here.

The Java-based materials can be found at the RI-ITEST portal.

Challenges

Developing interactives using a system that is not well documented (outside of the code itself) and understanding some of the science/expected behaviors of the interactives being developed. Depending on the scientific/mathematical background of the student, more challenging extensions of the Lab framework may be developed. In particular, we would like to extend the modeling engine to handle Gay-Berne particle simulations.

Degree of difficulty and needed skills

This will vary depending on the interactive and what kinds of extensions of the Lab framework are proposed. Some interactives will be easier, while others will require extensive scripting in JavaScript. Extending the Lab framework would be the most challenging.

Mentors

Dan Damelin, Cynthia McIntyre, Piotr Janik

return to project list

Integrate Google Forms with creative data analytics and visualizations

Rationale

Have you ever wished online survey tools had better data visualization capabilities so you could really figure out what the responses are telling you? You’ll make it happen by linking Google Forms with Concord Consortium's Common Online Data Analysis Platform (CODAP) so that the survey owner can build the data visualization and watch it fill in in real time as surveys get filled out. This could be a real game changer for students and teachers who make surveys and analyze their results!

Approach

Research the inner workings of Google Forms and figure out how to get real-time notification of form submission and real-time access to the data. Then use this knowledge to build a CODAP data interactive that connects to a form and streams the data to CODAP. Come up with some compelling demos of how to use this capability in realistic settings.

Challenges

This project will challenge you to master two API’s and figure out how to mash them together into a single CODAP data interactive. You will have to think both at the low level to get good performance and at the user level to build a simple, elegant UI.

Degree of difficulty and needed skills:

Medium difficulty. JavaScript programming to make use of Web protocols.

Mentors

William Finzer, Jonathan Sandoe

return to project list

Make data personal: Visualize reaction time data with with creative data analytics tool

Rationale

Measuring reaction time is a time-honored tradition made scalable by the Web. But doing it right is a subtle task, and making the data accessible and useful downright challenging. The Concord Consortium has all the tools needed to establish a new benchmark for accuracy, completeness, and utility. Our Lab interactives provide the instrumentation and the Common Online Data Analysis Platform (CODAP) allows universal access to and visualization of the data. Are lefties, on average, quicker than righties? Women than men? Old than young? So many questions!

Approach

There are a lot of ways to approach this project, but, at minimum, you’ll need to figure out how to create a Lab interactive that will accurately measure reaction time following current best practices, stream the resulting times, and survey information to a database, and then build a CODAP interactive that makes the visualization and analysis possible. That’s a lot of technologies to string together. The possibility of going viral will have to be considered and prepared for.

Challenges

There’s a lot to learn here, from the science of reaction times to the mash up of disparate technologies, to the challenges of potentially big data.

Degree of difficulty and needed skills:

Moderate to great difficulty depending on approach and scope. JavaScript and database programming necessary.

Mentors

William Finzer, Dan Damelin

return to project list

Breeding dragons in the browser: Use JavaScript and React to transform a classic genetic simulation environment in HTML5

Rationale

The Concord Consortium has spearheaded the use of simulations to help STEM teaching and learning for over two decades. One of our longest-standing and most famous products is GenScope, a full-featured, open-ended simulation environment that allows users to breed dragons to learn genetics. This environment was the first of its kind when built over twenty years ago, and pioneered the development of many techniques for educational technology development still in use today. Sadly, technologies have moved on, and GenScope no longer runs on most modern computers. However, it is still remembered and loved as an amazing learning tool by hundreds of thousands around the world—teachers have been known to hold onto classic Macintosh computers just to run it! Fortunately, the underlying simulation engines are still available and in active use in other dragon genetics projects at the Concord Consortium, and we are currently rebuilding these components in React. This project offers the opportunity to revive one of the most classic and requested educational technology packages and make it available to students and teachers around the world.

Approach

The underlying genetics simulation engine for the simulation environment has been ported to JavaScript, and multiple examples of its use in HTML5 applications using Sproutcore and Ember already exist in full-featured forms. The classic GenScope software is also accessible and can be run on older operating systems. Current projects are rebuilding the components necessary to create flexible dragon genetics applications using React and JavaScript/HTML5 programming.

Challenges

Recreating GenScope will involve documenting the original Java-based GenScope application and code to form a functional specification, understanding the new React components, programming any non-existing needed components and assembling these into an HTML5 application that is as faithful to the original GenScope as possible. While the original code is available, there will be multiple differences between the original features and code and the new React-based components that you will need to navigate to complete this project successfully. Additionally, the React component codebase will be under active development, so you will need to negotiate, and likely add to, this evolving codebase as part of the project.

Degree of difficulty and required skills

This project will require a good deal of fluency with JavaScript, HTML5 and general Web development and design. Either previous experience with React or the ability to quickly come up to speed with new frameworks will also be necessary. Knowledge of Java is not specifically required, but will be highly helpful in dissecting the functionality of the original environment. Understanding of biology and genetics is not required, but will help in understanding the project.

Mentors

Paul Horwitz, Chad Dorsey, Sam Fentress

Log In

Don't have a profile?

Create a profile and...

Create your profile now »

Loading...