Crowdsourcing & Data Cleaning (11/6)

Assignment due on November 11 (Wed) by the end of the day: Project Critique 3

We will begin class with a discussion of Project Critique 3 and the Zooniverse Project assignment. Then we will have a data cleaning workshop, in which we will prepare a dataset to use in next week’s in-class project. After the workshop, we will debrief and discuss the readings.

Readings due

To do before class:

Download OpenRefine

Go to and click either “Mac kit” or”Windows kit.” (No need for the embedded Java version) Then open your downloads folder and unzip* the file. Double-click on openrefine.exe or refine.bat if the latter does not work.

*If you are on a Windows system, you may need an additional piece of software called WinRAR. You do not have to buy it. You can download it for free here: Click on “Download WinRAR,” then click “Download WinRAR” again in the box that appears. Save the file, then run the installer. Once it finishes, you should be able to right click on the OpenRefine zip file and click “Extract Here.” Then follow the rest of the instructions above.

Crowdsourcing Activity

  • Go to the Decoding Punch Cards project, which is hosted on Zooniverse.
  • Read a bit about the project by exploring the About page, etc, and then click “Getting Started”.
  • Participate in the project by filling out the info for at least 3 punch cards. (Note, you do not need to create an account to do this.)
  • Take a screenshot of you doing the task and write a few sentences (3-4 sentences at least), analyzing the Decoding Punch Cards project in light of Zooniverse’s Best Practices guide. What does the project do well? Does it fall short in any way? Post your screenshot and comments in the #class-discussion channel before class.