Teaching Digital Humanities on a laptop

467px-DH_Computational_Methods.jpgThis summer Graham Sack, a doctoral student in the English department is teaching an introductory course in Digital Humanities called ''Computational Methods for Literary and Cultural Criticism''. Graham came to CCNMTL inquiring about the usage of a cutting edge approach to teaching programing to novices, a web-based programming environment called IPython Notebook.

IPython Notebook is a tool that runs in your browser and allows for the full execution of any Python program that can run on the underlying server. Similar to a wave of new web-based educational programming environments, such as Code Academy and specialized tools that run within MOOC platforms like Coursera and EdX, IPython Notebook allows users to author and execute programs through their browser without requiring them to interact with the file system, text editors, and command lines. IPython Notebook also allows for the creation of annotations and commentary so that the code can be interspersed with blocks of formatted text. It can also be configured to display mathematical equations, rich media and charts inline, so that the results of numerical computations can be displayed visually, directly in the browser. All of these elements can be combined in a single, portable document that can be shared and modified by anyone running IPython Notebook. This is a gallery of some interesting notebooks and a video of the notebook in action:

IPython Notebook has become very popular in scientific research communities as a tool for communicating and publishing methods and calculations. It is a powerful way for researchers to create a research notebook, showing their work and demonstrating to their peers exactly how they arrived at a particular solution. Monumental calculation errors such as the Reinhart and Rogoff spreadsheet bug, where the accidental omission of a few rows in a calculation resulted in a the IMF introducing austerity policies in Europe, demonstrate the increasing importance of data provenance, and the ability for researchers to show their work so that others can reproduce and validate the results. Tools like IPython Notebook point towards a format that these kinds of publication might take.

Education is another area where IPython Notebook is becoming popular. Instead of using a fixed slide deck, and perhaps an interactive terminal session for demonstrations, classroom lecture notes are created within IPython Notebook allowing the teacher to interactively modify the examples and then re-rerun them to explain a concept or respond to a question. The IPython Notebook documents can also be distributed to students so that they can follow along with the examples in class, or review them afterwards. Within their copy of the IPython Notebook document, students can also modify and tweak the code, allowing them to poke at the parts they want to explore. Assignments can also be distributed as IPython Notebook documents which the students can modify with their responses, submit to their teacher, who can in turn provide them with feedback, directly within the context of their code. All by exchanging small, portable .ipynb files. Graham was very interested in trying to teach with this tool, but needed some advice on the best way to enable his students to access this tool. A startup called trinket.io now offers web-based interactive tutorials as a service, although they do not currently provide access to all of the libraries that Graham wanted to teach with.

Graham also had other challenges to overcome, more general than providing access to this specific tool, and increasingly common among non-STEM faculty trying to introduce programming to their students. He needed a secure way to provide his students with a programming environment, and one that was fully stocked with all of the libraries and tools that he wanted them to learn about. In Graham's case, he wanted to expose them to a wide range of very powerful libraries, such as Python's Natural Language Processing Toolkit (NLTK), Topic Modeling (gensim), Data Analyis (pandas), Scientific Computing (NumPy), Text Processing (TextBlob), and graphical plotting (matplotlib). These libraries can be quite difficult to install, and though the experience of setting them up is itself instructive, teaching systems administration skills was not Graham's learning objective and he wanted his students to have a uniform, pre-built environment so they could immediately start using these powerful tools. There are a few pre-built ipython notebook environments for windows/mac, such as Anaconda or Enthought, but, once again, they don't ship with the libraries that Graham wanted to teach with, and the installation process was complex and daunting.

Currently, neither Columbia's Central IT department nor the Libraries has a standard solution to the problem of setting up secure custom computing environments for instruction. As more and more courses across the social sciences and humanities embrace computational methods, there is a growing demand for services that address this need.

Together with Graham, CCNMTL developed a two-pronged solution for his course.

First, we developed a solution that would enable his students to each run their own environment, locally on their own computers. IPython Notebook runs as a web server, and we needed to come up with a cross-platform solution that would allow us to easily distribute and update the complex custom environment that Graham needed. Using a popular tool that open-source communities have been adopting to create uniform development environments, we built a custom Vagrant box which contained a fully configured IPython Notebook server along with pre-built versions all of the libraries that Graham wanted to teach with. This Vagrant box was built on top of a Linux (Ubuntu 14.04) server, and setup to run within the VirtualBox virtualization software. This means that once the students install VirtualBox and Vagrant (which both support Windows, OSX, and Linux), the experience for the students was consistent.

Graham created simple installation instructions (the zip file you need to download is here: DH-METHODS.zip), and we also released all of our work to Github: https://github.com/ccnmtl/ipython-notebook.

Second, as a fallback, in case some students had difficulty installing the software, we also prepared a server version of the software. Initially, we weren't certain how many students would enroll in the class, and a server version would inevitably be more difficult to administer, in terms of securing access to the server and the filesystem. IPython Notebook is not (yet) a multiuser environment, so we knew we would need one notebook per student. Given these requirements, the Docker containment technology seemed like a good fit. Alongside the Vagrant image we also brought up a Docker version of an almost identical environment alongside.

The class met for the first time last week, and after a few installation hiccups almost all the the students have successfully installed the Vagrant box. One or two have older machines that run the virtualized operating system too slowly to be practical, and we are working to set them up with their own Docker-based IPython Notebooks. A little less convenient than having all the tools local, but should get us through the summer. In the future we hope to also work with some of the folks running computer labs to have this software installed on computers available to the students.

Finally, we're hoping to connect with the DH-in-a-box community who are working in a very similar problem-space. With the growing need for secure, easy to use, computing platforms in teaching and learning, solutions like these are very interesting and important to explore.