Topic: Projects

467px-DH_Computational_Methods.jpgThis summer Graham Sack, a doctoral student in the English department is teaching an introductory course in Digital Humanities called ''Computational Methods for Literary and Cultural Criticism''. Graham came to CCNMTL inquiring about the usage of a cutting edge approach to teaching programing to novices, a web-based programming environment called IPython Notebook.

DjangoCon '12 is on the East Coast this year, and we submitted a proposal to present on our recent intervention in South Africa. We hope to see you in DC!

Title: Offline and Off-Road: Django, Health and Human Rights

Description: For years, CCNMTL has been using Django to create interactive multimedia health interventions. We'll spotlight our latest NIMH-funded project where we deployed Django to offline netbooks at South African HIV clinics, and developed a sneakernet-based (ie USB drives) data synchronization protocol. We'll also present our FOSS CMS for authoring these ebook-like sites.

Abstract: For years, CCNMTL has been using Django to create interactive multimedia health interventions as a part of our Triangle initiative. We have worked closely with the Schools of Social Work and Public health to explore the possibilities and benefits of incorporating rich, interactive, multimedia into these kinds of counseling sessions.

In this talk we will spotlight Masivukeni, our latest NIMH-funded project where we deployed Django to offline netbooks at South African HIV clinics, and developed a sneakernet-based (ie USB drives) data synchronization protocol.

As we iterate over projects like these, we have continued to abstract the aspects of these projects that are idiosyncratic to this domain. We'll also present our open-source lightweigth-CMS that we have created for authoring these ebook-like sites. The characteristics of these sites are serial content delivery (often with specific business rules, such as preventing access to already-seen, or not-yet-seen pages), interspersed with casual learning "games" (eg html5/javascript drag and drop activities).

Finally, we will discuss the roadmap for this authoring tool, including the possibility of a networked, collaborative ebook authoring tool, that might export epub3 or SCORM-compliant sites.

At CCNMTL we focus on pedagogical innovation, but we continue to work on projects that involve delivering static educational materials in traditional sequential formats. We work hard to carve out places for study in a world of instruction, but there is plenty of important knowledge that people want to acquire, and training people on skills continues to be an important component of education and often a precondition of concept formation.

In many of our projects, we've explored the boundaries of what we call - "Serial Directed Learning Modules". The key properties of these projects include:

  • Nested, hierarchical, rich content with idiosyncratic navigation and access rules
  • Rich interactive activities (quizzes, drag/drop, planning, mapping)
  • Detailed reporting on the learner's performance and completion

In our partnership with the Columbia University Medical Center and our strategic Triangle Initiative we've worked on several multimedia behavioral interventions that conform to this delivery pattern. We've worked on direct interventions relating to HIV couples counseling, childhood diabetes and cavity prevention, treatment adherence, and we've developed directing learning modules for teaching practitioners about tobacco cessation, child abuse, and more.

While similar in the abstract, these projects vary in their devilish details. Some of these environments are mediated by a service provider, such as a social worker and their patient, while others are self-directed. Some require multiple modes with additional notes available only to the facilitator. A few lessons are completed in a single sitting, while others must preserve state and pick up where the learner left off.

We try to balance the effort of creating unique works of art with churning out boilerplate, cookie-cutter sites. We've explored the use of general purpose content management solutions (CMS) for these projects and are regularly stymied by the mismatch between these styles of interaction and the sweet spots of the CMS platforms we know well. CMS platforms are great for creating collections of random-access content, and organizing and relating it in a variety of ways. The business rules around the directed learning projects often left us wrestling with CMS environments, wishing we had developed them using a lightweight MVC framework, without as much overhead to introduce the customize workflows these projects demand.

After building of a few of these sites à la carte, we began to generalize our approach and developed the PageTree hierarchical menu-creation system for Django. PageTree evolved into an open-source lightweight, domain-specific content management system, and we introduced a modular architecture for embedding and assembling PageBlocks which introduces elements like text, media, or custom Javascript activities within pages. The source code for PageTree and a basic set of PageBlocks are available on our 'ccnmtl' github account. We have also released the code and content powering the childhood diabetes intervention - and it is available here.

As the demand for these sites has grown, we've recently created a system for "farming" these PageTree sites -- aptly named "Forest" -- that allows our project managers to very quickly set up their own PageTree sites (called "Stands") in order to get a skeletal site up and running without the bottleneck of overhead of developer intervention. You can see a self-documenting demo of Forest here.

This approach allows us to collect content as early as possible. The features can be developed around the content, instead of vice-versa. If the site requires custom functionality that goes beyond the generic features of the Forest farm, we can spin off an independent Django site from the Forest farm, and begin development at the onset with the site's content already in place.

This system helped us achieve a nice balance between customization and efficiency, and we are pleased with the flexibility this approach has enabled for this class of projects. We're in the process of conceptualizing a roadmap for PageTree sites, and have been imagining a collaborative authoring platform that supports versioning, SCORM authoring/publishing platform, BasicLTI compliance, and more.

While working on a Javascript interactive for the diaBeaters project, we stumbled across an interesting problem with jQuery UI draggables. To wit: if you have draggable items inside a div with overflow:hidden, they're stuck. You can't drag them out of the container -- the div just scrolls out to infinity. (Try it sometime, it's awful.)

Here's the original drag-and-drop setup. The game involves dragging magnets from a menu on the left-hand side onto a refrigerator on the right.

jQuery(".magnet").draggable({
  revert: 'invalid'
});

jQuery("#fridge").droppable({
  drop: function(event, ui) {
      jQuery(this).addClass('dropped');
  });

This semester we upgraded our WordPress Multi-User (MU) installation to WordPress3. WordPress runs our EdBlogs course blogging platform, a system we support that is designed for multi-user course blogs. WordPress3 brings the WordPress MU fork back into the fold of the core WordPress distribution and continues the gradual improvement of its technical architecture and design.

We concentrated on revamping our standard themes and worked harder than I anticipated to make sure that the default experience within a newly created blog makes educational sense.

MediaThread is a media analysis communication platform we announced back in January.  At the moment it sports a number of central features:

  1. annotating images
    large images on any web page, Flickr, and some specific collections like ArtStor (for subscribers)
  2. clipping video into an annotation
    YouTube, quicktime, flv, flv pseudo-streaming, realmedia, h264, and preliminary ogg (when the browser supports it)
  3. embedding your image and video annotations into a multimedia essay
  4. discussing collected images and video (we call them assets) in a space where you can also embed annotations
We find this kind of communication affords and encourages deep analysis and "brings the laser pointer to the essay." Instead of just referencing a video and describing the scene, you can embed the exact moment and let the reader view the evidence directly and immediately.

We would love for this platform to grow beyond the walls of Columbia.  Fostering a community for a new open-source project is always a bit of a challenge, so please contribute with questions, suggestions, code, experience or insight.  The MediaThread forum will not just be for developers, so if you are using MediaThread, then tell us about your experience.

One of the primary tenets of agile development is test first, test often. After working in a small XP shop doing mobile development, I came to believe strongly that quality code hinges on a test-driven approach.

Coders, impatient with paper specs and endless product meetings, often rush to their keyboards and push out half-baked, poorly implemented solutions that don't meet anyone's needs. Writing tests -- especially in a test-first approach -- provides time for thoughtful inquiry into an application's overall design and specific functionality. The coder can express herself in her own comfortable environment and language. The resulting tests become permanent artifacts, able to verify functionality as the application is enhanced and refactored.

And, in less altruistic, more self-serving terms: good tests mean good code, and good code makes the coder look good. Why wouldn't you want to write tests?

Still, I was a little apprehensive when asked to setup a test infrastructure for the Mondrian JavaScript components. (Mondrian is our snazzy new web-based, multimedia, annotation environment). I've tackled many server-side testing tasks, but have managed to circumvent the swampy land of JavaScript. JavaScript generally does not lend itself to testing. Most JavaScript code I've seen is poorly organized, fragmentary and tightly-bound to the browser. I've often lamented the lack of good JavaScript testing tools, but also was loathe to tackle the seemingly messy, difficult task.

We are very excited to announce the release of our latest iteration on a web-based, multimedia, annotation environment - code named: Mondrian Mediathread ( source code ). Mediathread builds on the strengths and experiences of our long history of annotation projects here at CCNMTL.

Mediathread is a collaborative multimedia analysis environment that supports deep critical exploration of primary multimedia source material, i.e. participatory education, research, democracy, and culture. The Mediathread platform supports a robust access control model with multiple analysis spaces and a variety of workflows (solo projects, collaborative projects, versioning, private projects, public projects, etc). The community portal also organizes streams of activity notifications to help the participants track each other's (net)work.

Participants in the analysis space collect multimedia assets from around the web, clip/annotate these assets, organize their clips, and create a multimedia composition where their clips are directly embedded inline in their analysis/argument. The upcoming release supports video clipping (quicktime, flowplayer, and youtube), and drawing on images (using the fabulous OpenLayers viewer).

For thousands of years critical and scholarly discourse around text has revolved around citation and reference. What might this kind of discourse look like around multimedia - html text, images, audio, and video?

This question is a central theme in our technical work here at CCNTML, and a variety of our projects have taken a pass at this question from one angle or another. My colleagues have also taught me the importance of designing these kinds of features in ways that encourage students to critically engage with the source materials they are studying. How can we facilitate the marshaling of multimedia sources as evidence to support an argument or hypothesis?

In March, CCNMTL shipped a laptop to a South African AIDS clinic as a part of a multimedia health-care intervention.

We're not that experienced with desktop application development, so the main discussion was how do we bundle a web application on a stand-alone laptop with no connection to the Internet. The first proposal was to run a virtual machine (Xen or VMware) which would run the web server on the Windows desktop.

I was less sanguine about diagnosing problems with a web server across continents and timezones, and looked for a way to store state information from static web pages. Firefox's DOM Storage was close to a HTML5 standard (now finally implemented in Firefox 3.5), and seemed to work with URLs visited as "file://localhost/C:/..." so this made the following process possible:

  1. Put static HTML files on the laptop
  2. All state is stored by the browser (in a file called webappsstore.sqlite)
  3. All application storage is accessed and modified by javascript (see code)
  4. Login state uses sessionStorage which works similarly but disappears after the browser closes (like a session cookie)

Instead of supporting a virtualization and web server stack, all that's left to support is the browser--something very familiar to all computer users by now. It's worked out great.

I should note that our application is not secure from a javascript hacker who has access to the computer--they could access and change all account information on our system. Fortunately, that's not an attack vector we're worried about.

OK, there's a dirty secret behind my not posting about this previously--it no longer works! There's a laptop in a South African clinic that's not getting any Firefox updates, security or otherwise, and that's a very good thing. Now, it seems, all browsers, remove the 'localhost' from file:// URLs. The new HTML5 standard localStorage does not work for local files, and the deprecated globalStorage[hostname] doesn't work without a hostname!

HTML5 taketh away, but it giveth ath well. Instead of relying on file:// URLs, in the future we can label our site as an offline resource and then use the now standardized and implemented localStorage.

The one issue with this future approach is if we need to update the application while it's in the field. We haven't needed to do that on this project, but it's a comfort to know that if they discovered a critical bug, we could email them a single HTML file to replace, and the computer running the application does not need to be connected (to anything other than the USB key the new file is on). I sent our use cases for localStorage over to the HTML5 mailing list, but there's still work on the standards side and for the browser vendors.

OpenID is an increasingly popular universal sign-on mechanism on the web. Google, Facebook, LiveJournal, even Sears' online store are supporting it. We can, in theory, adapt Columbia logins to be an OpenID provider. This would allow members of the Columbia community to login to other sites which accept OpenID with their Columbia UNIs.

At Columbia, CUIT provides a single sign-on mechanism for services within the university called Wind , which is based on the more ubiquitous CAS .

One problem with depending on Columbia-only authentication/authorization is that it makes it awkward, if not impossible, for students or faculty to work together with non-Columbia affiliates in the same protected environments. Guest lecturers can't access the course materials. Researchers have difficulty collaborating across institutions.

The solution is to use a broader authentication method. Shibboleth is one that has been baking for a long time within the academic world, but it seems like OpenID provides many similar features and will allow our community to interface on more popular websites, too.

It's not available yet; we still need to talk to the folks at CUIT about why it's a good idea. However, I've written an implementation that should make the following scenario possible:

For any service on the Internet, you should be able to type into an OpenID login 'columbia.edu.' This will shepherd you to a Wind login, and then Columbia will authenticate you with the site. No saved passwords and, if you choose, your name and email will automatically be sent to them.

The system, as implemented, would also let you login as an anonymous Columbia student or faculty member. Sometimes you don't want a site to know your name or even be able to match your login with other sites you've logged in to the same way. Note, however, that the Columbia IT department could figure out who you are--you'd be trusting their servers after all.

Some tech details:
I used the mostly friendly JanRain php-openid library
OpenID is pretty complicated, and it took a bit of time before I started thinking my confusion was related to a bug in the code, rather than my confusion with the spec. Overall, though, the example server was a great place to start.

One gotcha I encountered toward the end of the implementation: I was getting an error when testing at openidenabled.com "OpenID authentication failed: No matching endpoint found after discovering [my OpenID]." It turned out that the Relying Party (the server for which you login), on the last step, queries the user's identity_url to check that it trusts the Provider (the server providing the login). When I checked what my provider was doing, it was serving the wrong information in that case--easily broken, but easily fixed with some better htaccess rules.

All the code is at http://github.com/ccnmtl/openid-wind-bridge

Firefox 3.5 is just released today. Download it now!

Especially for our more public projects, we aim to support all of the most popular web browsers. However, our favorite browser is Firefox. When development time is short, or we have a controlled environment, we will default to requiring Firefox.

Why? Generally many of our projects go like this. We develop an application in a couple of weeks or months. It works perfectly in Firefox. Making it work in Safari is often a small tweak, at most, taking an hour or so. But for Internet Explorer, we often need
a full extra week or more to solve all the program and style problems that appear.

As developers, Firefox has some incredible extensions, without which modern web development would be impossible. The most common ones we use are Firebug , LiveHTTPHeaders , and WebDeveloper .

For users, Firefox is also the best browser. In academic life we can use powerful
extensions like Zotero . Because it's fast , we can design more demanding applications on the web. Because it supports SVG, Canvas and the video tag we can make more powerful applications for analyzing images and video.

The Collateral Consequences Calculator is premiering this week at the New York State Bar Association's annual meeting, marking the culmination of a 2 year development effort. The motivation and curricular goals driving this project are described in our portfolio. In addition to the formidable educational and logistical challenges, this project also presented some very unique technical challenges, which are worth documenting and celebrating.

CCNMTL's primary mission is educational, and our design research is usually focused on pedagogy and improving the user experience, rather than infrastructure or basic research. Typically, we attempt to apply well understood software solutions to educational contexts and improve the experience around a well understood technical problem. While our developers dance on the cutting edge of enterprise solutions, it is difficult and risky to work on problems that computer scientists still consider 'hard'.

The Collateral Consequences Calculator is one of the more technically aggressive projects we have embarked on. The project has gone through various stages of complexity and sophistication, but at its core, we were asked to model the Law within Code. Since the Law is expressed using natural human language, this problem falls within the domain of artificial intelligence.

I've been largely an outside observer to our Country X project. CountryX is a simulation used for International Policy students studying what scenarios lead to genocide. In the early stages, the rules of the 'game.' converged on:

1. Four players, each in a specified role, starting with a watershed moment in an imaginary country.
2. The four roles are President, First-world Envoy, Regional Representative, and an Opposition Leader. (I immediately abbreviated these in my notes to 'P','E','R','O')
3. The game has four turns. In each turn, each of the four players decides on a context-specific list of three choices for their particular role. This is like playing a 'trick' in Bridge or Hearts.

When I first heard of the project, Jonah was working with the others on the project to specify the rules, for all of the contexts, in each of the states. He described an elaborate forest of paper Mark, our client, had on the walls of his office. How were we the programmers going to get him to describe the rules?

The light at the end of the tunnel seems to actually be visible now for the Collateral Consequences project. There's still a formidable amount of manual data entry to be done, but we've got a part-timer on that, so it's not my primary concern. The difficult work is now out of the way. If you've been following the development at all, you know that we've basically got a system worked out and running. There were plenty of tricky problems to solve to get there and I've written about some of them here.

The last big "how on earth are we going to do that?" problem that I kept putting off solving was verification. This project involves law and even though we have big disclaimers plastered over everything, none of us really wanted to release this application to be used without being really confident that it's not going to give people wrong results. The problem, as I've explained before, is that the inherent complexity makes it really difficult to verify. You could change a single line of N3 code somewhere and potentially affect all the consequences on all the charges. It just isn't feasible to predict exactly what repercussions a change will have without running it for all the charges and manually inspecting the results.