Updated Thu Feb 13 11:33:34 EST 2025

COS IW01 and IW02: Computer Science Tools and Techniques for Digital Humanities

IW01 Thursday 3:00-4:20; IW02 Friday 11:00am-12:20pm in CS 402 (for the whole semester)

People

Brian Kernighan, COS
Wouter Haverals, CDH

Updates

Playlist      

Warmup exercise      
Exercise to do before third meeting (Feb 13/14)
My approach

Administrivia

IW seminars have a common set of checkpoints and deadlines; keep an eye on them.

An important note on participation: We expect everyone to be present, on time, fully engaged, and actively participating. After the first 2 or 3 weeks, each person will present a brief but formal update on their project at each meeting.

If you are sick, quarantined, isolated, or otherwise potentially dangerous to the rest of us, stay home! Let me know, and I'll try to arrange a Zoom connection. Thanks.

We will use Slack for news, weekly progress reports, and the like. Invitations will be sent to all class members; please sign up before the first meeting.

I will try to arrive 10-15 minutes ahead of time for casual chat and will usually be able to hang out afterwards as long as anyone wants to stay. I will also have regular office hours during the week. I will meet with each of you individually every two or three weeks so we can talk about how your project is going; I will post a calendar link for scheduling appointments once we get rolling.

Overview of seminar

"Digital humanities" covers a wide variety of ways in which scholars in the humanities -- literature, languages, history, music, art, religion, and many other disciplines -- collect, curate, analyze and present information about their fields, using digital representations and technology.

Digital humanities data is intrinsically messy, and there is always a considerable effort devoted to cleaning it up even before study can begin. There is also much effort devoted to figuring out how to represent it effectively and make it accessible to others.

This seminar is aimed at building tools and developing techniques that will help humanities scholars work more effectively with their data. This might include exploratory data analysis, machine learning, natural language processing, text encodings, APIs, data visualization, data cleaning, and user interface design for making the processes available to scholars just starting out in technology.

A typical project will begin with a dataset of interest to you, or a focus on a CS technique. In the former case, the goal will be to explore the data to learn and present new and interesting things about it. In the latter case, the goal will be to create or improve tools, languages, and interfaces to help scholars in the humanities.

Tools and techniques to start exploring

You're not expected to know about all of these, of course, but you're certain to be using some, and you should at least have an idea of what most of them are and how they might be relevant. We'll talk about some of them in class, but for the most part you're on your own. Once you get underway, you'll pick up what you need.

Things to do now

Potential data sources

Many sites have data for download, or accessible via an API, or both. If you have an idea, start now to determine whether the data you want is really available. Searching for "digital humanities databases" will yield useful results. The following links are not even remotely complete, just places where you could start. Suggestion: click each link and see if anything interests you. Come back from time to time as you learn more.

Potential Example Code Libraries

Other resources