COS 333 Project: Project Ideas

COS 333: Project Ideas

Wed Feb 27 14:17:44 EST 2013
Newer item(s) are at the front.

Overview

The ideas here have come from a variety of friends on campus. I've edited their prose a bit, but have tried to leave their ideas intact. You're welcome to approach the people listed directly, and I would be happy to act as an intermediary if you prefer, or to help you solicit more information. There are also some project ideas from last year at the end of the list that might still be current.

Help the COS Lab TAs, Alex Daifotis (daifotis@princeton.edu)

With the rise in enrollment, the COS Lab TAs have really been stretched to continue providing the help they always have. One problem of particular note, is that we are pretty technologically backward in terms of assigning people to shifts, getting feedback from students, and making sure people are being helped ASAP. Right now our entire system is basically email the list for shift switches, and write names on a blackboard as people need help. Help us do better than this! We're open to people making the project into their own vision, but broadly speaking -- we are looking for a way of tracking people (students and TAs) in the lab more efficiently, and keeping track of relevant stats (i.e. how many people are helped in a given night, what assignments were they doing, etc.) Contact Alex Daifotis for more information.

Teen Sex Diaries: Revolutionizing the Social Sciences, Janet Vertesi and Marta Tienda (jvertesi, tienda@princeton.edu), Sociology

Social Scientists still don't understand why teens start having sex when they do. To reach teens and get them to tell us their stories from their point of view, we want to develop a mobile diary so they can tell us about their romantic relationships and their social lives. Using mobile apps for data collection is new to sociology, and the first results are already changing the social sciences.

This will require a mobile app, since phones are the primary form of web and social access for teens. Privacy and encryption matter; the sensitive community and topic requires high quality end-to-end security. Front end development is needed both for making teen users feel comfortable on the system, and for presenting the data to social scientists for them to analyze. How do we code, query, access, and visualize this data? We need a modular design because we don't want a one-off. Can you write a platform that can be modified or extended to other projects? We're looking for a fully-functional system in a real world setting.

OIT infrastructure and maintenance reporting systems, Jay Dominick (jdominick@princeton.edu), OIT

1. A system to report maintenance related issues on campus at time of observation. A quick example would be that if you see a lightbulb out, you would snap a photo of it with your phone, add some location information, put the info in a database and then alert a technician. This could also used to report printer malfunctions, cluster problems, etc.

2. A recommender system to help students navigate the information and service infrastructure of campus. This gets at the problem that "There are so many resources here for students, and the organization of the information is so poor, that students don't know what they could be doing." This is a particular problem for OIT in that we offer lots of tools and technologies for students but can't seem to communicate particularly well about them. However, some students do manage to figure it out and their expertise could be very useful to others. How do you capture what the experts know about campus and make it available (in a time-relevant manner) to other students?

Helping individual political activism, Sam Wang (sswang@princeton.edu), Princeton Election Consortium

Create an app that would be a finder to locate the nearest nearby Congressional district that is easily swayable by individual activism, i.e., get-out-the-vote (GOTV) activity. Input: ZIP code, most recent election outcomes, Congressional/national polls if available to calculate an offset from the recent election outcomes. Output: Congressional districts within 3 hours' drive where GOTV is most likely to sway a race, i.e., where the race is within 5, 10, or 15 points, depending on how aggressive you want to be. Ideally it would be in a condition to be updated by the authors or by someone else for the next election. It has the potential to be very useful.

Princeton Prosody Archive, Grant Wythoff (gwythoff@princeton.edu) and Meredith Martin (mm4@princeton.edu), English

The Princeton Prosody Archive, a collection of historical texts on poetry and poetics, is looking for a redesigned interface. The Archive offers a range of historical documents, including manuscripts, manuals, articles, grammar books, and other materials all pertaining to the rhythm, intonation, and measure of language. Not a static repository of historical data, the Princeton Prosody Archive engages scholars to re-think the past and future of navigating, conceptualizing, and historicizing large amounts of data in a format that will be useful for scholars who work with many kinds of digital-text archives. The archive needs a search portal for the full text of 3,000 books, and a browser for the original page images of these books. Also possible will be a collaboration with Princeton computer science researchers on building graphical interfaces to topic models derived from these texts on poetics.

Virtual Campus Tour: Past and Present, Shaun Ellis (shaune@princeton.edu), Library

Get a tour of Princeton's campus using "augmented reality" on your Android/iPhone, including both current geolocated live tweets and historic images and names associated with locations, buildings, and sculpture that you see through your mobile phone's camera while wandering the campus.

How? Using augmented reality for mobile devices, add a Princeton University layer in the Android/iOS augmented reality application Layar, to see two kinds of things: 1) tweets using the #PrincetonUniversity hashtag (if the tweets are geolocated so they can be nailed to a point) and 2) points of interest from a shared Google Map.

Mash up the existing Locations and Places Web Services with Library Special Collections resources (i.e., such as the Princeton University Historic Postcard Collection) to provide some historic context, images, and name associations. The Library can work with students to add location data to the existing Digital Library Web Service.

OIT Data Feeds, Sal Rosario (srosario@princeton.edu), OIT

OIT has recently made available a number of interesting data feeds.

Some of these might suggest ideas for projects, perhaps phone apps, that might combine multiple sources of campus data.

From 2012:

Analysis and Visualization of Text Influences, Henry Cowles (hcowles@princeton.edu), History

We want to search one text (A) with a set of other texts (B & C), find places in A that are identical or similar to B or C, tag them accordingly, and have a way to link back accurately to exactly where in B or C the relevant text portion is from. Historians would like to be able to do this to identify previously unrecognized instances of plagiarism or, more-often, "self-plagiarism," in which an author has lifted text from previous things she's written (from letters to articles) and re-used them in a later text. Such a search/visualization tool would be of general interest to intellectual historians, especially those dealing with authors/thinkers who have limited corpora, were known to plagiarize themselves, and do so in complex or otherwise misunderstood ways.

Such a project would have two parts: a search algorithm and a visualizer (like a PDF viewer). The former should be able to match strings (not line-based, but sentence-based), and it would be great if there were ways to avoid false negatives especially (e.g., an author might blank out a name in an example paragraph, such that exact strings wouldn't match but the paragraph is still lifted and we'd like to pick that up). The visualizer is almost more important: something clean, into which digital texts could be loaded in a straightforward way and out of which "results" (i.e., highlighted text, with scrollover revelations of original provenance in B and C, listed in a menu bar).

Library dashboard(s), Kevin Reiss (kr2@princeton.edu), Library

The dashboard/visualization project I had in mind would have three components:

Select some interesting metrics about the usage and performance of our library websites and library search applications. This data could be culled from the actual application logs or could be harvested from our Library Google Analytics account using the Google Analytics API. Part of this work would involve setting up some processing routines to regularly harvest data from these sources.
The creation of useful visualizations based on the data selected for harvesting. This could be an opportunity for the student to explore a visualization tool such as processing.js or work with something like the Google Visualization API.
The packaging of these visualizations into a dashboard environment that supports access controls. This portion of the project could be accomplished through the creation of a module for the Drupal CMS the library is transitioning or with another common web programming framework.

If this is too expansive a project for the course we definitely could focus on just points one and two above.

Digitization of materials, Marvin Bielawski (marvinb@princeton.edu), Library

Is there a way we could get full-text transcriptions of some of the handwritten manuscripts in the digital collections? It would be great to develop some tools that would allow us to "crowd source" this kind of thing. For example, imagine a competitive game among alumni to see which class could transcribe the most Trustees Minutes. From my experience in the Office of Development, they love that type of healthy competition among classes to see who can help out Princeton more. Here's an example of a great game-like interface for crowd sourcing ship logs for better climate change data: http://www.oldweather.org and here's an in-depth explanation.

Marvin is also interested in data analysis and visualization tools for a variety of library systems, in effect a dashboard for the library.

Mail interfaces, Randee Tengi (rit@princeton.edu), Psychology

Like many people, I just treat my email like a filing cabinet, making levels of nested folders, saving emails with attachments, while often the attachment is all I need, and filling up email disk quota. But if I file it on my hard drive, I often can't find it again. I just find it much easier to find things in email folders than on disk, in large part because the directory/folder structures don't match. I am not sure exactly what I'm getting at, but I would love some way to make my email folder structure automatically map to a corresponding folder on my hard drive so that I could either store the email in an email folder, but the attachment in a disk file so it's not chewing up mail quota. Or maybe a way to "link" and email (Exchange) folder to a folder on disk so that if I file the mail in an email folder, I could, at a click, save the attachment someone on disk where I can easily find it, and remove it from the email.

How about a Thunderbird extension that allows me to right-click on an attachment and decide if I want to save it in a place that I choose or in a folder that corresponds to an email folder structurally, but is on my hard drive? Or, a way to save a message in an email folder, but that gives me the option of where to put the attachment - in the email folder or on my local drive, but with a 'stub' of some sort being kept with the message so it can retrieve the attachment from disk when I view the message.

College wise calendar, Patrick Caddeau (caddeau@Princeton.edu), Forbes College

The idea is to help students to sync up how they spend their time with their academic goals, important deadlines, and milestones in the progress from freshman to senior, and beyond. Many students struggle with how to wisely allocate their time and find management of their time to be a major source of stress. A college wise calendar would provide a map that connects how students spend time with accomplishing major goals. It would have three main features:

(1) automatically populate with all significant university deadlines (add/drop, pdf, mid-terms, dean's date, deadlines for declaring a major, JP, etc). For some of these events, there could be an estimation of how far in advance you need to prepare for the event so you can see a bar indicating when you should begin planning -- for example 72 hours before add/drop so you have time to schedule a meeting with your professor or adviser to get an update on your status in the course, a month prior for JP deadline to make sure you have a working draft, etc.). User could add, sync, or import additional events from other calendars. If a student selects a particular major the calendar could populate with a list of departmental requirements that could be dropped into the calendar in the appropriate term -- using features of ICE perhaps? Courses that have prerequisites would prompt users for those courses when they are dropped into the calendar.

(2) a feature that ranks or tags calendar events with relative importance to you and what type of goal it is connected with -- for example: thesis would be tied to the "graduate from Princeton" goal so it would be ranked high, while attending TH night arch sing could be given relatively low importance and tied to "relaxation".

(3) zoom feature allowing users to see a week, month, term, year, or all four years at Princeton in a single view. Events that are ranked high in importance (for example finish thesis with a 6 month block of time) would be visible event from the highest level while events ranked lower would only be visible when viewed at a higher resolution. This would help students to think about how they spend their time as it relates to their goals by seeing long term goals and deadlines from different perspectives.

De-duping RECAP, Marvin Bielawski (marvinb@Princeton.edu), Library

Princeton, the New York public library, and Columbia run a joint off-site storage facility on the Forrestal campus named "RECAP." One of our longterm dreams is to do something called "de-duping," meaning "de-duplicating," meaning storing only 1 copy of a particular volume rather than 2 or 3 (one from Princeton, one from NYPL, and one from Columbia). There are many obstacles to this dream, some of which are legal (e.g., if there's one copy, who owns it?).

But one of the obstacles is technical. To de-dupe, we would have to identify the duplicates (preferably before they entered RECAP). This can be tricky for at least two reasons. One is that, given the purposes of the research libraries, it will matter (at least sometimes) whether the duplicates are exact or not: the 2nd edition and the 4th edition are not perfect substitutes for one another. The second is that Princeton, NYPL, and Columbia all run differently configured online catalogs, so it becomes a clunky, manual process to compare records.

This should be solvable by a Kayak-like program: if one app can search a bunch of airline websites, why not an app that combs multiple library databases?