COS 333 Project: Project Ideas

COS 333: Project Ideas from Around Campus

Tue Feb 19 15:32:43 EST 2019

Newer items are at the front.

Overview

There are many individuals and groups on campus who have really interesting problems that could be profitably attacked by folks in COS 333. Here are some of them. They have come from a variety of friends and colleagues on campus. I've edited their prose a bit, but have tried to leave their ideas intact. You're welcome to approach the people listed directly, and I would be happy to act as an intermediary if you prefer, or to help you solicit more information. The first few are new; the others are hold-overs from previous years but still of interest.

Digital Humanities Tools
Meredith Martin, Department of English

There are many digital humanities problems, large and small, that could profit from some CS attention. Here are some ideas.

Large-scale "unsolved problems" 1. full-text search across multiple languages and 2. named-entity recognition within texts that somehow knows not to look at titles and 3. teaching the computer to detect when the OCR is returning something that is not text so I can easily detect typographically unique pages rather than doing so by hand.

Small-scale problems 1. isolating excerpted texts (a single article) in fully-indexed digitized bound-periodicals (which is how HathiTrust gives me data). 2. collapsing multiple reprints so that only the first displays and the user has to "see more editions" for the rest to display 3. building on the (now non-existent) named-entity recognition I dream of above, so we can visualize a cultural field of reference or discourse across texts (as in: this text is mentioned in / referenced by, like Google Scholar except that these scholars don't use citations in the same way). 4. adding other kinds of visualizations to display the collections in more evocative ways 5. adding OCR quality to the search result filters,

Tutoring and Advising
Patrick Caddeau, Dean of Forbes College

Two projects that would help undergrads with tutoring and advising:

a system to connect students eligible to tutor courses and students who need tutors.
a system to facilitate students' sharing their academic plans and course enrollment plans with their academic advisers and to get feedback and approval of their plans -- basically a user friendly version of the Academic Planning Form.

Art Museum Services
Stephen Kim, Associate Director for Information and Technology

The Princeton University Art Museum offers a world-class collection of over 100,000 works of art spanning the world of art from antiquity to the present. While more than 200,000 visitors visit our galleries in a year, we are always eager to develop new ways to engage audiences, especially, YOU, our students. Recently, we've built out new data and images services to power potential innovations like:

"Art of the Day": A service that introduces users to a single object from our collection each morning. Relevancy to the user could be built off a variety of potential sources (see next bullet point).
"Tinder . . . for Art": An application that presents a pair of objects relatively devoid of context and asks for a simple thumbs up, thumbs down (or swipe left, swipe right). Then repeat "n" times, resulting in a visual "taste" profile driven by machine learning for the user.
"Oh, that's what that's called?": A geo-driven tool that identifies the nearest pieces of campus art (the over 600 sculptures, portraits, and other objects outside the Museum building around campus) to the user's current location.
"Scavenger Hunt": A service that leads users (could be students or all levels, both University and K-12) via clues to art objects around campus. Could be integrated with augmented reality or machine vision experiences.
"Virtual Museum": A service that allows users to curate their own digital exhibition and share with others. Could include a virtual reality component.

Communities of Interest App: letting citizens talk back to redistricters
Sam Wang, Neuroscience

Every 10 years, legislative districts across America must be redrawn after the Census. Redistricters have the task of making sure that diverse communities within a state are fairly represented. But they do not always know where those communities are.

Citizens have opportunities to testify about their communities in public hearings. But that testimony is qualitative, and there is no way to integrate the comments in a unified way. It would be useful to have a graphical application for individuals to (a) draw their communities of interest (COI's) on a state map, (b) store the shapes in a standard format such as GIS, and (c) annotate the shapes with comments. Then, after citizens have participated, it would be useful to display all of the communities of interest in a single map for inspection.

An additional feature might be reduction of redundancy by combining highly overlapping communities in a single consensus graphical display object.

Dynamic Frist Displays
Abby Klionsky '14, Office of the Executive Vice President

The decor in Frist -- all the quotes painted on the wall, etc. -- is meant to represent a diversity of ideas, and is one of the places on campus that, theoretically, does this quite well. It's theoretical because we don't know how much people actually pay attention to them, nor whether they know anything about the person being quoted.

There is actually documentation of all of this, in a very old-school, circa-2000 website that pairs photos of the quotes with photos and bios and explanations of the people who they are quoting: http://princeton.edu/frist/iconography.

This also covers the images in Cafe Viv and some of the Princeton-y flotsam that adorns the halls and walls. It would be GREAT if this could actually be a site that made people interested in looking at it!

Could we build a system that showed these images much more dynamically, perhaps with a rotating sequence of pictures that always showed something interesting. For each one, perhaps there could be a QR code that pointed to more details. Or maybe a touch screen would make it easy to get more details. Would it be possible to add new images and new text very easily without having to be an expert? Are there other things that would make the displays more appealing and encourage people to look at them more carefully?

Co-curricular Opportunities: A Better Understanding
Claire Pinciaro '13, ODUS

Do you ever find yourself overwhelmed by the number of co-curricular opportunities available at Princeton? Do you find yourself wishing that there was an efficient way to find out which groups, teams, and organizations your peers belong to?

Imagine a centralized digital platform in which you and other students can keep track of your co-curricular involvements, search the profiles of other students, and see the membership of student organizations in real time. Think Tigerbook but with a co-curricular section.

We've done a lot of research in this sphere and know that there's real potential for this to be a hit not only at Princeton, but at other schools as well.

Princeton Prison Teaching Initiative
Jill Stockwell, McGraw Center

Ideas that would greatly improve our organization's efficiency and communication. One is a volunteer application management system for our 150+ applicants each semester; another is a carpooling application for each of the seven facilities where we teach.

Managing maps and geospatial data
Wangyal Shawa, Map and Geospatial Information Center

We are planning two projects to create and manage our scanned maps and create geospatial data. One project is related to creating a batch georeferencing tool that will georeference scanned topographic maps that are the same size and the same scale. There is one system called QUAD-G (open source) to process the United States Geological Survey 1:24,000 scale maps but this software does not work well if you have a smaller scale map series. We need to customize the QUAD-G software to work with smaller scale maps using the same programming language or redesign it with a different programming language using similar workflows.

Another project is to design an open source software system that will extract georeferenced scanned maps to vector geospatial data.

These projects will benefit many researchers and libraries.

Ideas from 2018

Improve Tiger Energy
Caroline Savage, Office of Sustainability

This project would try to improve Tiger Energy by:

Incorporating weather-normalized data into the residential college energy consumption ranking tool
Incorporating educational components on how to lower personal energy usage
Designing a student feedback tool on building comfort (e.g. temperature, humidity conditions)
Create a way-finding app to help the campus community quickly identify the nearest water refilling stations to encourage use of Drink Local/reusable water bottles
Develop an app that educates, tracks, and rewards proper recycling/actions through year-long competition

Academic Task/Assignment Time Estimator
Nik Voge, McGraw Center

Time is in short supply for Princeton students. This makes scheduling and planning of academic tasks and activities such as completing p-sets, assigned reading, papers, and projects difficult. Because assignments can be quite challenging and time consuming and because they can vary considerably not only from course to course, but also from week to week, it is often difficult for students to accurately predict how much time tasks will require. At the same time, most students, with the encouragement of the university, are involved in extra-curricular, career preparation, and social activities, which results in a relatively small margin for error in planning and scheduling.

In many cases students do not budget adequate time to complete their academic work, leading to unmet grade (and learning) goals and feelings of dissatisfaction. Students often lack sufficient information to effectively plan and schedule their academic work and other aspects of their lives.

One recent innovation is Rice University's Course Workload Estimator. While the Course Workload Estimator has been a useful tool for instructors, it can be improved upon. It can be adapted to Princeton's distinctive academic environment, including its instructional materials and evaluation standards. Another improvement is continuously refining the algorithm by which the estimates are made by collecting input from students in specific courses on the amount of time various tasks demand. Additionally, the corpus of data collected can be analyzed to better understand the academic time demands across campus, an endeavor which has never been undertaken in any systematic manner to my knowledge.

Presenting Cultural-Heritage Data Online
Cliff Wulfman, Library

A very large quantity of the cultural-heritage material that has been digitized is encoded and stored in XML: information about the objects (metadata); information about digital images of the objects (file types; file paths; technical info about the files).

There has been much buzz in the digital cultural-heritage community in recent years about the International Image Interoperability Framework (IIIF). IIIF is a set of specifications for APIs to web services, including an Image API, which deliver images (at various resolutions, orientations, etc.) and a Presentation API, which delivers a structured representation of complex image-based digital objects in JSON-LD (JSON for Linked Data).

Princeton's Digital Library includes a collection of Princeton-area newspapers, including the entire run of The Daily Princetonian from its founding in 1876. The digital representations of these newspapers are encoded in XML – a particular blend of XML schemas called METS/ALTO.

The project would be to create a IIIF-based viewer for The Daily Princetonian historical collection by implementing IIIF APIs:

Image API. Several IIIF Image Servers have already been implemented; you need only choose one and deploy it.
Presentation API. This is the heart of this project: develop a server that can deliver IIIF Manifests (JSON-LD structures) for digitized newspapers encoded in METS/ALTO. Create a back-end that translates METS/ALTO representations into IIIF Manifests. Data conversion is a huge part of real-world software engineering projects.
Search API. Optional; implement the IIIF Search API to support searching of IIIF-based assets.
A front-end web application. There are several open-source projects working on image viewers; Javascript adepts could have a field day with integrating these viewers into responsive interfaces.

Data collection and presentation for student outcomes
Jed Marsh, Vice Provost for Institutional Research

There is an increasing interest in student outcomes after the initial placement -- say 10 years post degree. Currently, these data are harvested from a hodge-podge of sources, including scraping sites like LinkedIn. There's a fair amount of staff time spent across campus googling former students, both graduates and undergrads. We need tools that:
(1) improve data collection from the web. Could there be an API from LinkedIn or job search sites? Could one develop an app to systematically search for and harvest CV's & resumes posted by Princeton Alumni?
(2) Categorize unstructured employment data (job code, employer, etc.,) into standardized occupation (SOC) and industry (NACIS) codes.
(3) Store these data in a common repository that could be available for student outcome studies.

Themed historical tours of campus
Abby Klionsky '14, Office of the Executive Vice President

As a breakout group of the Campus Iconography Committee, the Princeton History Working Group is building a series of themed historical tours of Princeton's campus that will highlight lesser-known histories of the university. These will take shape in the form of a mobile app, which will use wayfinding technology to guide users to sites across campus and showcase associated photos, audio, and video to tell these stories. For some of these sites, we'd like to incorporate augmented reality features -- particularly in places where there may no longer be a physical marker or building still standing. The augmented reality component we're envisioning would likely be a statue for "placement" in one of the statue-hold pedestals in East Pyne courtyard or the front of Frist, a moving image to launch over a picture frame or screen that does exist in reality, or overlaying an old image of a campus map/building over what exists today.

COS 333: Project Ideas from Around Campus

Overview

Digital Humanities Tools Meredith Martin, Department of English

Tutoring and Advising Patrick Caddeau, Dean of Forbes College

Art Museum Services Stephen Kim, Associate Director for Information and Technology

Communities of Interest App: letting citizens talk back to redistricters Sam Wang, Neuroscience

Dynamic Frist Displays Abby Klionsky '14, Office of the Executive Vice President

Co-curricular Opportunities: A Better Understanding Claire Pinciaro '13, ODUS

Princeton Prison Teaching Initiative Jill Stockwell, McGraw Center

Managing maps and geospatial data Wangyal Shawa, Map and Geospatial Information Center