COS 598d, Spring 2000: What's New

Princeton University
Computer Science Dept.

Computer Science 598d
Advanced Topics in CS: Data Visualization

Projects

Spring 2000

General Information | Schedule | Readings | Projects

About Projects in This Course

Projects can be done individually or in groups. Given the variety of interesting topics we would
like to encourage people to spread over available topics and not all concentrate on one or two.

Data Repository

In order to explore various ways to visualize data, we are building a data repository consisting
several kinds of data. We will have at least five kinds of data sets in the collection:

Volumetric data

Visible woman and visible man data
Two sets of Plasma physics simulation data sets
An astrophysics simulation data set

Stock market data (currently aug, sept, and oct 1999)

NYSE Trade and Quote (TAQ) Database (user's guide)

Census Data

1990 Population Data (we have the NY and NJ CDROM Sets)
1997 Economic Census (CDROM CD-EC97-1B)

Web access logs (in ASCII)

www.cs.princeton.edu
www.princeton.edu (slightly modified to preserve privacy)

DNA genome data sets

If you have ideas about what data to put in the repository, please let us know.

Suggested Projects

We are actively soliciting additions to the following list. What do you think would make a good project?
Send your ideas to li@cs.princeton.edu.

Parallel VTK for the display wall. The simplest approach would be to run a copy of VTK on each display PC and have a master interacting with the user and broadcast new camera information to everyone. Just like the parallel walk-through program, it is possible to have this open-source tool running on the display wall in days. But, after getting the initial version running, there will be interesting performance issues such as how to deal with huge datasets.
Similar to the parallel VTK, we can do the same with OpenDX which is also open source. Eventually, we will figure out which one is better for our purposes.
Visualize the stock market data. What kinds of questions would you like to ask about the stock market? What algorithms and implementions would you need to answer these questions and provide insight to people who like to know about the new economy that no one seems to be able to explain?
Visualize the population data. Think of ways to visualize and analyze the population data for various purposes. For example, if you ran a startup company to make new-generation wireless communication gears, which regions of would you like to offer services first?
Visualize the web server log data. Think of system design questions you would like to ask about the web server such as access patterns and see how to visualize the results. For example, can you determine which web pages are important from the web data and its server log?
Visualize the genome datasets. This probably requires some understanding of the M-Bio stuff, but it would be cool to see whether we can develop interesting algorithms and visual metaphors to learn something non-trivial.
Visualize the a file server? How are disks allocated and being used? Who is using the resources? What types of data are being stored, what are the access patterns, etc? Choose an existing disk system, like the one in the lab, and characterize it using visualization.
Collaborative, remote visualization. Some data is generated remotely. For example, volumetric data is typically generated on a high-performance parallel machine in a supercomputing center that doesn't do data visualization. Some big machine probably keeps the real-time stock market data somewhere. It may be impractical to move the data over, visualize it and make decisions. To do data visualization remotely, what are the tradeoffs and what software mechanisms do we need?

Participants

Stefanos Damianakis snd@cs
Adam Finkelstein      af@cs
Tom Funkhouser        funk@cs
Anoop Gupta             anoopg@princeton.edu
Scott Klasky              sklasky@pppl.gov
Kai Li                        li@cs
Zhiyan Liu                 zhiyan@cs
Robert Osada            rosada@cs
S. Jordan Parker        sjparker@princeton.edu
Lena Petrovic            lenap@cs
Yilei Shao                 yshao@cs
Ben Shedd                 benshedd@cs
Mona Singh               mona@cs
S. Morrow Petigrew petigrew@princeton.edu
Grant Wallace           gwallace@cs
Curtis Wright            cawright@princeton.edu

Princeton University Computer Science Dept.

Computer Science 598d Advanced Topics in CS: Data Visualization Projects