Quick links

Enabling Large-Scale Data Intensive Computations

Date and Time
Friday, October 22, 2010 - 1:30pm to 3:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
CS Department Colloquium Series
Host
Edward Felten
This talk describes a set of distributed services developed at Microsoft Research Silicon Valley to enable efficient parallel programming on very large datasets. Parallel programs arise naturally within scientific, data mining, and business applications. Central to our philosophy is the notion that parallel programs do not have to be difficult to write and that the same program must seamlessly run on a laptop, desktop, a small cluster, or on a large data center without the author having to worry about the details of parallelization, synchronization, or fault-tolerance. We have built several services (Dryad, DryadLINQ, TidyFS, and Nectar) that embody this belief. Our goal is to enable users, particularly scientists of all disciplines, to treat a computer cluster as a forensic, diagnostic, or analytic tool. The talk will describe the details of our infrastructure and the characteristics of some of the applications that have been run on it.

Chandu Thekkath is a researcher at Microsoft Research Silicon Valley. He received his Ph.D. in Computer Science from the University of Washington in 1994. Since then, except for a sabbatical year at Stanford in 2000, he has been in industrial research labs at DEC, Compaq, and Microsoft. He is a fellow of the ACM.

Follow us: Facebook Twitter Linkedin