This talk describes a set of distributed services developed at
Microsoft Research Silicon Valley to enable efficient parallel
programming on very large datasets. Parallel programs arise naturally
within scientific, data mining, and business applications. Central to
our philosophy is the notion that parallel programs do not have to be
difficult to write and that the same program must seamlessly run on a
laptop, desktop, a small cluster, or on a large data center without the author having to worry about the details of parallelization, synchronization, or fault-tolerance. We have built several services (Dryad, DryadLINQ, TidyFS, and Nectar) that embody this belief. Our goal is to enable users, particularly scientists of all disciplines, to treat a computer cluster as a forensic, diagnostic, or analytic tool. The talk will describe the details of our infrastructure and the characteristics of some of the applications that have been run on it.
Chandu Thekkath is a researcher at Microsoft Research Silicon Valley. He received his Ph.D. in Computer Science from the University of Washington in 1994. Since then, except for a sabbatical year at Stanford in 2000, he has been in industrial research labs at DEC, Compaq, and Microsoft. He is a fellow of the ACM.