CS Department Colloquium Series
Our accelerating computational demand and the rise of multicore hardware have made parallel programs, especially shared-memory multithreaded programs, increasingly pervasive and critical. Yet, these programs remain extremely difﬁcult to write, test, analyze, debug, and verify. Conventional wisdom has attributed these difﬁculties to nondeterminism (i.e., repeated executions of the same program on the same input may show different behaviors), and researchers have recently dedicated much effort to bringing determinism into multithreading. In this talk, I argue that determinism is not as useful as commonly perceived: it is neither sufﬁcient nor necessary for reliability. We present our view on why multithreaded programs are difﬁcult to get right, describe a promising approach we call stable multithreading to dramatically improve reliability, and summarize our last four years’ research on building and applying stable multithreading systems.
Junfeng Yang's research (www.cs.columbia.edu/~junfeng) centers on making reliable and secure systems. He earned his PhD at Stanford, where he created eXplode, a general, lightweight system for effectively finding storage system errors. This work has led to an OSDI '04 best paper, numerous bug fixes to real systems such as the Linux kernel, and a featured article in Linux Weekly news. He worked at Microsoft Research, Silicon Valley from 2007-2008, extending eXplode to check production distributed systems. MoDist, the resultant system, is being transferred to Microsoft product groups. He's now co-directing the Software Systems Lab (ssl.cs.columbia.edu) at Columbia University, where his recent work on making reliable parallel programs---the Tern/Peregrine/Parrot stable multithreading systems---was featured in CACM, ACM Tech News, The Register, and many other sites. He won Sloan and AFOSR YIP both in 2012; and NSF CAREER in 2011.