Quick links

Colloquium

The Boosting Approach to Machine Learning

Date and Time
Wednesday, February 13, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Robert Schapire, from AT&T Labs
Host
Bernard Chazelle
Boosting is a general method for producing a very accurate classification rule by combining rough and moderately inaccurate "rules of thumb". While rooted in a theoretical framework of machine learning, boosting has been found to perform quite well empirically. In this talk, I will introduce the boosting algorithm AdaBoost, and explain the underlying theory of boosting, including our explanation of why boosting often does not suffer from overfitting. I will also describe some recent applications and extensions of boosting.

The Role of Simplicity in Learning Theory

Date and Time
Wednesday, February 6, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
David McAllester, from AT&T Research
Host
Andrew Appel
Science values simplicity. All other things being equal, a simple explanation is preferable to a complex one. Bayesians assign higher prior probability to simple theories. But is it really true that a simple theory is a-priori more likely than a complex one? It turns out that one can justify a preference for simplicity independent of Bayesian assumptions. The justification involves only the law of large number and the observation that the number of simple theories is limited. This talk will present this justification and go on to describe more general "laws of large numbers" that justify more sophisticated methods of evaluating the accuracy of predictive rules.

The Effects of Wide-Area Conditions on WWW Server Performance

Date and Time
Thursday, December 6, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Erich Nahum, from IBM T. J. Watson Research Center
Host
Vivek Pai
WWW workload generators are used to evaluate web server performance, and thus have a large impact on what performance optimizations are applied to servers. However, current benchmarks ignore a crucial component: how these servers perform in the the wide-area Internet.

This work shows how WAN conditions can affect WWW server performance. We examine these effects using an experimental testbed which emulates WAN characteristics in a live setting, by introducing factors such as delay and packet loss in a controlled and reproducible fashion. We demonstrate that when more realistic wide-area conditions are introduced, servers exhibit performance properties and scaling behaviors which are not exposed by existing benchmarks running on LANs. We find that packet losses can reduce server throughput by as much as 50 percent and increase response time as seen by the client. We show that using TCP SACK does not lower server throughput, and can reduce client response time.

Scalable peer-to-peer substrates: A new foundation for distributedapplications

Date and Time
Friday, November 30, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Peter Druschel, from Rice University
Host
Larry Peterson
Peer-to-peer (p2p), initially conceived for the purpose of sharing music in the Internet, promises to be a more general paradigm for organizing large-scale distributed applications. We define p2p systems broadly as self-organizing, decentralized, distributed systems where most or all communication is symmetric. The self-organization, decentralization, diversity and redundancy of resources inherent in the approach lend themselves to a large domain of applications beyond file sharing, anonymity and anti-censorship. At the same time, the decentralization and diversity also pose difficult problems, particularly in resource management and security. Recent work on p2p overlay networks like CAN, Chord, Pastry and Tapestry has made significant strides towards providing a general substrate that simplifies the construction of a wide range of p2p applications. These overlay networks effectively shield applications from the complexities of organizing and maintaining a secure overlay network, tolerating node failure, and from distributing and locating resources. In this talk, I'll present an overview of Pastry, a p2p overlay network that provides self-organization, fault-tolerance, efficient resource location and distribution, and incorporates interesting heuristics that exploit proximity in the underlying Internet. I will also sketch two applications built upon Pastry to date: PAST, an archival, cooperative file storage and distribution facility, and SCRIBE, a highly scalable event notification system. I'll conclude with an outlook on key research problems and future directions.

Finding Needles in a 10 TB Haystack, 140M Times/Day

Date and Time
Thursday, November 15, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Rob Shillner, from Google, Inc
Host
David Dobkin
Search is one of the most ubiquitous and important applications used on the internet, but it is also one of the hardest applications to do well. Google is a search engine company that began as a research project at Stanford University, and has evolved into the world's largest and most trafficked search engine in just under three years. Three main characteristics have driven this growth: search quality, index size, and speed. Addressing these issues has required tackling problems in a range of computer science disciplines, including algorithm and data structure design, networking, operating systems, distributed and fault-tolerant computing, information retrieval, and user interface design. In this talk, I'll focus on Google's unique hardware platform of 10,000 commodity PCs running Linux, and some of the challenges and benefits presented by this platform. I'll also describe some of the interesting problems that arise in crawling and indexing more than a billion web pages, and performing 140 million queries per day on this index. Finally, I'll describe some of the challenges facing search engines in the future.

Some Perspectives on Computational Complexity

Date and Time
Wednesday, October 3, 2001 - 3:30pm to 5:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Andy Yao, from Princeton University
Host
David Dobkin
In past decades, the theory of computational complexity has flourished in terms of both the revelation of its internal structures and the unfolding of its numerous applications. In this paper we discuss several persistent and interwoven themes underlying many of these accomplishments. Chief amoung them are the interplay between communication and computation, the power of problem reduction, and the increasingly prominent role played by classical mathematics. We will also speculate on a few promising directions for future development of computational complexity.

A Market Simulation with Aglets

Date and Time
Tuesday, November 17, 1998 - 4:00pm to 5:30pm
Location
Friend Center 013
Type
Colloquium
Speaker
Hideyuki Mizuta, from IBM Tokyo Research Laboratory
Host
Kenneth Steiglitz
The behavior of the real market is quite complicated and the analysis is very difficult. The computer simulation of the market provide us with powerful tools to understand the essential elements for a certain phenomenon of the market like bubbles. Recently, Steiglitz and Shapiro [1] constructed a market model including production and consumption and verified the occurrence of bubbles and negative bubbles.

Now we have implemented the simulation using the Aglets [TM] technology [2]. Aglets are Java [TM] objects that can move from one host on the network to another and can communicate with other aglets by message passing. By using aglets, we can construct a market dynamically. That is, we can add or delete agents which participate in a market at any time and can watch the market status by sending various kinds of observer aglets.

In this talk, I will present a demonstration of the simulation using aglets and discuss the results of it.

References

[1] Simulating the Madness of Crowds: Price Bubbles in an Auction-Mediated Robot Market, K. Steiglitz and D. Shapiro, Computational Economics, vol. 12, pp. 35-59, 1998. (http://www.cs.princeton.edu/~ken/madness.ps)

[2] Aglets Software Development Kit, http://www.trl.ibm.co.jp/aglets/

Event (no name)

Date and Time
Wednesday, November 4, 1998 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Follow us: Facebook Twitter Linkedin