Quick links

Colloquium

Digital Voices

Date and Time
Wednesday, March 6, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Cristina Lopes, from Xerox PARC
Host
Perry Cook
Aerial acoustic communications are remarkably successful in biological systems, most notably in humans, but they have been ignored or dismissed in computer networks. There are good reasons for this: the data rates are relatively low when compared to other media and the sounds tend to be annoying. But as computers get more ubiquitous, and as more and more devices support a channel for voice or music, sound becomes a cheap, secure and intuitive alternative for transferring small amounts of information among devices that happen to be near each other.

As any other modems, the design of these "air modems" must account for factors such as the data rate, the error probability and the computational overhead at the receiver. On top of that, these modems must also account for aesthetic and psychoacoustic factors. I will show how to vary certain parameters in standard modulation techniques such as ASK, FSK and Spread-Spectrum to obain communication systems in which the messages are musical and other familiar sounds, rather than modem sounds. Some applications of this technology will be discussed. I will also present the basis of a framework for studying low bit rate communication systems including air modems, bird songs, and human speach.

Sparse Sampling Algorithms for Probabilistic Artificial Intelligence

Date and Time
Wednesday, February 27, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Michael Kearns, from University of Pennsylvania
Host
Robert Tarjan
A powerful theme from computational learning theory is uniform convergence: the fact that a large or infinite class C of random variables can be simultaneously estimated from a sample whose size is just logarithmic in the size of C, or dependent only on combinatorial parameters such as its VC dimension. The application of this tool in learning theory has primarily been to supervised learning (classification and regression), and has largely been descriptive (for example, to model the generalization ability of existing learning algorithms), rather than prescriptive.

Over the past several years, we have been systematically exploring the application of this fundamental notion to a broader set of natural problems in probabilistic artificial intelligence, and have found a number of striking examples where it leads to novel algorithms and analyses for core problems. In particular, by applying uniform convergence arguments to carefully structured but very sparse samples in probabilistic environments, we obtain:

* A new algorithm providing a near-optimal solution to the exploration-exploitation trade-off in Markov decision processes, a basic problem in reinforcement learning;

* A sparse sampling algorithm for planning in Markov decision processes whose running time has no dependence on the number of states;

* Powerful new algorithms for probabilistic inference in Bayesian networks:

*Algorithms for computing approximate Nash equilibria in large-population games.

These are only a few of the many algorithmic applications of learning theory methods in modern AI. This talk will present a self contained survey of these developments.

The Boosting Approach to Machine Learning

Date and Time
Wednesday, February 13, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Robert Schapire, from AT&T Labs
Host
Bernard Chazelle
Boosting is a general method for producing a very accurate classification rule by combining rough and moderately inaccurate "rules of thumb". While rooted in a theoretical framework of machine learning, boosting has been found to perform quite well empirically. In this talk, I will introduce the boosting algorithm AdaBoost, and explain the underlying theory of boosting, including our explanation of why boosting often does not suffer from overfitting. I will also describe some recent applications and extensions of boosting.

The Role of Simplicity in Learning Theory

Date and Time
Wednesday, February 6, 2002 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
David McAllester, from AT&T Research
Host
Andrew Appel
Science values simplicity. All other things being equal, a simple explanation is preferable to a complex one. Bayesians assign higher prior probability to simple theories. But is it really true that a simple theory is a-priori more likely than a complex one? It turns out that one can justify a preference for simplicity independent of Bayesian assumptions. The justification involves only the law of large number and the observation that the number of simple theories is limited. This talk will present this justification and go on to describe more general "laws of large numbers" that justify more sophisticated methods of evaluating the accuracy of predictive rules.

The Effects of Wide-Area Conditions on WWW Server Performance

Date and Time
Thursday, December 6, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Erich Nahum, from IBM T. J. Watson Research Center
Host
Vivek Pai
WWW workload generators are used to evaluate web server performance, and thus have a large impact on what performance optimizations are applied to servers. However, current benchmarks ignore a crucial component: how these servers perform in the the wide-area Internet.

This work shows how WAN conditions can affect WWW server performance. We examine these effects using an experimental testbed which emulates WAN characteristics in a live setting, by introducing factors such as delay and packet loss in a controlled and reproducible fashion. We demonstrate that when more realistic wide-area conditions are introduced, servers exhibit performance properties and scaling behaviors which are not exposed by existing benchmarks running on LANs. We find that packet losses can reduce server throughput by as much as 50 percent and increase response time as seen by the client. We show that using TCP SACK does not lower server throughput, and can reduce client response time.

Scalable peer-to-peer substrates: A new foundation for distributedapplications

Date and Time
Friday, November 30, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Peter Druschel, from Rice University
Host
Larry Peterson
Peer-to-peer (p2p), initially conceived for the purpose of sharing music in the Internet, promises to be a more general paradigm for organizing large-scale distributed applications. We define p2p systems broadly as self-organizing, decentralized, distributed systems where most or all communication is symmetric. The self-organization, decentralization, diversity and redundancy of resources inherent in the approach lend themselves to a large domain of applications beyond file sharing, anonymity and anti-censorship. At the same time, the decentralization and diversity also pose difficult problems, particularly in resource management and security. Recent work on p2p overlay networks like CAN, Chord, Pastry and Tapestry has made significant strides towards providing a general substrate that simplifies the construction of a wide range of p2p applications. These overlay networks effectively shield applications from the complexities of organizing and maintaining a secure overlay network, tolerating node failure, and from distributing and locating resources. In this talk, I'll present an overview of Pastry, a p2p overlay network that provides self-organization, fault-tolerance, efficient resource location and distribution, and incorporates interesting heuristics that exploit proximity in the underlying Internet. I will also sketch two applications built upon Pastry to date: PAST, an archival, cooperative file storage and distribution facility, and SCRIBE, a highly scalable event notification system. I'll conclude with an outlook on key research problems and future directions.

Finding Needles in a 10 TB Haystack, 140M Times/Day

Date and Time
Thursday, November 15, 2001 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Rob Shillner, from Google, Inc
Host
David Dobkin
Search is one of the most ubiquitous and important applications used on the internet, but it is also one of the hardest applications to do well. Google is a search engine company that began as a research project at Stanford University, and has evolved into the world's largest and most trafficked search engine in just under three years. Three main characteristics have driven this growth: search quality, index size, and speed. Addressing these issues has required tackling problems in a range of computer science disciplines, including algorithm and data structure design, networking, operating systems, distributed and fault-tolerant computing, information retrieval, and user interface design. In this talk, I'll focus on Google's unique hardware platform of 10,000 commodity PCs running Linux, and some of the challenges and benefits presented by this platform. I'll also describe some of the interesting problems that arise in crawling and indexing more than a billion web pages, and performing 140 million queries per day on this index. Finally, I'll describe some of the challenges facing search engines in the future.

Some Perspectives on Computational Complexity

Date and Time
Wednesday, October 3, 2001 - 3:30pm to 5:00pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Speaker
Andy Yao, from Princeton University
Host
David Dobkin
In past decades, the theory of computational complexity has flourished in terms of both the revelation of its internal structures and the unfolding of its numerous applications. In this paper we discuss several persistent and interwoven themes underlying many of these accomplishments. Chief amoung them are the interplay between communication and computation, the power of problem reduction, and the increasingly prominent role played by classical mathematics. We will also speculate on a few promising directions for future development of computational complexity.

A Market Simulation with Aglets

Date and Time
Tuesday, November 17, 1998 - 4:00pm to 5:30pm
Location
Friend Center 013
Type
Colloquium
Speaker
Hideyuki Mizuta, from IBM Tokyo Research Laboratory
Host
Kenneth Steiglitz
The behavior of the real market is quite complicated and the analysis is very difficult. The computer simulation of the market provide us with powerful tools to understand the essential elements for a certain phenomenon of the market like bubbles. Recently, Steiglitz and Shapiro [1] constructed a market model including production and consumption and verified the occurrence of bubbles and negative bubbles.

Now we have implemented the simulation using the Aglets [TM] technology [2]. Aglets are Java [TM] objects that can move from one host on the network to another and can communicate with other aglets by message passing. By using aglets, we can construct a market dynamically. That is, we can add or delete agents which participate in a market at any time and can watch the market status by sending various kinds of observer aglets.

In this talk, I will present a demonstration of the simulation using aglets and discuss the results of it.

References

[1] Simulating the Madness of Crowds: Price Bubbles in an Auction-Mediated Robot Market, K. Steiglitz and D. Shapiro, Computational Economics, vol. 12, pp. 35-59, 1998. (http://www.cs.princeton.edu/~ken/madness.ps)

[2] Aglets Software Development Kit, http://www.trl.ibm.co.jp/aglets/

Event (no name)

Date and Time
Wednesday, November 4, 1998 - 4:00pm to 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Type
Colloquium
Follow us: Facebook Twitter Linkedin