CS 597C: Scalable Internet Services

Course Description


The goal of this graduate seminar is to explore systems for and applications of an interesting model of distributed communication and coordination over the Internet, namely the publish-subscribe model.  The seminar will serve as a vehicle for us to research, understand, and make various design decisions at the protocols, systems and applications levels for this model.  As such, the seminar will be quite exploratory and research-oriented, with the goal being to learn a lot together and to have fun doing so.


Publish-subscribe is a model of asynchronous, decoupled communication that holds great promise for a variety of highly distributed applications, ranging from coordinating distributed computations among powerful nodes, through supporting a new model of selective dissemination of information for human users, to coordinating and monitoring extremely large networks of extremely small sensors.  Larger scales and greater distribution require decoupling of producers and communications rather than traditional client-server approaches, long latencies and the human as well as bandwidth inefficiencies of polling demand asynchronous rather than synchronous communication, and human as well as network bandwidth considerations demand sophisticated selection and filtering of the information that is desired.  Publish-subscribe has been used within local-area networks for integrating back-end applications together, but the scale, popularity and openness of the Internet---as well as the processing limitations of human users---demand whole new approaches.

This graduate seminar will explore the publish-subscribe model over the Internet.  Our exploration will assume Internet transport (TCP/IP) and will divide the publish-subscribe stack above it into three layers: the communication layer, the protocol layer, and the application layer.  Within these layers, we will explore in depth the five major topics listed below.


The Communication Layer


     1. Application Protocols


The Protocol Layer


     2. PubSub Interfaces and Protocol Specifications


     3. Distributed PubSub System Design


     4. Matching Algorithms and Systems (Centralized and Distributed)


The Application Layer


     5. PubSub Applications (including use of PubSub in Traditional Applications)


Depending on how many people join the seminar, we plan to divide people into teams that would be responsible for different topics among the five throughout the course. That is, all topics will be carried forward as threads throughout the term, instead of being dealt with sequentially, so that we can examine the individual topics in depth and also the synergies and interfaces among them.  Teams will read some assigned papers as well as perform independent research on the Web, in order to understand the key issues and approaches in their assigned topic area and its sub-topics. They will synthesize the information and present it to the class.  Teams may also do projects in their topic areas.  Class sessions will include presentations by different teams, both on specific subtopics and also oriented toward successively “peeling the onion” for their assigned topics over the semester, driving from research to design; i.e. starting with identifying the key issues in the topic, then determining the major contenders for alternative approaches, then analyzing the tradeoffs among some of the top candidates, then determining which one they would choose and why.  Class sessions will also include presentations by guest speakers who are experts in specific areas, presentations by me, and a lot of discussion.  Students will therefore be responsible for deep investigation into their teams’ topics or layers, but will learn a lot about the others as well.   Interested students and teams may perform small projects as well to learn by doing and to help sharpen their understanding.