Quick links

Continuous Distributed Counting for Non-monotonic Streams

Date and Time
Tuesday, February 7, 2012 - 11:00am to 12:00pm
Computer Science 302
Jennifer Rexford
We consider the problem of tracking a sum of values in a distributed environment where the input values arrive from k distributed streams. It is required that an estimate of the sum is maintained by a coordinator node that is within a prescribed relative error tolerance, at any time with high probability. The goal is to achieve such objective with small total communication with the coordinator node, in order to minimize the use of network bandwidth that may be a scarce resource. This is a basic distributed computation problem that is of interest for various applications including data aggregation and iterative solving of various computational problems.

Specifically, we consider a setting where the input values are selected and assigned to sites by an adversary but the arrival order of the input items is random. We show a randomized protocol with a sub-linear total communication with respect to the number of updates and its matching lower bound. This result stands in between the previously known results that for monotonic partial sums (all value updates either positive or negative), the expected total communication is logarithmic in the number of updates, and that for general worst-case inputs, the total required communication is linear in the number of updates. Finally, we also discuss applications of the sum tracking module for tracking the second frequency moment and for solving a Bayesian linear regression.

Zhenmin Liu is a PhD candidate in the theory of computation group at Harvard, working with Michael Mitzenmacher. His research focuses on stochastic processes, optimization, and algorithm design, as well as their applications to distributed computation and big data analytics.

Follow us: Facebook Twitter Linkedin