Main»Theory Lunch 0607

Princeton Theory Lunch Schedule, 2006-07.

Spring 2007

DateSpeakerTitle
Feb 9Satyen Kale, PrincetonA Combinatorial, Primal-Dual Approach to Semidefinite Programs
Feb 16Esther Ezra, Tel Aviv UnivOn Translational Motion Planning in 3-Space
Feb 23Risi Kondor, ColumbiaA complete set of rotationally and translationally invariant features for images
Mar 2Iannis Tourlakis, TorontoTight integrality gaps for Vertex Cover SDPs in the Lovasz-Schrijver hierarchy
Mar 9Nir Halman, MITFully Polynomial Time Approximation Schemes for Stochastic Dynamic Programming
Mar 16Venkat Guruswami, U. WashingtonList Decoding with Optimal Data Rate
Mar 23No talkSpring break
Mar 30Petros Drineas, RPISampling algorithms and coresets for Lp regression and applications
Apr 6Mohit Singh, CMUApproximating Minimum Bounded Degree Spanning Trees to within one of Optimal
Apr 13Amit Agarwal, PrincetonImproved Approximation for Directed Cut Problems
Apr 20No talkNew York Area Theory Day at Columbia
Apr 27Konstantin Makarychev, PrincetonTradeoffs between Local and Global Distortions of Metric Spaces
May 4Michael Mitzenmacher, HarvardThe Hiring Problem and Lake Wobegon Strategies
May 11Assaf Naor, NYUAn approximation algorithm for the cubic Grothendieck problem
May 18Koby Crammer, UPennLearning from related sources

Fall 2006

DateSpeakerTitle
Sep 22Nikhil Bansal, IBM TJ Watson.The Santa Claus Problem
Sep 29Martin J. Strauss, Michigan/PrincetonSublinear-time Heavy Hitters with Universal Guarantees
Oct 6Vijay V. Vazirani, Georgia TechNew Market Models and Algorithms II
Oct 13Nick Harvey, MITAlgebraic Structures and Algorithms for Matching and Matroid Problems
Oct 20Eric Allender, RutgersOn the Complexity of Numerical Analysis (update)
Oct 27No talk
Nov 3No talkFall break
Nov 10 DIMACS mixer at Princeton
Nov 17Vitaly Feldman, HarvardOn Computational Hardness of Agnostic Learning
Nov 24No talkThanksgiving holiday
Dec 1Sanjeev Khanna, U PennDisjoint Paths in Networks
Thu, Dec 7, 4pmMikkel Thorup, At&T Labs-ResearchCompact Oracles for Approximate Distances around Obstacles in the Plane
Jan 19Michael Schapira, Hebrew UniversitySetting Lower Bounds on Truthfulness

Abstracts for the talks.

Sept 22 The Santa Claus Problem

Nikhil Bansal, IBM TJ Watson

Suppose Santa Claus has n presents that he wants to distribute among m kids. Let p_{ij} denote the happiness that kid i receives from present j. The Santa's goal is to distribute presents among kids in such a way that the least lucky kid is as happy as possible. That is, to maximize min_{i=1,..,m} sum_{j \in S_i} p_{ij} where S_i is a set of presents received by the i-th kid.

The problem arises naturally in the context of fair allocation of goods. However, it remains poorly understood from the point of view of approximability. I will survey the known results, and describe some approximation algorithms for certain special cases of the problem.

Joint work with Maxim Sviridenko.


Sept 29 Sublinear-time Heavy Hitters with Universal Guarantees

Martin J. Strauss, Michigan/Princeton

In the heavy hitters problem, the goal is to track the most frequent items in a changing collection, using little space. Over the last decade, a number of algorithms have been proposed for this problem. We present the first algorithms that simultaneously satisfy three important criteria: (i) the space is optimal, up to log factors (ii) the reconstruction time is polynomial in the number of heavy hitters and polylogarithmic in the universe size (iii) a single set of measurements (constructed randomly) works for all collections of items.

Joint work with Anna Gilbert, Joel Tropp, and Roman Vershynin

The talk is based on two papers. One is available as http://arxiv.org/abs/cs.DS/0608079 and the other is in progress.


Oct 6 New Market Models and Algorithms II

Vijay V, Vazirani, Georgia Tech

The notion of a ``market'' has undergone a paradigm shift with the Internet -- totally new and highly successful markets have been defined and launched by companies such as Google, Yahoo!, Amazon, MSN and Ebay. Another major change is the availability of massive computational power for running these markets in a centralized or distributed manner.

In view of these new realities, the study of market equilibria, an important, though essentially non-algorithmic, theory within Mathematical Economics, needs to be revived and rejuvenated with an inherently algorithmic approach. Such a theory should not only address traditional market models but also define new models for some of the new markets.

In this two-talk series, I will give a feel for the exciting work going on on this front. Interestingly enough, this work has also contributed handsomely to the theory of algorithms itself. In particular, the highly successful primal-dual schema from exact and approximation algorithms, which was so far used for combinatorially solving special classes of linear programs, has been extended to solving nonlinear convex programs.

Both talks are self-contained; the first is meant for a general audience and will be the Princeton CS department colloquium on Wednesday, Oct 4. The second will be at the Princeton theory lunch on Friday, Oct 6:

http://www.cs.princeton.edu/theory/index.php/Main/TheoryLunch

1). Resource Allocation Markets

This talk is based on the following three papers:

http://www-static.cc.gatech.edu/fac/Vijay.Vazirani/adwords.pdf

http://www-static.cc.gatech.edu/fac/Vijay.Vazirani/EG.pdf

http://www-static.cc.gatech.edu/fac/Vijay.Vazirani/EG2.pdf

2). Spending Constraint Utilities, with Applications to the Adwords Market

This talk is based on:

http://www-static.cc.gatech.edu/fac/Vijay.Vazirani/spending.pdf


Oct 13 Algebraic Structures and Algorithms for Matching and Matroid Problems

Nick Harvey, MIT

The task of finding a maximum matching in a graph is one of the classical problems in the theory of algorithms, inspiring the definition of the class P. Until recently, the fastest algorithm for this task took O(n^2.5) steps in an n vertex (dense) graph. But an exciting development due to Mucha and Sankoswki (FOCS '04) dropped the running time to O(n^w) where w < 2.38 is the exponent of matrix multiplication. However, their result was quite complicated, relying on certain structural decompositions of graphs and complicated data structures to maintain certain properties dynamically.

In this talk, I will present two new results: The first is a simple randomized algorithm achieving the same result, allowing for a self-contained presentation of this result. The second is an extension of this algorithm to the task of matroid intersection (for linear matroids over any field).


Oct 20 On the Complexity of Numerical Analysis (update)

Eric Allender, (Chair, Rutgers Dept. of Computer Science)

The Euclidean Traveling Salesman Problem is a well-known NP-hard problem, but some are surprised to learn that it is not known to lie in NP. This lecture will present a new upper bound on the complexity of this problem: it lies in the "counting hierarchy". The proof falls out of some more general considerations concerning arithmetic circuits, by way of a detour through the field of computation of the Real Numbers in the Blum-Shub-Smale model.

As a recent development, we also discuss some new connections between derandomization of Arithmetic Circuit Identity Testing and some arithmetic circuit questions in the Blum-Shub-Smale model. This is joint work with Peter Buergisser, Johan Kjeldgaard-Pedersen, and Peter Bro Miltersen, and was presented at the 2006 IEEE Conference on Computational Complexity; it is available on my home page. I will also discuss some more recent work by Peter Buergisser.


Nov 17 On Computational Hardness of Agnostic Learning

Vitaly Feldman, Harvard

The agnostic learning framework of Kearns, Schapire and Sellie is an extension of Valiant's PAC learning model, in which examples that are given to the learning algorithm are labelled by an unknown and unrestricted (and hence adversarial) function. An agnostic learning algorithm for a class of functions $C$ is required to produce a hypothesis whose error (with respect to the distribution of examples) is "close" to the optimal possible by a function from $C$. A large number of questions regarding the learnability of even the simplest function classes in this natural model remain open since the introduction of the model in 1992.

In this talk I will show that agnostic learning of parities with respect to the uniform distribution reduces to learning of parities with random classification noise. Learning with random noise is a substantially more tractable model and, in particular, using a noise-tolerant parity learning algorithm by Blum, Kalai and Wasserman we obtain the first non-trivial algorithm for agnostic learning of parities with respect to the uniform distribution. The reduction also shows that learning of juntas and DNF expressions with respect to the uniform distribution can be reduced to learning parities with random classification noise.

The second problem we will discuss is the problem of weak agnostic learning of monomials by a proper learning algorithm (that is, an algorithm that produces a monomial as its hypothesis) formulated by Avrim Blum. We solve this open problem by showing that it is NP-hard. In other words, we prove that finding a monomial that is consistent with $1/2+\epsilon$ fraction of examples is NP-hard even when there exists a monomial consistent with $1-\epsilon$ fraction of examples (for any constant $\epsilon$). The result can also be interpreted as an optimal hardness of approximation result for maximizing agreements with a monomial. It substantially improves on previous hardness of approximation results for this problem (due to Ben-David et al., and Bshouty and Burroughs).

The talk will be largely self-contained and both results will be also interpreted outside of the learning context. Parts of this talk are based on joint work with Parikshit Gopalan, Subhash Khot and Ashok Ponnuswami.


Dec 1 Disjoint Paths in Networks

Sanjeev Khanna, U Penn

A fundamental problem in combinatorial optimization is the edge-disjoint paths problem (EDP). We are given a network and a collection of source-destination pairs in the network. The goal is maximize the number of pairs that can be connected by edge-disjoint paths (i.e. the paths can not share any edges). Even special cases of EDP correspond to non-trivial optimization problems and the problem becomes NP-hard in very restricted settings. A well-studied approach for designing approximation algorithms for EDP is to use the multicommodity flow relaxation. In this talk, we will describe some recent progress on understanding the integrality gap of the multicommodity flow relaxation as well as present some new hardness of approximation results for EDP and related problems in directed graphs.


Dec 7 Compact Oracles for Approximate Distances around Obstacles in the Plane

Mikkel Thorup, AT&T Labs-Research

Consider the Euclidean plane with arbitrary polygonal obstacles with a total of n corners. For any fixed a>0, we construct an oracle in near-linear time and space. Given any pair of points, the oracle approximates the obstacle avoiding distance within a factor (1+a) in O(log n) time. To get exact such distances, the best current space bound is O(n^{11}) [Chang Mitchell SODA'99].


Jan 19 Setting Lower Bounds on Truthfulness

Michael Schapira, Hebrew University

We present and discuss general techniques for proving inapproximability results for truthful mechanisms. We make use of these techniques to prove lower bounds on the approximability of several non-utilitarian multi-parameter problems.

In particular, we demonstrate the strength of our techniques by exhibiting a lower bound of $2-\frac{1}{m}$ for the scheduling problem with unrelated machines (formulated as a mechanism design problem in the seminal paper of Nisan and Ronen on Algorithmic Mechanism Design). Our lower bound applies to truthful randomized mechanisms (disregarding any computational assumptions on the running time of these mechanisms). Moreover, it holds even for the weaker notion of truthfulness for randomized mechanisms -- i.e., truthfulness in expectation. This lower bound nearly matches the known $\frac{7}{4}$ (randomized) truthful upper bound for the case of two machines (a non-truthful FPTAS exists). No lower bound for truthful randomized mechanisms in multi-parameter settings was previously known.

Based on joint work with Ahuva Mu'alem


Feb 9 A Combinatorial, Primal-Dual Approach to Semidefinite Programs

Satyen Kale, Princeton University

Algorithms based on convex optimization, especially linear and semidefinite programming, are ubiquitous in Computer Science. While there are polynomial time algorithms known to solve such problems, quite often the running time of these algorithms is very high. Designing simpler and more efficient algorithms is important for practical impact.

In my talk, I will describe applications of a Lagrangian relaxation technique, the Multiplicative Weights Update method in the design of efficient algorithms for various optimization problems. We generalize the method to the setting of symmetric matrices rather than real numbers. The new algorithm yields the first truly general, combinatorial, primal-dual method for designing efficient algorithms for semidefinite programming. Using these techniques, we obtain significantly faster algorithms for approximating the Sparsest Cut and Balanced Separator in both directed and undirected weighted graphs to factors of O(log n) and O(sqrt{log n}), and also for the Min UnCut problem. The algorithm also has applications in quantum computing and derandomization.

This is joint work with Sanjeev Arora.


Feb 16 On Translational Motion Planning in 3-Space

Esther Ezra, Tel-Aviv University

Constructing the union of geometric objects and bounding its combinatorial complexity is a fundamental problem in computational and combinatorial geometry. A major application of this problem is to translational motion planning, where a polyhedral robot $R$ translates in a fixed polyhedral environment, and one wishes to compute all placements of $R$ (known as free placements) at which it avoids collision with any obstacle.

In this talk, I will give a brief overview of the area, and explain its importance. I will then discuss the union problem, where the input consists of $n$ ``fat'' tetrahedra in 3-space, of arbitrary sizes, and show that in this case the combinatorial complexity of the union is $O(n^{2+\eps})$, for any $\eps > 0$, thus almost settling a conjecture of Pach et al. Our bound is the first (non-trivial) subcubic bound presented for this problem, and is almost tight in the worst case. An immediate consequence of this result implies that the combinatorial complexity of the union of $n$ cubes in 3-space, arbitrarily aligned and of arbitrary sizes, is also $O(n^{2+\eps})$, for any $\eps > 0$.

The talk will be self-contained.

Joint work with Micha Sharir.


Feb 23 A complete set of rotationally and translationally invariant features based on a generalization of the bispectrum to non-commutative groups

Risi Kondor, Columbia University

Deriving translation and rotation invariant representations is a fundamental problem in computer vision with a substantial literature. I propose a new set of features which

a, are simultaneously invariant to translation and rotation;
b, are sufficient to reconstruct the original image with no loss (up to a badwidth limit);
c, do not involve matching with a template image or any similar discontinuous operation.

The new features are based on Kakarala's generalization of the bispectrum to compact Lie groups and a projection onto the sphere. I validated the method on a handwritten digit recognition dataset with randomly translated and rotated digits.

paper: http://arxiv.org/abs/cs.CV/0701127


Mar 2 Tight integrality gaps for Vertex Cover SDPs in the Lovasz-Schrijver hierarchy

Iannis Tourlakis, U. Toronto

ne of the most powerful techniques used for approximation algorithms is semidefinite programming relaxations. Such relaxations have led to breakthrough approximation algorithms, e.g., for Max Cut and Sparsest cut. However, they have not had much success with Vertex Cover. A series of papers (Kleinberg-Goemans, Charikar, Hatami-Magen-Markakis) show that several natural SDP relaxations for Vertex Cover have integrality gaps of 2-o(1) giving no improvement over the naive greedy algorithm for Vertex Cover.

Do better SDPs exist for Vertex Cover? A natural class of SDPs to consider are those derived by applying the Lovasz-Schrijver LS+ lift-and-project operator to the standard LP relaxation for Vertex Cover. Repeatedly applying this operator over many rounds results in a sequence of tighter and tighter SDP relaxations where the rth round SDP is computable in time n^O(r) and where the nth SDP solves Vertex Cover exactly on n vertex graphs. Ruling out non-trivial approximations for Vertex Cover in this hierarchy rules out one of the most promising classes of algorithms for Vertex Cover and thus may give evidence on the true approximability of Vertex Cover.

Schoenebeck, Trevisan and Tulsiani recently showed that the integrality gap for Vertex Cover remains 7/6 after even linearly many rounds of LS+. However, this gap is less than the inapproximability factor of 1.36 that Dinur and Safra rule out using PCP-based techniques. Determining the exact strength of the LS+ relaxation for Vertex Cover after even 2 rounds remained open.

We show the first tight integrality gaps for Vertex Cover SDPs in the Lovasz-Schrijver hierarchy. In particular, we prove ithat the integrality gap remains 2-o(1) after even Omega(sqrt(log n/log log n)) rounds of LS+.

This is joint work with Konstantinos Georgiou, Avner Magen and Toniann Pitassi.


Mar 9 Fully Polynomial Time Approximation Schemes for Stochastic Dynamic Programming

Nir Halman, MIT

We develop a framework for obtaining (deterministic) Fully Polynomial Time Approximation Schemes (FPTAS) for stochastic univariate dynamic programming problems with either convex or monotone single-period cost function.

Using our framework, we give the first FPTAS for several NP-Hard problems in various fields of research.

Joint work with Diego Klabjan, Chung-Lun Li, James Orlin and David Simchi-Levi.


Mar 16 List Decoding with Optimal Data Rate

Venkatesan Guruswami, U. Washington

Suppose you want to communicate a message of k packets on a noisy communication channel. So you judiciously encode it as a redundant collection of n = ck packets and transmit these. What is the fewest number of correct packets one needs to receive in order to have any hope of recovering the message?

Well, clearly one needs at least k correct packets. In this talk, I will describe an encoding scheme that attains this information-theoretic limit: for any desired constant eps > 0, it enables recovery of the message as long as at least k(1+eps) packets are received intact. The location of the correct packets and the errors on the remaining packets can be picked adversarially by the channel. Our codes are simple to describe: they are certain *folded* Reed-Solomon codes, which are just the well known Reed-Solomon codes, but viewed as a code over a larger alphabet via a bundling together of codeword symbols.

This achieves the optimal trade-off between redundancy and error-resilience for a worst-case noise model where the channel can corrupt the transmitted symbols arbitrarily subject to a bound on the total number of errors. These results are obtained in an error-recovery model called list decoding. The talk will introduce and motivate the problem of list decoding, and then give a peek into the algebraic ideas and constructions that lead to the above result.

Based on joint work with Atri Rudra.


Mar 30 Sampling algorithms and coresets for Lp regression and applications

Petros Drineas, RPI

L2 regression, also known as the least squares approximation problem, and more generally Lp regression, are fundamental problems that have numerous applications in mathematics and statistical data analysis. Recall that the over-constrained Lp regression problem takes as input an n x d matrix A, where n > d, an n-vector b, and a number p, and it returns as output a d-vector x such that ||Ax-b||_p is minimized. Several randomized algorithms for this problem, each of which provides a relative-error approximation to the exact solution, are described. The first algorithm applies novel ``subspace sampling'' probabilities to randomly sample constraints and thus construct a coreset for the L2 regression problem. The second algorithm (devised by T. Sarlos) applies a random projection and uses a ``fast'' version of the Johnson-Lindenstrauss lemma to obtain an efficient approximation to the L2 regression problem. The third algorithm generalizes the concept of a ``well-conditioned'' basis and of ``subspace-preserving sampling'' to obtain efficiently a coreset for Lp regression, for all $p\in[1,\infty)$. Applications of these random projection and random sampling algorithms to data analysis will be briefly discussed.


April 6 Approximating Minimum Bounded Degree Spanning Trees to within one of Optimal

Mohit Singh, CMU

In the Minimum Bounded Degree Spanning Tree problem, we are given an edge-weighted undirected graph with degree bound k for each vertex v and the task is to find a spanning tree of minimum cost which satisfies all the degree bounds. In this talk I will present an efficent algorithm which returns a spanning tree in which degree of any vertex v is at most k+1 and whose cost is at most the cost of the optimal spanning tree of maximum degree k. This settles a conjecture of Goemans affirmatively and generalizes a result of Furer and Raghavachari to weighted graphs.

The algorithm generalizes when vertices have distinct upper and lower bounds on vertex degrees and returns a spanning tree of optimal cost which violates the degree bounds by an additive one.

The main technique used is the iterative rounding method introduced by Jain in the design of approximation algorithms. Our major technical contribution is to extend the ideas of this method and apply them effectively to a new domain of problems.

The talk will be self-contained.

This is joint work with Lap Chi Lau.


April 13 Improved Approximation for Directed Cut Problems

Amit Agarwal, Princeton

A fundamental problem associated with graphs is that of finding the cheapest set of edges which cuts the graph into two or more pieces. Graph cut problems have a number of important applications in divide and conquer algorithms. In the minimum multicut problem, the goal is to partition the graph so as to separate a given set of source-sink pairs.

Our main result is an improvement upon the current best O(n^{1/2}) approximation for the multicut problem in directed graphs. Our algorithm uses the natural LP relaxation for the problem. We present a randomized rounding algorithm with a more sophisticated charging scheme and analysis to obtain an O(n^{11/23}) approximation. Our result also implies an improved upper bound on the worse case ratio between maximum multicommodity flow and minimum multicut in directed graphs.

Perhaps more interesting than the specific bound we obtain is the fact that sqrt{n} is not the right answer for this problem as well as the techniques we use in establishiing this.

This is joint work with Noga Alon and Moses Charikar


April 27 Tradeoffs between Local and Global Distortions of Metric Spaces

Konstantin Makarychev, Princeton

Suppose that every k points in a metric space X are D-distortion embeddable into l1. We give upper and lower bounds on the distortion required to embed the entire space X into l1. This is a natural mathematical question and is also motivated by the study of relaxations obtained by lift-and-project methods for graph partitioning problems. In this setting, we show that X can be embedded into l1 with distortion O (D log (n/k)). Moreover, we give a lower bound showing that this result is tight if D is bounded away from 1. For D = (1 + delta) we give a lower bound of Omega(log (n/k) / log (1/delta)); and for D=1, we give a lower bound of Omega(log n / (log k + log log n)). Our bounds significantly improve on the results of Arora, Lovasz, Newman, Rabani, Rabinovich and Vempala, who initiated a study of these questions.

This is a joint work with Moses Charikar and Yury Makarychev


May 4 The Hiring Problem and Lake Wobegon Strategies

Michael Mitzenmacher, Harvard University

We introduce the hiring problem, in which a growing company continuously interviews and decides whether to hire applicants. This problem is similar in spirit but quite different from the well-studied secretary problem. It captures aspects of decision making under uncertainty, and specifically issues that arise in systems where items are bought or sold at negotiated prices in an imperfect information market.

We analyze natural strategies of hiring above the current average, considering both the mean and the median averages; we call these Lake Wobegon strategies. Like the hiring problem itself, our strategies are intuitive, simple to describe, and amenable to mathematically and economically significant modifications. We demonstrate several intriguing behaviors of the two strategies. Specifically, we show dramatic differences between hiring above the mean and above the median. We also show that both strategies are intrinsically connected to the lognormal distribution, leading to only very weak concentration results, and the marked importance of the first few hires on the overall outcome.

Joint work with Andrei Broder (Yahoo! Research), Adam Kirsch (Harvard), Ravi Kumar (Yahoo! Research), Eli Upfal (Brown), and Sergei Vassilvitskii (Stanford).


May 11 An approximation algorithm for the cubic Grothendieck problem

Assaf Naor, NYU

Given a symmetric 3-tensor of real number {a_{ijk}:1<=i,j,k<=n}, we consider the problem of approximating in polynomial time the maximum of \sum_{i,j,k} a_{ijk} x_i x_j x_k over the discrete cube {-1,1}^n. We will present both upper and lower bounds for this problem, and show how our new algorithm can be used in combinatorial optimization. Our approach reduces the problem to that of efficiently computing the diameter of convex bodies in the \ell_1 norm, for which we will present optimal result.

joint work with Subhash Khot.


May 18 Learning from related sources

Koby Crammer, University of Pennsylvania

We often like to build a model for one scenario based on data from similar or nearby cases. For example, consider the problem of building a model which predicts a sentiment about books from short reviews, using reviews and sentiment of DVDs. Another example is of learning movies preference for one viewers from rating provided by other similar users. There is a natural tradeoff between using data from more users and using data from only similar users.

In this talk, I will discuss the problem of learning good models using data from multiple related or similar sources. I will present a theoretical approach which extends the standard probably approximately correct (PAC) learning framework, and show how it can be applied in order to determine which sources of data should be used for classification and regression.

I will then proceed to the analysis of the problem where the we are like to model a target domain using data labeled from sources with a different distribution. I will discuss shortly the problem of representation and will try to answer under what conditions can we adapt a classifier trained on the source domain for use in the target domain.