| |
|
|
| Boosting and hard-core set constructions: a simplified approach |
2007 |
|
| ECCC Report
TR07-131.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
|
Abstract
We revisit the connection between boosting
algorithms and hard-core set constructions discovered by Klivans and Servedio.
We present a boosting algorithm with a certain smoothness property that is
necessary for hard-core set constructions: the distributions it generates do
not put too much weight on any single example. We then use this boosting
algorithm to show the existence of hard-core sets matching the best parameters
of Klivans and Servedio's construction. |
|
| Testing Expansion in Bounded Degree Graphs |
2007 |
|
| With C.
Seshadhri. ECCC Report
TR07-076.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
|
Abstract
We consider the problem of testing graph expansion in the bounded degree model.
We give a property tester that given a graph with degree bound d, an
expansion bound α, and a parameter ε > 0, accepts the graph
with high probability if its expansion is more than α, and rejects
it with high probability if it is ε-far from any graph (with degree
bound 2d) with expansion Ω(α2). The algorithm runs
in time Õ(n0.5 + μd2/εα2) for
any given constant μ > 0. |
|
| Computational Equivalence of Fixed Points and No Regret
Algorithms, and Convergence to Equilibria |
2007 |
|
| With Elad
Hazan. To appear in 23rd
NIPS, 2007.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We study the relation between notions of game-theoretic equilibria which are
based on stability under a set of deviations, and empirical equilibria which
are reached by rational players. Rational players are modeled by players using
no regret algorithms, which guarantee that their payoff in the long run is
close to the maximum they could hope to achieve by consistently deviating
from the algorithm's suggested action.
We show that for a given set of deviations over the strategy set of a player,
it is possible to efficiently approximate fixed points of a given deviation if
and only if there exist efficient no regret algorithms resistant to the
deviations. Further, we show that if all players use a no regret algorithm,
then the empirical distribution of their plays converges to an equilibrium. |
|
| Efficient Algorithms using the Multiplicative Weights Update
Method |
2007 |
|
| Ph.D. thesis. Princeton
Tech Report TR-804-07.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
Algorithms based on convex optimization, especially linear and
semidefinite programming, are ubiquitous in Computer Science.
While there are polynomial time algorithms known to solve such
problems, quite often the running time of these algorithms is
very high. Designing simpler and more efficient algorithms is
important for practical impact.
In this thesis, we explore applications of the Multiplicative
Weights method in the design of efficient algorithms for
various optimization problems. This method, which was
repeatedly discovered in quite diverse fields, is an
algorithmic technique which maintains a distribution on a
certain set of interest, and updates it iteratively by
multiplying the probability mass of elements by suitably
chosen factors based on feedback obtained by running another
algorithm on the distribution.
We present a single meta-algorithm which unifies all known
applications of this method in a common framework. Next, we
generalize the method to the setting of symmetric matrices
rather than real numbers. We derive the following applications
of the resulting Matrix Multiplicative Weights algorithm:
- The first truly general, combinatorial, primal-dual method
for algorithms for semidefinite programming. Using these
techniques, we obtain significantly faster algorithms for
obtaining O(√log(n)) approximations to various graph
partitioning problems, such as Sparsest Cut, Balanced
Separator in both directed and undirected weighted graphs, and
constraint satisfaction problems such as Min UnCut and Min
2CNF Deletion.
- An Õ(n3) time derandomization of the Alon-Roichman
construction of expanders using Cayley graphs. The algorithm
yields a set of O(log n) elements which generates an expanding
Cayley graph in any group of n elements.
- An Õ(n3) time deterministic O(log n) approximation
algorithm for the quantum hypergraph covering problem.
- An alternative proof of a result of Aaronson that the
gamma-fat-shattering dimension of quantum states on n qubits
is O(n/gamma2).
Using our framework for the classical Multiplicative Weights
Update method, we derive the following algorithmic
applications:
- Fast algorithms for approximately solving several families
of semidefinite programs which beat interior point methods.
Our algorithms rely on eigenvector computations, which are
very efficient in practice compared to the Cholesky
decompositions needed by interior point methods. We also give
a matrix sparsification algorithm to speed up the eigenvector
computation using the Lanczos iteration.
- O(√log(n)) approximation to the Sparsest Cut and the
Balanced Separator problems in undirected weighted graphs in
Õ(n2) time by embedding expander flows in the graph. This
improves upon the previous Õ(n4.5) time algorithm of
Arora, Rao, and Vazirani, which was based on semidefinite
programming.
|
|
| A Combinatorial, Primal-Dual approach to Semidefinite
Programs |
2007 |
|
| With Sanjeev
Arora. To appear in 39th
STOC, 2007.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
|
Abstract
Semidefinite programs (SDP) have been used in many recent
approximation algorithms. We develop a general primal-dual
approach to solve SDPs using a generalization of the
well-known multiplicative weights update rule to symmetric
matrices. For a number of problems, such as Sparsest Cut
and Balanced Separator in undirected and directed
weighted graphs, and the Min UnCut problem, this yields
combinatorial approximation algorithms that are significantly
more efficient than interior point methods. The design of our
primal-dual algorithms is guided by a robust analysis of
rounding algorithms used to obtain integer solutions from
fractional ones. |
|
| Privacy, Accuracy, and Consistency Too:
A Holistic Solution to Contingency Table Release |
2007 |
|
| With Boaz
Barak, Kamalika
Chaudhuri, Cynthia
Dwork,
Frank
McSherry and
Kunal
Talwar. To appear in 26th PODS,
2007.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
The contingency table is a work horse of official statistics,
the format of reported data for the US Census, Bureau of Labor
Statistics, and the Internal Revenue Service. In many settings
such as these privacy is not only ethically mandated, but
frequently legally as well. Consequently there is an extensive
and diverse literature dedicated to the problems of
statistical disclosure control in contingency table release.
However, all current techniques for reporting contingency
tables fall short on at least one of privacy, accuracy, and
consistency (among multiple released tables). We propose a
solution that provides strong guarantees for all three
desiderata simultaneously.
Our approach can be viewed as a special case of a more general
approach for producing synthetic data:
Any privacy-preserving mechanism for contingency table release
begins with raw data and produces a (possibly inconsistent)
privacy-preserving set of marginals. From these tables alone
-- and hence without weakening privacy -- we will find and
output the "nearest" consistent set of marginals.
Interestingly, this set is no farther than the tables of the
raw data, and consequently the additional error introduced by
the imposition of consistency is no more than the error
introduced by the privacy mechanism itself.
The privacy mechanism of Dwork, McSherry, Nissim, Smith '06 gives the strongest
known privacy guarantees, with very little error. Combined
with the techniques of the current paper, we therefore obtain
excellent privacy, accuracy, and consistency among the tables.
Moreover, our techniques are surprisingly efficient.
Our techniques apply equally well to the logical cousin of
the contingency table, the OLAP cube.
|
|
| A Fast Random Sampling Algorithm for Sparsifying Matrices |
2006 |
|
| With Sanjeev
Arora and Elad
Hazan. In APPROX-RANDOM
2006.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We describe a simple random-sampling based procedure
for producing sparse matrix approximations. Our procedure
and analysis are extremely simple: the analysis uses
nothing more than the Chernoff-Hoeffding bounds. Despite
the simplicity, the approximation is comparable and
sometimes better than previous work.
Our algorithm computes the sparse matrix approximation
in a single pass over the data. Further, most of the
entries in the output matrix are quantized, and can be
succinctly represented by a bit vector, thus leading to
much savings in space. |
|
| Efficient Aggregation Algorithms for Probabilistic Data |
2006 |
|
| With T. S.
Jayram and Erik Vee. To appear in SODA 2007.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We study the problem of computing aggregation operators on
probabilistic data in an I/O efficient manner. Algorithms for
aggregation operators such as SUM, COUNT, AVG, and MIN/MAX are
crucial to applications on probabilistic databases. We give a
generalization of the classical data stream model to handle
probabilistic data, called probabilistic streams, in
order to analyze the I/O-requirements of our algorithms.
Whereas the algorithms for SUM and COUNT turn out to be
simple, the problem is harder for both AVG and MIN/MAX.
Although data stream algorithms typically use randomness, all
of the algorithms we present are deterministic.
For MIN and MAX, we obtain efficient one-pass data stream
algorithms for estimating each of these quantities with
relative accuracy (1 + ε), using constant update time
per element and O((log R)/ε) space, where each element
has a value between 1 and R. For AVG, we present a new data
stream algorithm for estimating its value to a relative
accuracy (1 + &epsilon) in O(log n) passes over the data with
O((log2 n)/ε) space and update time O((log
n)/ε) per element. On the other hand, we prove a space
lower bound of Ω(n) for any exact one-pass
deterministic data stream algorithm. Complementing this
result, we also present an O(n(log2n))-time exact
deterministic algorithm which uses O(n) space (thus removing
the data-streaming restriction), improving dramatically on the
previous O(n3)-time algorithm. Our algorithms for
AVG involve a novel technique based on generating functions
and numerical integration, which may be of independent
interest.
Finally, we provide an experimental analysis and show that
our algorithms, coupled with additional heuristics, have
excellent performance over large data sets. |
|
| Logarithmic Regret Algorithms for Online Convex
Optimization |
2006 |
|
| With Elad
Hazan, Adam Kalai,
and Amit
Agarwal. In 19th
COLT, 2006.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
In an online convex optimization problem a
decision-maker makes a sequence of decisions, i.e.,
chooses a sequence of points in Euclidean space, from a
fixed feasible set. After each point is chosen, it
encounters a sequence of (possibly unrelated) convex cost
functions. Zinkevich (2003) introduced this framework,
which models many natural repeated decision-making
problems and generalizes many existing problems such as
Prediction from Expert Advice and Cover's Universal
Portfolios. Zinkevich showed that a simple online gradient
descent algorithm achieves additive regret
O(√T), for an arbitrary sequence of T convex cost
functions (of bounded gradients), with respect to the best
single decision in hindsight.
In this paper, we give algorithms that achieve regret
O(log(T)) for an arbitrary sequence of strictly convex
functions (with bounded first and second derivatives).
This mirrors what has been done for the special cases of
prediction from expert advice by Kivinen and Warmuth
(1999), and Universal Portfolios by Cover (1991). We
propose several algorithms achieving logarithmic regret,
which besides being more general are also much more
efficient to implement.
The main new ideas give rise to an efficient algorithm
based on the Newton method for optimization, a new tool in
the field. Our analysis shows a surprising connection to
follow-the-leader method, and builds on the recent work of
Agarwal and Hazan (2005). We also analyze other
algorithms, which tie together several different previous
approaches including follow-the-leader, exponential
weighting, Cover's algorithm and gradient descent. |
|
| Algorithms for Portfolio Management based on the Newton
Method |
2006 |
|
| With Amit
Agarwal,
Elad
Hazan, and
Robert
Schapire. In 23rd
ICML, 2006.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We experimentally study on-line investment algorithms
first proposed by Agarwal and Hazan and extended by Hazan
et al. which achieve almost the same wealth as the best
constant-rebalanced portfolio determined in hindsight.
These algorithms are the first to combine optimal
logarithmic regret bounds with efficient deterministic
computability. They are based on the Newton method for
offline optimization which, unlike previous approaches,
exploits second order information. After analyzing the
algorithm using the potential function introduced by
Agarwal and Hazan, we present extensive experiments on
actual financial data. These experiments confirm the
theoretical advantage of our algorithms, which yield
higher returns and run considerably faster than previous
algorithms with optimal regret. Additionally, we perform
financial analysis using mean-variance calculations and
the Sharpe ratio. |
|
| A Variation on SVD Based Image Compression |
2006 |
|
| With Abhiram
Ranade and Srikanth S. M. In
WCVGIP, 2006. Journal paper in Image
and Vision Computing 25(6): 771-777 (2007).
|
![[ps]](images/pdf2.gif) |
Abstract
We present a variation to the well studied SVD based image
compression technique. Our variation can be viewed as a
preprocessing step in which the input image is permuted as
per a fixed, data independent permutation, after which it
is fed to the standard SVD algorithm. Likewise, our
decompression algorithm can be viewed as the standard SVD
algorithm followed by a postprocessing step which applies
the inverse permutation.
On experimenting with standard images we show that our
method performs substantially better than the standard
method. Typically, for any given compression quality, our
method needs about 30% fewer singular values and vectors to
be retained. We also present a bit allocation scheme and
show that our method also performs better than the more
familiar Discrete Cosine Transform (DCT).
We show that the original SVD algorithm as well as our
variation, can be viewed as instances of the Karhunen-
Loeve transform (KLT). In fact, we observe that there is a
whole family of variations possible by choosing different
parameter values while applying the KLT. We present
heuristic arguments to show that our variation is likely
to yield the best compression of all these. We also
present experimental evidence which appears to justify our
analysis. |
|
| Fast Algorithms for Approximate Semidefinite Programming
using the Multiplicative Weights Update method |
2005 |
|
| With Sanjeev
Arora and Elad
Hazan. In
Proc. 46th FOCS, 339-348, 2005.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
Semidefinite programming (SDP) relaxations appear in
many recent approximation algorithms but the only general
technique for solving such SDP relaxations is via interior
point methods. We use a Lagrangian-relaxation based
technique (modified from the papers of Plotkin, Shmoys,
and Tardos (PST), and Klein and Lu) to derive faster
algorithms for approximately solving several families of
SDP relaxations. The algorithms are based upon some
improvements to the PST ideas --which lead to new results
even for their framework-- as well as improvements in
approximate eigenvalue computations by using random
sampling. |
|
| The Multiplicative Weights Update Method: A
Meta-Algorithm and Applications |
2005 |
|
| With Sanjeev
Arora and Elad
Hazan. Survey paper draft.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
Algorithms in varied fields use the idea of maintaining a
distribution over a certain set and use the
multiplicative update rule to iteratively change these
weights. Their analysis are usually very similar and rely
on an exponential potential function.
We present a simple meta algorithm that unifies these
disparate algorithms and derives them as simple
instantiations of the meta algorithm. |
|
| Analysis and Algorithms for Content-based Event Matching |
2005 |
|
| With
Elad Hazan,
Fengyun Cao and
Jaswinder
Pal Singh. In 4th
DEBS, 2005.
|
![[ps]](images/pdf2.gif) |
Abstract
Content-based event matching is an important
problem in large-scale event-based publish/subscribe
systems. However, open questions remain in analysis
of its difficulty and evaluation of its solutions. This
paper makes a few contributions toward analysis,
evaluation and development of matching algorithms.
First, based on a simplified yet generic model, we give
a formal proof of hardness of matching problem by
showing its equivalence to the notoriously hard Partial
Match problem. Second, we compare two major
existing matching approaches and show that countingbased
algorithms are likely to be more computationally
expensive than tree-based algorithms in this model.
Third, we observe an important, prevalent
characteristic of real-world publish/subscribe events,
and develop a new matching algorithm called
RAPIDMatch to exploit it. Finally, we propose a new
metric for evaluation of matching algorithms. We
analyze and evaluate RAPIDMatch using both the
traditional and new metrics proposed. Results show
that RAPIDMatch achieves large performance
improvement over the tree-based algorithm under
certain publish/subscribe scenarios. |
|
| O(√log n) approximation to SPARSEST CUT
in Õ(n²) time |
2004 |
|
| With Sanjeev
Arora and Elad
Hazan. In Proc.
45th FOCS, 238-247, 2004.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We show how to compute O(√log
n)-approximations to SPARSEST CUT and BALANCED
SEPARATOR problems in Õ(n²) time,
thus improving upon the recent algorithm of Arora, Rao and
Vazirani (2004). Their algorithm uses semidefinite programming
and required Õ(n4.5) time. Our
algorithm relies on efficiently finding expander flows in the
graph and does not solve semidefinite programs. The existence
of expander flows was also established by Arora, Rao, and
Vazirani. |
|
| Approximating Quadratic Programs with Positive Semidefinite
Constraints |
2004 |
|
| With Elad
Hazan.
Princeton Tech Report TR-746-06.
|
![[ps]](images/ps2.gif) ![[ps]](images/pdf2.gif) |
Abstract
We describe a polynomial time approximation algorithm to the
problem of maximizing a quadratic form subject to quadratic
constraints specified by PSD matrices. A special case, that has
applications for clustering, is optimizing quadratic forms over
the unit cube.
Approximation algorithms with similar guarantees are known, and
there is evidence that this factor is optimal. The following
analysis is particularly simple. |
|
|
|