|
-
Self-improving Algorithms for Convex Hulls
           
abstract   bibtex
With K. Clarkson and
W. Mulzer, SODA 2010
We describe an algorithm for computing
planar convex hulls in the self-improving
model: given a sequence $I_1, I_2, \ldots$ of planar $n$-point sets,
the upper convex hull $conv(I)$ of each set $I$ is desired.
We assume that there exists a probability distribution $D$ on $n$-point
sets, such that the inputs $I_j$ are drawn independently
according to $D$. Furthermore, $D$ is such that the individual points
are distributed independently of each other. In other words, the $i$'th point
is distributed according to $D_i$. The $D_i$'s can be arbitrary
but are independent of each other.
The distribution
$D$ is not known to the algorithm in advance.
After a learning phase of $n^\eps$ rounds,
the expected time to compute $conv(I)$ is
$O(n + H(conv(I)))$. Here, $H(conv(I))$ is
the entropy of the output, which is
a lower bound for the expected running time
of \emph{any} algebraic computation tree that computes
the convex hull. (More precisely, $H(conv(I))$ is
the minimum entropy
of any random variable that maps $I$ to a description of
$conv(I)$ and to a labeling scheme that proves nonextremality
for every point in $I$ not on the hull.)
Our algorithm is thus asymptotically optimal
for $D$.
@inproceedings{ClaMS10,
author = {K. Clarkson and W. Mulzer and C. Seshadhri},
title = {Self-improving Algorithms for Convex Hulls},
booktitle = {Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms (SODA)},
year = {2010},
pages = {}
}
-
Efficient learning algorithms for changing environments
        abstract     bibtex
   
full (ECCC TR07-088)
With E. Hazan, ICML 2009
We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in hindsight. Hence, regret minimizing algorithms tend to converge to the static best optimum, clearly a suboptimal behavior in changing environments. On the other hand, various metrics proposed to strengthen regret and allow for more dynamic algorithms produce inefficient algorithms.
We propose a different performance metric which strengthens the standard metric of regret and measures performance with respect to a changing comparator. We then describe a series of data-streaming-based reductions which transform algorithms for minimizing (standard) regret into adaptive algorithms albeit incurring only poly-logarithmic computational overhead.
Using this reduction, we obtain {\it efficient} low adaptive-regret algorithms for the problem of online convex optimization.
This can be applied to various learning scenarios, i.e. online portfolio selection, for which we describe experimental results showing the advantage of
adaptivity.
@inproceedings{HazS09,
author = {E. Hazan and
C. Seshadhri},
title = {Efficient learning algorithms for changing environments},
booktitle = {Proceedings of the 26th Annual International Conference on Machine Learning (ICML)},
year = {2009},
pages = {50},
}
-
An Almost Optimal Rank Bound for Depth-3 Identities
        abstract     bibtex
   
full (ECCC TR08-108)
With N. Saxena, CCC 2009
We show that the rank of a depth-$3$ circuit (over any field) that is simple,
minimal and zero is at most $O(k^3\log d)$. The previous best rank bound known was
$2^{O(k^2)}(\log d)^{k-2}$ by Dvir and Shpilka (STOC 2005).
This almost resolves the rank question first posed by Dvir and Shpilka
(as we also provide a simple and minimal identity of rank $\Omega(k\log d)$).
Our rank bound significantly improves (dependence on $k$ exponentially reduced)
the best known deterministic black-box identity tests for depth-$3$ circuits
by Karnin and Shpilka (CCC 2008). Our techniques also shed light
on the factorization pattern of nonzero depth-$3$ circuits, most
strikingly: the rank of linear factors of a simple, minimal and nonzero depth-$3$ circuit
(over any field) is at most $O(k^3\log d)$.
The novel feature of this work is a new notion of maps between sets
of linear forms, called \emph{ideal
matchings}, used to study depth-$3$ circuits. We prove interesting structural results about
depth-$3$ identities using these techniques. We believe that these can lead
to the goal of a deterministic polynomial time identity test for these circuits.
@inproceedings{SaxS09,
author = {N. Saxena and C. Seshadhri},
title = {An Almost Optimal Rank Bound for Depth-3 Identities},
booktitle = {Proceedings of the 24th Annual Conference on Computational Complexity (CCC)},
year = {2009},
pages = {137--148}
}
-
Noise Tolerance of Expanders and Sublinear Expander Reconstruction
        abstract     bibtex
    full
With S. Kale and
Y. Peres, FOCS 2008
We consider the problem of \emph{online sublinear expander reconstruction} and
its relation to random walks in ``noisy" expanders. Given access to an
adjacency list representation of a bounded-degree graph $G$, we want to convert
this graph into a bounded-degree expander $G'$ changing $G$ as little as
possible. The graph $G'$ will be output by a \emph{distributed filter}: this is
a sublinear time procedure that given a query vertex, outputs all its neighbors
in $G'$, and can do so even in a distributed manner, ensuring consistency in
all the answers.
One of the main tools in our analysis is a result on the behavior of random
walks in graph that are almost expanders: graphs that are formed by arbitrarily
connecting a small unknown graph (the noise) to a large expander. We show that
a random walk from almost any vertex in the expander part will have fast mixing
properties, in the general setting of irreducible finite Markov chains. We also
design sublinear time procedures to distinguish vertices of the expander part
from those in the noise part, and use this procedure in the reconstruction
algorithm.
@inproceedings{KalPS08,
author = {S. Kale and
Y. Peres and
C. Seshadhri},
title = {Noise Tolerance of Expanders and Sublinear Expander Reconstruction},
booktitle = {Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS)},
year = {2008},
pages = {719-728},
}
-
Testing Expansion in Bounded Degree Graphs
        abstract     bibtex
    full (ECCC TR07-076)
With S. Kale, ICALP 2008
We consider the problem of testing graph expansion (either vertex or edge) in
the bounded degree model of Goldreich and Ron. We give a property tester that given a
graph with degree bound $d$, an expansion bound $\alpha$, and a parameter
$\epsilon > 0$, accepts the graph with high probability if its expansion is
more than $\alpha$, and rejects it with high probability if it is
$\epsilon$-far from any graph with expansion $\alpha'$ with degree bound $d$,
where $\alpha' < \alpha$ is a function of $\alpha$. For edge expansion, we
obtain $\alpha' = \Omega(\frac{\alpha^2}{d})$, and for vertex expansion, we
obtain $\alpha' = \Omega(\frac{\alpha^2}{d^2})$. In either case, the algorithm
runs in time $\tilde{O}(\frac{n^{(1+\mu)/2}d^2}{\epsilon\alpha^2})$ for any
given constant $\mu > 0$.
@inproceedings{KalS08,
author = {S. Kale and C. Seshadhri},
title = {Testing Expansion in Bounded Degree Graphs},
booktitle = {In proceedings of the 35th International Colloquium on Automata, Languages and Programming (ICALP(I))},
year = {2008}
pages = {527--538}
}
-
Self-Improving Algorithms for Delaunay Triangulations
        abstract     bibtex
    full (arXiv:0907.0884)
With K. Clarkson, SOCG
2008
(In the version submitted to a journal, this paper has been merged
with "Self-improving algorithms".)
We study the problem of two-dimensional Delaunay triangulation
in the self-improving algorithms model. We assume that
the $n$ points of the input each come from an independent, unknown, and arbitrary
distribution. The first phase of our algorithm builds data structures that
store relevant information about the input distribution. The second phase uses
these data structures to efficiently compute the Delaunay triangulation
of the input. The running time of our algorithm matches the information-theoretic
lower bound for the given input distribution, implying that if the input
distribution has low entropy, then our algorithm beats the standard $\Omega(n\log n)$ bound
for computing Delaunay triangulations.
Our algorithm uses a host of techniques: $\epsilon$-net construction for disks, optimal
expected-case point-location structures, ideas from randomized incremental construction,
linear-time Delaunay splitting algorithms, and information-theoretic arguments.
@inproceedings{ClaS08,
author = {K. L. Clarkson and
C. Seshadhri},
title = {Self-improving algorithms for delaunay triangulations},
booktitle = {Proceedings of the 24th Annual Symposium on Computational Geometry (SOCG)},
year = {2008},
pages = {148-155},
}
-
Parallel Monotonicity Reconstruction
        abstract     bibtex
    full
With M. Saks, SODA 2008
(In the version submitted to a journal, we
have renamed the paper as "Local Monotonicity Reconstruction".)
We investigate the problem of monotonicity reconstruction, as defined
by Ailon et al (ISAAC 2004), in a distributed setting. We have oracle access
to a nonnegative real-valued
function $f$ defined on domain $[n]^d=\{1,\ldots,n\}^d$
($d$ is considered to be a constant).
We would like to closely approximate $f$
by a monotone function $g$.
This should be done by a procedure (a {\em filter}) that given as input
a point $x \in [n]^d$ outputs the value of $g(x)$, and runs in
time that is highly sublinear in $n$. The procedure
can (indeed must) be randomized, but we require that all of the
randomness be specified in advance by a single short random seed.
We construct such an implementation where the
time and space per query is $(\log n)^{O(1)}$ and the size of
the seed is polynomial in $\log n$ and $d$.
Furthermore the distance of the approximating function $g$ from $f$
is at most a constant (depending on $d$) multiple of the minimum distance of
any monotone function from $f$.
This
allows for a distributed implementation: one can initialize many copies
of the filter with the same short random seed, and they
can autonomously handle queries, while producing outputs that
are consistent with the same approximating function $g$.
@inproceedings{SakS08,
author = {M. E. Saks and
C. Seshadhri},
title = {Parallel monotonicity reconstruction},
booktitle = {Proceedings of the 19th Annual Symposium on Discrete Algorithms (SODA)},
year = {2008},
pages = {962-971},
}
-
Online Geometric Reconstruction
        abstract     bibtex
    full
With B. Chazelle, SOCG 2006
We investigate a new class of geometric problems based on the idea
of online error correction.
Suppose one is given access to a large geometric dataset
though a query mechanism; for example, the dataset could
be a terrain and a query might ask for the coordinates
of a particular vertex or for the edges incident to it.
Suppose, in addition, that the dataset satisfies some known structural property
${\cal P}$ (eg, monotonicity or convexity) but that, because of errors and noise,
the queries occasionally provide answers that violate ${\cal P}$.
Can one design a filter that modifies the query's answers so that
(i) the output satisfies ${\cal P}$; (ii) the amount of data
modification is minimized?
We provide upper and lower bounds on the complexity
of online reconstruction for convexity in 2D and 3D.
@inproceedings{ChaS06,
author = {B. Chazelle and
C. Seshadhri},
title = {Online geometric reconstruction},
booktitle = {Proceedings of the 22nd ACM Symposium on Computational Geometry (SOCG)},
year = {2006},
pages = {386-394},
}
-
Self-Improving Algorithms
        abstract     bibtex
With N. Ailon,
B. Chazelle,
and D. Liu, SODA 2006
We investigate ways in which an algorithm can improve
its expected performance by fine-tuning itself
automatically with respect to an {\em arbitrary, unknown} input distribution.
We give such {\em self-improving} algorithms
for sorting and clustering. The highlights of this work:
(i) a sorting algorithm with optimal expected limiting running time;
and (ii) a k-median algorithm over the Hamming cube with
linear expected limiting running time.
In all cases, the algorithm begins with a learning phase
during which it adjusts itself to the input distribution (typically
in a logarithmic number of rounds), followed
by a stationary regime in which the algorithm settles to its
optimized incarnation.
@inproceedings{DBLP:conf/soda/AilonCCL06,
author = {N. Ailon and
B. Chazelle and
S. Comandur and
D. Liu},
title = {Self-improving algorithms},
booktitle = {Proceedings of the 17th Annual Symposium on Discrete Algorithms (SODA)},
year = {2006},
pages = {261-270},
}
-
RAM Simulation of
BGS model of abstract state machines
        abstract     bibtex
With A. Seth and
S. Biswas, ASM 2005
We show in this paper that the BGS model of abstract state
machines can be simulated by random access machines with at most a
polynomial time overhead. This result is already stated in [BGS99] with
a very brief proof sketch. The present paper gives a detailed proof of the
result. We represent hereditarily finite sets, which are the typical BGS
ASM objects, by membership graphs of the transitive closure of the sets.
Testing for equality between BGS objects can be done in linear time in
our representation.
@inproceedings{ComSB05,
author = {S. Comandur and
A. Seth and
S. Biswas},
title = {RAM Simulation of BGS Model of abstract State Machines},
booktitle = {Proceedings of the 12th International Workshop on abstract State Machines (ASM)},
year = {2005},
pages = {377-386},
}
-
Property-Preserving Data Reconstruction
        abstract     bibtex (conf)
  bibtex (journal)
With N. Ailon,
B. Chazelle,
and D. Liu, ISAAC 2004
(Journal version in Algorithmica 2008)
We initiate a new line of investigation into
{\em online property-preserving data reconstruction.}
Consider a dataset which is assumed to
satisfy various (known) structural properties; eg, it may consist
of sorted numbers, or points on a manifold, or vectors in a polyhedral cone,
or codewords from an error-correcting code. Because of noise and errors,
however, an (unknown) fraction of
the data is deemed {\em unsound}, ie, in violation with
the expected structural properties.
Can one still query into the dataset in an online fashion
and be provided data that is {\em always} sound?
In other words, can one design a filter which,
when given a query to any item $I$ in the dataset,
returns a {\em sound} item $J$ that, although not necessarily
in the dataset, differs from $I$ as infrequently as possible.
No preprocessing should be allowed and queries
should be answered online.
We consider the case of a monotone function.
Specifically, the dataset encodes
a function $f:\{1,\ldots, n\}\,\mapsto {\bf R}$
that is at (unknown) distance $\eps$ from monotone,
meaning that $f$ can---and must---be modified
at $\eps n$ places to become monotone.
Our main result is a randomized filter
that can answer any query in $O(\log^2 n\log\log n)$ time
while modifying the function $f$ at only $O(\eps n)$ places.
The amortized time over $n$ function evaluations is $O(\log n)$.
%With probability arbitrarily close to 1,
%the filter never errs at any query.
The filter works as stated with probability arbitrarily close
to $1$.
We provide an alternative filter with $O(\log n)$ worst case
query time and $O(\eps n \log n)$ function modifications. For reconstructing
$d$-dimensional monotone functions of the form $f:\{1,\ldots, n\}^d\,\mapsto {\bf R}$,
we present a filter that takes $(2^{O(d)}(\log n)^{4d-2}\log\log n)$
time per query and modifies at most $O(\eps n^d)$ function values (for constant $d$).
@inproceedings{AilCCL042,
author = {N. Ailon and
B. Chazelle and
S. Comandur and
D. Liu},
title = {Property-Preserving Data Reconstruction},
booktitle = {Proceedings of the 15th International Symposium on Algorithms and Computation (ISAAC)},
year = {2004},
pages = {16-27},
}
@article{AilCCL042,
author = {N. Ailon and
B. Chazelle and
S. Comandur and
D. Liu},
title = {Property-Preserving Data Reconstruction},
journal = {Algorithmica},
volume = {51},
number = {2},
year = {2008},
pages = {160-182},
}
-
Estimating the Distance to a Monotone Function
        abstract   bibtex (conf)
  bibtex (journal)
With N. Ailon,
B. Chazelle,
and D. Liu, RANDOM 2004
  (Journal version in Random Structures and Algorithms 2007)
In standard property testing, the task is to
distinguish between objects that have a property $\mathcal{P}$
and those that are $\eps$-far from $\mathcal{P}$, for some
$\eps > 0$. In this setting, it is perfectly acceptable
for the tester to provide a negative answer for every input object that
does not satisfy $\mathcal{P}$. This implies that
property testing in and of itself cannot be expected to
yield any information whatsoever about the distance
from the object to the property.
We address this problem in this paper, restricting our attention
to monotonicity testing.
A function $f:\{1,\ldots, n\}\,\mapsto {\bf R}$ is at distance
$\eps_f$ from being monotone if it can (and must) be modified
at $\eps_f n$ places to become monotone.
For any fixed $\delta>0$, we compute, with
probability at least $2/3$,
an interval $[(1/2 -\delta)\eps, \eps]$ that
encloses $\eps_f$.
The running time of our algorithm is $O( \eps_f^{-1}\log\log \eps_f^{-1}
\log n )$, which
is optimal within a factor of $\log\log \eps_f^{-1}$ and
represents a substantial improvement over previous work.
We give a second algorithm with an expected running time of
$O( \eps_f^{-1}\log n \log\log\log n)$. Finally, we extend
our results to multivariate functions.
@inproceedings{AilCCL041,
author = {N. Ailon and
B. Chazelle and
S. Comandur and
D. Liu},
title = {Estimating the Distance to a Monotone Function},
booktitle = {APPROX-RANDOM},
year = {2004},
pages = {229-236},
}
@article{AilCCL041,
author = {N. Ailon and
B. Chazelle and
S. Comandur and
D. Liu},
title = {Estimating the distance to a monotone function},
journal = {Random Struct. Algorithms},
volume = {31},
number = {3},
year = {2007},
pages = {371-383},
}
|