# Colloquium

## Stochastic Models on Networks: Reconstruction and Optimization

## Democratizing content distribution

## Evolutionary Escape on Fitness Landscapes

## Object Recognition and Feedforward models of the Ventral Stream in Visual Cortex: What is Next?

Understanding the processing of information in our cortex is a significant part of understanding how the brain works and of understanding intelligence itself, arguably one of the greatest problems in science today. In particular, our visual abilities are computationally amazing and we are still far from imitating them with computers. Thus, visual cortex may well be a good proxy for the rest of the cortex and indeed for intelligence itself. But despite enormous progress in the physiology and anatomy of the visual cortex, our understanding of the underlying computations remains fragmentary.

I will briefly review hierarchical feedforward quantitative models of the ventral stream which, heavily constrained by physiology and biophysics, are surprisingly successful in explaining several physiological data and psychophysical results in rapid scene categorization. I will then focus on the limitations of such models for object recognition, suggesting specific questions about the computational role of attention and about recognition tasks beyond scene classification.

## The Limits of Quantum Computers

First, I'll show that any quantum algorithm to decide whether a function f:[n]->[n] is one-to-one or two-to-one needs to query the function at least n^{1/5} times. This provides strong evidence that collision-resistant hash functions, and hence secure electronic commerce, would still be possible in a world with quantum computers.

Second, I'll show that in the "black-box" or "oracle" model that we know how to analyze, quantum computers could not solve NP-complete problems in polynomial time, even with the help of nonuniform "quantum advice states."

Third, I'll show that quantum computers need exponential time to find local optima -- and surprisingly, that the ideas used to prove this result also yield new classical lower bounds for the same problem.

Finally, I'll show how to do "pretty-good quantum state tomography" using a number of measurements that increases only linearly, not exponentially, with the number of qubits. This illustrates how one can sometimes turn the limitations of quantum computers on their head, and use them to develop new techniques for experimentalists.

No quantum computing background will be assumed.

## Learning Predictive Models of Gene Regulation

Instead of adopting the structure learning viewpoint, our focus is to build predictive models of gene regulation that allow us both to make accurate quantitative predictions on new or held-out experiments (test data) and to capture mechanistic information about transcriptional regulation. Our algorithm, called MEDUSA, integrates promoter sequence, mRNA expression, and transcription factor occupancy data to learn gene regulatory programs that predict the differential expression of target genes. Instead of using clustering or correlation of expression profiles to infer regulatory relationships, the algorithm learns to predict up/down expression of target genes by identifying condition-specific regulators and discovering regulatory motifs that may mediate their regulation of targets. We use boosting, a technique from statistical learning, to help avoid overfitting as the algorithm searches through the high dimensional space of potential regulators and sequence motifs. We will report computational results on the yeast environmental stress response, where MEDUSA achieves high prediction accuracy on held-out experiments and retrieves key stress-related transcriptional regulators, signal transducers, and transcription factor binding sites. We will also describe recent results on hypoxia in yeast, where we used MEDUSA to propose the first global model of the oxygen sensing and regulatory network, including new putative context-specific regulators. Through our experimental collaborator on this project, the Zhang Lab at Columbia University, we are in the process of validating our computational predictions with wet lab experiments, with encouraging preliminary results.

## Looking at People

Tracking is hard, because it is important to track relatively small structures that can move relatively fast, for example,lower arms. I will describe research into kinematic tracking, tracking that reports the kinematic configuration of the body, that has resulted in a fairly accurate, fully automatic tracker, that can keep track of multiple people.

Once one has tracked the body, one must interpret the results. One way to do so is to have a motion synthesis system that takes the track, and produces a motion that is (a) like a human motion and (b) close to the track. Our work has produced a high-quality motion synthesis system that can produce motions that look very much like human activities. I will describe work that couples that system with a tracker to produce a description of the activities, entirely automatically.

It is often very difficult to obtain a rich set of examples of activities, because activities appear to compose in complex ways. I will describe methods to extract a representation of activity that can be searched with a text-like query.

I will speculate on some of the many open problems. What should one report? how do nearby objects affect one's interpretation of activities? how can one interpret patterns of behaviour?

We describe methods to determine what people are doing in a video sequence.

## Reach for A*: Shortest Paths on Road Networks

Third, we add shortcuts to the graph to reduce the number of highway (i.e., non-local) edges. Our best query algorithm is thousands of times faster than Dijkstra's algorithm on the road networks of the United States and Western Europe, each with roughly 20 million vertices. This is joint work with Andrew Goldberg, Chris Harrelson, and Haim Kaplan.

## Models and Algorithms for Genomic Sequences, Proteins, and Networks of Protein Interactions

Algorithms for biological sequence analysis. One of the most fruitful developments in bioinformatics in the past decade was the wide adoption of Hidden Markov Models (HMMs) and related graphical models to an array of applications such as gene finding, sequence alignment, and non-coding RNA folding. Conditional Random Fields (CRFs) are a recent alternative to HMMs, and provide two main advantages:

Networks of protein interactions. Graphs that summarize pairwise interactions between all proteins of an organism have emerged as canonical data sets that can be constructed using multiple sources of functional genomic data. We construct protein interaction networks for all sequenced microbes by integrating information extracted from genomic sequences as well as microarrays and other predictors of pairwise interactions. We then align these networks in multiple species using Graemlin, a tool that we developed for that purpose, and search for modules (subgraphs) of proteins that exhibit homology as well as conservation of pairwise interactions among many organisms. Graemlin provides substantial speed and sensitivity gains compared to previous network alignment methods; it can be used to compare microbial networks at http://graemlin.stanford.edu.

## Leaving the Span

However we discuss a simple sparse linear problem that is
hard to learn with any algorithm that uses a linear
combination of the embedded training instances
as its weight vector, no matter what embedding is used.
We show that these algorithms are inherently limited
by the fact that after seeing *k* instances only a
weight space of dimension *k* can be spanned.

Surprisingly the same problem can be
efficiently learned using the exponentiated gradient (EG) algorithm:
Now the component-wise logarithms of the weights are essentially a
linear combination of the training instances. This algorithm enforces "additional constraints" on
the weights (all must be non-negative and sum to one) and in some
cases these constraints alone force the rank of the weight space to
grow as fast as *2 ^{k}*.

(Joint work with S.V.N. Vishwanathan!)