Service-centric networking with Serval
Survey-based GWAS. Genome-wide association studies (GWAS) identify genetic variants that are associated with the occurrence of a complex phenotype or disease in a set of individuals. Many phenotypes are difficult to quantify with a single measure. I am building methods for conducting GWAS using survey data as the phenotype. Standard dimensionality reduction techniques are not effective for scaling down the size of the data because the resulting phenotype summaries were not interpretable. In prior work, we applied SFA and found that the sparse solution had phenotypic interpretations for all of the factors, and genetic associatons for a number of phenotypes. Our current work goes well beyond this model for greater robustness and inference of the number of factors from the underlyng data.
Prevalent computer architecture modeling methodologies are prone to error, make design-space exploration slow, and create barriers to collaboration. The Structural Modeling Project addresses these issues by providing viable structural modeling methodologies to the community. The Liberty Simulation Environment showcases this approach and serves as the core of a new international standardization effort called Fraternité.
We present a method that combines hand-drawn artwork with fluid simulations to produce animated fluids in the visual style of the artwork. Given a fluid simulation and a set of keyframes rendered by the artist in any medium, our system produces a set of in-betweens that visually matches the style of the keyframes and roughly follows the motion from the underlying simulation. Our method leverages recent advances in patch-based regenerative morphing and image melding to produce temporally coherent sequences with visual fidelity to the target medium. Because direct application of these methods results in motion that is generally not fluid-like, we adapt them to produce motion closely matching that of the underlying simulation. The resulting animation is visually and temporally coherent, stylistically consistent with the given keyframes, and approximately matches the motion from the simulation. We demonstrate the method with animations in a variety of visual styles.
In many modern optimization problems, particularly those arising in machine learning, the amount data is too large to apply standard convex optimization methods. We develope new optimization algorithms that make use of randomization to prune the data produce a correct solution albeit running in time which is smaller than the data representation, i.e. sublinear running time. Such sublinear-time algorithms are applied to linear clasiffication, training support vector machines, semidefinite optimization and other problems. These new algorithms are based on a primal-dual approach, and use a combination of novel sampling techniques and the randomized implementation of online learning algorithms. Results are many times accompanied by information-theoretic lower bounds that show our running times to be nearly best possible in the unit-cost RAM model.
Algorithmic verification techniques have made tremendous progress by leveraging advancements in decision procedures based on SAT/SMT solvers. The project aims to develop techniques that improve their scalability for program verification and synthesis, by combining deductive learning with learning on data and examples.
As chip densities and clock rates increase, processors are becoming more susceptible to error-inducing transient faults. In contrast to existing techniques, the THRIFT Project advocates adaptive approaches that match the changing reliability and performance demands of a system to improve reliability at lower cost. This project introduced the concept of software-controlled fault tolerance.
Understanding how eQTLs work by looking across eQTL studies, cell types, and regulatory element data
As part of the GTEx consortium, and in collaboration with Casey Brown, we have conducted large-scale replication studies across eleven studies in seven tissue types. We have overlaid these results onto regulatory element data to enable a much more profound mechanistic understanding of eQTL data by looking at where the eQTLs and also the cell type specific eQTLs are co-located with specific cis-regulatory elements.
We are currently developing statistical models for understanding eQTLs and variants that influence mRNA isoform levels in RNA-seq data. We are also working on predictive models for eQTLs across tissue types and models that consider replication in trans-eQTLs.
Untrusted cloud storage and social networks
The VELOCITY Compiler Project aims to address computer architecture problems with a new approach to compiler organization. This compiler organization, embodied in the VELOCITY Compiler (and derivative run-time optimizers), enables true whole-program scope, practical iterative compilation, and smarter memory analysis. These properties make VELOCITY better at extracting threads, improving reliability, and enhancing security.
The software toolchain includes static analyzers to check assertions about your program; optimizing compilers to translate your program to machine language; operating systems and libraries to supply context for your program. The Verified Software Toolchain project assures with machine-checked proofs that the assertions claimed at the top of the toolchain really hold in the machine-language program, running in the operating-system context.
We study vision science ‒ the computational principles underlying computer vision, robot vision, and human vision. We are interested in building computer systems that automatically understand visual scenes, both inferring the semantics and extracting 3D structure for a large variety of environments. Our research is also closely related to computer graphics, perception and cognition, cognitive neuroscience, machine learning, HCI, NLP and AI in general.
At the moment, we focus on leveraging Big 3D Data for Visual Scene Understanding (e.g. RGB-D sensors, CAD models, depth, multiple viewpoints, panoramic fields of view), to look for the right representations of visual scenes that realistically describe the world. We believe that it is critical to consider the role of a computer as an active explorer in a 3D world, and learn from rich 3D data that is close to the natural input that humans have.
WordNet is a resource used by researchers attempting to get computers to understand English (and any other language for which a WordNet exists).
Viewing a human language as a very large graph provides a theoretical framework for creating algorithms for understanding the meaning of words
in a text (e.g. determining if "fly" is an insect or a ball hit into left field) and translating documents between languages. We are working on both making WordNet more effective for these tasks and creating new approaches that use WordNet.