Currently, I am working on algorithms for the large-scale, query-supervised integration of biological datasets, specifically gene expression datasets that are deposited in GEO.
We want to leverage the massive human compendium for investigating functional relationships between genes that participate in a pathway, or share a common function.
I am also interested in developing software and website tools for fast visualizations of expression datasets, online clustering of datasets to identify up- and down-regulated conditions, and integration of expression with text meta-data to enable hypothesis testings with human diseases and disease subtypes.
I and my colleagues have developed SEEK, a web-based search system for the heterogeneous human expression compendium.
SEEK enables biologists to search for co-expressed genes surrounding a set of interested genes. Using this system, biologists can identify candidate pathway genes and pathway regulators, annotate new functions to genes, or help discover new interactions between genes.
SEEK also provides online visualizations of multiple expression datasets.