Computation, Machine Learning and BioMarker Discovery in High-Throughput Proteomics

Mass spectrometry (MS) is a simple, 1930's era technology for separating molecules by mass. This old technology has been extended to exciting new
uses in separating and identifying components of biological mixtures. The latest protocols promise to be able to detect and quantify thousands of
proteins, peptides, metabolites, and other small molecules in a single experiment. The data volume produced by such experiments is enormous and
offers many computational challenges. In the first part of my talk, I will provide an overview of the current protocols for MS proteomics and highlight
the computational challenges. I will focus on the details of one small step of MS analysis, feature detection and quantification, and describe our
current model-based approach.

The first part of the talk will establish that MS has clear applications in preclinical biomarker discovery. In the second part, I will discuss why the
direct use of MS for clinical diagnostics is more controversial. Preliminary studies have demonstrated that machine learning approaches coupled with mass spectrometry can detect cancer with uncanny accuracy from simple blood samples, but thus far the FDA has refused to approve this use of the
technology. Armed with the background presented in the first part of the talk, we will examine both sides of the controversy and the audience members
can decide for themselves whether they would trust MS to perform a critical diagnostic test.