include "course_info.php"; ?>
Visualization and analysis of genomic data.
|
Computer Science 597F
AdTopCS: Visualization & Analysis of large-scale genomic data sets echo $code; ?>
|
echo $full_semester; ?>
|
Course Summary
include "http://www.cs.princeton.edu/courses/descriptions/cs$code"; ?>
The goal of this course is to introduce students to computational issues
involved in analysis and display of large-scale biological data sets. Algorithms
covered will include clustering and machine learning techniques for gene
expression and proteomics data analysis, biological networks, joint learning
from multiple data sources, and visualization issues for large-scale biological
data sets. No prior knowledge of biology or bioinformatics is required, and an
introduction to the field of bioinformatics and the nature of biological data
will be provided. In depth knowledge of computer science is not required, but
some understanding of programming and computation will be helpful. The course
will be taught in a mixed lectures and seminar format, and will involve
completing a project.
The course is open to graduate and advanced undergraduate students from all
departments.
Administrative Information
SIGNING UP: You should be able to sign up for
this course through SCORE using 25078 as the class no (course is COS 597F).
Let Melissa Lawson know if this doesn't work.
Lectures: echo "$days $start-$end"; ?>MW 1330-1450, Room: echo $room; ?>
301
Professor: echo $prof_fullname; ?> -
echo $prof_room; ?> Olga
Troyanskaya - 204 CS Building - 258-1749
ogt@cs.princeton.edu (e-mail is the
best way to contact)
if(substr($code,0,1) == "5") {
echo "Graduate Coordinator:\n";
echo "Graduate
Coordinator: Melissa Lawson - 310 CS Building - 258-5387
mml@cs.princeton.edu
Course Format & Grading
This course will cover the following issues: microarray analysis, data
integration, biological networks, visualization of large-scale biological data.
The class will consist of a mixture of lectures, student presentations of
current literature papers, and discussions of these papers.
Students will also complete a team or individual project. The project
will need to have a significant content related to the course, but could
contribute to the student's current research and reflect the student's
computational background. For example, you could implement and evaluate a
machine learning method application for microarray data (if you have
computational background).
Grades will depend on class participation in discussions of assigned reading
(20%), presentations (35%), and project (45%).
Books
There is no required book for this class. Readings will be based on
current literature. However, here are a few book recommendations for the
curious.
If you need to catch up on molecular
biology and genetics:
DOE primer on human genetics
R. Brent.
Genomic Biology.
Cell 100:169-183, 2000.
L. Hunter.
Molecular Biology for
Computer Scientists. In Artificial Intelligence and Molecular Biology,
L. Hunter editor, 1993, AAAI Press.
Introduction to bioinformatics:
P.L. Elkin.
Primer on Medical
Genomics Part V: Bioinformatics. In Mayo Clinic Proceedings.
NCBI
bioinformatics primer
NCBI primer
on microarray analysis
Approximate Schedule
Note: This schedule is approximate and may change.
S M Tu W Th F S
Sep 14 15 16 17 18 19 20 introduction to biology, bioinformatics, data; first class
21 22 23 24 25 26 27 microarray analysis, types of experiments, databases
28 29 30
Oct 1 2 3 4 microarray analysis
5 6 7 8 9 10 11 microarray analysis
12 13 14 15 16 17 18 proteomics
19 20 21 22 23 24 25 data integration
26 27 28 29 30 31 fall break
Nov 1
2 3 4 5 6 7 8 data integration
9 10 11 12 13 14 15 biological networks
16 17 18 19 20 21 22 biological networks
23 24 25 26 27 28 29 Thanksgiving
30
Dec 1 2 3 4 5 6 visualization
7 8 9 10 11 12 13 visualization; last class
14 15 16 17 18 19 20 winter break
21 22 23 24 25 26 27
28 29 30 31
Jan 1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Slides
9/15 - Course details, molecular biology 101, challenges in functional genomics, intro to microarrays
9/17 - A (very) brief overview of database issues, data filtering, normalization, and clustering
Kai Li's guest lecture about visualization
Readings
NOTE: readings are listed for the date when they are DUE
(not they date on which they are assigned)
Each student presentation will be UNDER 30 minutes (INCLUDING questions), and there will be 2 student presentations
per class, followed by a 20 minute discussion. It is perfectly fine to have a presentation that takes 20 minutes, with questions
you will probably take around 25-30mins anyway. Aim at a mixed audience, but explain methods in details.
|
CLASS |
PAPERS |
PRESENTERS |
|
9/15 |
DOE "Genomics
and its impact on Science and Society"
R. Brent. "Genomic
Biology" |
lecture
|
|
9/17 |
Lockhart et al "Genomics,
gene expression, and DNA microarrays" (general microarray)
Kaminski N et al "Practical
approaches to analyzing results of microarray experiments" (review) |
lecture
|
|
9/22 |
CLASS CANCELLED |
|
|
9/24 |
Troyanskaya et al "Missing
value estimation for DNA microarrays" (low-level processing)
Yang et al "Normalization
for cDNA Microarray Data" (normalization/statistics-don't need to go
into very fine detail on statistical methods) |
Elena Nabieva
Tony Wirth |
|
9/29 |
Eisen et al "Cluster
analysis and display of genome-wide expression patterns"
(Clustering/biology)
Cheng, Y et al "Biclustering
of expression data" (Clustering) |
Jessica Fong
Jie Chen |
|
10/1 |
Brown, MPS et al "Knowledge-based
analysis of microarray gene expression data by using support vector machines"
(Data organization)
Raychaudhuri et al "The
computational analysis of scientific literature to define and recognize gene
expression clusters" (Data organization/some biology) |
Joseph Berillari
Kristina Rogale |
|
10/6 |
McShane LM et al "Methods
for assessing reproducibility of clustering patterns observed in analyses of
microarray data" (Evaluation)
Alter et al "Generalized
singular value decomposition for comparative analysis of genome-scale
expression data sets of two different organisms" (combining expression
data sets) |
Nathanial Dirksen
Matthew Hibbs |
|
10/8 |
Dettling et al "Boosting
for tumor classification with gene expression methods" (Classification)
Troyanskaya et al "Nonparametric
methods for identifying differentially expressed genes in microarray data"
(marker selection/evaluation) (optional)
Discussion of class projects. |
Andre Cavalcanti
Olga Troyanskaya |
|
10/15 |
Liu, X. et al "Bioprospector:
Discovering conserved DNA motifs in upstream regulatory regions of
co-expressed genes" (regulatory regions discovery)
Segal, E. et al "Module
networks: identifying regulatory modules and their condition-specific
regulators from gene expression data" (gene expression networks) |
Jordan Vance
Robert Osada |
|
10/17 @3pm |
Phizicky et al "Protein analysis
on a proteomic scale" Eisenberg et
al "Protein function in the post-genomic era" |
Nathaniel Dirksen Joseph Berillari |
|
10/20 |
Edwards et al "Bridging
structural biology and genomics: assessing protein interaction data with
known complexes"
Tong et al "A combined
experimental and computational strategy to define protein interaction
networks for peptide recognition modules" |
K. Rogale
J. Fong |
|
10/22 |
Greenbaum et al "Interrelating
different types of Genomic Data, from proteome to secretome: Oming in on the
function" Troyanskaya et al "A
Bayesian framework for combining heterogeneous data sources for gene
function prediction (in S. cerevisiae)" |
J. Vance
M. Hibbs |
|
11/3 |
Class projects proposal
presentations |
project groups |
|
11/5 |
Stuart et al "A
gene coexpression network for global discovery of conserved genetic modules"
Letovsky et al "Predicting
protein function from protein/protein interaction data: a probabilistic
approach" |
Elena Nabieva
Andre Cavalcanti |
|
11/10 |
Smith et al. "Evaluating
functional network inference using simulations of complex biological systems"
Lanckriet et al "Kernel-based
data fusion and its application to protein function prediction in yeast" |
Robert Osada
Kristina Rogale |
|
11/12 |
Ihmels et al "Revealing
modular organization in the yeast transcriptional network"
Bar-Joseph et al "Computational
discovery of gene modules and regulatory networks" |
Jessica Fong
Elena Zaslavsky |
|
11/17 |
outside lecture - JP Singh |
JP Singh |
|
11/19 |
Project progress reports |
Matt & Nathaniel
Jordan Joe
Kristina Andre & Jie |
|
11/24 |
Sharan et al "Click
and Expander: a system for clustering and visualizing gene expression data"
Breitkreutz "Osprey: a network visualization
system" - this is a short paper, so the presenter should also download
the Osprey software and show us visualizations it is capable of, as well as
outline the limitations of the software (based on its use, not just the
paper) |
Jie Chen
Matt Hibbs |
|
11/26 |
Demir et al "Patika:
an integrated visual environment for collaborative construction and analysis
of cellular pathways" Werner-Washburne
et al "Comparative
Analysis of multiple genome-scale data sets" |
Nathaniel Dirksen
Joe Berillari |
|
12/1 |
outside lecture - Kai Li |
Kai Li |
|
12/3 |
Q&A about course and projects |
|
|
12/8 |
Final project presentations |
Kristina Andre & Jie |
|
12/10 |
Final project presentations |
Matt & Nathaniel
Jordan Joe |
|
|
|
|