Combinatorial Code Analysis for Understanding Biological Regulation
An important mechanism to achieve regulatory specificity in diverse
biological processes is through the combinatorial interplay between
different regulators, such as amongst transcription factors (TFs) during
transcriptional regulation or between RNA binding proteins (RBPs) and
microRNAs (miRNAs) during transcript degradation control. To advance our
understanding of combinatorial regulation, we developed a computational
pipeline called CCAT (Combinatorial Code Analysis Tool) for predicting
genome-wide co-binding between biological regulators.
In the first part of this thesis, we applied CCAT to the D. melanogaster
genome to uncover cooperativity amongst TFs during embryo development.
Using publicly-available TF binding specificity data and DNaseI
chromatin accessibility data, we first predicted genome-wide binding
sites for 324 TFs across five stages of D. melanogaster embryo
development. We then applied CCAT in each of these developmental stages,
and identified from 20 to 60 pairs of TFs in each stage whose predicted
binding sites are significantly co-localized. Several of the co-binding
pairs we found correspond to TFs that are known to work together.
Further, pairs of binding sites predicted to cooperate were found to be
consistently enriched in their evolutionarily conservation and their
tendency to be found in regions bound in relevant ChIP experiments.
Finally, we found that TFs tend to be co-localized with other TFs in a
dynamic manner across developmental stages.
In the second part of this thesis, we applied CCAT to explore whether
RBPs and miRNAs cooperate to promote transcript decay. We concentrated
on five highly conserved RBP motifs in human 3 UTRs. A specific group of
miRNA recognition sites were enriched within 50 nts from the RBP
recognition sites for PUM and UAUUUAU. The presence of both a PUM
recognition site and a recognition site for preferentially co-occurring
miRNAs was associated with faster decay of the associated transcripts.
For PUM and its co-occurring miRNAs, binding of the RBP to its
recognition sites was predicted to release nearby miRNA recognition
sites from RNA secondary structures. Overall, our CCAT analyses suggest
that a specific set of RBPs and miRNAs work together to affect
transcript decay, with the release of miRNA recognition sites via RBP
binding as one possible model of cooperativity.
Our pipeline provides a general tool for identifying combinatorial
cooperativity in biological regulation. All generated data as well as
source code are available at: http://cat.princeton.edu.