Josh Forman DNA, the genetic material, encodes the sequence information needed to make all of the proteins within the cell, as well as short functional motifs that signify, for instance, the beginning and end of genes or the binding sites for transcription factors. With the exception of splicing signals, functional DNA sequence motifs have been largely localized outside the sequences that code for amino acids - which makes sense, since non-coding DNA is significantly more mutable than coding DNA and should therefore be more amenable to evolving regulatory binding sites. Despite these apparent limitations, we developed an algorithm to mine the coding regions for surprising levels of conservation and found an excess of sequences conserved at the nucleotide level within coding regions in the human genome. We estimate that hundreds of genes contain sequences hidden by this kind of "genetic steganography" - whereby the gene itself contains an important message that is hidden within another, much larger and more recognizable message. I will present unpublished details of our findings that demonstrate an important biological role for some of these conserved sequences.
|
