Laxmi Parida The problem is motivated by the need to estimate accurately the statistical significance of large gene clusters along a chromosome. Consider the scenario, where common gene clusters are extracted from multiple species. For a cluster C, let S be the set of all possible subclusters in C observed in all the occurrences. One traditional way of computing the cluster probability is to ignore S and simply use individual gene probabilities in C. The effectiveness of this simple model is unclear when the number of genes is very large and the number of paralogs of each is small. In this talk, we address such a scenario with a model that is cognizant of S. This is arguably a better estimate of the cluster probability. However, the solution to this problem requires the estimation of a function P-arrangement (k) which we introduce, to understand the combinatorics of the clusters. The first part of the talk will provide the background and the use of permutations in a biological setting. In the second part of the talk, we introduce a certain combinatorial structure known as PQ-trees that is used in conjunction with P-arrangements to compute the cluster probabilities. Statistical Significance of Large Gene Clusters, Laxmi Parida, Journal of Computational Biology, 14(9), pp 1145--1159, 2007. Gapped Permutation Pattern Discovery for Gene Order Comparisons, Laxmi Parida, Journal of Computational Biology, vol 14, No 1, pp 46-56, 2007. Using PQ Structures for Genomic Rearrangement Phylogeny, Laxmi Parida, Journal of Computational Biology, 13(10), pp 1685-1700, 2006. Using Permutation Patterns for Content-Based Phylogeny, Enam Karim, Laxmi Parida, Arun Lakhotia, Pattern Recognition in Bioinformatics, LNBI 4146, pp 115-125, 2006. A PQ Framework for Reconstructions of Common Ancestors & Phylogeny Laxmi Parida, Proceedings of RECOMB-CG, Comparative Genomics, LNBI 4205, pp 141-155, 2006. Gene Proximity Analysis Across Whole Genomes via PQ Trees, G M Landau, L Parida, O Weimann, Journal of Computational Biology, 12(10), pp 1289--1306, 2005. Malware Phylogeny Generation Using Permutations of Code, M E Karim, A Walenstein, A Lakhotia, L Parida, Journal in Computer Virology, 2005. Permutation Pattern Discovery in Biosequences, R Eres, G M Landau, L Parida, Journal of Computational Biology, vol 11, No 6, pp 1050-1060, 2004.
|