-
Find the optimal global alignment(s) between CGATTC and GAATTC using
a scoring system which assigns
-
1 for a match
-1 for a mismatch
-2 for a gap
-
1 for a match
-1 for a mismatch
0 for a gap
-
Find the best local alignment between CGCAGCATTCTTT and TCCATGCTTATGCG
using the standard scoring system used in the textbook. Also turn
in the scoring matrix produced by the algorithm.
-
Theoretical Exercise (11.41 from Dan Gusfield;'s book)
The tRNA folding problem:
The following is a crude version of a problem that arises in predicting
the secondary structure of transfer RNA molecules. Let S be a string
of n characters over the RNA alphabet A, C, U, G. We define
a pairing as a set of disjoint pairs of characters in S. A
pairing is called proper if it only contains (A, U) or (C, G) pairs.
This constraint arises because in RNA the (A, U) and (C, G) are pairs of
complementary nucleotides. If we draw S as a circular string, we
define a nested pairing as a proper pairing where each pair in the
pairing is connected by a line inside the circle, and where the lines do
not cross each other. (A figure will be attached). The problem is
to find a nested pairing of largest cardinality. Show how to solve
this version of the tRNA folding problem in O(n^3) using dynamic programming.
We now modify the problem by adding weights to the objective function
so that the weight of an (A, U) pair is different than the weight of a
(C, G) pair. The goal is to find a nested pairing of maximum total
weight. Give an efficient algorithm for this weighted problem.
-
Affine gap penalty functions.
For biology majors:
Follow the link to the Smith-Waterman
Alignment Engine.
Usage Help is
available.
Experiment with various gap penalty functions and try to derive some
relationships between the funcitons used and the resulting alignment.
For computer science majors:
Write a C-function to perform a global alignment between two sequences
using an affine gap penalty function. Print the resulting matrix
as well as all the optimal alignments to stdout.
void align(char *seq1,
char *seq2,
char *matrix_file,
int gap_open_penalty,
int gap_extension_penalty);
-
What changes would you need to make to the local alignment algorithms if
the gaps in s were penalized differently from gaps in t?
Suppose
-
gaps in s are penalized by -2 and in t by -1
-
gaps in t are penalized by -2 and in s by -1
-
Multiple Alignments:
For this problem, use the sum-of-pairs metric and follow the ``once
a
gap, always a gap'' rule. Consider the following three sequences.
(1) ACGTC
(2) TCCT
(3) ACGTCCT
-
Compute all three optimal pairwise alignments assuming a cost of 2 for
each deletion and 3 for each substitution. Give the cost of each alignment.
-
Compute a progressive multiple alignment starting with the pairwise alignment
(1,3). Now use the pairwise alignment (2,3) to merge sequence 2 into the
multiple alignment. Show the resulting alignment and give its cost.
-
Repeat problem 2, but this time use the pairwise alignment (1,2) to merge
sequence 2 into the multiple alignment. Show the resulting alignment and
give its cost. Are the two alignments the same? Which has a lower cost?
-
What is the optimal multiple alignment?
-
Suppose you charge a cost of 1 for each deletion and 1 for each substitution.
What is the optimal alignment? Is it unique?