COS 597c - Problem Set 1

Problem Set 1 - Alignments.

Find the optimal global alignment(s) between CGATTC and GAATTC using a scoring system which assigns

1 for a match

1 for a match

Find the best local alignment between CGCAGCATTCTTT and TCCATGCTTATGCG using the standard scoring system used in the textbook. Also turn in the scoring matrix produced by the algorithm.
Theoretical Exercise (11.41 from Dan Gusfield;'s book)

The tRNA folding problem

pairing

proper

nested

Affine gap penalty functions.

Smith-Waterman Alignment Engine.

Help

Experiment with various gap penalty functions and try to derive some relationships between the funcitons used and the resulting alignment.

For computer science majors:
Write a C-function to perform a global alignment between two sequences using an affine gap penalty function. Print the resulting matrix as well as all the optimal alignments to stdout.

     void align(char *seq1,
                    char *seq2,
                    char *matrix_file,
                    int gap_open_penalty,
                    int gap_extension_penalty);

What changes would you need to make to the local alignment algorithms if the gaps in s were penalized differently from gaps in t?

gaps in s are penalized by -2 and in t by -1
gaps in t are penalized by -2 and in s by -1

Multiple Alignments:

(1) ACGTC
(2) TCCT
(3) ACGTCCT

Compute all three optimal pairwise alignments assuming a cost of 2 for each deletion and 3 for each substitution. Give the cost of each alignment.
Compute a progressive multiple alignment starting with the pairwise alignment (1,3). Now use the pairwise alignment (2,3) to merge sequence 2 into the multiple alignment. Show the resulting alignment and give its cost.
Repeat problem 2, but this time use the pairwise alignment (1,2) to merge sequence 2 into the multiple alignment. Show the resulting alignment and give its cost. Are the two alignments the same? Which has a lower cost?
What is the optimal multiple alignment?
Suppose you charge a cost of 1 for each deletion and 1 for each substitution. What is the optimal alignment? Is it unique?