1 / 12

Lecture 6. Pairwise Local Alignment and Database Search

Lecture 6. Pairwise Local Alignment and Database Search. Csc 487/687 Computing for bioinformatics. Homology Search. Given sequence q does there exist a sequence d in a database D such that q and d are homolgous?

salim
Download Presentation

Lecture 6. Pairwise Local Alignment and Database Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics

  2. Homology Search • Given sequence q does there exist a sequence d in a database D such that q and d are homolgous? • Could perform global pairwise alignment between q and each sequence in D, but • Maybe only a segment of q is highly (beyond random) similar to a segment of a database sequence • Remote homology – only motif conserved • Sequence/domain rearrangements – sequences not globally homologous, but share domain • Local alignment (alignment of segment of q with segment of d) desirable

  3. Homology Search – Task • Present all sequences in D that have segments homologous to segments in q • Avoid presenting sequences in D that are not homologous • For each local alignment – calculate statistical probability that alignment is ”random” (not caused by evolutionary relation)

  4. Definitions • Segment – contiguous subsequence (substring) of q or d • Segment pair – pair of segments, one from q and one from d (need not be of the same length) • Local alignment – alignment of a segment pair

  5. Dot Plot – Visualising Similarity • For sequence q (length m), d (length n), construct m times n matrix • Make a dot in cell (i,j) if qi=dj. • Possible to filter matrix • E.g., use window of length K – make dot in (i,j) only if at least C% of characters are similar between K-windows around (i,j)

  6. Dot Plots are Easy to Interpret • Can identify for instance repeats • Example: • Human HPRT gene (genomic sequence) • Dot if 8 identical bases • http://www.ansorge-group.embl.de/ geneskipper/dotplot.htm

  7. Dynamic Programming for Local Alignment (Smith & Waterman 1981) • Assumptions • scoring matrix has ”negative expectation” • gaps should decrease alignment score (as before) • Consequence: • Subalignment with negative score coming first (prefix) or last (suffix) can be removed to improve alignment score • Gaps should not be included unless the alignments on either side score to make up for the gap penalty Alignment prefix suffix

  8. Empty alignment Recurrence relation q1..i-1 h1..j qi - q1..i-1 h1..j-1 qi dj q1..i h1..j-1 - dj Effectively allows for removal of negatively contributing prefixes.

  9. Initialization – Removing Initial Gaps • Initial gaps – in either sequence – should be ignored

  10. The Best Local Alignment • Should ignore negatively contributing suffixes of alignments • Score of best local alignment – highest value in dynamic programming matrix • Alignment found by tracing back from maximum value until cell with value 0 (zero) has been reached

  11. 0 Best alignment Score of best alignment Calculating Best Local Alignment Use to fill rest row by row Use to fill first row Use to fill first column H matrix

  12. Time Complexity • Sequences of lengths n and m • Two sequences of length l

More Related