1 / 32

Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc 13928761660

Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc.edu 13928761660. #!/usr/bin/perl $DNA = 'ACGT'; # Next, we print the DNA onto the screen print $DNA, &quot;<br>&quot;; print '$DNA<br>'; print &quot;$DNA<br>&quot;; exit;. Do the Math (your 2nd Perl program). #!/usr/bin/perl print &quot; 4+5<br> &quot; ; print 4+5 , &quot; <br> &quot; ;

Download Presentation

Bioinformatics 生物信息学理论和实践 唐继军 jtang@cse.sc 13928761660

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics生物信息学理论和实践唐继军jtang@cse.sc.edu13928761660Bioinformatics生物信息学理论和实践唐继军jtang@cse.sc.edu13928761660

  2. #!/usr/bin/perl $DNA = 'ACGT'; # Next, we print the DNA onto the screen print $DNA, "\n"; print '$DNA\n'; print "$DNA\n"; exit;

  3. Do the Math (your 2nd Perl program) #!/usr/bin/perl print "4+5\n"; print 4+5 , "\n"; print "4+5=" , 4+5 , "\n"; [Note: use commas to separate multiple items in a print statement, whitespace is ignored]

  4. String Operations • Strings (text) in variables can be used for some math-like operations • Concatenate (join) use the dot . operator $seq1= "ACTG"; $seq2= "GGCTA"; $seq3= $seq1 . $seq2; print $seq3; ACTGGGCTA

  5. #!/usr/bin/perl –w $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; print "Here is the starting DNA:\n\n"; print "$DNA\n\n"; # Transcribe the DNA to RNA by substituting all T's with U's. $RNA = $DNA; $RNA =~ s/T/U/g; # Print the RNA onto the screen print "Here is the result of transcribing the DNA to RNA:\n\n"; print "$RNA\n"; # Exit the program. exit;

  6. #!/usr/bin/perl -w $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; print "$DNA\n\n"; $revcom = reverse $DNA; # See the text for a discussion of tr/// $revcom =~ tr/ACGTacgt/TGCAtgca/; # Print the reverse complement DNA onto the screen print "Here is the reverse complement DNA:\n\n"; print "$revcom\n"; exit;

  7. #!/usr/bin/perl -w $proteinfilename = 'NM_021964fragment.pep'; open(PROTEINFILE, $proteinfilename); # First line $protein = <PROTEINFILE>; print “\nHere is the first line of the protein file:\n\n”; print $protein; # Second line $protein = <PROTEINFILE>; print “\nHere is the second line of the protein file:\n\n”; print $protein; # Third line $protein = <PROTEINFILE>; print “\nHere is the third line of the protein file:\n\n”; print $protein; close PROTEINFILE; exit;

  8. #!/usr/bin/perl -w # The filename of the file containing the protein sequence data $proteinfilename = 'NM_021964fragment.pep'; # First we have to "open" the file open(PROTEINFILE, $proteinfilename); # Read the protein sequence data from the file, and store it # into the array variable @protein @protein = <PROTEINFILE>; # Print the protein onto the screen print @protein; # Close the file. close PROTEINFILE; exit;

  9. #!/usr/bin/perl -w # "scalar context" and "list context" @bases = ('A', 'C', 'G', 'T'); print "@bases\n"; $a = @bases; print $a, "\n"; ($a) = @bases; print $a, "\n"; exit;

  10. #!/usr/bin/perl -w # array indexing @bases = ('A', 'C', 'G', 'T'); print "@bases\n"; print $bases[0], "\n"; print $bases[1], "\n"; print $bases[2], "\n"; print $bases[3], "\n"; exit;

  11. String functions • Chomp • Length of a string • Substring

  12. #!/usr/bin/perl -w $proteinfilename = 'NM_021964fragment.pep'; unless ( open(PROTEINFILE, $proteinfilename) ) { print "Could not open file $proteinfilename!\n"; exit; } while( $protein = <PROTEINFILE> ) { print " #####Here is the next line of the file:\n"; print $protein; } # Close the file. close PROTEINFILE; exit;

  13. Comparison • String comparison (are they the same, > or <) • eq (equal ) • ne(not equal ) • ge(greater or equal ) • gt (greater than ) • lt(less than ) • le(less or equal )

  14. Conditions • if () {} • elsif() {} • else {}

  15. #!/usr/bin/perl –w $word = 'MNIDDKL'; if($word eq 'QSTVSGE') { print "QSTVSGE\n"; } elsif($word eq 'MRQQDMISHDEL') { print "MRQQDMISHDEL\n"; } elsif ( $word eq 'MNIDDKL' ) { print "MNIDDKL-the magic word!\n"; } else { print "Is \”$word\“ a peptide?\n"; } exit;

  16. More Conditions

  17. $x = 10; $y = -20; if ($x <= 10) { print "1st true\n";} if ($x > 10) {print "2nd true\n";} if ($x <= 10 || $y > -21) {print "3rd true\n";} if ($x > 5 && $y < 0) {print "4th true\n";} if (($x > 5 && $y < 0) || $y > 5) {print "5th true\n";}

  18. But • Use ==, <, <=, >, >=, !=, ||, && for numeric numbers • Use eq, lt, le, gt, ge, ne, or, and for string comparisons

  19. $x = 10; $y = -20; if ($x le 10) { print "1st true\n";} if ($x gt 5) {print "2nd true\n";} if ($x le 10 || $y gt -21) {print "3rd true\n";} if ($x gt 5 && $y lt 0) {print "4th true\n";} if (($x gt 5 && $y lt 0) || $y gt 5) {print "5th true\n";}

  20. #!/usr/bin/perl -w $num = 1234; $str = '1234'; print $num, " ", $str, "\n"; $num_or_str = $num + $str; print $num_or_str, "\n"; $num_or_str = $num . $str; print $num_or_str, "\n"; exit;

  21. More Arithmatics • +, -, *, **, /, % • +=, -=, *=, **=, /=, %= • ++, --

  22. $x = 10; $x = $x*1.5; print $x*=3, "\n"; print $x++, "\n"; print $x, "\n"; print ++$x, "\n"; print $x, "\n"; print $x % 3, "\n"; print $x**2, "\n";

  23. Motif finding • Read a sequence from a fasta file • Ask the user to input a motif • Check if the sequence has the motif

  24. #!/usr/bin/perl –w print "Please type the filename: "; $fname = <STDIN>; chomp $fname; open(PROTEINFILE, $fname); $name = <PROTEINFILE>; @protein = <PROTEINFILE>; close PROTEINFILE; $protein = join( '', @protein); $protein =~ s/\s//g; print "Enter a motif to search for: "; $motif = <STDIN>; chomp $motif; if ( $protein =~ /$motif/ ) { print "I found it!\n\n"; } else { print "I couldn\‘t find it.\n\n"; }

  25. #!/usr/bin/perl -w print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = <DNAFILE>; @DNA = <DNAFILE>; close DNAFILE; $DNA = join('', @DNA); $DNA =~ s/\s//g; $count_of_CG = 0; $position = 0; while ( $position < length $DNA) { $base = substr($DNA, $position, 1); if ( $base eq 'C' or $base eq 'G') { ++$count_of_CG; } $position++; } print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

  26. #!/usr/bin/perl –w print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = <DNAFILE>; @DNA = <DNAFILE>; close DNAFILE; $DNA = join('', @DNA); $DNA =~ s/\s//g; $count_of_CG = 0; for ( $position = 0 ; $position < length $DNA ; ++$position ) { $base = substr($DNA, $position, 1); if ( $base eq 'C' or $base eq 'G') { ++$count_of_CG; } } print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

  27. #!/usr/bin/perl –w print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = <DNAFILE>; @DNA = <DNAFILE>; close DNAFILE; $DNA = join('', @DNA); $DNA =~ s/\s//g; $count_of_CG = 0; while($DNA =~ /c/ig) {$count_of_CG++;} while($DNA =~ /g/ig) {$count_of_CG++;} print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

  28. #!/usr/bin/perl –w print "Please type the filename of the DNA sequence data: "; $dna_filename = <STDIN>; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = <DNAFILE>; @DNA = <DNAFILE>; close DNAFILE; $DNA = join('', @DNA); $DNA =~ s/\s//g; $count_of_CG = 0; while($DNA =~ /c/i) {$count_of_CG++;} while($DNA =~ /g/ig) {$count_of_CG++;} print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

  29. Exercise 1 • Ask for a protein file in fasta format • Ask for an amino acid • Count the frequency of that amino acid • TKFHSNAHFYDCWRMLQYQLDMRCMRAISTFSPHCGMEHMPDQTHNQGEMCKPRMWQVSMNQSCNHTPPFRKTYVEWDYMAKALIAPYTLGWLASTCFIW

  30. Exercise 2 • Ask for a DNA file in fasta format • Ask for a codon • Count the frequency of that codon • TCGTACTTAGAAATGAGGGTCCGCTTTTGCCCACGCACCTGATCGCTCCTCGTTTGCTTTTAAGAACCGGACGAACCACAGAGCATAAGGAGAACCTCTAGCTGCTTTACAAAGTACTGGTTCCCTTTCCAGCGGGATGCTTTATCTAAACGCAATGAGAGAGGTATTCCTCAGGCCACATCGCTTCCTAGTTCCGCTGGGATCCATCGTTGGCGGCCGAAGCCGCCATTCCATAGTGAGTTCTTCGTCTGTGTCATTCTGTGCCAGATCGTCTGGCAAATAGCCGATCCAGTTTATCTCTCGAAACTATAGTCGTACAGATCGAAATCTTAAGTCAAATCACGCGACTAGACTCAGCTCTATTTTAGTGGTCATGGGTTTTGGTCCCCCCGAGCGGTGCAACCGATTAGGACCATGTAGAACATTAGTTATAAGTCTTCTTTTAAACACAATCTTCCTGCTCAGTGGTACATGGTTATCGTTATTGCTAGCCAGCCTGATAAGTAACACCACCACTGCGACCCTAATGCGCCCTTTCCACGAACACAGGGCTGTCCGATCCTATATTACGACTCCGGGAAGGGGTTCGCAAGTCGCACCCTAAACGATGTTGAAGGCTCAGGATGTACACGCACTAGTACAATACATACGTGTTCCGGCTCTTATCCTGCATCGGAAGCTCAATCATGCATCGCACCAGCGTGTTCGTGTCATCTAGGAGGGGCGCGTAGGATAAATAATTCAATTAAGATATCGTTATGCTAGTATACGCCTACCCGTCACCGGCCAACAGTGTGCAGATGGCGCCACGAGTTACTGGCCCTGATTTCTCCGCTTCTAATACCGCACACTGGGCAATACGAGCTCAAGCCAGTCTCGCAGTAACGCTCATCAGCTAACGAAAGAGTTAGAGGCTCGCTAAATCGCACTGTCGGGGTCCCTTGGGTATTTTACACTAGCGTCAGGTAGGCTAGCATGTGTCTTTCCTTCCAGGGGTATG

More Related