1 / 39

Bioperl modules

Bioperl modules. Object Oriented Programming in Perl (1). Defining a class A class is simply a package with subroutines that function as methods. #!/usr/local/bin/perl package Cat; sub new { … } sub meow { … }. Object Oriented Programming in Perl (2). Perl Object

jersey
Download Presentation

Bioperl modules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioperl modules

  2. Object Oriented Programming in Perl (1) • Defining a class • A class is simply a package with subroutines that function as methods. #!/usr/local/bin/perl package Cat; sub new { … } sub meow { … }

  3. Object Oriented Programming in Perl (2) • Perl Object • To initiates an object from a class, call the class “new” method. $new_object = new ClassName; • Using Method • To use the methods of an object, use the “->” operator. $cat->meow();

  4. Object Oriented Programming in Perl (3) • Inheritance • Declare a class array called @ISA. • This array store the name and parent class(es) of the new species. package NorthAmericanCat; @NorthAmericanCat::ISA = (“Cat”); sub new { … }

  5. Perl Modules A Perl module is a reusable package defined in a library file whose name is the same as the name of the package.

  6. Names of perl modules • Each Perl module has a unique name. • To minimize name space collision, Perl provides a hierarchical name space for modules. • Components of a module name are separated by double colons (::). • For example, • Math::Complex • Math::Approx • String::BitCount • String::Approx

  7. Module files • Each module is contained in a single file. • Module files are stored in a subdirectory hierarchy that parallels the module name hierarchy. • All module files have an extension of .pm.

  8. Module libraries • The Perl interpreter has a list of directories in which it searhces for modules. • Global arry @INC >perl –V @INC: /usr/local/lib/perl5/5.00503/sun4-solaris /usr/local/lib/perl5/5.00503 /usr/local/lib/perl5/site-perl/5.005/sun4-solaris /usr/local/lib/perl5/site-perl/5.005

  9. Using Modules • A module can be loaded by calling the use function. use Foo; bar( “a” ); # using bar method blat( “b” ); # using blat method

  10. Bioperl toolkit • Core package (bioperl-live) • THE basic package and it’s required by all the other packages • Run package (bioperl-run) • Providing wrappers for executing some 60 common bioinformatics applications • DB package (bioperl-db) • Subproject to store sequence and annotation data in a BioSQL relational database • Network package (bioperl-network) • Parses and analyzes protein-protein interaction data • Dev package (bioperl-dev) • New and exploratory bioperl development

  11. Bioperl Object-Oriented • The Bioperl takes advantages of the OO design to create a consistent, well documented, object model for interacting with biological data in the life sciences. • Bioperl Name space The Bioperl package installs everything in the Bio:: namespace. (where are the packages stored???)

  12. Bioperl Objects • Sequence handling objects • Sequence objects • Alignment objects • Location objects • Other Objects: 3D structure objects, tree objects and phylogenetic trees, map objects, bibliographic objects and graphics objects

  13. Sequence handling • Typical sequence handling tasks: • Access the sequence • Format the sequence • Sequence alignment and comparison • Search for similar sequences • Pairwise comparisons • Multiple alignment

  14. Sequence Annotation • Bio::SeqFeature Sequence object can have multiple sequence feature (SeqFeature) objects (e.g. Gene, Exon, or Promoter objects) associated with it. • Bio::Annotation A Seq object can also have an Annotation object (used to store database links, literature references and comments) associated with it

  15. Sequence Input/Output The Bio::SeqIOsystem was designed to make getting and storing sequences to and from the myriad of formats as easy as possible.

  16. Accessing sequence data • Bioperl supports accessing remote databases as well as local databases. • Bioperl currently supports sequence data retrieval from the GenBank, Genpept, RefSeq, SwissProt, and EMBL databases

  17. Format the sequences • SeqIO object can read a stream of sequences in one format: Fasta, EMBL, GenBank, Swissprot, PIR, GCG, SCF, phd/phred, Ace, or raw (plain sequence), then write to another file in another format

  18. Manipulating sequence data $seqobj->display_id() # the human readable id of the sequence $seqobj->subseq(5,10) # part of the sequence as a string $seqobj->desc() # a description of the sequence $seqobj->trunc(5,10) # truncation from 5 to 10 as new object $seqobj->revcom # reverse complements sequence $seqobj->translate # translation of the sequence …

  19. Search result parsing The Bio::SearchIOsystem was designed for parsing sequence database searches (BLAST, sim4, waba, FASTA, HMMER, exonerate, etc.)

  20. Manipulating alignment The Bio::AlignIOsystem was designed for manipulating the alignment objects in different formats including aln, phylip, fasta, etc.

  21. Example: Format the sequences Example: using “seq_formating.pl” to convert “sequences.gb” to another format

  22. Copy the files to the current directory Check whether the files are executable Now, let’s look at the genbank file.

  23. The home directory in Windows system. If you have Notepad++ installed, click “Edit with Notepad++”. If not, try to open “sequence.gb” with Notepad program.

  24. uncheck

  25. The format of the input sequences.

  26. The perl script file

  27. If no arguments were supplied, a usage information will appear for instructions.

  28. <enter> Program name Format of the input sequences Format of the output sequences Input file Output file

  29. Program suceeded! Now it’s time to look at the file generated.

  30. Use ‘command prompt’ to run the script

  31. Type: cd<space>c:\BioDownload To enter the BioDownload folder

  32. Type: • dir • To display the files in the current folder (NOT ls) • You should have the following files in the folder • (you may have other files, but that’s fine): • seq_formating.pl • sequences.gb.txt

  33. Type: perl<space>seq_formating.pl<space>sequences.gb.txt<space>genbank<space>sequences.fasta<space>fasta

  34. Output file

  35. The format of the output sequences.

  36. What’s next: Parsing the BLAST output

More Related