260 likes | 444 Views
Yao-Cheng Lin Bioinformatics & Evolutionary Genomics Division VIB Department of Plant Systems Biology, UGent. Laccaria gene families and their expression. Introduction. Gene family construction based on JGI Laccaria gene models and six other fungal genomes.
E N D
Yao-Cheng Lin Bioinformatics & Evolutionary Genomics Division VIB Department of Plant Systems Biology, UGent Laccaria gene families and their expression
Introduction • Gene family construction based on JGI Laccaria gene models and six other fungal genomes. • Detailed analysis of each (large expanded) gene family. • Protein domain structure and their possible phylogeny tree. • Expression profile for this family. • Coexpression of neighboring genes
Gene families • Building the protein phylogeny profile and dividing them into two parts: • Core-Genes: genes that have the same single copy number among the seven genomes. The expectation is that they might have a conserved gene structure and a similar expression profile • Genes showing a Large Expansion in Laccaria or basidiomycota. • Genomes for gene family construction • 2 Ascomycota: Neurospora crassa (Ncra), Magnaporthe grisea (Mgri) • 5 Basidiomycota: Ustilago maydis (Umay), Cryptococcus neoformans (Cneo), Phanerochaete chrysosporium (Pchr), Coprinopsis cinerea (Ccin), Laccaria bicolor (Lbic)
Building Gene families BLAST result Fungal Proteins All-against-all Homology search TribeMCL 7 fungal proteomes Blast2GO Gene with GO terms MySQL Pfam Genome Core-gene Gene families Phylogeny profile Expansion
Core-gene to build the phylogenetic tree • Phylogenetic tree of 7 fungal species. 591 gene families which are only present once in each genome were aligned and concatenated into an artificial alignment of 464,163 positions. Ambiguously aligned regions were removed, leaving an alignment of 211,358 residues.
TPR domain • The tetratricopeptide repeat is a structural motif present in a wide range of proteins. It mediates protein-protein interactions and the assembly of multidomain complexes. It contains TPR-1, TPR-2, TPR-3, TPR-4 and it’s similar to NSF (membrane fusion). • This family had been discovered in a wide range of species from prokaryotic to eukaryotic. • The mutation of a TPR containing gene(CCN1)in Cryptococcus can cause subcutaneous lesions but fails to cause systemic infection.
TPR domain * After manually remove the extra long (~20-40kb) intron genes. Family3 2-12 TPR NACHT / NB-ARC Family22 3-15 TPR NSF/Sel1
NACHT / NB-ARC domain • NACHT family, NTPases may function as key integrators of stress and nutrient availability signals.
TPR (family22) Independent expansion of TPR domain genes
Cytochrome P450 • The P450 protein in Aspergillus could catalyze various reactions involved in biosynthesis of aflatoxins and other such secondary metabolites. • The Phanerochaete genome shows a total of sixteenP450 gene clusters, with up to 11 tandem genes in a cluster. There is up to 10 tandem genes in Laccaria.
Cytochrome P450 Phylogeny tree of family 4
How many Cytochrome P450 ? • CYP61 is a single copy P450 gene across 7 genomes. • CYP64 had large independent expansion in basidiomycota. Laccaria is still calculating
WD-40 • The underlying common function of all WD-repeat proteins is coordinating multi-protein complex assemblies, where the repeating units serve as a rigid scaffold for protein interactions. 13-14 WD-40 NACHT / NB-ARC
Family Lbic Ccin Pchr Ppla Cneo Umay Sros Mgri Ncra Domain 14 14 14 14 21 14 13 14 14 14 AAA+ ATPase AAA+ ATPase Postia placenta Sporobolomyces roseus The phylogeny tree in one clade
AAA+ ATPase • AAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes • CaYLL34, a Candida AAA+ ATPase encoding gene was enhanced in hyphae (mycelium). The deletion mutant had significant decrease of hyphae formation whereas proper morphology is essential for the ability of Candida to switch between yeast and hyphae .
NimbleGen Expresssion Array (INRA-Nancy) • Gene models • 22,294 predicted Laccaria gene models • 178,352 60mer (8 per gene models) • Cross-hybridization • ~30% cross-hybridization • Biological Material • Mycelium (2 replicates), Mycorrhiza (4 replicates), Fruit bodies (2 replicates), Laccaria in contact with helper bacteria (1 control, 2 replicates)
Cross-hybridization rate High cross-hybridize Low cross-hybridize
AAA+ ATPase family membersdo not show cross-hybridization Different group of genes are up/down regulated in different biological materials.
Protein families and their expression Core-genes Ankyrin* TPR-22 domain* TPR-3 domain* Unknown* Mycorrhiza Fruit body Mycelium With helper cell Y-axis: log2 (mean ( expression/3-fold mean value )) * very high cross-hybridization but low expression value
Lowly-Expressed Genes • Either they only express in certain conditions or stages which is not in the current microarray materials. • > RT-PCR with broader biological material sources. • Or they are pseudogenes • Why does Laccaria need so many TRP domain proteins which are all lowly expressed ?
Protein length and average expression value Only calculate those genes without cross-hybridization Y-axis: log2 (mean ( expression/3-fold mean value )) *Only calculate to protein length < 500 a.a.
Gene Co-expression on Chromatin Level? Topology of Gene Expression Mycellium Mycorrhiza Fruit Body Helper Cell Expression along scaffold_2 comparing to 3-fold normalized mean value (50Kb window / 10 Kb sliding)
Future work on the coexpression data • The data normalization should be taken after filtering out all the cross-hybridizing probes. • Different window sizes (fix sequence length or fix gene number) and sliding window sizes should be tested. Then correlation coefficients should be computed for each window size. • Are there Expression Hot-Spot or Silent Zone (centromere, telomere) ?
Acknowledgement • Pierre Rouze • Yves Van de Peer • Jan Wuyts • Stephane Rombauts • Lieven Sterck