230 likes | 419 Views
Optimising Biofuel production computational characterisation of gene and related promoter and enhancer involved in fatty acid production in algae. Candidato : Antonino Ida’. Relatore: Prof. Giovanni Perini Supervisione : Prof. Ugur Sezerman.
E N D
Optimising Biofuel productioncomputational characterisation of gene and related promoter and enhancer involved infatty acid production in algae Candidato : Antonino Ida’ Relatore: Prof. Giovanni Perini Supervisione : Prof. Ugur Sezerman Facolta’ di Scienze Matematiche, Fisiche e Naturali. Corso di Laura Magistrale in Bioinformatica 5 March 2009
Statement of the problem Energetic crisis requires a new renovable source of fuel Biofuel: • Bioethanol (produced by fermenting plant-based raw materials i.e. sugar) • Biogas (produced through anaerobic fermentation of biomass) • Biodiesel (is a methyl ester made from raw materials, such as plant-based oil)
Biodiesel Product from the trans-esterification of triaglycerides whit methanol • Consisting of a long chain of • Alkyl • Methyl • Propyl • Ethyl
Current Biodiesel Production Algae: 2763 dm3 Hemp: 1535 dm3 Chinese tallow: 772 dm3 Palm oil: 780 - 1490 dm3 Coconut: 353 dm3 Rapeseed: 157 dm3 Soy: 76-161 dm3 Peanut: 138 dm3 Sunflower: 126 dm3
Advantages in algae • Synthesis and accumulation large quantities of neutral lipids/oil (20-50% DCW) • Grow at high rate • Thrive in the saline/brackish water/costal seawater • Tolerate marginal land that are not suitable for the conventional agriculture • Utilize growth nutrient such as nitrogen and phosphores from a variety of waste water sources • Sequester carbon dioxide from flue gasses emitted from fossil fuel fired • Produced value added co-product or by-product (e.g. Byopolimers, protein, pigments, animal feed, fertilizer) • Grown in suitable culture vessels (photo-bioreactor) with an annual biomass productivity exceeding that of terrestrial plant by approsimately tenfold
Algae : large number of species The ability to survive over a wide enviromental condition, reflects a wide range of fatty acids product and the ability to modify lipid metabolism efficently in responce to changes in enviromental condition
Biosynthesis of fatty acids • From Acetyl-CoA pool the following reaction takes place: • Carboxylation • Condensation • Reduction • Dehydration • Reduction
Regulation of fatty acids synthesis A major challenge for the future is to discover how the level of expression of genes lipid synthesis is controlled • Acetyl CoA carboxylase (ACCase) is the major candidate to regulate this pathway • Displaced from equilibrium • Light dependent step • Feedback regulation • Compartmentalization
Aim of the project • Finding out the conservative motif on upsteam region between the strains • Comparing the difference in the upsteam region of ACCase according to the amount of fatty acids • Identifing the sequence of ACCase in Scenedesmus Protuberans • Available data: • Experimental caracterization of one gene from Cyclotella Criptica • some draft genomic sequence
Genome sequences NCBI JGI EBI Gene sequences Refinement and classification
Strategy of selection • Autor’s annotation (literature) • Keyword search • BlastN against all draft genome sequence and EST db • PsiBlast against protein sequence db • TblastN against translate nucleotide • Refinement by ClustalW in local
Phylogenetic footprinting A single gene was investigated and non-coding flanking region were compared to their homologs „TFBSs are Island of conservation in a sea of much less conserved DNA” Giulio Pavesi • Exaustive search is prohibitive due to the exponential growth. Thus heuristic methods have been used in: • Bioprospector • GALF_P
Bioprospectora Gibbs sampling algorithm • How Gibbs captures a motif • Probabilistic matrix of a motif with length w • The goal of Gibbs sampling is to maximize the rate between motif base composition and background base distribution.
Initial Motif Motif Without a1' Segment Bioprospector Initialization : Randomly initialize the beginning motif Iterative update: take out one sequence at a time with its segment a1' a2' a3' a4' ak'
Motif Without a1' Segment Bioprospector Iterative Update: Scoring each segment with the current motif Ax=Qx/Px Where: Qx= probability of genereting segment x from the current motif matrix Px= probability of generating segment x from the indipendent background model Sequence 1 Segment (1-6): 1.5 Segment (3-8): 2.7 Segment (6-11): 27.1 Segment (4-9): 9.0 Segment (5-10): 3.2 Segment (2-7): 3 Sequence 1 Sequence 1 Sequence 1 Sequence 1 Sequence 1 a2' a3' a4' ak'
a1" Bioprospector Repeat the process until convergence Score sequence 1 in all possible alignments a2' a3' a4' ak' Motif Without a2' Segment Candidate Motif
GALF_PGenetic Algorithm with local filtering • Motif rappresentation: • Positional led • Consensus led • Where • bi= is the nucleotide i of the motif istance • PWM(bi, i) = is the score of bi at position i in the matrix
GALF_P : feature Genetic operator Local Filtering Operator Shift operator 62 62 387 387 60 60 12 272 71 43 366 366 432 272 mutation 3 Crossing over 753 Single parent Double parents
GALF_P : flow Generate initial population randomly y Local filtering for each new individual N Shift operator Trigger local filtering Evaluate the fitness And perform replace N Converge or reach The maximal generation y Output stagnate y N Genetic Operator
Results Both softwares were run 5 times with the same data set consisting of 1000 bp of the upstream alpha gene region with different motif width (8; 10 ;12; 14; 16), • The shared region were found in 8 and 10 fragment • Some overlap motives were found between -719 and 793 bp upstream encode region of ClamidomonasReinardtii
Experimental procedure • Primer design • PCR • Gradient PCR • Digestion
Conclusions We managed to identify the Alpha fragment in Scenedesmus protuberans by a sequence analysis . The method used to search for hypothetical TFBSs has allowed to identify a cluster of sequences that can be considered “significantly conserved”, hence likely to posses a regulatory function. Future perspectives • Building a descriptor as position specific weight matrices • in Transfact format and searching for conservative region that fits • the descriptor in other species. • The study of the sequence identified shall proceed, in order to capture its flanking regions.