1 / 25

Yaniv Erlich Hannon Lab

shRNA libraries sequencing using DNA Sudoku. Yaniv Erlich Hannon Lab. Preparing DNA libraries. Programmable microarray. Cloning into plasmids. Transformation. Array single colonies. The problem. Input : 40,000 bacterial colonies Output: The sequence of the shRNA inserts. Insert type.

sinead
Download Presentation

Yaniv Erlich Hannon Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. shRNA libraries sequencing using DNA Sudoku Yaniv Erlich Hannon Lab erlich@cshl.edu

  2. Preparing DNA libraries Programmable microarray Cloning into plasmids Transformation Array single colonies erlich@cshl.edu

  3. The problem Input: 40,000 bacterial colonies Output: The sequence of the shRNA inserts Insert type erlich@cshl.edu

  4. Motivation • Filtering the correct fragments • Balanced representation • Subset selection. erlich@cshl.edu

  5. Clone-by-clone sequencing Clone-by-clone sequencing: Sequence each clone by a capillary platform Caveat: Cost: ~40,000$ Conclusion: using next generation sequencing erlich@cshl.edu

  6. Naïve next-gen Solexa Pooling ?? Conclusion: we need to add a source clone identifier (barcode) erlich@cshl.edu

  7. Naive barcoding Solexa Pooling Barcoding • Caveats: • Order 40,000 barcodes. Each of length of ~95nt. • 40,000 PCR reactions. Conclusion: we need less barcodes erlich@cshl.edu

  8. Naive Pooling(1) Barcode: Case #1: Which specimen appears in both barcode #5 and #B? Specimen #13! erlich@cshl.edu erlich@cshl.edu

  9. Naive Pooling(2) Barcode: Case #2: Or maybe ACGTT associated with specimens #25(D,2) and #34(E,1)? ACGTT associated with specimens #25(D,1) and #34 (E,2)! Ambiguity Conclusion: we should deal with shRNA ‘duplicates’ erlich@cshl.edu erlich@cshl.edu

  10. Lessons learned for the desired scheme erlich@cshl.edu erlich@cshl.edu

  11. Overview of our solution ‘Chinese’ Pooling PE sequencing Barcoding Decoding erlich@cshl.edu erlich@cshl.edu

  12. The pooling design Combinatorial pooling using the Chinese Remainder Theorem (CRT). "I have never done anything 'useful'. No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world” (G. Hardy, A Mathematician's Apology,1940) erlich@cshl.edu

  13. Chinese remainder riddle “An old woman goes to market and a horse steps on her basket and crashes the eggs. The rider offers to pay for the damages and asks her how many eggs she had brought. She does not remember the exact number, but when she had taken them out 3 at a time, there was one egg left. The same happened when she picked them out 4, and 5 at a time, but when she took them 7 at a time they came out even. What is the smallest number of eggs she could have had?” • Chinese Remainder Theorem says: • There is one-to-one correspondence between n (0n<2*3*5*7) and the residues. • There is an easy algorithm to solve the equation system. Answer: 91 eggs erlich@cshl.edu

  14. Pooling construction with modular equations Destination well (different plates) Specimen Pooling window One-to-One correspondence… erlich@cshl.edu

  15. Example of Chinese pooling Source array: 03/06/09 erlich@cshl.edu erlich@cshl.edu

  16. Chinese Remainder Pooling Design • Inputs:N (number of specimens in the experiment) • Weight (pooling efforts) • Algorithm: • 1. Find W numbers {x1,x2,…,xw} such that: • Bigger than • Pairwise coprime • For instance: {5,8,9} but not {5,6,9} • 2. Generate W modular equations: • 3. Construct the pooling design upon the modular equations • Output: Pooling design Chinese Remainder Theorem asserts: (1) Two specimens will be meet in no more than one pool. (2) The number of pools Number of bc: erlich@cshl.edu erlich@cshl.edu

  17. How good is our method? erlich@cshl.edu erlich@cshl.edu

  18. Barcode reduction IEEE Transaction on Information Theory (1964) Proved upon pure combinatorial constrains: the lower theoretical bound of the number of barcodes is Our method is very close the lower theoretical bound erlich@cshl.edu erlich@cshl.edu

  19. How good is our method? erlich@cshl.edu erlich@cshl.edu

  20. Dealing with duplicates - simulation 0.99 Probability of correct decoding Duplicates size 40,000 specimens with only 384 barcodes erlich@cshl.edu erlich@cshl.edu

  21. How good is our method? • W=5: • 5 lanes of Solexa • One week and a half of robotics erlich@cshl.edu erlich@cshl.edu

  22. How good is our method? erlich@cshl.edu erlich@cshl.edu

  23. Real results… • Arabidopsis shRNA library with 17,000 shRNA fragments • Picked 40,320 bacterial colonies • Sequence 3,000 colonies with capillary sequencing for comparison. • Decoded ~20,500 bacterial colonies with correct inserts • 96% of the assignments were correct. • ~8,000 unique fragments of the library. erlich@cshl.edu

  24. Future directions • Developing a more advance decoder using machine learning approach • 2-stage algorithm erlich@cshl.edu

  25. Acknowledgements Greg Hannon Oron Navon and Roy Ronen Ken Chang Michelle Rooks Assaf Gordon 03/06/09 DNA Sudoku erlich@cshl.edu erlich@cshl.edu

More Related