1 / 15

Programming Tips

Programming Tips. GS540 January 10, 2011. Jarrett Egertson 4 th year Genome Sciences Graduate Student MacCoss Lab for Biological Mass Spectrometry Email: jegertso@uw.edu Discussion Section: Thursdays 2-3 Foege S-040 Office Hours: Tuesdays: 2-3 Vista Café and by Appointment

Download Presentation

Programming Tips

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming Tips • GS540 • January 10, 2011

  2. Jarrett Egertson • 4th year Genome Sciences Graduate Student • MacCoss Lab for Biological Mass Spectrometry • Email: jegertso@uw.edu • Discussion Section:Thursdays 2-3 Foege S-040 • Office Hours: Tuesdays: 2-3 Vista Café and by Appointment • Programming: C++/C#/Python • Devenvironment: GCC (Linux), Visual Studio (Windows)

  3. Outline • General tips / Coding style advice • Performance • Debugging advice • Numerical issues • C Programming • Pointers • Sorting • Homework 1

  4. General tips • Validate using toy cases with a small amount of data • Figure out minimal cases by hand to verify program • Print intermediate output • Test important functions

  5. Coding style advice • Your audience is people as well as the computer • Break large functions into small, simple functions • Break large files into smaller files containing groups of related functions • Use descriptive names for function arguments and variables with larger scopes (e.g.out_file, exons) • Use short names for iterators, vars of limited scope, and vars that are used many times (e.g.i, j)

  6. Performance • Avoid unnecessary optimization! • Better to write simple code first and improve speed if necessary • Big performance gains often result from changing a few small sections of code (e.g. within loops) • Move unnecessary code out of loops • Avoid frequent memory allocation • Allocate memory in large blocks (e.g. array of structs, not one at a time) • Re-use the same piece of memory • Avoid slow comparison routines when sorting • Use a profiler for tough cases (gprof for C; dprofpp for perl)

  7. Debugging advice • Use assertions • E.g. check probabilities: • should always be >= 0 and <= 1 • often should sum to 1.0 • Write slow but sure code to check optimized code • In difficult cases use a debugger, but avoid overuse • valgrind can help find segfaults, memleaks (compile with -g first)

  8. Numerical issues • Consider using log space to avoid overflow and underflow • Don’t compare floats with equals (use >= or <=, NOT ==) • Beware subtracting large, close numbers • Beware integer casts • 1/2 is 0, but 1.0/2 is 0.5 • To generate random numbers, use random() rather than rand()

  9. Pointers in C • Pointers are memory addresses (they point to other variables) • The address-of operator (&) obtains the memory address of a variable • The dereference operator (*) accesses the value stored at the pointed-to mem location

  10. Pointers in C (cont’d) • Arrays are pointers to blocks of memory • Array indices are just pointer arithmetic and dereferencing combined: • a[12] is the same as *(a+12) • &a[3] is the same as a+3 From The C Programming Language by B. Kernighan & D. Ritchie

  11. Homework 1 • Declare large arrays on heap not stack • Outside main() or as static • Output XML markup directly • Avoid copy-pasting results into an XML template

  12. Pointers in C (cont’d) • Large arrays should be dynamically allocated (on the heap) C C++ From The C Programming Language by B. Kernighan & D. Ritchie

  13. Pointers in C (cont’d) • Attributes of pointed-to structures can be derefenced with “arrow notation”: • a->elem is equivalent to (*a).elem

  14. Sorting (in C)

  15. Words of wisdom • "Everything should be made as simple as possible, but no simpler." -- Albert Einstein • KISS principle: “Keep It Simple, Stupid” • From The Zen of Python by Tim Peters: • Beautiful is better than ugly • Explicit is better than implicit • Simple is better than complex • Complex is better than complicated • Flat is better than nested • Sparse is better than dense • Readability counts

More Related