1 / 22

Optimization: The Art of Computing

Explore the golden principles of optimization, algorithms, implementations, and hardware performance techniques. Learn tricks, such as the prime number algorithm, and key concepts like parallelization and vectorization.

Download Presentation

Optimization: The Art of Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimization: The Art of Computing Intel Challenge experience and other tricks … Mathieu Gravey

  2. Golden principle of Optimizing • Algorithm • Implementation • Hardware L o n g - t e r m P e r f o r m a n c e

  3. Example: Prime Number Algorithm For i=2 to N boolisPrime=true; For j=2 to N If (mod(i,j)==0 and i != j) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for

  4. Example: Prime Number Algorithm For i=2 to N boolisPrime=true; For j=2 to i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for

  5. Example: Prime Number Algorithm For i=2 to N boolisPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for

  6. Example: Prime Number Algorithm // the job For i=2 to N boolisPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for

  7. Example: Prime Number Algorithm // the job For i=2 to N boolisPrime=true; For j=2 to √i If (mod(i,j)==0) isPrime=false; break; end if end for if (isPrime) add i to the listOfPrimeNumber End for

  8. Example: Prime Number Algorithm // the job For i=2 to N boolisPrime=true; vectorize the job For j=2 to √i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for

  9. Example: Prime Number Algorithm // the job For i=3 to N step 2 boolisPrime=true; vectorize the job For j in √i step 2 isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for

  10. Example: Prime Number Algorithm // the job For i=2 to N step 2 boolisPrime=true; vectorize the job For j=2 to √i step 2 isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumber End for

  11. Example: Prime Number Algorithm // the job For i==2 to N boolisPrime=true; vectorize the job For j in listOfPrimeNumber and j<√i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumberin order End for

  12. Example: Prime Number Algorithm // the job For i==1 or i==5 in base 6, to N boolisPrime=true; vectorize the job For j in listOfPrimeNumberand j<√i isPrime = isPrime && (mod(i,j)!=0); end for if (isPrime) add i to the listOfPrimeNumberin order End for

  13. Basic principles • Pareto principle • Structure • Parallelization • Vectorization inotes4you.files.wordpress.com

  14. Basic principles • Start by the main issues • Global view  critical issue • Monkey development • Start simple  go to complex • Iterative process • Optimizing, start by slowing down • Global picture ! http://bestofpicture.com/

  15. Rules Guidelines • Be lazy • Don’treinvent the wheel • Don’t be idle • Design pattern • Global variables areyourenemies • Don’t Overgeneralize

  16. Rules Guidelines • Trust the compiler • Simple for you= simple for compiler | computer • Share your knowledge • Compiler

  17. Rules Guidelines • Think different, try,change and try again … • Don’t aim for the Best, but something Good and Better

  18. Concrete trick : Memory • Array vs. List • Prefetch | random access

  19. Concrete trick : First step Optimization • Compiler optimization • icpcmyCodeFile –O3 -xhost–o myCompiledProgram • ⚠ -g • const • No-writes • inline • restrict/__restrict__ • No read updates • Loop-unroll • __builtin_expect((x),(y))

  20. Concrete trick : OpenMP • Vectorization => SIMD • #pragma ompsimd • Multi-operation with one instruction • ⚠ non-aligned data • Multi-Thread • L3 cache-communication • Shared memory • How to use : • #pragma omp parallel for default(none) shared(x,y) fisratPrivate(array) reduction(max:MaxValue) schedule(static) • for(inti=0; i< 10000; i++){ something … } • #pragma omp critical • #pragma omp barrier

  21. Multi-Chip | Multi-Sockets • NUMA (Non-uniformmemoryaccess) • slowerthan local memory • Position in memory => first touch • Parallelize the initialisation with : schedule(static) • readonly data => copy in eachlocal memory • Thread Affinity

  22. Questions ?

More Related