1 / 16

Query Reordering for Photon Mapping

Query Reordering for Photon Mapping. Rohit Saboo. Photon Mapping. A two step solution for global illumination:. Step 1: Build the Photon Map. Step 2: Shoot eye rays and perform a “gather”. Gather variants. Approximate. Accurate. Bandwidth estimate. 100 queries for one eye-ray

dalit
Download Presentation

Query Reordering for Photon Mapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Reordering for Photon Mapping Rohit Saboo

  2. Photon Mapping A two step solution for global illumination: Step 1: Build the Photon Map Step 2: Shoot eye rays and perform a “gather”

  3. Gather variants • Approximate • Accurate

  4. Bandwidth estimate • 100 queries for one eye-ray • 20 bytes per photon • 512x512 image • super-sampling give bandwidth estimates of 50GB. Caches fetch data in blocks + other factors -> bandwidth requirement could go upto 200 GB

  5. Reordering Queries Improve locality of data/queries for cache effectiveness Two ways- • Generate the queries in some order • Generate the queries, reorder them in some manner and then run the queries.

  6. R A M P R O C E S S O R I-Cache L2 Cache D-Cache Cache Hierarchy A very naïve hierarchy

  7. Reordering methods • Row ordering • Tiled row ordering • Direction binning • Hashed • Tiled Direction binned hashed • Hilbert Curve • Tiled Hilbert curve

  8. The Cornell Box Not one of the results

  9. Performance monitoring • Intel Pentium M processor 1.7GHz (The frequency scaling feature was disabled) • FSB 533 MHz • 2MB L2 cache • Separate I-cache and D-cache • 32KB 8-way set associative each I-cache and D-cache • 768MB RAM • Windows XP (with most services disabled) • pbrt • Intel C++ compiler with all optimizations • VTune performance analysis package

  10. Results With plain irradiance caching – • Branch mispredictions account for ~25% of the time • Algo seems to be too complicated to be optimized successfully. • Bus utilization factor – 0.0024 (no of times bus was asserted busy vs clockticks) which is very low. • ~10% of time spent due to cache misses.

  11. Results… Naïve reordering… • Bus utilization – 0.0014 – again very low • CPU load port – 0.54 loads per clocktick (maximum I could achieve is 1.07) • ~7% of time wasted due to cache misses.

  12. Results… Hilbert curve … • Bus utilization – 0.00074 (an order of magnitude lower) • 0.93 loads per clocktick (almost as high as one can get) • Not much impact due to L2 cache misses.

  13. Multi threading • Multithreaded the kd-tree data structure • Simply starts two threads to do the search. • Results show very small changes • Maybe some other threading approach would be better? • Cost of threading overshadows any gains.

  14. Any Possible Discrepencies • Pentium M processor vs Desktop processors – results are highly architecture dependent. (eg if processor has more than one port connected to D-cache) • Not running the analysis over the entire duration of the run.

  15. Conclusions • L2-memory bandwidth is not the bottleneck. • The bottleneck is more in the form of cpu-L1 accesses and computations. • There does exist scope for improving performance • But this would need algos which have very little overhead and simple enough to be optimized by the compiler and at the same time exploit cache coherency

  16. References • Reordering for Cache conscious photon mapping – Josh Steinhurst • Realistic Image Synthesis using Photon Mapping – Jensen • IA-32 Intel Architecture Software Developer’s Manual, Volume 3: System Programming Guide. ftp://download.intel.com/design/Pentium4/manuals/25366814.pdf • VTune Performance Analyzer http://www.intel.com/software/products/vtune/vpa/

More Related