1 / 24

Array-of-Structures AoS vs. Structure-of-Arrays SoA

Outline of Talk. What is Stream Processing?What is the problem?Why is Stream Processing relevant to HPC?What are Array of Structures and Structure of Arrays?Why use SoA?What can be done?What is being done?. Stream Processing. Computing paradigm allowing use of multiple functional unitsPerfor

maine
Download Presentation

Array-of-Structures AoS vs. Structure-of-Arrays SoA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Array-of-Structures (AoS) vs. Structure-of-Arrays (SoA) Steve Carroll March 02, 2010

    2. Outline of Talk What is Stream Processing? What is the problem? Why is Stream Processing relevant to HPC? What are Array of Structures and Structure of Arrays? Why use SoA? What can be done? What is being done?

    3. Stream Processing Computing paradigm allowing use of multiple functional units Performance increase in areas such as

    4. Problem Definition Stream programming requiring new way of thinking about data New way of thinking is not traditional Programmers must relearn new style

    5. Example An use of stream processing is that of displaying a particle system E.g. smoke, fire, water spay, dust, etc. Traditional way of thinking is to use an array to hold the particles Bad for performance Better to use Structure-of-Array format

    6. Example (cont.) Array-of-Structures (AoS) Format

    7. Example (cont.)

    8. Example (cont.) Structures-of-Array (SoA) Format

    9. Example (cont.)

    10. Example (cont.) AoS mParticleArray[i]->position.x SoA mParticleArray->x[i] SoA requires different thinking than AoS Why? Stream programming encourages decoupling computation and memory access [4], [5]

    11. Stream Processing and HPC “Can 2000 grad students complete a PhD’s worth of research in 1 day? Most likely not! One must design novel ways to utilize large scale resources in efficient ways.” [6] 3D graphics is huge workload, consists of

    12. Stream Processing and HPC (cont.) Assigned paper Ma, W.-C., Yang, C.-L. Using Intel streaming SIMD extensions for 3D geometry processing. In Advances in Multimedia Information Processing – PCM, 2002. Using SIMD-FP alone achieves close to 3X speedup for graphics using Intel SSE Arranging vertices favorable to SIMD is up to 4X

    13. Issues Organizing data in SIMD format has significant overhead Conventional approach is AoS Intel suggest SoA approach

    14. Issues (cont.)

    15. Issues (cont.) AoS requires more instructions than SoA

    16. Current Work SoA results in better performance

    17. Current Work (cont.) [2] maps stream programming code to general purpose CPUs Believed that thinking about stream programming is a benefit [9], [10] studied design issues [11] evaluated the MMX technology [12] studied performance of SIMD on 3D geometry

    18. Current Work (cont.) Memory management is important relation

    19. Current Work (cont.) Stream Processors in current use / research Imagine (Stanford University, 1996) SSE3 (Intel, 2004) Cell Broadband Engine Architecture (STI, 2005) Storm-1 (Stream Processor, Inc, 2007) System S (IBM, 2007) GPUs (ATI, NVIDIA, present) Merrimac (Stanford University, present)

    20. Paper Conclusion Traditional AoS data structures boost performance 2.7X to 3X [3] SoA data strucutres boost performance 3.1X to 3.3X [3] SoA with prefetching boosts performance 3.6X to 3.9X [3] Layout of data in memory is important for performance

    21. Conclusion & Comments Coding programs in a streaming style “can improve performance on today’s machines and smooth the way for significant performance improvements with the depoloyment of streaming architetures” [2] Stream programming forces the programmer to think about memory accesses and computer operations seperately

    22. Stream processors can benefit numerous types of problems if data structures are kept in mind

    23. References [1] Gordon, M. I. et al. A stream compiler for communication-exposed architectures. In Proceedings of the 10th international Conference on Architectural Support For Programming Languages and Operating Systems, October 05 - 09, 2002. [2] Gummaraju, J. and Rosenblum, M. 2005. Stream Programming on General-Purpose Processors. In Proceedings of the 38th Annual IEEE/ACM international Symposium on Microarchitecture, November 12 - 16, 2005. [3] Ma, W. and Yang, C.-L. 2002. Using Intel Streaming SIMD Extensions for 3D Geometry Processing. Advances in Multimedia Information Processing, PCM 2002. [4] Creating a Particle System with Streaming SIMD Extensions http://software.intel.com/en-us/articles/creating-a-particle-system-with-streaming-simd-extensions/

    24. References [5] W. Thies, M. Karczmarek, and S. Amarasinghe, StreamIt: A language for streaming applications. in Int’l Conference on Compiler Construction, Apr. 2002 [6] Houston, M., General Purpose Computation on Graphics Processors (GPGPU). Stanford University, Public Talks, 2007. [7] Folding@Home Official Stats http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats [8] I. Buck, “Brook Specification v0.2,” merrimac.stanford.edu/brook/brookspec-v0.2.pdf, October 2003. [9] M.T. et al. VIS speeds new media processing. In IEEE Micro, 16(4):10-20, 1996. [10] Raman, S.K., Pentkovski, V. and Keshava, J. Implementing Streaming SIMD Extensions on the Pentium III processor. In IEEE Micro, 20(4):47-57, 2000.

    25. References [11] Bhargava R., et. Al. Evaluating MMX technology using DSP and multimedia applications. In ACM/IEEE International Symposium on Microarchitecture, 1998. [12] Yang, C.-L., Sano, B., Lebeck A.R., Exploiting instruction level patry processing for three dimensional graphics applications. In ACM/IEEE International Symposium on Microarchitecture, 1998.

More Related