1 / 28

Introduction to Graphics Hardware and GPUs

Introduction to Graphics Hardware and GPUs Yannick Francken Tom Mertens Overview Definition Motivation History of Graphics Hardware Graphics Pipeline Vertex and Fragment Shaders Modern Graphics Hardware Stream Programming GPU Stream Programming Languages Exercise More Information

niveditha
Download Presentation

Introduction to Graphics Hardware and GPUs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Graphics Hardware and GPUs Yannick Francken Tom Mertens

  2. Overview • Definition • Motivation • History of Graphics Hardware • Graphics Pipeline • Vertex and Fragment Shaders • Modern Graphics Hardware • Stream Programming • GPU Stream Programming • Languages • Exercise • More Information

  3. Definition Logical Representation of Visual Information Output Signal

  4. Motivation • Real Time: 15 – 60 fps • High Resolution

  5. Motivation • High CPU load • Physics, AI, sound, network, … • Graphics demand: • Fast memory access • Many lookups [ vertices, normal, textures, … ] • High bandwidth usage • A few GB/s needed in regular cases! • Large number of flops • Flops = Floating Point Operations [ ADD, MUL, SUB, … ] • Illustration: matrix-vector products (16 MUL + 12 ADD) x (#vertices + #normals) x fps = (28 Flops) x (6.000.000) x 30 ≈ 5GFlops • Conclusion: Real time graphics needs supporting hardware!

  6. History of Graphics Hardware • … - mid ’90s • SGI mainframes and workstations • PC: only 2D graphics hardware • mid ’90s • Consumer 3D graphics hardware (PC) • 3dfx, nVidia, Matrox, ATI, … • Triangle rasterization (only) • Cheap: pushed by game industry • 1999 • PC-card with TnL [ Transform and Lighting ] • nVIDIA GeForce: Graphics Processing Unit (GPU) • PC-card more powerful than specialized workstations • Modern graphics hardware • Graphics pipeline partly programmable • Leaders: ATI and nVidia • “ATI Radeon X1950” and “nVidia GeForce 8800” • Game consoles similar to GPUs (Xbox)

  7. Graphics Pipeline LOD selection Frustum Culling Portal Culling … Application Modelview/Projection tr. Clipping Lighting Division by w Primitive Assembly Viewport transform Backface culling Geometry Processing Scan Conversion Fragment Shading [Color and Texture interpol.] Frame Buffer Ops [Z-buffer, Alpha Blending,…] Rasterization Output to Device Output

  8. Graphics Pipeline LOD selection Frustum Culling Portal Culling … Application Programmable Clipping Division by w Primitive Assembly Viewport transform Backface culling VERTEX SHADER Geometry Processing Scan Conversion Rasterization FRAGMENT SHADER Output to Device Output

  9. ( x, y, z, w ) ( nx, ny, nz ) ( s, t, r, q ) ( r, g, b, a ) ( x’, y’, z’, w’ ) ( nx’, ny’, nz’ ) ( s’, t’, r’, q’ ) ( r’, g’, b’, a’ ) VERTEX SHADER ( x, y ) ( r’, g’, b’, a’ ) ( depth’ ) ( x, y ) ( r, g, b, a ) ( depth ) FRAGMENT SHADER Vertex and Fragment Shaders

  10. Modern Graphics Hardware • GPU = Graphics Processing Unit • Vector processor • Operates on 4 tuples • Position ( x, y, z, w ) • Color ( red, green, blue, alpha ) • Texture Coordinates ( s, t, r, q ) • 4 tuple ops, 1 clock cycle • SIMD [ Single Instruction Multiple Data ] • ADD, MUL, SUB, DIV, MADD, …

  11. Modern Graphics Hardware • Pipelining • Number of stages • Parallelism • Number of parallel processes • Parallelism + pipelining • Number of parallel pipelines 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

  12. Modern Graphics Hardware • Parallelism + pipelining: ATI Radeon 9700 4 vertex pipelines 8 pixel pipelines

  13. Modern Graphics Hardware • Features of ATI Radeon X1900 XTX • Core speed 650 Mhz • 48 pixel shader processors • 8 vertex shader processors • 51 GB/s memory bandwidth • 512 MB memory

  14. Modern Graphics Hardware • High Memory Bandwidth Graphics Card High bandwidth 51GB/s GPU 650Mhz Graphics memory ½GB Output AGP bus 2GB/s Parallel Processes Processor Chip Main memory 1GB AGP memory ½GB High bandwidth 77GB/s CPU 3Ghz 3GB/s Cache ½MB

  15. Stream Programming • Input: stream of data records • Output: stream(s) of data records • Kernel: operates sequentially on the data records, accessing one record at a time! • Read-Only Memory: record independent read only memory

  16. GPU Stream Programming • Vertex Shader • Input and output streams • Vertices, normals, colors, texture coordinates • Read only memory • Uniform variables • Uniform = constant per stream • Textures, floats, ints, arrays, … • Fragment Shader • Input and output streams • Pixels • Z-values • Read only memory • See above

  17. Languages • Assembly • Cg [ C for Graphics ] • HLSL [ High Level Shading Language ] • GLSL [ OpenGL Shading Language ] • Sh • BrookGPU

  18. Specialized Instruction Set DP4: 4 tuple dot poduct RSQ: reciprocal square root MAD: multiply and add DPH: homogeneous dot product SCS: sine and cosine LRP: linear interpolate TEX: texture map … Nowadays, “not” used directly anymore Generated by high level language compilers !!ARBvp1.0 ATTRIB pos = vertex.position; PARAM mat[4] = { state.matrix.mvp }; # Transform by concatenation of the # MODELVIEW and PROJECTION # matrices. DP4 result.position.x, mat[0], pos; DP4 result.position.y, mat[1], pos; DP4 result.position.z, mat[2], pos; DP4 result.position.w, mat[3], pos; # Pass the primary color through w/o # lighting. MOV result.color, vertex.color; END Assembly

  19. High level programming language Static conditional jumps if, while, for, … Data dependent conditional jumps SIMD Fragment shader: only efficient in case of coherent program flow! No pointers! struct appdata { float4 position : POSITION; float3 normal : NORMAL; float3 color : DIFFUSE; float3 VertexColor : SPECULAR; }; struct vfconn { float4 HPOS : POSITION; float4 COL0 : COLOR0; }; vfconn main( appdata IN, uniformfloat4 Kd, uniform float4x4 mvp ) { vfconn OUT; OUT.HPOS = mul( mvp, IN.position); OUT.COL0.xyz = Kd.xyz * IN.VertexColor.xyz; OUT.COL0.w = 1.0; return OUT; } Cg / HLSL / GLSL

  20. Shader code embedded in C++ // C++ Code … vsh = SH_BEGIN_VERTEX_PROGRAM { ShInputNormal3f normal; ShInputPosition4f p; ShOutputPoint4f ov; ShOutputNormal3f on; ShOutputVector3f lvv; ShOutputPosition4f opd; opd = Globals::mvp | p; on = normalize( Globals::mv | normal ); ov = -normalize( Globals::mv | p ); lvv = normalize( Globals::lightPos – ( Globals::mv | p) ( 0,1,2 ) ); } SH_END_PROGRAM; fsh = SH_BEGIN_FRAGMENT_PROGRAM { ShInputVector4f v; ShInputNormal3f n; ShInputVector3f lvv; ShInputPosition4f p; ShOutputColor3f out; out( 0,1,2 ) = Globals::color * dot( normalize( n ), normalize( lvv ) ); } SH_END_PROGRAM; Sh

  21. GPGPU Language General Purpose GPU Language Brook: Streaming extension of C BrookGPU: GPU port of Brook No computer graphics knowledge required! kernel void k( float s<>, float3 f, float a[10][10], out float o<> ); float a<100>; float b<100>; float c<10,10>; streamRead( a, data1 ); streamRead( b, data2 ); streamRead( c, data3 ); // Call kernel "k" k( a, 3.2f, c, b ); streamWrite( b, result ) BrookGPU

  22. Screenshots • nVidia Toolkit [ Reflection-Bump Mapping ]

  23. Screenshots • Far Cry [ UBISOFT ]

  24. Screenshots • NPR [ ATI Research Group ]

  25. Exercise • Vertex Shader in Cg • Free Form Deformation • Framework available on website Vertex Shader

  26. More Information • nVidia • http://developer.nvidia.com/ • ATI • http://www.ati.com/developer/ • General Purpose GPU Programming • http://www.gpgpu.org • GPU Programming and Architecture • http://www.cis.upenn.edu/~suvenkat/700/ • Hardware • http://www.beyond3d.com • http://www.tomshardware.com

  27. Questions?

More Related