1 / 33

Programming GPUs CS 446: Real-Time Rendering & Game Technology

Programming GPUs CS 446: Real-Time Rendering & Game Technology. David Luebke University of Virginia. Demo. Today: Jiayuan Meng, ATI demos Mar 14: Sean Arietta, Super Mario Sunshine. Assignments. Assn 3 due tomorrow at midnight Assignment 4 now out on web page

oki
Download Presentation

Programming GPUs CS 446: Real-Time Rendering & Game Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming GPUs CS 446: Real-Time Rendering & Game Technology David Luebke University of Virginia

  2. Demo • Today: Jiayuan Meng, ATI demos • Mar 14: Sean Arietta, Super Mario Sunshine Real-Time Rendering

  3. Assignments • Assn 3 due tomorrow at midnight • Assignment 4 now out on web page • Individual assignment w/ group “integration” deadline • Due Mar 21 (individual), Mar 23 (group) • Next 2 assignments will also follow this pattern Real-Time Rendering

  4. Programmable vertex processor! The ModernGraphics Pipeline Graphics State CPU GPU VertexProcessor FragmentProcessor Xformed, Lit Vertices (2D) Screenspace triangles (2D) Fragments (pre-pixels) Final Pixels (Color, Depth) Application Transform& Light AssemblePrimitives Rasterize Shade Vertices (3D) VideoMemory(Textures) Render-to-texture • Programmable pixel processor! Real-Time Rendering

  5. GPU vendor differences • Note: this slide (March 2006) will be dated quickly! • NVIDIA: as described in previous lecture • ATI GPUs (1900XT current high-end part): • No vertex texture fetch (but good render-to-vertex-array) • Far fewer levels of computed texture coordinates • Better at fine-grained (less coherent) dynamic branching • ATI Xenos (Xbox) has some cool design features: • Unified shader model: vertex proc == pixel proc • Scatter support: shaders can write arbitrary memory loc Real-Time Rendering

  6. Cg – “C for Graphics” • Cg is a high-level GPU programming language • Designed by NVIDIA and Microsoft • Competes with the (quite similar) GL Shading Language, a.k.a GLslang Real-Time Rendering

  7. Programming in assembly is painful Assembly Cg …FRC R2.y, C11.w; ADD R3.x, C11.w, -R2.y; MOV H4.y, R2.y; ADD H4.x, -H4.y, C4.w; MUL R3.xy, R3.xyww, C11.xyww; ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; ADD R3.x, R3.x, C11.x; TEX H6, R3, TEX2, 2D;… … L2weight = timeval – floor(timeval); L1weight = 1.0 – L2weight; ocoord1 = floor(timeval)/64.0 + 1.0/128.0; ocoord2 = ocoord1 + 1.0/64.0; L1offset = f2tex2D(tex2, float2(ocoord1, 1.0/128.0)); L2offset = f2tex2D(tex2, float2(ocoord2, 1.0/128.0)); … • Easier to read and modify • Cross-platform • Combine pieces • etc. Real-Time Rendering

  8. Some points in the design space • CPU languages • C – close to the hardware; general purpose • C++, Java, lisp – require memory management • RenderMan – specialized for shading • Real-time shading languages • Stanford shading language • Creative Labs shading language Real-Time Rendering

  9. Design strategy • Start with C (and a bit of C++) • Minimizes number of decisions • Gives you known mistakes instead of unknown ones • Allow subsetting of the language • Add features desired for GPU’s • To support GPU programming model • To enable high performance • Tweak to make it fit together well Real-Time Rendering

  10. How are GPUs different from CPUs? • GPU is a stream processor • Multiple programmable processing units • Connected by data flows VertexProcessor FragmentProcessor FramebufferOperations Assembly &Rasterization Application Framebuffer Textures

  11. Cg separates vertex & fragment programs VertexProcessor FragmentProcessor FramebufferOperations Assembly &Rasterization Application Framebuffer Textures Program Program Real-Time Rendering

  12. Cg programs have two kinds of inputs • Varying inputs (streaming data) • e.g. normal vector – comes with each vertex • This is the default kind of input • Uniform inputs (a.k.a. graphics state) • e.g. modelview matrix • Note: Outputs are always varying vout MyVertexProgram( float4 normal,uniform float4x4 modelview) { …

  13. Binding VP outputs to FP inputs • Let compiler do it • Define a single structure • Use it for vertex-program output • Use it for fragment-program input struct vout { float4 color; float4 texcoord; … };

  14. Binding VP outputs to FP inputs • Do it yourself • Specify register bindings for VP outputs • Specify register bindings for FP inputs • May introduce HW dependence • Necessary for mixing Cg with assembly struct vout { float4 color: TEX3; float4 texcoord: TEX5; … };

  15. Some inputs and outputs are special • E.g. the position output from vert prog • This output drives the rasterizer • It must be marked struct vout { float4 color; float4 texcoord; float4 position : HPOS; };

  16. How are GPUs different from CPUs? • Greater variation in basic capabilities • Most processors don’t yet support branching • Vertex processors don’t support texture mapping • Some processors support additional data types • Compiler can’t hide these differences • Least-common-denominator is too restrictive • Cg exposes differences via language profiles(list of capabilities and data types) • Over time, profiles will converge

  17. How are GPUs different from CPUs? • Optimized for 4-vector arithmetic • Useful for graphics – colors, vectors, texcoords • Easy way to get high performance/cost • C philosophy says: expose these HW data types • Cg has vector data types and operationse.g. float2, float3, float4 • Makes it obvious how to get high performance • Cg also has matrix data typese.g. float3x3, float3x4, float4x4

  18. Some vector operations // // Clamp components of 3-vector to [minval,maxval] range // float3 clamp(float3 a, float minval, float maxval) { a = (a < minval.xxx) ? Minval.xxx : a; a = (a > maxval.xxx) ? Maxval.xxx : a; return a; } ? : is per-component for vectors Swizzle – replicate and/or rearrange components. Comparisons between vectorsare per-component, andproduce vector result

  19. Cg has arrays too • Declared just as in C • But, arrays are distinct frombuilt-in vector types:float4 != float[4] • Language profiles may restrict array usage vout MyVertexProgram(float3 lightcolor[10], …) { …

  20. How are GPUs different from CPUs? • No support for pointers • Arrays are first-class data types in Cg • No integer data type • Cg adds “bool” data type for boolean operations • This change isn’t obvious except when declaring vars Real-Time Rendering

  21. Cg basic data types • All profiles: • float • bool • All profiles with texture lookups: • sampler1D, sampler2D, sampler3D,samplerCUBE • NV_fragment_program profile: • half -- half-precision float • fixed -- fixed point [-2,2) Real-Time Rendering

  22. Other Cg capabilities • Function overloading • Function parameters are value/result • Use “out” modifier to declare return value • “discard” statement – fragment kill void foo (float a, out float b) { b = a; } if (a > b) discard; Real-Time Rendering

  23. Cg Built-in functions • Texture lookups (in fragment profiles) • Math • Dot product • Matrix multiply • Sin/cos/etc. • Normalize • Misc • Partial derivative (when supported) • See spec for more details Real-Time Rendering

  24. Cg Example • The following fragment program implements a (very) simple toon shader • Flat 3-tone shading • Highlight • Base color • Shadow • Black silhouettes Real-Time Rendering

  25. Cg Example – part 1 // In: // eye_space position = TEX7 // eye space T = (TEX4.x, TEX5.x, TEX6.x) denormalized // eye space B = (TEX4.y, TEX5.y, TEX6.y) denormalized // eye space N = (TEX4.z, TEX5.z, TEX6.z) denormalized fragout frag program main(vf30 In) { float m = 30; // power float3 hiCol = float3( 1.0, 0.1, 0.1 ); // lit color float3 lowCol = float3( 0.3, 0.0, 0.0 ); // dark color float3 specCol = float3( 1.0, 1.0, 1.0 ); // specular color // Get eye-space eye vector. float3 e = normalize( -In.TEX7.xyz ); // Get eye-space normal vector. float3 n = normalize(float3(In.TEX4.z, In.TEX5.z, In.TEX6.z));

  26. Cg Example – part 2 float edgeMask = (dot(e, n) > 0.4) ? 1 : 0; float3 lpos = float3(3,3,3); float3 l = normalize(lpos - In.TEX7.xyz); float3 h = normalize(l + e); float specMask = (pow(dot(h, n), m) > 0.5) ? 1 : 0; float hiMask = (dot(l, n) > 0.4) ? 1 : 0; float3 ocol1 = edgeMask * (lerp(lowCol, hiCol, hiMask) + (specMask *specCol)); fragout O; O.COL = float4(ocol1.x, ocol1.y, ocol1.z, 1); return O; }

  27. New vector operators • Swizzle – replicate/rearrange elements a = b.xxyy; • Write mask – selectively over-write a.w = 1.0; • Vector constructor builds vectora = float4(1.0, 0.0, 0.0, 1.0); Real-Time Rendering

  28. Change to constant-typing mechanism • In C, it’s easy to accidentally use high precision half x, y; x = y * 2.0; // Double-precision multiply! • Not in Cg x = y * 2.0; // Half-precision multiply • Unless you want to x = y * 2.0f; // Float-precision multiply Real-Time Rendering

  29. Dot product, matrix multiply • Dot product • dot(v1,v2); // returns a scalar • Matrix multiplications: • matrix-vector: mul(M, v); // returns a vector • vector-matrix: mul(v, M); // returns a vector • matrix-matrix: mul(M, N); // returns a matrix Real-Time Rendering

  30. Demos and Examples Real-Time Rendering

  31. Cg runtime API • Compile a program • Select active programs for rendering • Pass “uniform” parameters to program • Pass “varying” (per-vertex) parameters • Load vertex-program constants • Other housekeeping Real-Time Rendering

  32. Runtime is split into three libraries • API-independent layer – cg.lib • Compilation • Query information about object code • API-dependent layer –cgGL.lib and cgD3D.lib • Bind to compiled program • Specify parameter values • etc. Real-Time Rendering

  33. More Information… • The Cg Toolkit (includes Cg user’s manual): • http://developer.nvidia.com/object/cg_toolkit.html • NVIDIA’s GPU Programming Guide: • http://developer.nvidia.com/object/gpu_programming_guide.html • Cg Tutorial Book • The CgShader class • See http://www.cs.virginia.edu/~rs5ea/tools/ • Hello, Cg (haven’t checked out myself): • http://developer.nvidia.com/object/hello_cg_tutorial.html Real-Time Rendering

More Related