400 likes | 559 Views
CS179: GPU Programming. Lecture 9: Lab 4 Recitation. Today. 3D Textures PBOs Fractals Raytracing Lighting/ Phong Shading Memory Coalescing. 3D Textures. Recall advantages of textures: Not global memory, faster accesses Still available to all threads/blocks Larger in size
E N D
CS179: GPU Programming Lecture 9: Lab 4 Recitation
Today • 3D Textures • PBOs • Fractals • Raytracing • Lighting/Phong Shading • Memory Coalescing
3D Textures • Recall advantages of textures: • Not global memory, faster accesses • Still available to all threads/blocks • Larger in size • Better caching • Filtering, clamping, sampling, etc.
3D Textures • 3D textures store volume data • Could be used for rendering smoke, particles, fractals, etc. • Allocate a 3D cudaArray to make 3D texture • cudaMalloc3DArray gives correct pitch • Declare texture in device • texture<type, 3, mode> tex • Access on device with texture sampling • tex3D(tex, x, y, z)
3D Textures • Some texture properties you can set: • tex.normalized: ints or normalized floats • tex.filterMode: linear or point filtering • tex.addressMode[x]: wrap or clamp (for each dimension) • Bind texture to array • cudaBindTextureToArray • Unbinding is typical, but probably not necessary • All of this is done for you in lab 4!
PBOs • Pixel Buffer Objects (PBOs) • Store volume data • Used to easily render in OpenGL • Recall lab 3 • VBOs stored vertex data • Vertex data remained on GPU -- minimal transfer to/from CPU • Rendered via OpenGL on GPU • Same story here • Pixels instead of verts, but same idea
PBOs • Initialize: • glGenBuffersARB(target, &pbo) • target is the target buffer object • Bind to OpenGL: • glBindBufferARB(target, pbo) • Assign Data: • glBufferDataARB(target, size, data, usage) • data is a pointer to the data, usage tells us how often we read/write • Map to CUDA: • cudaGLMapBufferObject/cudaGLUnmapBufferObject
Fractals • Fractals: infinite complexity given by simple instructions • “Self-similar, recursive” • Difficult for us to process (but nice for a computer!) • Many different kinds (we’ll look at Julia Set) • How to render on CUDA: • Calculate fractal volume or area • Copy into texture • Volume render as PBO • What is a Julia Set?
Mandlebrot Set • “Father” of Julia Set
Mandlebrot Set • Simpler than it looks • Recursively defined by zn+1 = zn2 + c • c is imaginary constant • z0 = 0 • Three possible results based on c: • Converge to 0 (black space) • Stays in finite orbit (boundary • Escapes to infinite (blue area)
Mandlebrot Set • Computed by iteratively computing zn • Assume after some point, it escapes and we can stop checking… • ||zn|| > 2, for example • Coloring is oftentimes based on rate of escape • Don’t need more than a few dozen iterations to see behavior • Demo?
Julia Set • Each pixel on Mandlebrot set has corresponding Julia Set
Julia Set • Idea: instead of starting with z0 = 0, let z0 = c0 • c0 changing will change Julia Set dramatically!
Julia Sets • Why are they useful? • Nothing really practical yet • But they look cool! • Can teach us about chaos, model weather, coastlines, etc. • Parallelizable problem, so good for us!
Julia Sets • Lab 4 is even more exciting than Julia Sets… • 4D Julia Sets!
Julia Sets • 4D: Using quaternions instead of imaginary • Quaternions: 3D extension to imaginary numbers • i2 = j2 = k2 = ijk = -1 • ij = k = -ji, jk = i = -kj, ki = j = -ik • Many uses in graphics • Rotations • Kinematics • Visualizations • Etc. • We give you some nice quaternion functions (sqr_quat, mul_quat, etc.)
Julia Sets • How do we render 4D object? • Projection: taking nD slices of an (n+1)D object • Ex.: MRI Scan - 2D images of a 3D volume • For 4D Julia set, render volume slices of 4D object • Think of it as time evolving object • Slice is one frame in time • Now we have 3 parameters: • z0- starting point for Julia set • c - constant for Mandlebrot set • zp - slicing plane for projection
Julia Sets • How to render: • Transform each coordinate in volume texture to quaternion • q = (pos.x, pos.y, pos.z, dot((pos, 1), plane)) • Implemented for you as pos_to_quat • Store escape speed or convergence in volume texure • Volume render - raytracing
Raytracing • Kind of what it sounds like: tracing rays • Start at some origin ray.o • Step in direction ray.d • If we collide with something, render it! • To check shadows, raytrace back toward light - if object hit, then in shadow • Raytracing used for super high-def images • Can also be used to calculate lighting, volumes, etc.
Raytracing • Might not work great for fractals • Fractals are infinitely thin, so we might skip over many details • Use distance function estimator • Gives lower bound for distance to set from any point in space • Let z’n also be iteratively computed as z’n+1= 2znz’n, z’0= (1,0,0,0)
Raytracing • Rendering this distance function isosurface is okay • Usage: • Iterate zn and z’n until we escape or reach maximum iterations • Return distance of previous slide • Render all pixels “close enough” to set in volume
Raytracing • Better idea: use a bit of raytracing • Load volume data with distances to set • Store in volume texture • Raytrace along a ray through texture • Stop once we see distance is very low, under some epsilon • Each ray handled by one thread, so pretty quick
Raytracing • Better raytracing: • Current model: step along ray by step * ray.d • step = some small constant, e.g. 0.005 • What if we are 0.5 units away? • Don’t need to step by 0.005 • Use adaptive sampling: • step = factor * dist • factor = 0.01-0.5 works well • No need to worry about thread divergence
Raytracing • Calculating ray: • Inverse matrix needed to calculate where we are looking • invViewMatrix given to you, calculated for you • Pass it into constant memory c_invViewMatrix on GPU • ray.o = invViewMat * (0, 0, 0, 1) • ray.d = invViewMat * (u, v, -2.0) • u, v are screen coordinates -- calculate these based on 2D thread index
Lighting • Once we hit fractal, render it! • What color? • Depends on lighting, shading model, material properties… • You get to color based on however you like • Something with some complexity would be good • We suggest phong shading
Phong Shading • 3 Components: Ambient, diffuse, specular
Phong Shading • Ambient: Just a flat color • amb = amb_color;
Phong Shading • Diffuse: Adds soft shadows and highlights based on normal • diff = diff_color * cos(a) • a is angle between light and surface normal • Remember to use normalized vectors! N L a
Phong Shading • Specular: adds in bright highlights • spec = spec_color * dot(R, eye)S • R is L reflected across N • Eye = vector to eye • S = shininess (weird: higher S = less shiny) R N L eye
Phong Shading • Final output color is just sum of components: • out = amb + diff + spec • Main info we need to know: • Light direction (chosen up to you, just hardcode) • Normal (must compute) • Eye vector (this is just -ray.d)
Phong Shading • Calculating Normal via gradient: sample volume texture • For each component (x, y, z): • Sample texture at component + some offset (x + 0.01) • Sample texture at component - some offset (x - 0.01) • Calculate difference per component • Resulting differences are normal components! • We can also directly sample d_juliaDist • This can be pretty slow, but normals will be smoother • Up to you, if you’d like
Coalesced Memory • Recap: coalesced memory gets better access rates • Must be in-order, aligned, and together • Comes into play with thread indexing • index = threadIdx.x + blockDim.x * (blockIdx.x + gridDim.x * blockIdx.y); • index = threadIdx.x + blockDim.y * (blockIdx.y + gridDim.y * blockIdx.x);
Your Task • Some prelab questions will be available • All TODO code is in frac_kernel.cu • Host code: • Copy necessary memory into GPU • Map/Unmap buffers • Run kernels (2 this time, one to compute fractal, one to render) • Use timer events to record time of recalculation • Device code: • d_setfractal: loads volume data with data from d_juliaDist • d_juliaDist: returns estimated distance to Julia set at given point • d_juliaNormal: samples texture to calculate normal • d_render: raytraces to render isosurface, uses shading model to color fractal
Your Task • GPU architecture: • Indexing is made easiest with 1D block, 2D grid • Defined for you, see globally defined consts and dim3s • Space is bounded by region [-2, 2]3 • You’ll need to convert back and forth between this space and texture array indices • Feel free to play with any architecture/setup • In general, feel free to play with anything! • Coloring can be really cool… • Try other functions (z3 + c, for example)
Your Task • Extra Credit: • Raytracing: use raytracing to render shadows (10pts) • Once we hit surface, trace back toward light source • If we hit surface again, the original surface pixel was in shadow, make it slightly darker • Adaptive Detailing: higher detail when we’re zoomed in (5pts) • Allows us to see the infiniteness of the fractal • Essentially, just adjust epsilon based on distance to camera • epsilon: how close we must be to fractal to be considered a “hit”