200 likes | 774 Views
Synthetic content approach for benchmarking mobile 3D graphics: Work-in-progress Kari J. Kangas, Mika Qvist, Kari Pulli Nokia Outline Introduction and motivation Related work Synthetic benchmark content approach Tracing, analyzing, synthesizing Preliminary results Future work Summary
E N D
Synthetic content approach for benchmarking mobile 3D graphics:Work-in-progress Kari J. Kangas, Mika Qvist, Kari Pulli Nokia
Outline • Introduction and motivation • Related work • Synthetic benchmark content approach • Tracing, analyzing, synthesizing • Preliminary results • Future work • Summary
Introduction • Phones already support 2D/3D vector graphics • OpenGL ES, Java M3G, SVG, (OpenVG) • Vector graphics HW is coming • Technologies are adapted from desktop PCs very quickly • Vector graphics performance • Essential part of interactive vector graphics user experience • Very complex issue compared to traditional bitmap graphics; performance is highly content dependent
Why OpenGL ES benchmarking? • Performance optimization • Find performance bugs • Monitor the progress of optimization work • Understand 3D graphics platforms • How various content affects performance • Performance estimates for content developers • Benchmark data is needed as early as possible
Why benchmarking is challenging? • Immature platforms • Fragile SW environment, no GUI • Binary breaks • Source code is needed • Lack of versatile benchmark suites • Content ranges from simple UI controls to M3G to native OpenGL ES • We need speculative content
Synthetic benchmark content approach • Key question: is synthetic content similar enough to the real content from the performance point of view? • Analyze existing OpenGL ES content • Tracer, trace player, analyzer • Create synthetic benchmark content • Synthetic content tool • Ensure the synthetic content matches the original content from the performance point of view • Use synthetic content for benchmarking
Related work • Workload characterization • Dunwoody & Linton [1990], Mitra and Chiuen [1999], and Antochi et al. [2004] • Analyzing workload features • Render-time estimation • Funkhouser and Sequin [1993] • Benchmark suites • SPMark04, 3DMarkMobile06
OpenGL ES tracer Store OpenGL ES calls & parameters OpenGL ES impl. Render graphics Applications used as is No changes, no source code Real-time tracing Full trace vs. sample frames Tracing OpenGL ES content OpenGL ES application OpenGL ES Call trace OpenGL ES tracer OpenGL ES OpenGL ES Implementation
OpenGL ES trace player Replay the graphics in controlled env. OpenGL ES analyzer Extract content features Content features Condensed representation Off-line analysis Analyzing OpenGL ES trace Call trace Trace player Analyzer Content features
Content features: an example FNUM TXF TXBW TXIBW TXCBW TEXA TEXAM TEXC TEXCM TRIR TRIBFC TRIAA TRIRP PRIC VERIN TRIIN TPP ODE VSS 10 1539632 4.55 0.049 1.366 13 0.71 259 11.68 3797 35.55 404.7 990330 1489 7672 4694 3.15 3.22 184128 11 1405320 4.09 0.055 1.226 33 1.11 259 11.68 4332 36.91 452.0 1235221 1571 8131 4989 3.18 4.02 195112 12 1462472 4.32 0.078 1.296 49 1.38 259 11.68 5744 34.89 324.4 1213106 2136 10986 6714 3.14 3.95 263632 13 1646844 4.99 0.065 1.497 32 1.15 259 11.68 4704 33.72 375.5 1170905 1604 8362 5154 3.21 3.81 200688 14 1419244 4.23 0.036 1.269 5 0.50 259 11.68 1886 22.75 566.8 825884 855 4260 2550 2.98 2.69 102240 15 1477088 4.40 0.051 1.320 18 0.80 259 11.68 1868 23.93 642.8 913369 690 3456 2076 3.01 2.97 82944 16 1513716 4.57 0.066 1.370 19 0.85 259 11.68 1872 25.75 625.0 868731 701 3532 2130 3.04 2.83 84768 17 1540548 4.63 0.054 1.389 30 0.99 259 11.68 2611 29.41 466.7 860060 917 4706 2872 3.13 2.80 112944 18 1620824 4.90 0.085 1.471 38 1.18 259 11.68 3894 33.23 349.6 909023 1296 6742 4150 3.20 2.96 161808 19 1598616 4.85 0.051 1.454 18 0.85 259 11.68 2685 29.09 504.8 961196 1054 5382 3274 3.11 3.13 129168 20 1495388 4.45 0.058 1.336 19 0.85 259 11.68 3142 31.80 605.3 1297206 1054 5430 3322 3.15 4.22 130288 21 1712896 5.25 0.038 1.576 17 0.82 259 11.68 1269 27.03 900.8 834102 522 2644 1600 3.07 2.72 63456 22 1502388 4.44 0.049 1.333 21 0.92 259 11.68 4422 31.21 362.2 1101661 1508 7757 4741 3.14 3.59 186168 23 1482496 4.41 0.051 1.322 27 0.87 259 11.68 4953 36.87 278.2 870045 1677 8691 5337 3.18 2.83 208584 24 1501376 4.47 0.048 1.341 35 0.82 259 11.68 4897 37.06 294.9 908773 1586 8403 5231 3.30 2.96 201672 25 1580848 4.70 0.066 1.409 30 1.15 259 11.68 4608 26.63 400.2 1353148 1556 7899 4787 3.08 4.40 189544 26 1552052 4.62 0.062 1.386 37 1.09 259 11.68 4883 27.48 274.5 972134 1651 8675 5373 3.25 3.16 208200 27 2337220 6.96 0.048 2.088 25 0.68 259 11.68 5677 31.37 536.6 2090740 1780 9524 5964 3.35 6.81 228544 Example data from OpenGL analyzer
OpenGL ES application Win32, WinCE/PocketPC, Symbian Benchmark OpenGL ES frame, drawn repeatedly Benchmark suite Collection of benchmarks Extensible framework Diverse benchmark actions Action composition Support for animation Synthetic content tool Content features Benchmark suite Synthetic Content Tool
Benchmarks: old SCT ["Quake1 frame 11": FILLRATE] Surface : WINDOW FrameBufferFormat : 16/5/6/5/0/- PrimitiveType : TRIANGLES TriangleSize : 8*8 TriangleCount : 1008 Overdraw : 3 VertexType : FIXED ColorType : OFF TexCoordType : FIXED InterleavedArrays : LOOSE ShadeModel : FLAT Blending : OFF DepthTest : ON AlphaTest : OFF ColorMask : 0xF DepthMask : ON LogicOp : OFF Fog : OFF PerspectiveCorrectionHint : FASTEST Transformation : PERSPECTIVE Texture0 : ON Texture0Count : 4 Texture0Size : 128*128 Texture0Type : RGB565 Texture0MinFilter : GL_LINEAR Texture0MagFilter : GL_LINEAR Texture0EnvMode : GL_REPLACE Texture0Rotate : 0 Texture0Scale : 1 Texture1 : OFF [“Lighting": TRANSFORMATION] Surface : WINDOW FrameBufferFormat : 16/5/6/5/0/- TriangleSize : 2*1 TriangleCount : 8192 Overdraw : 4 InterleavedArrays : LOOSE ColorType : OFF LightCount : 0, 1, 2 TransPerTriangle : 1.0 SharingDistance : 0 VertexType : FIXED Transformation : COMPLEX Fog : OFF Blending : OFF BackfacingTrianglePercent : 100 Normalization : OFF Monolithic benchmarks with lots of parameters, no reuse
Benchmark: new SCT ["DispSetup":SETUP@DISPLAY] Surface : PBUFFER … ["ClearSetup":CLEAR_SETUP@OPENGL] ClearColor : {0,0,0,1} … ["ClearScreen":CLEAR@OPENGL] ClearColor : ON ClearDepth : ON ["EndFrame":READ_PIXELS@OPENGL] Dummy : 1 # ignore [“Camera":CAMERA@OPENGL] Projection : ORTHO … ["FlatSetup":SETUP@OPENGL] ShadeModel : GL_FLAT … [“Mem":SINGLE@MEMORY] Operation : FILLZ … [“Cpu":DHRYSTONE@CPU] Iterations : 1000 [“Mesh":PLANE_MESH@OPENGL] Iterations : 1 PrimitiveType : GL_TRIANGLE_STRIP TriangleWidth : 176 TriangleHeight : 208 TriangleCount : 2 VertexType : GL_FIXED ColorType : GL_UNSIGNED_BYTE TexCoordType : GL_FIXED TexCoordUnits : 2 NormalType : GL_FIXED InterleavedArrays : OFF UseVBO : OFF ["Max. screen clear rate benchmark":BENCHMARK] InitActions : DispSetup+ClearSetup+Camera BenchmarkActions : ClearScreen+EndFrame [“Marketing benchmark":BENCHMARK] InitActions : DispSetup+FlatSetup+Camera BenchmarkActions : Mesh+EndFrame [“Composite benchmark":BENCHMARK] InitActions : DispSetup+FlatSetup+Camera BenchmarkActions : Mem+Cpu+Mesh+EndFrame Reusable benchmark actions, support for composition, …
Rendering time per frame Trace vs. synthetic content Different platforms Compare Different platforms Modify SCT to improve the match New parameters Improved actions Comparing real and synthetic content Trace player Rendering time OGLES1 Call trace Trace player Rendering time OGLES2 Synthetic content Rendering time OGLES1 Content features Synthetic content Rendering time OGLES2
Example Early OpenGL SCT proto: trace from Quake 2 demo 2, 600 MHz P3, 256 MB RAM (SW OpenGL)
Example Early OpenGL SCT proto: trace from Quake 2 demo 2, 3 GHz Xeon, 2 GB RAM, nVidia QuadroFX 500 SCT Proto was VERY primitive (single triangle rendered repeatedly, etc.)
Preliminary results • Real content vs. synthetic content • Real OpenGL ES content analyzed by hand, very rough content features • ~20 FPS real vs. 24 FPS synthetic on a mobile 3D hardware • Understanding 3D performance • Understanding content • Creating “realistic” synthetic content
Future work • OpenGL ES tracer, trace player, analyzer • OpenGL ES content analysis • Creating accurate synthetic content • Mapping content features to SCT input • Designing good benchmark actions, action compositions • Different types of workloads • CPU, memory, audio playback, game physics engine
Summary • Outline of the synthetic content approach • Preliminary results indicate that it should be possible to match synthetic content performance to the real content: work-in-progress
Thank you! Questions?