1 / 61

Functional Verification from a Manager's Perspective: When is 'Good Enough',  Really Good Enough?

Functional Verification from a Manager's Perspective: When is 'Good Enough',  Really Good Enough?. Ira Chayut , Verification Architect (opinions are my own and do not necessarily represent the opinion of my employer). Topics. Achieving Balance Functional Verification Overview

zoey
Download Presentation

Functional Verification from a Manager's Perspective: When is 'Good Enough',  Really Good Enough?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Functional Verification from a Manager's Perspective:When is 'Good Enough', Really Good Enough? Ira Chayut,Verification Architect (opinions are my own and do not necessarily represent the opinion of my employer)

  2. Topics • Achieving Balance • Functional Verification Overview • Optimizing Functional Verification • Living with Functional (and Architectural) Errors • Conclusions • Questions

  3. Achieving Balance “Complete” Verification Early Tapeout

  4. Achieving Balance “Complete” Verification Early Tapeout • Critical bugs missed • Respin(s) • Missed Market Window

  5. Achieving Balance “Complete” Verification • Increased resources used • Delayed tapeout • Missed Market Window Early Tapeout

  6. Achieving Balance “Complete” Verification • Missed Market Window Early Tapeout • Missed Market Window

  7. Achieving Balance “Complete” Verification Early Tapeout Balance

  8. Achieving Balance “Complete” Verification Early Tapeout Balance • Diversify to manage risk • Employ multiple verification techniques • The whole is greater than the sum of the parts • Plan to succeed, even in the presence of errors

  9. Functional Verification Overview

  10. What is Functional Verification • “Functional Verification” is the task of checking that the design implements the architecture specified, such as in:

  11. A Functional Verification Example Reference models and Devices Under Test (DUTs) can be C models, RTL models, FPGA prototypes, emulation implementations, or silicon.

  12. Good Enough for What? • During different phases of an ASIC development, the term “good enough” can change meaning, the RTL can be good enough to: • Start running simple directed tests • Start running all written directed tests • Synthesize for size estimates • Synthesize for FPGA prototype • Take to emulation • Tape out • Ship silicon to customers Focus of this talk

  13. Why Do We Do Functional Verification • The cost of extra spins is enormous (the lost opportunity costs can dwarf the cost of new masks, etc.) • Even one critical bug can cause a respin • We wish to greatly improve the odds that the manufactured silicon is good enough to ship

  14. Why Do We Do Functional Verification • The cost of extra spins is enormous (the lost opportunity costs can dwarf the cost of new masks, etc.) • Even one critical bug can cause a respin • We wish to greatly improve the odds that the manufactured silicon is good enough to ship

  15. Common Types of Functional Verif • Test stimulus applied to inputs obtained via: • Manually generated • Generated by a directed test program (either open-loop or closed-loop) • Pseudo-Random generator • Captured from a live system • Output captured and checked against reference design (usually a C model, or earlier silicon) • Assertions (in both RTL and C models) • Formal and Semi-formal techniques • Real applications in a testbench that mimics a full system

  16. Cost of Incomplete Verification • Extra spins • Lost time-to-market • Scrapped inventory • Good will of customers who were promised delivery dates • Company reputation • Impact on next projects, as engineers are brought back to fix problems found post-silicon

  17. Why Don’t We Just Do More Functional Verification? It is possible to have “too much of a good thing?”

  18. Why Don’t We Just Do More Functional Verification? It is possible to have “too much of a good thing”?

  19. Why Don’t We Just Do More Functional Verification? It is possible to have “too much of a good thing?” Analogy, courtesy of my colleague, David Wyland

  20. Costs of Thorough Verification • Time-to-market • Time-edge over competition • If it were (technically and financially) possible to completely test a complex ASIC we would probably miss the market window • Staffing • Computer time • Software licenses • Emulation time • Opportunity costs

  21. Optimizing Functional Verification

  22. Functional Verification Efforts Must Be Optimized • We need to find ways of maximizing the Functional Verification we can afford to do • No single technique is a total answer, multiple techniques will yield the best approach • Thorough unit-level verification testing and loose coupling of units

  23. Loose Coupling • Reduces verification time • Small simulations run faster • Avoids combinatorial explosion of interactions • Well defined interfaces between blocks with assertions and formal verification techniques to reduce inter-block problems

  24. Functional Verification Metrics

  25. Functional Verification Metrics • Exhaustive Testing • Error Injection • New Bug Rate • Number/Size of Check-Ins • Number of Assertions Firing • Line Coverage • Expression Coverage • State Machine Coverage • Coverage Assertions

  26. Exhaustive Testing • 100% coverage • All possible inputs presented when the Device Under Test (DUT) is in each of its possible states • Consider a two-input AND gate • No internal state • Four vectors fully test • Consider a two-input 32-bit adder • No internal state • 264 vectors needed to fully test • At 10 Billion tests per second, test will take 58 years

  27. Exhaustive Testing Not Practical

  28. Error Injection • It is theoretically possible to inject random errors into the design code and see what percentage are caught by the regression test suite • For the size of today’s designs, this is impractical due to the time it takes to run a regression test suite, even with expensive emulation platforms, and the number of runs that are needed to get statistically meaningful results • The errors that are not caught are difficult to analyze to determine if new tests are needed, and difficult to determine what those tests are

  29. Error Injection Not Practical

  30. New Bug Rate • Common metric used as a gate to tape-out • Easy to measure and graph

  31. New Bug Rate, Page 2 • Doesn’t distinguish severity of bug (but can)

  32. New Bug Rate, Page 3 • “Past performance is no guarantee of future results” • Doesn’t predict number of bugs not yet seen

  33. New Bug Rate, Page 4 No major new bugs for at least one week after 100% test plan and randomsrunning

  34. Number/Size of Check-Ins • Easy to measure and graph • Doesn’t distinguish scope of check-in or the severity of the problem being addressed (but can) • Doesn’t accurately predict the size of check-ins that will be needed to address future problems (especially missing functionality that is found late in the design process)

  35. Number/Size of Check-Ins Noneonly required fixes allowed

  36. Number of Assertions Firing • If the RTL design code is fully instrumented with functional assertions, then the fewer assertions that fire, the better • BUT, the functional assertions will only fire when an error condition is seen – they do NOT provide any measure of test coverage

  37. Number of Assertions Firing None

  38. Line or Block Coverage • Using third-party or built-in coverage tools, monitor a full regression to find lines (or blocks) of design code is being run at least once • Slight, though some, impact on run-time • Does NOT tell us: • Which values of registers were exercised • How much of “short-circuit and/or” (&&/||) or ?: lines are exercised • Which combinations of conditions and DUT state are exercised • Anything about missing lines of code • Visiting a line is necessary, but not sufficient, for complete testing

  39. Line or Block Coverage > 98%of reachable code

  40. Expression Coverage • Overcomes the limitations of line (or block) coverage w.r.t. “short-circuit and/or” (&&/||) or ?: lines are exercised • Significant impact on simulation run-time • Does NOT tell us: • Which values of registers were exercised • Which combinations of conditions and DUT state are exercised • Anything about missing expressions or code

  41. Expression Coverage > 95%of reachable code

  42. State Machine Coverage • Can measure if all state machine states and state machine state transitions have been exercised • For most designs, can be labor-intensive, as illegal (or legal) state machine states and state transitions have to be declared • Does NOT tell us: • Which states, transitions, or entire state machines are missing

  43. State Machine Coverage 100% of functional code

  44. Coverage Assertions • Also know as “Functional Coverage” • Designers and/or verification engineers can declare “interesting” events to be covered by test cases • Events can be simple: • All inputs between 0 and 10 must be seen or can be complex: • All combinations of FSM1 and FSM2 state machines must be seen • Quality of metric depends upon the size of the effort to declare interesting events and the quality of those declarations

  45. Coverage Assertions 100%

  46. Functional vs. Architectural Errors • To the end-user – no difference • Functional Verif. is not trying to find architectural errors, but the user doesn’t care which end of the boat is sinking – only that their feet are starting to get wet

  47. Living with Functional (and Architectural) Errors

  48. Living With Errors • Successful companies have learned how to ship chips with functional and architectural – time to market pressures and chip complexity force the delivery of chips that are not perfect (even if that were possible). How can this be done better? • For a long while, DRAMs have been made with extra components to allow a less-than-perfect chip to provide full device function and to ship • How to do the same with architectural redundancy? How can a less-than-perfect architecture or hardware implementation provide full device function?

  49. Architectural Redundancy • A programmable abstraction layer between the real hardware and user’s API can hide functional warts • Upper-layer protocols can recover from some functional or architectural errors; though there can be performance penalties when this is used • Soft hardware can allow chip redesign after silicon is frozen (and shipped!)

  50. Make the Silicon Testable • Provide access to interior blocks for live testing • Inclusion of on-chip logic analyzer capability

More Related