1 / 35

HW/SW Co-design System Partitioning in HW/SW Co-Design

Bastian Knerr June 6th, 2008. HW/SW Co-design System Partitioning in HW/SW Co-Design. Christian Doppler Laboratory for Design Methodology of Signal Processing Algorithms. Outline. HW/SW Codesign for Embedded Systems System Partitioning Heterogeneous Platforms

seanna
Download Presentation

HW/SW Co-design System Partitioning in HW/SW Co-Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bastian Knerr June 6th, 2008 HW/SW Co-designSystem Partitioning in HW/SW Co-Design Christian Doppler Laboratory for Design Methodology of Signal Processing Algorithms

  2. Outline • HW/SW Codesign for Embedded Systems • System Partitioning • Heterogeneous Platforms • Mapping Graphs to Platforms • Heuristic Optimisation Methods for Multiple Objectives • Summary

  3. Embedded System Design An embedded system is a computing device in general subject to a specific purpose and its implementation is predominantly deter-mined by this purpose, usually entailing a complete encapsulation into the environment where this purpose is located at. Automotive Phones/PDAs Transceiver (WIFI, WLAN, xDSL,...)

  4. Embedded System Design Flow

  5. Outline • Embedded System Design • System Partitioning • Heterogeneous Platforms • Mapping Graphs to Platforms • Heuristic Optimisation for Multiple Objectives • Summary

  6. Heterogeneous Platforms Classical HW/SW Codesign Platform Is around for ~20 years Served well to get a first grip on partitioning Has not gained any relevance for industrial design flows

  7. Heterogeneous Platforms • Modern rapid prototyping platforms • Prototyping board forreal-time MIMO OFDM • DSP+Microcontroller • FPGAs • Busses and Bridges • RAM and Registers • Interfaces

  8. Heterogeneous Platforms • Modern SoC/embedded platforms • UMTS baseband trans-ceiver chip (2003) • DSP+Microcontroller • ASICs • Busses and Bridges • RAM and Registers • Interfaces

  9. Heterogeneous Platforms • Library for • DSPs • Cache/RAM • Schedules • FPGA • RAM/Flash • Slices/Gates • ASICs • Registers/Gates • Channels • Fifo/Direct/Bus • Memory • Schedules • Parallel read/write access

  10. Outline • Embedded System Design • System Partitioning • Heterogeneous Platforms • Mapping Graphs to Platforms • Heuristic Optimisation for Multiple Objectives • Summary

  11. Mapping Graphs to Platforms System Graphs

  12. Mapping Graphs to Platforms

  13. Mapping Graphs to Platforms NP-hard multi-objective optimisation problem Proven to be NP-complete by restriction to the classical graph partitioning problem

  14. Outline • Embedded System Design • System Partitioning • Heterogeneous Platforms • Mapping Graphs to Platforms • Heuristic Optimisation for Multiple Objectives • Summary

  15. Heuristic Optimisation • Multi-objective optimisation problem • A mapping of a problem instance Iis called valid, iff , with being objective functions and being constraints. • : is the mapping relation of a vertex i to the jth implementation alternative Aon resource r. • Objective functions: • Area for ‚HW‘ in gates/slices/NAND2 equivalents ( ) : , with for ASICs, for FPGAs • Code size for ‚SW‘ in bytes ( ) : , with for code size on DSPs. • ...

  16. Heuristic optimisation Objective function fT: system delay (makespan) Multi-core scheduling is NP-hard as well

  17. Heuristic Optimisation Definition A heuristic is a robust technique for the design of (randomised) algorithms for optimisation problems, and it provides (randomised) algorithms for which one is not able to guarantee at once the efficiency and the quality of the computed feasible solutions, even not with any bounded constant probability P > 0.

  18. Heuristic Optimisation • Partitioning analytically not solvable • Use heuristic methods • Simulated Annealing • Tabu Search • Kernighan-Lin min-cut • Genetic Algorithm • Particle Swarm • Custom Heuristics (GCLP, RRES, etc.) • ...

  19. Heuristic Optimisation Classical Kernighan-Lin min-cut • Modifications • More than two partitions • Unbalanced partitions allowed • Multiple objectives • Omit change list • ...

  20. Summary • Scheduling/Partitioning is a hard optimisation problem • Heuristic methods have to be applied • Highly dependent on platform model and high level estimation techniques • Many questions yet unsolved • Execution time profiles for processes (control flow) • Estimation uncertainties • Automated platform composition • ...

  21. Outline Thank you for your attention

  22. Typcial Graphs Industry Design for xDSL Transceiver

  23. Graph Properties • Degree of parallelism γ = |VCP| / |V| • Density ρ = |E| / |V| • Rank-Locality rloc = 1 / |E| Σ (rank(vhead) –rank(vtail)) rank

  24. Restricted Range Exhaustive Search • Create task graph • Create ordered vector of processes • Create initial mapping • Start exhaustive search on subset of processes (window) • Move window along the vector • Finally map process that leaves the window • Strong performance for typical graphs • Degree of parallelism • Density • Locality

  25. Results Averaged Cost Averaged Validity Normalised Relative Cost κ= f (parallelism, locality) Window Length

  26. The Genome Coding • Arrange vertices on a string • String elements (alleles) indicate implementation alternative • What about the order of the vertices? Does it matter?

  27. Recombination with chromosomes • 1-point crossover • Multi-point crossover • Uniform crossover • Why does it work? • Fundamental schema theorem and the building block hypothesis • Schema theorem • Short, low-order, above averageschemata (building block) proliferate • Below-average schemata die off • What makes schemata fit in system partitioning?

  28. Combinatorial vs. structural fitness • Combinatorial (area, code size, time) • Low resource consumption is ensured for any single vertex • Combination of assignments utilise resources optimally • Structural (time) • Exact graph matching bet-ween task and architecturesubgraphs • Parallel execution of proces-ses and data transfers • Structural fitness requires a representation in the chromo-some • Building blocks are short, low-order, and fit schemata

  29. Coding for structural exploitation • Locality preserving chromosome coding • Adjacent vertices in task graph shall be adjacent in chromosome • Use two schedules • As soon as possible • As last as possible • Arrange vertices viin increasing average start times: stavg(vi) = stasap(vi) + stalap(vi)

  30. Results • Impact of genome coding new rank random Cost

  31. More results • Structural mutation • 1-gene mutation (M1g) • Swap mutation (Msw) • Multi-swap mutation (Mbb)

  32. More results • Comparison with other heuristics • Penalty reward tabu search (pwTS) • Simulated annealing (SA) • Global criticality/local phase (GCLP) Averaged cost Ω Averaged Validity Ψ

  33. Conclusion • 3-operator GA has been implemented and analysed • Structural problem components (time) have been exposed • Genome coding Locality preserving ordering • Mutation Multi-swap mutation • Crossover depends heavily on building block size • Comparison with heuristics from literature showed superior performance of GA over pwTS • In contrast to published work

  34. Results • Related to crossover recombination • Uniform • 10-point • 5-point • 1-point new random

  35. More results • Selection over mutation probability • Binary tournament (BT) • Survival of the fittest (SOTF) • Roulette wheel (RW)

More Related