200 likes | 291 Views
OOO vs. EPIC. Yingmin Li Ting Yan Qi Zhao. Outline. “Advantages” of EPIC Critique Conclusion. EPIC: Main Idea. “Smart compiler, dumb machine” Finding parallelism Processor compiler Software/hardware synergy Processor design Avoid complexity and difficulty ILP, SMT & CMP.
E N D
OOO vs. EPIC Yingmin Li Ting Yan Qi Zhao
Outline • “Advantages” of EPIC • Critique • Conclusion
EPIC: Main Idea • “Smart compiler, dumb machine” • Finding parallelism • Processor compiler • Software/hardware synergy • Processor design • Avoid complexity and difficulty • ILP, SMT & CMP
EPIC: Predication • In OOO: dynamic branch prediction. • Larger basic blocks. • Control dep. Data dep. • Eliminate misprediction & penalties.
EPIC: Speculation • OOO: dynamic hardware • Data speculation & control speculation • Bigger window • Reduce impact of memory latencies
EPIC: Large Register Set • OOO: register renaming. • Easier to design than reg. Renaming. • “Real” registers benefits some apps. • Encryption alg., Numerical alg. • Avoid loss of invisible registers. • Interruptions in OOO.
EPIC: Unique Features • Register Stack Engine (RSE). • To deal with call/ return costs. • Seems an unlimited stack of phys. Reg. • Rotating register file. • Software pipelining. • Multiple loops at the same time.
Function Call • Register saving/restoring • Processor? • Compiler? • Register file • Expensive • Always idle
Predication • Computation of the branch condition is on the critical path • Increase ICache footprint • Half of the functional units effectively used if both “then” and “else” are scheduled • Hard to implement out-of-order with full predication
Predication To compute if (a) x = t+1:
Control Speculation • Why not just use prefetch which will not cause unexpected exception? • Technique to exploit control speculation such as superblock increase code length
Data Speculation • Moving a load above a possibly conflicting store • An advanced load and a checking load (IA64) • A run-time predictor
Software Pipelining • For high performance technical computing • High trip-count loops • For commercial applications • Low trip-count loops
EPIC: at least not a breakthrough • Design Object of EPIC: • Moving hardware complexity to compiler
EPIC: at least not a breakthrough • The failure of EPIC: • The compiling technique used for EPIC almost also apply well to OOO • Hardware simplicity is not so obvious to offset EPIC’s overhead • Without dynamic information, compiler essentially can’t do sth well enough
The tragedy of cycle time • Why no obvious improvement in cycle time • mechanisms like RSA increase die complexity • Compare and dependent branch in one cycle • Predicted execution dependent on the existence of many function units
Dynamic path length: hey, IA64, you wasted too much here • Speculation • Half of the predicted instructions discarded • Restricted bundling • One base register • No sign-extended loads • No integer multiply or divide in general register
CPI • No dynamic prediction • Longer source code (more GR, Predicate register, template bit, restricted bundling, recovery code) is burdensome for instruction fetching • Recovery code may induce ICache pollution or just a page-fault