210 likes | 437 Views
Fault-Tolerant Network-Interface for Spatial Division Multiplexing Based Network-on-Chip. By Anup Das. Content. NoC Overview TDM-Based SDM-Based Existing NI Architecture New Area Optimized Architecture Need for Fault-Tolerance Fault-Tolerant NI Architectures Centralized Approach
E N D
Fault-Tolerant Network-Interface for Spatial Division Multiplexing Based Network-on-Chip By Anup Das
Content • NoC Overview • TDM-Based • SDM-Based • Existing NI Architecture • New Area Optimized Architecture • Need for Fault-Tolerance • Fault-Tolerant NI Architectures • Centralized Approach • Distributed Approach • Results • Conclusion
Network-on-Chip Switch Switch Switch • Increasing Number of IPs/PEs per die • Communication bottleneck with shared bus • Need for a scalable alternative • Use of networking concepts • NoC proposed by Benini et al. NI NI NI IP IP IP Switch Switch Switch NI NI NI IP IP IP
Network-on-Chip (contd.) Switch Switch Switch Switch A A B C B C • Two techniques for communication • Time Division Multiplexing • Spatial Division Multiplexing NI NI NI NI IP IP IP IP TDM-based NoC SDM-based NoC
Network Interface Architecture • N to 1 bit serializers – one for each outgoing wire • Data Distributor to send data from output queues to one of the serializers • Each distributor can send data to each of the serializers • Not all the distributors are loaded all the time • A single distributor can serve all the serializers
32 32 Network Interface Architecture Switch PE 32 out[0] Distributor 1 n to 1 Queue 1 out[1] n to 1 32 Distributor 2 Queue 2 32 Distributor 3 Queue 3 out[7] n to 1
New Area Optimized NI • Single distributor for all the serializers • New component called “requester” added for interfacing with the queue • 2 IDs introduced – serializer ID (sID) and queue ID (qID) • At connection setup time – each serializer assigned to a queue • Serializer requests for data which is then forwarded to corresponding queue • Data from queues travels back to the requesting serializer
32 Queue 1 PE 32 32 32 32 32 32 32 New Area Optimized NI 32 to 1 out[0] Switch out[1] 32 to 1 Distributor Requester 32 Queue 2 32 Queue 3 32 to 1 out[7]
Need for Fault-Tolerance • Transistor density on the rise • Shrinking feature size • Increasing number of faults manifesting post fabrication • Yield Loss • Need for fault-tolerance • IP/PE level • Interconnect Level • Idea is to provide graceful degradation of performance in event of faults
32 32 32 32 32 32 NI Fault-Tolerance - Centralized Switch PE 32 Controller out[0] Distributor 1 n to 1 Queue 1 • Controller introduced between distributor and IP queues • Changes data mapping dynamically when fault occurs with load balancing out[1] n to 1 32 Distributor 2 Queue 2 32 Distributor 3 Queue 3 out[7] n to 1
S1 Queue 1 Controller D1 S2 S3 Queue 2 D2 Centralized NI Operation S4 S5 D3 Queue 3 S6 S7 S8 S1 Controller Queue 1 D1 S2 S3 D2 Queue 2 S4 S5 D3 Queue 3 S6 S7 S8 S1 Controller Queue 1 D1 S2 S3 D2 Queue 2 S4 S5 D3 Queue 3 S6 S7 S8
NI Fault-Tolerance - Distributed • Multiple Distributors and Requestors –each capable of fault recovery • Two other IDs included – dID (distributor ID) and rID (requester ID) • When forwarding request to requester, distributor forwards dID, sID and qID • qID – used by requester to forward request to a queue • dID – used by requester to send back data from the queue to the requesting distributor • sID – used by the distributor to send data to the requesting serializer
S1 D1 Queue 1 R1 S2 S3 Queue 2 D2 R2 Distributed NI Operation S4 S5 Queue 3 S6 S7 S8 S1 R1 Queue 1 D1 S2 S3 Queue 2 D2 R2 S4 S5 Queue 3 S6 S7 S8 S1 D1 R1 S2 Queue 1 S3 D2 R2 S4 Queue 2 S5 Queue 3 S6 S7 S8
Experimental Setup • NoC considered with 8 links per node • Data packets of size 32 bits • Centralized Design coded in VHDL • Distributed Design in Verilog • Synopsys Design Compiler for ASIC synthesis • UMC 65nm Standard Cells • Area and Power number from the synthesis tool • Area number converted to gate count for comparison across technologies
Area Breakup Centralized Design Distributed Design
Summary • Distributed Design more area and power efficient but centralized design becomes more efficient with more distributors • Single fault in the controller of centralized design will render it useless • No single fault will affect distributed NI behavior • Next Step – • Increase granularity of load balancing • Fault-tolerance of Serializer