Outline

Designing Efficient Matrix Transposition on Various Interconnection Networks Using Tensor Product Formulation Presented by Chin-Yi Tsai

Outline • Introduction • Tensor Product Notation • Matrix Transposition • Designing Matrix Transposition on Various Interconnection Networks • Conclusions and Future Work

Introduction • Matrix transposition is a simple, but an important computational problem. • A matrix is a two-dimensional data structure which is stored in a one-dimensional computer memory. • A simple double-loop transposition program will perform poorly in modern computer architecture with memory hierarchy.

Introduction (cont’d) • We develop matrix transposition algorithms on various interconnection networks, including omega, baseline and hypercube networks. • Tensor product has been successfully used for designing block recursive algorithm, such as FFT, Strassen’s matrix multiplication, parallel prefix algorithm, Hilbert space-filling curve, and Karatsuba’s multiplication. • Tensor product formulas are also suitable for specifying interconnection networks.

Introduction (cont’d) • Different interconnection networks have their own architectural characteristics and properties. • Distributed-memory algorithms and VLSI circuit design. • A major goal of this study is to provide an effective way for designing VLSI circuits of DSP algorithms.

Tensor Product Notation • Let A and B be two matrices of size and , respectively • Stride permutation

Matrix Transposition • Matrix transposition can be viewed as changing the elements from the row-major order to column-major order. • Matrix A is stored in a computer memory, the index scheme of element : • Row-major order • Column-major order • Various matrix transposition algorithms can be designed by manipulating stride permutation:

Matrix Transposition (cont’d) Step1: blocks with qs elements of each block Step2: perform transposition of matrix for pr blocks Step3: transpose a block matrix with each block of qs elements Step4: convert a block structure order of blocks with qs elements of each blcok to the row- major order of the transposed matrix

Designing Matrix Transposition on Various Interconnection Networks • We consider two kinds of networks: • multistage interconnection network, • direct interconnection network. • The basic component of multistage interconnection network is a switching element. • A direct interconnection network is a set of processors connected by a set of links. x0 y0 x0 y0 x1 y1 x1 y1

Designing Matrix Transposition on Various Interconnection Networks • Suppose that N=2n, • Omega network • Baseline network • Hypercube network

0 1 8 9 0 4 8 12 2 10 3 11 1 9 5 13 4 12 5 13 2 10 6 14 6 14 7 15 3 11 7 15 0 1 2 3

Deviation of Algorithm on Omega Interconnection Network

0 0 0 0 0 1 8 4 2 1 2 1 8 4 2 3 9 12 6 3 4 2 1 8 4 5 10 5 10 5 6 3 9 12 6 7 11 13 14 7 8 4 2 1 8 9 12 6 3 9 10 5 10 5 10 11 13 14 7 11 12 6 3 9 12 13 14 2 11 13 14 7 11 13 14 15 15 15 15 15 Omega Interconnection Network

Deviation of Algorithm on Baseline Interconnection Network Bit-reversal operation Partial bit-reversal operation

0 0 0 0 0 1 2 2 1 4 2 1 1 4 8 3 3 3 5 12 4 4 4 8 1 5 6 6 9 5 6 5 5 12 9 7 7 7 13 13 8 8 8 2 2 9 10 10 3 6 10 9 9 6 10 11 11 11 7 14 12 12 12 10 3 13 14 14 11 7 14 13 13 14 11 15 15 15 15 15 Baseline Interconnection Network

2 2 0 0 0 1 0 0 0 1 1 1 2 2 2 3 3 3 1 3 3 3 2 1 Hypercube Interconnection Network

Deviation of Algorithm on Hypercube Interconnection Network

0 0 0 0 0 4 1 4 4 1 2 8 2 8 8 12 9 9 6 6 2 1 8 4 1 10 12 10 3 9 12 3 5 5 5 13 14 11 13 7 2 2 1 4 8 10 10 10 12 3 12 3 6 5 5 13 11 14 14 7 9 6 3 9 6 11 11 11 14 14 13 13 7 7 7 15 15 15 15 15 Hypercube Interconnection Network (cont’d)

Conclusions and Future Work • We use tensor product as the framework to design matrix transposition algorithms on various interconnection networks. • To manipulate stride permutation operations to fit into networks. • VLSI circuit design for DSP and image processing algorithms on various interconnection networks.

Outline

Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

OUTLINE