10 likes | 104 Views
CPU. CPU. Host RAM. Sys RAM. Cache. HBA Disk Ctrlr. NIC. HBA Disk Crlr. NIC. Cache. Write. Read. Bus is in a critical path. 1. 2. HELP. Host. 3. Computer Architecture Research at HPCL (www.ele.uri.edu/hpcl).
E N D
CPU CPU Host RAM Sys RAM Cache HBA Disk Ctrlr NIC HBA Disk Crlr NIC Cache Write Read Bus is in a critical path. 1 2 HELP Host 3 Computer Architecture Research at HPCL(www.ele.uri.edu/hpcl) BUCS—A Bottom Up Caching Structure for Storage ServersMing Zhang and Dr. Ken Qing YangHPCL, Dept. of ECEURI • Storage Volume • Data storage plays an essential role in today’s • fast-growing data-intensive network services. • Online data storage doubles every 9 months • How much data are there? • Read (Text) • 100 KB/hr, 25 GB/lifetime per person • 2. Hear (Speech @ 10KB/s) • 40 MB/hr, 10 TB/lifetime per person • 3. See (TV @ .5 MB/s) • 2 GB/hr, 500TB/lifetime per person Storage Cost in an IT Dept. Storage Speed A Server-to-Storage Bottleneck Storage cost as proportion of total IT spending as compared to server cost (Src Brocade) Current Storage Servers: • Motivations • Data bus is becoming a bottleneck • - 1 Gigabit NIC support 2 Gb/s (duplex) • - 10Gb/s NIC is on the way • - A 10Gb/s TOE can achieve 7.9Gb/s • - 6 SATA RAID0 can achieve >300MB/s • - 1 PCI bus: 66 * 8 = 533 MB/s • - PCI-X (1GB/s = 8Gb/s) • PCI Express, InfiniBand Motivations Embedded systems have become more powerful than ever BUCS –Bottom Up Caching Structure System Bus System bus network network • BUCS • Functional marriage between HBA and NIC • Caching at controller level • Data are placed at lower level caches • Replacing using LRU among • L1, L2, Disk • Only meta data are passed to bus and RAM • Most reads and writes from network • are done in lower level caches with • minimum bus transactions BUCS Controller Prototype Read Performance(Single Client) Write Performance(Single Client) • Performance (Four Clients) TPC-C Trace Results Request Response Time Randomly chosen 10K requests. • Conclusions • A New Cache Hierarchy Structure • Eliminate bus bottleneck • Reduce Response time • Increase system throughput by 3 times • Compliance with Existing Standards • Ready to be used HELP---Hardware Environment for Low-overhead Profiling Ming Zhang and Ken Qing Yang HPCL, Dept. of ECE, URI • Why Profiling? • System profiling has been an important • mechanism to observe system activities • Profiling-based optimization has become a • key technique in computer designs • Continuous and online optimization is needed because • of dynamic nature of computer systems • Traditional approaches suffer from high overhead • to already overloaded systems • HELP— • Hardware Environment for Low-overhead Profiling • Offload computing overheads from host • processors to an embedded processor • Continuous feedback loop model: • 1. Low overhead profiling of system events • 2. Parallel processing of raw data and • setting up new policies • 3. Applying the new policies to host • HELP Architecture • Low cost, low power embedded processor • Expandable with secondary PCI slot • Interface with host via standard PCI slot Measured Performance Run PostMark and popular Linux profiling tool, LTT The following table shows the measured time and overhead HELP reduces overhead of profiling to negligible • Adaptive Caching Policy • IOMeter results of buffer cache with random • write workloads • HELP can help by adaptively setting cache policies • Potential Applications • Performance: • - Low overhead profiling • - Adaptive pre-fetching and caching policies • - Online code optimization and recompilation • Availability: • - Monitor system events and • report failures or faults • Security: • - Monitor abnormal system accesses, • high risk events, intrusion detection • …… • Conclusion • HELP is a low cost, low power tool for system • profiling and optimization • Plug-and-Play device • Can be applied to any computer system with • PCI slots • “Offload” feature makes it superior to other • existing tools.