1 / 140

Parallel Database Systems Survey: VLDB 95 Presentation

This presentation discusses the parallel database systems at VLDB 95, focusing on technology push, application pull, benchmarks, buyer's guide, and parallel database techniques. It also explores the reasons for putting everything in cyberspace, the challenges of data storage and analysis, and the impact of Moore's Law on storage capacity and cost.

carole
Download Presentation

Parallel Database Systems Survey: VLDB 95 Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detailed notes available from Gray@Microsoft.com this presentation is 120 of the 174 slides (time limit) Notes in PowerPoint7 and Word7 Parallel Database Systems 101Jim Gray & Gordon BellMicrosoft Corporationpresented at VLDB 95, Zurich Switzerland, Sept 1995 Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  2. Why Parallelism: technology push application pull Benchmark Buyer’s Guide metrics simple tests Parallel Database Techniques partitioned data partitioned and pipelined execution parallel relational operators Parallel Database Systems Teradata. Tandem, Oracle, Informix, Sybase, DB2, ‘RedBrick Outline Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  3. Kinds Of Information Processing Broadcast Point-to-Point lecture concert conversation money Net work Immediate book newspaper mail Time Shifted Data Base Its ALL going electronic Immediate is being stored for analysis (so ALL database) Analysis & Automatic Processing are being added Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  4. Why Put Everything in Cyberspace? Point-to-Point OR Broadcast Low rent min $/byte Shrinks time now or later Shrinks space here or there Automate processing knowbots Network Immediate OR Time Delayed Locate Process Analyze Summarize Data Base Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  5. All information will be in an online database (somewhere) You might record everything you read: 10MB/day, 400 GB/lifetime (two tapes) hear: 400MB/day, 16 TB/lifetime (a tape per decade) see: 1MB/s, 40GB/day, 1.6 PB/lifetime (maybe someday) Data storage, organization, and analysis is a challenge. That is what databases are about DBs do a good job on “records” Now working on text, spatial, image, and sound. Databases: Information At Your Fingertips™ Information Network™Knowledge Navigator™ Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  6. The New World: Billions of objects Big objects (1MB) Objects have behavior (methods) Database Store ALL Data Types • The Old World: • Millions of objects • 100-byte objects People Name Address David NY Mike Berk Paperless office Library of congress online All information online entertainment publishing business Information Network, Knowledge Navigator, Information at your fingertips Won Austin People Name Voice Picture Address Papers NY David Mike Berk Won Austin Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  7. File Cabinet: cabinet (4 drawer) 250$ paper (24,000 sheets) 250$ space (2x3 @ 10$/ft2) 180$ total 700$ 3 ¢/sheet Disk: disk (8 GB =) 2,000$ ASCII: 4 m pages 0.05 ¢/sheet (60x cheaper) Image: 200 k pages 1 ¢/sheet (3x cheaper than paper) Store everything on disk Magnetic Storage Cheaper than Paper Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  8. Moore’s Law • XXX doubles every 18 months 60% increase per year • Micro Processor speeds • chip density • Magnetic disk density • Communications bandwidthWAN bandwidth approaching LANs • Exponential Growth: • The past does not matter • 10x here, 10x there, soon you're talking REALchange. • PC costs decline faster than any other platform • Volume & learning curves • PCs will be the building bricks of all future systems 1GB 128MB 1 chip memory size ( 2 MB to 32 MB) 8MB 1MB 128KB 8KB 1980 1990 2000 1970 bits: 1K 4K 16K 64K 256K 1M 4M 16M 64M 256M Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  9. In The Limit: The Pico Processor • 1 M SPECmarks, • 1TFLOP • 106 clocks to bulk ram • Event-horizon on chip. • VM reincarnated • Multi-program cache • On-Chip SMP Terror Bytes! Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  10. What's a Terabyte? (250 K$ of Disk @ .25$/MB) 150 miles of bookshelf 15 miles of bookshelf 7 miles of bookshelf 10 days of video 1 Terabyte 1,000,000,000 business letters 100,000,000 book pages 50,000,000 FAX images 10,000,000 TV pictures (mpeg) 4,000 LandSat images Library of Congress (in ASCII) is 25 TB 1980: 200 M$ of disc 10,000 discs 5 M$ of tape silo 10,000 tapes 1995: 250 K$ of magnetic disc 70 discs 500 K$ of optical disc robot 250 platters 50 K$ of tape silo 50 tapes Terror Byte !! Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  11. Capacity and cost are improving fast (100x per decade) Accesses are getting larger (MOX, GOX, SCANS) BUT Latencies and bandwidth are not improving much (3x per decade) How to deal with this??? Bandwidth: Use partitioned parallel access (disk & tape farms) Latency Pipeline data up storage hierarchy (next section) Summary (of storage) Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  12. Disk is back to 100x cheaper than RAM Nearline tape is only 10x cheaper than disk and the gap is closing! Interesting Storage Ratios RAM $/MB Disk $/MB 100:1 • Disk & DRAM look good 30:1 ? 10:1 • ??? Why bother with Tape Disk $/MB Nearline Tape 1:1 1960 1970 1980 1990 2000 Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  13. In the “old days” we counted instructions and IO’s Now we count memory references Processors wait most of the time Sort Disc Wait Sort OS Disc Wait Memory Wait I-Cache Miss B-Cache D-Cache Data Miss Miss Performance =Storage Accesses not Instructions Executed Where the time goes: clock ticks used by AlphaSort Components 70 MIPS “real” apps have worse Icache misses so run at 60 MIPS if well tuned, 20 MIPS if not Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  14. Storage Latency: How Far Away is the Data? Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  15. Network speeds grow 60% / year WAN speeds limited by politics if voice is X$/minute, how much is video? Switched 100Mb Ethernet 1,000x more bandwidth ATM is a scaleable net: 1 Gb/s to desktop & wall plug commodity: same for LAN, WAN 1Tb/s fibers in laboratory 1e 9 1e 8 1e 7 1e 6 1e 5 1e 4 1e 3 Network Speeds Comm Speedups Processors (i/s) LANs & WANs (b/s) 1960 1970 1980 1990 2000 Year Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  16. Bandwidth UP 104 Price DOWN Speed-of-light unchanged Software got worse Standard Fast Nets ATM PCI Myrinet Tnet HOPE: Commodity Net Good software Then clusters become a SNAP! commodity: 10k$/slice 10 10 1 Gb/s 9 10 PC Bus 8 10 CAN 7 10 LAN 1 Mb/s 6 10 WAN 5 10 4 10 POTS 1 Kb/s 3 10 2 10 1995 2000 1975 1985 1965 Network Trends & Challenge Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  17. 10$: wrist watch computers 100$: pocket/ palm computers 1,000$: portable computers 10,000$: personal computers (desktop) 100,000$: departmental computers (closet) 1,000,000$: site computers (glass house) 10,000,000$: regional computers (glass castle) The Seven Price Tiers SuperServer: Costs more than 100,000 $ “Mainframe” Costs more than 1M$ Must be an array of processors, disks, tapes comm ports Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  18. Horizontal integration is new structure Each layer picks best from lower layer. Desktop (C/S) market 1991: 50% 1995: 75% Example Function Operation AT&T Integration EDS Applications SAP Middleware Oracle Baseware Microsoft Systems Compaq Intel & Seagate Silicon & Oxide The New Computer Industry Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  19. Bill Joy’s law (Sun): Don’t write software for less than 100,000 platforms. @10M$ engineering expense, 1,000$ price Bill Gate’s law:Don’t write software for less than 1,000,000 platforms. @10M$ engineering expense, 100$ price Examples: UNIX vs NT: 3,500$ vs 500$ UNIX-Oracle vs SQL-Server: 100,000$ vs 1,000$ No Spreadsheet or Presentation pack on UNIX/VMS/... Commoditization of base Software & Hardware Software Economics: Bill’s Law Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  20. ThesisMany Little will Win over Few Big 1 M$ 10 K$ 100 K$ Micro Nano Mini Mainframe 1.8" 2.5" 3.5" 5.25" 9" 14" Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  21. The Year 2000 commodity PC (3K$) Billion Instructions/Sec Billion Bytes RAM Billion Bits/s Net 10 B Bytes Disk Billion Pixel display 3000 x 3000 x 24 pixel Year 2000 4B Machine 1 Bips Processor .1 B byte RAM 10 B byte Disk 1 B bits/sec LAN/WAN Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  22. Cost 3,000 $ Come with OS (NT, POSIX,..) DBMS High speed Net System management GUI / OOUI Tools Compatible with everyone else CyberBricks 4 B PC’s: The Bricks of Cyberspace Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  23. Implications of Hardware Trends Large Disc Farms will be inexpensive ( 100$/GB) Large RAM databases will be inexpensive (1,000$/GB) Processors will be inexpensive So The building block will be a processor with large RAM lots of Disc 1k SPECint CPU 50 GB Disc 5 GB RAM Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  24. Implication of Hardware Trends: Clusters CPU 50 GB Disc 5 GB RAM Future Servers are CLUSTERS of processors, discs Distributed Database techniques make clusters work Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  25. 100 Tape Transports = 1,000 tapes = 1 PetaByte Future SuperServer4T Machine • Array of 1,000 4B machines • processors, • disks, • tapes • comm lines • A few MegaBucks • Challenge: • Manageability • Programmability • Security • Availability • Scaleability • Affordability • As easy as a single system 1,000 discs = 10 Terrorbytes 100 Nodes 1 Tips High Speed Network ( 10 Gb/s) Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  26. Great Debate: Shared What? Shared Nothing (network) Shared Memory (SMP) Shared Disk Easy to program Difficult to build Difficult to scaleup Hard to program Easy to build Easy to scaleup Sequent, SGI, Sun VMScluster, Sysplex Tandem, Teradata, SP2 • Winner will be a synthesis of these ideas • Distributed shared memory (DASH, Encore) blurs distinction • between Network and Bus (locality still important) • But gives Shared memory message cost. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  27. A Slice is a processor, memory, and a few disks. Slice Price of Scaleables so far is 5x to 10x markup Teradata: 70K$ for a Intel 486 + 32MB + 4 disk. Tandem: 100k$ for a MipsCo R4000 + 64MB + 4 disk Intel: 75k$ for an I860 +32MB + 2 disk TMC: 75k$ for a SPARC 3 + 32MB + 2 disk. IBM/SP2: 100k$ for a R6000 + 64MB + 8 disk Compaq Slice Price is less than 10k$ What is the problem? Proprietary interconnect Proprietary packaging Proprietary software (vendorIX) Scaleables: Uneconomic So Far Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  28. Storage trends force pipeline & partition parallelism Lots of bytes & bandwidth per dollar Lots of latency Processor trends force pipeline & partition Lots of MIPS per dollar Lots of processors Putting it together Scaleable Networks and Platforms) Build clusters of commodity processors & storage Commodity interconnect is key (S of PMS) Traditional interconnects give 100k$/slice. Commodity Cluster Operating System is key Fault isolation and tolerance is key Automatic Parallel Programming is key Summary Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  29. ? • SNAP • Scaleable Network And Platforms • Commodity Distributed OS • built on • Commodity Platforms • Commodity Network Interconnect The Hardware is in Place and Then A Miracle Occurs Enables Parallel Applications Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  30. Why Parallel Access To Data? At 10 MB/s 1.2 days to scan 1,000 x parallel 1.5 minute SCAN. Bandwidth Parallelism: divide a big problem into many smaller ones to be solved in parallel. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  31. Can't wait for the data to arrive (2,000 years!) Need a memory that gets the data in advance ( 100MB/S) Solution: Pipeline from source (tape, disc, ram...) to cpu cache Pipeline results to destination DataFlow ProgrammingPrefetch & Postwrite Hide Latency Latency Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  32. Why are Relational OperatorsSo Successful for Parallelism? • Relational data model uniform operators • on uniform data stream • Closed under composition • Each operator consumes 1 or 2 input streams • Each stream is a uniform collection of data • Sequential data in and out: Pure dataflow • partitioning some operators (e.g. aggregates, non-equi-join, sort,..) • requires innovation • AUTOMATIC PARALLELISM Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  33. Automate system management via tools data placement data organization (indexing) periodic tasks (dump / recover / reorganize) Automatic fault tolerance duplex & failover transactions Automatic parallelism among transactions (locking) within a transaction (parallel execution) Database Systems “Hide” Parallelism Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  34. Automatic Parallel OR DB Select image from landsat where date between 1970 and 1990 and overlaps(location, :Rockies) and snow_cover(image) >.7; Temporal Spatial Image • Assign one process per processor/disk: • find images with right data & location • analyze image, if 70% snow, return it Landsat Answer date loc image image 33N 120W . . . . . . . 34N 120W 1/2/72 . . . . . .. . . 4/8/95 date, location, & image tests Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  35. Why Parallelism: technology push application pull Benchmark Buyer’s Guide metrics simple tests Parallel Database Techniques partitioned data partitioned and pipelined execution parallel relational operators Parallel Database Systems Teradata. Tandem, Oracle, Informix, Sybase, DB2, RedBrick Outline Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  36. Parallelism: Speedup & Scaleup Speedup: Same Job, More Hardware Less time Scaleup: Bigger Job, More Hardware Same time Transaction Scaleup: more clients/servers Same response time Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  37. 2x $ is 4x performance 1,000 MIPS 32 $ 2x $ is 2x performance 1 MIPS 1 $ .03$/MIPS 1,000 MIPS 1 MIPS 1,000 $ 1 $ The New Law of Computing • Grosch's Law: • Parallel Law: • Needs • Linear Speedup and Linear Scaleup • Not always possible Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  38. Parallelism: Performance is the Goal Goal is to get 'good' performance. • Law 1: parallel system should be • faster than serial system • Law 2: parallel system should give • near-linear scaleup or • near-linear speedup or • both. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  39. Transaction Processing Performance Council: TPC-A: simple transaction TPC-B: server only, about 3x lighter than TPC-A Both obsoleted by TPC-C (no new results after 6/7/95) TPC-C (revision 3) Transactions Per Minute tpm-C Mix of 5 transactions: query, update, minibatch Terminal price eliminated about 5x heavier than tpcA (so 3.5 ktpcA ­ 20 ktpmC) TPC-D approved in March 1995 - Transactions Per Hour Scaleable database (30 GB, 100GB, 300GB,... ) 17 complex SQL queries (no rewrites, no hints without permission) 2 load/purge queries No official results yet, many “customer” results. The New Performance Metrics Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  40. TPC-C Results 12/94 Courtesy of Charles Levine of Tandem (of course) Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  41. Online Transaction Processing many little jobs SQL systems support 3700 tps-A(24 cpu, 240 disk) SQL systems support 21,000 tpm-C (112 cpu,670 disks) Batch (decision support and Utility) few big jobs, parallelism inside Scan data at 100 MB/s Linear Scaleup to 500 processors Success Stories transactions / sec hardware recs/ sec hardware Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  42. The Perils of Parallelism Startup: Creating processes Opening files Optimization Interference: Device (cpu, disc, bus) logical (lock, hotspot, server, log,...) Skew: If tasks get very small, variance > service time Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  43. Benchmark Buyer's Guide • Things to ask • When does it stop scaling? • Throughput numbers, • Not ratios. • Standard benchmarks allow • Comparison to others • Comparison to sequential Ratios and non-standard benchmarks are red flags. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  44. Disk is 3MB/s to 10MB/s • Record is 100B to 200B (TPC-D 110...160, Wisconsin 204) • So should be able to read 10kr/s to 100kr/s • Simple test: Time this on a 1M record table • SELECT count(*) FROM T WHERE x < :infinity; • (table on one disk, turn off parallelism) • Typical problems: • disk or controller is an antique • no read-ahead in operating system or DB • small page reads (2kb) • data not clustered on disk • big cpu overhead in record movement • Parallelism is not the cure for these problems Performance 101: Scan Rate Scan Agg Count Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  45. Parallel Scan Rate • Simplest parallel test: • Scaleup previous test: • 4 disks, • 4 controllers, • 4 processors • 4 times as many records • partitioned 4 ways. • Same query • Should have same elapsed time. • Some systems do. Scan Scan Scan Scan Agg Count Agg Count Agg Count Agg Count Agg Sum Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  46. Log Parallel Update Rate • Test: UPDATE T • SET x = x + :one; • Test for million row T on 1 disk • Test for four million row T on 4 disks • Look for bottlenecks. • After each call, execute ROLLBACK WORK • See if UNDO runs at the DO speed • See if UNDO is parallel (scales up) UPDATE Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  47. parallel database systems scan data An interesting metric (100 byte record): Record Scan Rate / System Cost Typical scan rates: 1k records/s to 30k records/s Each Scaleable system has a “slice price” guess: Gateway: 15k$ (P5 + ATM + 2 disks +NT + SQLserver or Informix or Oracle) Teradata: 75k$ Sequent: 75k$ (P5+2 disks+Dynix+Informix) Tandem: 100k$ IBM SP2: 130k$ (RS6000+2 disks, AIX, DB2) You can compute slice price for systems later in presentation BAD: 0.1 records/s/$ (there is one of these) GOOD: 0.33 records/s/$ (there is one of these) Super! 1.00 records/s/$ (there is one of these) We should aim at 10 records/s/$ with P6. The records/$/second Metric Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  48. Embarrassing Questions to Ask Your PDB Vendor • How are constraints checked? • ask about unique secondary indices • ask about deferred constraints • ask about referential integrity • How does parallelism interact with • triggers • Stored procedures • OO extensions • How can I change my 10 TB database design in an hour? • add index • add constraint • reorganize / repartition • These are hard problems. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  49. Why Parallelism: technology push application pull Benchmark Buyer’s Guide metrics simple tests Parallel Database Techniques partitioned data partitioned and pipelined execution parallel relational operators Parallel Database Systems Teradata. Tandem, Oracle, Informix, Sybase, DB2, RedBrick Outline Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  50. Automatic Data Partitioning Split a SQL table to subset of nodes & disks Partition within set: Range Hash Round Robin Good for equijoins, range queries group-by Good for equijoins Good to spread load Shared disk and memory less sensitive to partitioning, Shared nothing benefits from "good" partitioning Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

More Related