1 / 25

Distributed Database Systems

Distributed Database Systems. Definitions: Distributed Database : is a collection of multiple logically interrelated databases distributed over a computer network.

ward
Download Presentation

Distributed Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Database Systems Dr. Mohamed Osman Hegazi

  2. Definitions: • Distributed Database : is a collection of multiple logically interrelated databases distributed over a computer network. • Distributed database management systems (DDBMS): The software that permits the management of DDBS and makes the distribution transparent to the users. • Distributed database system (DDBS) = DDB + D–DBMS • The two important terms in this definitions are: • Logically interrelated. (The Application) • Distributed over a network. Dr. Mohamed Osman Hegazi

  3. Motivation for Distributed Database • The development of computer network promotes de-centralization • In a company, the database organization might reflect the organizational structure, which is distributed into units. Each unit maintains its own database • Sharing of data can be achieved by developing a distributed database system which: • Makes data accessible by all units • Stores data close to where it is most frequently used Dr. Mohamed Osman Hegazi

  4. DDBMS Advantages: • Data are located near “greatest demand” site • Faster data access • Faster data processing • Growth facilitation • Improved communications • Reduced operating costs • User-friendly interface • Less danger of a single-point failure • Processor independence Dr. Mohamed Osman Hegazi

  5. DDBMS Disadvantages • Complexity of management and control • Security • Lack of standards • Increased storage requirements • Greater difficulty in managing the data environment • Increased training cost Dr. Mohamed Osman Hegazi

  6. The concept of DDB: A DDBS is not a collection of files that can be individually stored at each node of computer network. To form a DDBS, files should not only be logically related, but there should be structure among the files, and access should be via a common interface. Dr. Mohamed Osman Hegazi

  7. Distributed Database Management Systems Dr. Mohamed Osman Hegazi

  8. An Example EMP(ENO, ENAME, TITLE) ASG(ENO, PNO, DUR, RESP) PROJ(PNO, PNAME, BUDGET) PAY(TITLE,SAL) Dr. Mohamed Osman Hegazi

  9. Distributed Query • If these table is stored in one place then we can “for example” using the following query to get the name and the salary of the employee who works more than 12 months. SELECT ENAME, SAL FROM EMP, ASG, PAY WHERE ASG. DUR >12 AND EMP.ENO=ASG.ENO AND PAY.TITLE=EMP.TITLE  But if these table are distributed over deferent site then the execution of this query needs allot of process to be done , DDMS do this process and let the end user feel like database’s only user (transparence) Dr. Mohamed Osman Hegazi

  10. Distributed Database Transparency • The concepts of DDB is to fragment the data and store each fragment on its site. • Data may be replicated on different site (replication) • DDBMS hide these details from the user and makes the distribution transparent to the users. Distributed Database Transparency Features • Distribution transparency • Transaction transparency • Failure transparency • Performance transparency • Heterogeneity transparency Dr. Mohamed Osman Hegazi

  11. Distributed DB Design Top-down approach: • have a database • how to split and allocate to individual sites Two issues in top-down design • Fragmentation • Allocation Multi-databases (or bottom-up): • combine existing databases • how to deal with heterogeneity & autonomy Dr. Mohamed Osman Hegazi

  12. Fragmentation • Horizontal Primary depends on local attributes R Derived depends on foreign relation • Vertical R Dr. Mohamed Osman Hegazi

  13. Motivation: Two sites: Sa, Sb Qa  Qb Sa Sb Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E from E where loc=Sa where loc=Sb and… and ... Dr. Mohamed Osman Hegazi

  14. 5 Joe Sa 10 E 7 Sally Sb 25 8 Tom Sa 15 # Name Loc Sal .. .. F = {F1,F2} 5 Joe Sa 10 7 Sally Sb 25 At Sb .. 8 Tom Sa 15 At Sa .. F1 = loc=Sa(E) F2 = loc=Sb(E)  primary horizontal fragmentation Dr. Mohamed Osman Hegazi

  15. Qa: Select … loc = SA ... Loc=SA sal < 10 Qb: Select … loc = SB ... Loc=SA sal  10 F3 F2 Prefer F2 to F1 and F3 Loc=SB sal < 10 F1 Loc=SB sal  10 Dr. Mohamed Osman Hegazi

  16. Horizontal Fragmentation : Peer to peer relationship – brothers Dr. Mohamed Osman Hegazi

  17. Vertical fragmentation Example: E E2 E1 R[T]  R1[T1], R2[T2],…, Rn[Tn] Ti  T  Just like normalization of relations Dr. Mohamed Osman Hegazi

  18. PROJ PNO PNAME BUDGET LOC P1 Instrumentation 150000 Montreal P2 Database Develop. 135000 New York New York P3 CAD/CAM 250000 New York New York P4 Maintenance 310000 Paris P5 CAD/CAM 500000 Boston PNO PNAME LOC P1 Instrumentation Montreal P2 Database Develop. New York P3 CAD/CAM New York P4 Maintenance Paris P5 CAD/CAM Boston Vertical Fragmentation example PROJ1: information about project budgets PROJ2: information about project names and locations PROJ1 PROJ2 PNO BUDGET P1 150000 P2 135000 P3 250000 P4 310000 P5 500000 Dr. Mohamed Osman Hegazi

  19. Grouping Attributes E1(#,NM,LOC) E2(#,SAL) Example: E(#,NM,LOC,SAL) E1(#,NM) E2(#,LOC) E3(#,SAL) Which is the right vertical fragmentation? ….. Dr. Mohamed Osman Hegazi

  20. Vertical Fragmentation : branch relationship – parents and son Dr. Mohamed Osman Hegazi

  21. Hybrid Fragmentation R  HF HF R1 R2   VF VF VF VF VF      R11 R12 R21 R22 R23 Dr. Mohamed Osman Hegazi

  22. Allocation Example: E  F1 = loc=Sa(E); F2 = loc=Sb(E) Fragment E F1 F1 Site c Site a F2 Site b Do we replicate fragments? Where do we place each copy of each fragment? Dr. Mohamed Osman Hegazi

  23. read - only queries 1 update queries Allocation Alternatives • Non-replicated • partitioned : each fragment resides at only one site • Replicated • fully replicated : each fragment at each site • partially replicated : each fragment at some of the sites • Rule : If replication is advantageous, otherwise replication may cause problems Dr. Mohamed Osman Hegazi

  24. Very hard problem Optimization problem • What is the best placement of fragments and/or best number of copies to: • minimize query response time • maximize throughput • minimize “some cost” • ... • Subject to constraints • Available storage • Available bandwidth, processing power,… • Keep 90% of response time below X • ... Dr. Mohamed Osman Hegazi

  25. Replication Replication is to store copies of the same data in more than one location (site) and then these copies must be consistency updated "Despite the distance from each other" Controlling the updating of these copies is done by one of two techniques: Lazy replication: it is to update the data after the completion of work on one of the copies (master copy). This means that update is done outside the boundaries of transaction Eager replication: is to update the replicated data within the transaction boundaries while working on one of the copies. • central update(initial copy primary copy): update the primary copy first and then update the secondary copy. This method leads to lack of synchronization of the update, which facilitates control of consistency, but may lead to the problems of the bottleneck • Or update everywhere: ​​updating the copies in all places make all the copies of equal opportunities for the update. Dr. Mohamed Osman Hegazi

More Related