1 / 22

Tuesday, January 27, 2009

Tuesday, January 27, 2009. “In the confrontation between the stream and the rock, the stream always wins, not through strength but by perseverance.” H. Jackson Brown. Distributed Computing Class:BSIT-8. Instructor : Dr. Raihan Ur Rasool. Lecture Objectives.

peri
Download Presentation

Tuesday, January 27, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tuesday, January 27, 2009 “In the confrontation between the stream and the rock, the stream always wins, not through strength but by perseverance.” • H. Jackson Brown

  2. Distributed ComputingClass:BSIT-8 Instructor: Dr. Raihan Ur Rasool

  3. Lecture Objectives • To understand the practical concepts of • P2P • SOA • Distributed Algorithms Loose Coupling and the degree of loose coupling

  4. Outline • Peer to Peer SystemsEvolution of P2P systems, P2P middleware, Routing overlay, case studies: Chord, Pastry, TeaPastry • Service Oriented Architecture • Vision of web & Evolution of web • Web Services Web Services, Web Services Architecture, SOAP, WSDL, UDDI, Service Description and IDL, Directory Service for use with Web Services, XML Security, Coordination of web Services.

  5. Intro to P2P Systems [reliable resource sharing layer over unreliable] • Demand for services --eliminating separately-managed servers • The scope of expanding popular services by adding to number of the computers hosting them is limited when all the host must be owned & managed by the service provider • Administration and fault recovery costs • Bandwidth that can be provided to a single server site over available physical link • Major service provider all face this problem with varying severity

  6. Intro to P2P Systems • Purpose: • Describe some general techniques • Construction of P2P applications • Scalability, reliability and security • Problem: • placement of objects, manage workloads • ensure scalability without adding overheads • P2P applications exploit resources available at the edges of the internet • *Storage, content, cycles, human presence

  7. Intro to P2P Systems • P2P application that exploit resources available at the edges of the internet • *Storage, content, cycles, human presence • Traditional client-server provide access to these but only on single machine or tightly coupled servers • This centralized design required few decisions about placement & management of resources

  8. Intro to P2P Systems • P2P application that exploit resources available at the edges of the internet • Storage, content, cycles, human presence • Traditional client-server provide access to these but only on single machine or tightly coupled servers • This centralized design required few decisions about placement & management of resources • In P2P -- algorithm for the placement and subsequent retrieval of information objects are a key aspect of the system design. It’s a system which is • Fully decentralized & self organizing • Can dynamically balance the storage and processing loads between all the participating computers as they join and leave

  9. P2P Design Characteristics • Their design ensures that each user contributes resources to the systems • Although they may differ in the resources that contribute, all the nodes in a peer to peer system have the same functionality capabilities and responsibilities • Their correct operation dose not depend on the existence of any centrally administered systems • They can be designed to offer a limited degree of anonymity to the providers and users of resources • Key issues for the their efficient operation is the choice of algorithm for placing and retrieving data on many hosts • Balance of load • Availability without much overhead • Participants availability to system is unpredictable

  10. Evolution of P2P • Volatile resources --Strength ? • No guaranteed access to individual resources • Probability of failure can be minimized • Can be grouped in three generations • First generation – Napster music exchange service [OpenNap 2001] • Second generation – file sharing applications with greater • Scalability, anonymity & fault tolerance • Guentella, Kaza, Freenet • Developed with help of middleware layers • Application independent management of distributed resources on a global scale • E.g. Pastry, Tapestry, CAN, CHORD, JAXTA • Provide guarantees of delivery for requests in a bounded number of network hops • Place replicas of resources, by keeping in mind volatile availability & trustworthiness, locality

  11. P2P Middleware - GUID • Resources are identified by Global Unique Identifier GUID • Derived from secure hash from resource’s state • HASH makes a resource self certifying • Client receiving the resource can check the hash • This requires that states of resources are immutable • P2P systems are inherently best suited for the storage of immutable objects – music file, images • Mutable objects sharing can be managed by set of trusted servers to manage the sequence of versions e.g Oceanstore, Ivy – more in section 10.6

  12. Overlay routing vs IP routing (shared characteristics)

  13. Distributed Computation • Only a small portion of the CPU cycles of most computers is utilized. Most computers are idle for the greatest portion of the day, and many of the ones in use spend the majority of their time waiting for input or a response. • Loosely coupled –data/computation • A number of projects have attempted to use these idle CPU cycles. The best known is the SETI@home project, but other projects including code breaking have used idle CPU cycles on distributed machines.

  14. How many of you did not shutdown the computer and are now here in this room? • Assume we are 15 people running a screensaver without performing real work. The talk lasts one hour. • Opportunity loss for one hour: Speed: 15 * 0.8 GFlops = 12 GFlops Comp: 12GFlops * 1h = 43‘200 billion of floating point operations • Costs for one hour: Power consumption: 15 * 300 W = 4500 W during one hour = 4.5 kWh Money: 4.5kWh à 0.20 CHF = 0.9 CHF Oil needed: 0.36 liter (Gasoline: 12.3 kWh/kg) CO2 emissions: 0.81 kg CO2 (Gasoline: 2.27 kg CO2 / liter)

  15. During one year (15 people)… • Opportunity loss for one year: Speed: 15 * 0.8 GFlops = 12 GFlops Comp: 12GFlops * 1y = 378 432 000 billion of floating point ops • Costs for one year: Power consumption: 15 * 300 W = 4500 W during one year = 39.42 MWh Money: 39.42 MWh => 7 884 CHF (525.6 CHF per head) Oil needed: 3153.6 liter (Gasoline: 12.3 kWh/kg) CO2 emissions: 7 t CO2 (Gasoline: 2.27 kg CO2 / liter)

  16. Distributed computation • Usage & Exploitation best example • SETi@home (Search for Extra-Terrestrial Intelligence) • Portions a steam of digitized radio telescope data into 107 second work unit, each about 350KB, distribute them on clients computer • Work unit is redundantly distributed to 3-4 users, to guard against errors & bad nodes • Coordination work is handled by a single server • 3.91 million PCs participated in this by 2002 • In one year they processed 221 million work units, data worth 27.36 teraflops on average • Need for Grid –bluebrain

  17. Discussion Question: Computer or Infomachine? • The first computers were used primarily for computations. One early use was calculating ballistic tables for the U.S. Navy during World War II. • Today, computers are used more for sharing information than computations— perhaps infomachine may be a more accurate name than computer? • Distributed computation may be better suited to Grid and peer-to-peer systems while information tends to be hierarchical and may be better suited to client/server.

  18. Current Peer-Peer Concerns • Topics listed in the IEEE 9th annual conference:

  19. Dangers and Attacks on P2P • Poisoning (files with contents different to description) • Polluting (inserting bad packets into the files) • Defection (users use the service without sharing) • Insertion of viruses (attached to other files) • Malware (originally attached to the files) • Denial of Service (slow down or stop the network traffic) • Filtering (some networks don’t allow P2P traffic) • Identity attacks (tracking down users and disturbing them) • Spam (sending unsolicited information)

  20. Where are we ? • Introduction • Napster and its legacy –self study • Peer-to-Peer middleware –self study • Routing overlays • Overlay case studies: Pastry, Tapestry • Application case studies: Squirrel, OceanStore, Ivy • Summary Discussion date: 6th January First four and last two pages only

  21. Reading Assignment • Reading • Napster and its legacy • Peer-to-Peer middleware Discussion date: 10th February (first 8 pages, conclusion and future work)

More Related