1 / 22

Challenges in Ubiquitous Data Management

Challenges in Ubiquitous Data Management. Michael Franklin UC Berkeley August 2000. Ubiquitous Computing. “In ten years, billions of people will be using the Web, but a trillion "gizmos" will also be connected to the Web.” Asilomar Rep. on DB Research, Dec. 1998 You’ve heard it before…

lfarr
Download Presentation

Challenges in Ubiquitous Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges in Ubiquitous Data Management Michael Franklin UC Berkeley August 2000

  2. Ubiquitous Computing • “In ten years, billions of people will be using the Web, but a trillion "gizmos" will also be connected to the Web.”Asilomar Rep. on DB Research, Dec. 1998 • You’ve heard it before… • Smartphones, PDAs, Smartcards, badges, wearables, lightswitches, toasters, … • Worldwide sales of Internet-enabled appliances projected to grow from 5.9M units in 1998 to 55.7M units in 2002. IDC via H&Q report M. Franklin – Aug. ‘00

  3. Many people per computer One person per computer Many computers per person Ubiquitous Computers (Picture is by way of Randy Katz) Information Appliances More Scaled down PCs, desktop metaphor PC + Network Distribution WS/Server Time Sharing Batch RJE Less Less More Personalization M. Franklin – Aug. ‘00

  4. Ubiquitous Connectivity • Tremendous improvements in Internet backbone bandwidth and reductions in diameter. • Broadband connectivity to the home and office (i.e. the “last mile”) is solved. • Wireless technologies are enabling anytime-anywhere connectivity. M. Franklin – Aug. ‘00

  5. Ubiquitous Data Access • But, ubiquitous computing and connectivity aren’t worth much withoutubiquitous data access. • “Fundamentally, the ability to access all information from anywhere and have ONE unified and synchronized information repository is critical to making appliances useful.” Hambrecht and Quist, iWord , 3/99 • Ubiquitous data access will put existing data management techniques to the test, in all aspects – searching, location, reliability, consistency, … M. Franklin – Aug. ‘00

  6. Ubiquitous Data: Past Accomplishments • Database Systems • Relational Model and extensions • Data Independence (physical and logical) • Declarative query processing and cost-based opt. • Storage structures/distribution/parallelism/… • Transactions – a comprehensive model for concurrency and fault tolerance. • Information Retrieval • More natural query interfaces • User interaction/feedback is designed in • Tolerance of ambiguity due to natural language and unstructured data. M. Franklin – Aug. ‘00

  7. Ubiquitous Data – State of the Art • Everyone uses a database system and/or search engine every day Although they may not realize it!(the true test of “ubiquity”). • The Internet and WWW have become a ubiquitous means of global data dissemination and exchange. • Databases play a crucial but largely invisible role here. • XML and related standards are enabling increasingly sophisticated interoperation. • Wireless access provides anytime-anywhere access and enables location-centric applications. M. Franklin – Aug. ‘00

  8. Where it’s heading • TV/Phone/Internet/etc convergence. • Mobility and user context-sensitive applications. • Global utility-oriented infrastructure. • Data streams (broadcast and otherwise). • Alerters (agents?) and context-sensitive delivery. • “In the future, the main bottleneck will be human attention.” But how far in the future? M. Franklin – Aug. ‘00

  9. A True Paradigm Shift • Data management research in the 80’s and 90’s was all about “ilities” (or “alities”): • functionality • scalability • serializability • optimality • interoperability M. Franklin – Aug. ‘00

  10. Paradigm Shift (continued) • In the world of ubiquitous data access we need to shift from “ilities” to “ations”! • functionality • scalability • serializability • optimality • interoperability  personalization  globalization  synchronization  flow regulation  integration M. Franklin – Aug. ‘00

  11. 1) Personalization • Filtering the data flood • There’s too much information out there • Systems will have to help people find what they need • Systems will actively suggest information and sites based on user’s interests. • Data delivery must be made context-aware • Location-centric applications • Task, and role-sensitive delivery • The key technology is User Profiles M. Franklin – Aug. ‘00

  12. Example: “Data Recharging” Profiles • Three main components: 1) Content-based specifications of user interests (read “queries”) 2) Specifications of user priorities/requirements priority ordering, resolution, freshness, dependencies 3) User Context information – where, when, who, what • This info is available in the user’s PIM data! • Profiles must be both specified explicitly and learned automatically. M. Franklin – Aug. ‘00

  13. 2) Globalization • Universal connectivity + cheap storage enables new solutions for availability and durability: • Large-scale and dynamic replication • Fault/disaster tolerance through RAID-like techniques. • Security is a fundamental open problem here. • Archival storage: capture all data,voice,video,programs • How to find anything? • Formats become obsolete – how to carry them forward? • How to ensure a realistic reproduction? e.g. old video games or web surfing circa 2000 M. Franklin – Aug. ‘00

  14. Example: Berkeley’s OceanStore (Picture is from J. Kubiatowicz) Canadian OceanStore • Based on a global storage “utility” model • Think of a safe and principled Gnutella Sprint AT&T IBM Pac Bell IBM M. Franklin – Aug. ‘00

  15. 3) Synchronization • Many different types of data • Enterprise – Inventory, ERP, … • Web content – Stock quotes, news, weather, other • Personal data – calendar, contacts, email, … • All have different requirements for consistency. • Traditional notion of ACID transaction semantics not appropriate for most of these. M. Franklin – Aug. ‘00

  16. Synchronization (continued) • Other problems with transactional approaches: • Scale – # of devices, data size, etc. • Varying degrees of connectivity. • Lack of clear transactional boundaries (e.g. continuous queries) • “Closed world” assumption is inappropriate. • Data spans multiple administrative domains • Interactive systems – “User in the loop” • Some alternatives: • Data Dissemination, Data Recharging, Explicit user-directed synchronization, Epidemic Algorithms M. Franklin – Aug. ‘00

  17. Example: Epidemic Protocol (Picture is by way of Ugur Cetintemel) Conflict? M. Franklin – Aug. ‘00

  18. 4) Data Flow Regulation • Pervasive network connectivity enables global-scale federated DBMSs. • Improvements in heterogeneous DBMS and emerging standards enable Internet query processing. • Users and data are increasingly mobile. • Continuous Data streams from sensors, stock tickers, updates to web sites, etc. M. Franklin – Aug. ‘00

  19. Why Standard DBMS Won’t Work • Can’t deal with arbitrary services • Can’t adapt while running • need a “continuous” query optimizer • need to handle midstream failover or redirection. • Reload, alternate sites • Uses the wrong Query Processing algorithms • Can’t produce incremental results but data stream never ends! • Can’t understand cost/quality tradeoffs • maybe I’d settle for something less if it went faster M. Franklin – Aug. ‘00

  20. static plans continuous opt. late binding anarchy reopt. Adaptive Approaches • Increased uncertainty argues for increased adaptivity. • Wide-area nets and admin domains introduce uncertainty. • Pesky users introduce uncertainty. • Mobility and streams introduce uncertainty. • The Telegraph project at Berkeley is building an adaptive data flow processing infrastructure using radically new techniques (almost, but not quite anarchy…). Dynamic, Parametric, Competitive, … Query Scrambling Eddy ??? current DBMS XJoin M. Franklin – Aug. ‘00

  21. 5) Interoperation • Data and application integration is still a difficult problem. • People have realized that there is no silver bullet. • The Internet has made the tough work required to do this integration seem more worthwhile. • XML and its related standardization efforts provide the basic plumbing for large-scale interoperation. • The key is to develop a more flexible and evolving approach. M. Franklin – Aug. ‘00

  22. Conculsions • Ubiquitous Data Access is real • UDA challenges all aspects of existing data management technology. • We need to build systems to protect humans from the data flood, but good old systems performance issues still matter. • What is the killer app for Ubiqutious Data Access? • Most existing examples are • boring (replay TV) • silly (business meetings in the park) • or irritating (buy milk now!!!) M. Franklin – Aug. ‘00

More Related