1 / 117

Querying Sensor Networks

Querying Sensor Networks. Sam Madden UC Berkeley December 13 th , New England Database Seminar. TinyDB Introduction. What is a sensor network? Programming sensor nets is hard! Declarative queries are easy TinyDB : In-network processing via declarative queries Example:

Download Presentation

Querying Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Querying Sensor Networks Sam Madden UC Berkeley December 13th, New England Database Seminar

  2. TinyDB Introduction • What is a sensor network? • Programming sensor nets is hard! • Declarative queries are easy • TinyDB: In-network processing via declarative queries • Example: • Vehicle tracking application: 2 weeks for 2 students • Vehicle tracking query: took 2 minutes to write, worked just as well! SELECT nodeid FROM sensors WHERE mag > thresh EPOCH DURATION 64ms

  3. Overview • Sensor Networks • Why Queries in Sensor Nets • TinyDB • Features • Demo • Focus: Acquisitional Query Processing

  4. Overview • Sensor Networks • Why Queries in Sensor Nets • TinyDB • Features • Demo • Focus: Acquisitional Query Processing

  5. Device Capabilities • “Mica Motes” • 8bit, 4Mhz processor • Roughly a PC AT • 40kbit CSMA radio • 4KB RAM, 128K flash, 512K EEPROM • Sensor board expansion slot • Standard board has light & temperature sensors, accelerometer, magnetometer, microphone, & buzzer • Other more powerful platforms exist • E.g. UCLA WINS, Medusa, MIT Cricket, Princeton Zebranet • Trend towards smaller devices • “Smart Dust” – Kris Pister, et al.

  6. Sensor Net Sample Apps Habitat Monitoring. Storm petrels on great duck island, microclimates on James Reserve. Earthquake monitoring in shake-test sites. • Vehicle detection: sensors dropped from UAV along a road, collect data about passing vehicles, relay data back to UAV. • Traditional monitoring apparatus.

  7. Metric: Communication • Lifetime from one pair of AA batteries • 2-3 days at full power • 6 months at 2% duty cycle • Communication dominates cost • < few mS to compute • 30mS to send message • Our metric: communication!

  8. TinyOS • Operating system from David Culler’s group at Berkeley • C-like programming environment • Provides messaging layer, abstractions for major hardware components • Split phase highly asynchronous, interrupt-driven programming model Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. See http://webs.cs.berkeley.edu/tos

  9. A B C D F E Communication In Sensor Nets • Radio communication has high link-level losses • typically about 20% @ 5m • Ad-hoc neighbor discovery • Tree-based routing

  10. Overview • Sensor Networks • Why Queries in Sensor Nets • TinyDB • Features • Demo • Acquisitional Query Processing

  11. Declarative Queries for Sensor Networks Sensors • Examples: SELECT nodeid, light FROM sensors WHERE light > 400 EPOCH DURATION 1s 1

  12. SELECT roomNo, AVG(sound) • FROM sensors • GROUP BY roomNo • HAVINGAVG(sound) > 200 • EPOCH DURATION 10s 3 • SELECTAVG(sound) • FROM sensors • EPOCH DURATION 10s 2 Rooms w/ sound > 200 Aggregation Queries

  13. Declarative Benefits In Sensor Networks • Reduces Complexity • Locations as predicates • Operations are over groups • Fault Tolerance • Control of when & where • Data independence • Control of representation & storage • Indices, join location, materialization points, RAM vs EEPROM, etc.

  14. Computing In Sensor Nets Is Hard • Limited power • Power-based query optimization • Routing indices • In-network computation • Exploitation of operator semantics (TAG, OSDI 2002) • Lossy, low-bandwidth communication • In-network computation • Caching, retransmission, etc. • Data prioritization • Remote, zero administration, long lived deployments • Ad-hoc (fault-tolerant) networking • Lifetime estimation • Semantically aware routing (TAG) • Limited processing capabilities, storage

  15. Overview • Sensor Networks • Why Queries in Sensor Nets • TinyDB • Features • Demo • Focus: Tiny Aggregation • The Next Step

  16. TinyDB • A distributed query processor for networks of Mica motes • Available today! • Goal: Eliminate the need to write C code for most TinyOS users • Features • Declarative queries • Temporal + spatial operations • Multihop routing • In-network storage

  17. Query {A,B,C,D,E,F} A {B,D,E,F} B C {D,E,F} D F E TinyDB @ 10000 Ft (Almost) All Queries are Continuous and Periodic • Written in SQL-Like Language With Extensions For : • Sample rate • Offline delivery • Temporal Aggregation

  18. TinyDB Demo

  19. Applications + Early Adopters • Some demo apps: • Network monitoring • Vehicle tracking • “Real” future deployments: • Environmental monitoring @ GDI (and James Reserve?) • Generic Sensor Kit • Building Monitoring Demo!

  20. Benefit of TinyDB SELECT COUNT(light) SAMPLE PERIOD 4s • Cost metric = #msgs • 16 nodes • 150 Epochs • In-net loss rates: 5% • Centralized loss: 15% • Network depth: 4

  21. TinyDB Architecture (Per node) SelOperator AggOperator • TupleRouter: • Fetches readings (for ready queries) • Builds tuples • Applies operators • Deliver results (up tree) TupleRouter • AggOperator: • Combines local & neighbor readings Network ~10,000 Lines C Code ~5,000 Lines Java ~3200 Bytes RAM (w/ 768 byte heap) ~58 kB compiled code (3x larger than 2nd largest TinyOS Program) • SelOperator: • Filters readings Radio Stack Schema TinyAllloc • Schema: • “Catalog” of commands & attributes (more later) • TinyAlloc: • Reusable memory allocator!

  22. Catalog & Schema Manager • Attribute & Command IF • Components register attributes and commands they support • Commands implemented via wiring • Attributes fetched via accessor command • Catalog API allows local and remote queries over known attributes / commands. • Sensor specific metadata • Power to access attributes • Time to access attribute

  23. Overview • Sensor Networks • Why Queries in Sensor Nets • TinyDB • Features • Demo • Acquisitional Query Processing

  24. Laptops! Distributed DBs! Moore’s Law! Acquisitional Query Processing • Cynical question: what’s really different about sensor networks? • Low Power? • Lots of Nodes? • Limited Processing Capabilities? Being a little bit facetious, but…

  25. Answer • Long running queries on physically embedded devices that control when and and with what frequency data is collected! • Versus traditional systems where data is provided a priori

  26. ACQP: What’s Different? • How does the user control acquisition? • Rates or lifetimes. • Event-based triggers • How should the query be processed? • Sampling as an operator! • Events as joins • Which nodes have relevant data? • Semantic Routing Tree • Nodes that are queried together route together • Which samples should be transmitted? • Pick most “valuable”?

  27. ACQP • How does the user control acquisition? • Rates or lifetimes. • Event-based triggers • How should the query be processed? • Sampling as an operator! • Events as joins • Which nodes have relevant data? • Semantic Routing Tree • Nodes that are queried together route together • Which samples should be transmitted? • Pick most “valuable”?

  28. Implies not all data is xmitted Lifetime Queries • Lifetime vs. sample rate SELECT … LIFETIME 30 days SELECT … LIFETIME 10 days MIN SAMPLE INTERVAL 1s

  29. Processing Lifetimes • At root • Compute SAMPLE PERIOD that satisfies lifetime • If it exceeds MIN SAMPLE PERIOD (MSP), use MSP and compute transmission rate • At other nodes • Use root’s values or slower • Root = bottleneck • Multiple roots? • Adaptive roots?

  30. Lifetime Based Queries

  31. In-network storage Subject to optimization Event Based Processing • ACQP – want to initiate queries in response to events CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE

  32. More Events ON EVENT bird_detect(loc) AS bd SELECT AVG(s.light), AVG(s.temp) FROM sensors AS s WHERE dist(bd.loc,s.loc) < 10m SAMPLE PERIOD 1s for 10 [Coming soon!]

  33. ACQP • How does the user control acquisition? • Rates or lifetimes. • Event-based triggers • How should the query be processed? • Sampling as an operator! • Events as joins • Which nodes have relevant data? • Semantic Routing Tree • Nodes that are queried together route together • Which samples should be transmitted? • Pick most “valuable”?

  34. Operator Ordering: Interleave Sampling + Selection SELECT light, mag FROM sensors WHERE pred1(mag) AND pred2(light) SAMPLE INTERVAL 1s At 1 sample / sec, total power savings could be as much as 4mW, same as the processor! • E(mag) >> E(light) • 1500 uJ vs. 90 uJ • Possible orderings: • 1. Sample light • Sample mag • Apply pred1 • Apply pred2 • 2. Sample light • Apply pred2 • Sample mag • Apply pred1 • 3. Sample mag • Apply pred1 • Sample light • Apply pred2

  35. Optimizing in ACQP • Sampling = “expensive predicate” • Some subtleties: • Which predicate to “charge”? • Can’t operate without samples • Solution: • Treat sampling as a separate task • Build a partial order • Solve for cheapest schedule using series-parallel scheduling algorithm • Monma & Sidney, 1979, as in Ibaraki & Kameda, TODS, 1984, or Hellerstein, TODS, 1998.

  36. Exemplary Aggregate Pushdown SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x SAMPLE INTERVAL 1s Unless > x is very selective, correct ordering is: Sample light Check if it’s the maximum If it is: Sample mag Check predicate If satisfied, update maximum

  37. d t d/2 d Event-Join Duality • Problem: multiple outstanding queries (lots of samples) ON EVENTE(nodeid) SELECT a FROM sensors AS s WHERE s.nodeid = e.nodeid SAMPLE INTERVAL d FOR k • High event frequency → Use Rewrite • Rewrite problem: phase alignment! SELECT s.a FROM sensors AS s, events AS e WHERE s.nodeid = e.nodeid AND e.type = E AND s.time – e.time < k AND s.time > e.time SAMPLE INTERVAL d • Solution: subsample

  38. ACQP • How does the user control acquisition? • Rates or lifetimes. • Event-based triggers • How should the query be processed? • Sampling as an operator! • Events as joins • Which nodes have relevant data? • Semantic Routing Tree • Nodes that are queried together route together • Which samples should be transmitted? • Pick most “valuable”?

  39. Attribute Driven Topology Selection • Observation: internal queries often over local area • Or some other subset of the network • E.g. regions with light value in [10,20] • Idea: build topology for those queries based on values of range-selected attributes • For range queries • Relatively static trees • Maintenance Cost

  40. Attribute Driven Query Propagation SELECT … WHERE a > 5 AND a < 12 Precomputed intervals = Semantic Routing Tree (SRT) 4 [1,10] [20,40] [7,15] 1 2 3

  41. Attribute Driven Parent Selection Even without intervals, expect that sending to parent with closest value will help 1 2 3 [1,10] [20,40] [7,15] [3,6]  [1,10] = [3,6] [3,7]  [7,15] = ø [3,7]  [20,40] = ø 4 [3,6]

  42. Simulation Result ~14% Reduction

  43. ACQP • How does the user control acquisition? • Rates or lifetimes. • Event-based triggers • How should the query be processed? • Sampling as an operator! • Events as joins • Which nodes have relevant data? • Semantic Routing Tree • Nodes that are queried together route together • Which samples should be transmitted? • Pick most “valuable”?

  44. Adaptive = 2x Successful Xmissions Adaptive Rate Control

  45. Delta Encoding • Must pick most valuable data • How? • Domain Dependent • E.g., largest, average, shape preserving, frequency preserving, most samples, etc. • Simple idea for time-series: order biggest-change-first

  46. Choosing Data To Send • Score each item • Send largest score • Out of order -> Priority Queue • Discard / aggregate when full Time Value t=4 t=5 [5,2.5] [1,2] [2,6] [3,15] [4,1] [5,4]

  47. Choosing Data To Send t=1 [1,2]

  48. Choosing Data To Send t=5 [1,2] |2-15| = 13 [2,6] [3,15] [4,1] |2-4| = 2 |2-6| = 4

  49. Choosing Data To Send t=5 [3,15] [1,2] [2,6] [4,1] |15-4| = 11 |2-6| = 4

  50. Choosing Data To Send t=5 [3,15] [1,2] [4,1] [2,6]

More Related