1 / 20

Data Storage options on Windows Azure

Data Storage options on Windows Azure. Govind Kanshi MTC. Way to skin cat store. Hosting options What you need to worry about Availability Performance Scale ... Where do I store data. Hosting option. Hosted Host your own What you need to worry about Availability

ronia
Download Presentation

Data Storage options on Windows Azure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Storage options on Windows Azure Govind Kanshi MTC

  2. Way to skin cat store • Hosting options • What you need to worry about • Availability • Performance • Scale... • Where do I store data

  3. Hosting option • Hosted • Host your own • What you need to worry about • Availability • Performance (more compute/bw/better storage) • Scale (throughput/latency/storage) • Management/Monitoring • Cost

  4. Hosting option Path • Hosted (not my headache option) • No admin – (majority – setup/maintenance) • Availability – Better and cheaper • Very little planning/spend the size of mc, resources • Focus on application not on admin/mgmt. issues

  5. Hosting Options Path • Hosted • No admin – (majority – setup/maintenance) • Availability – Better and cheaper • Very little planning/spend the size of mc, resources • Focus on application not on admin/mgmt. issues • Host your own • Flexibility (use jobs, use replication, use broker) • Roll your own Availability, Performance, upgrade,patching • Plan your scale, spend • Plan for Admin – have inhouse expertise

  6. Offerings • Relational • Hosted • SqlAzure • Host your own • Sql Server, Oracle, MySql, Postgres • Non Relational • Hosted • Table Storage – key/value, Blob/Page store • Mongo • Host your own • Cassandra., Mongo, Redis

  7. Availability • Hosted • SQLAzure • Local transparent failover – no direct access to replicas • Replicas – Remote ? In future (bkup and restore) • Replicas – Read Only ? – In future (local vs across dc) • Azure Storage • Local transparent failover – no direct access to replicas • Remote replication (no guarantee SLA but usually within minutes) • Host your own • Availability sets • Need to setup Virtual Network • Need to create synch mechanism • Need to setup failover mechanism • AlwaysOn for SQL servers, Other databases need to get it right like SQL Server • Use Azure storage – push backup(log+data) via Azure or self.

  8. Performance • Hosted • Azure provides various options • SqlAzure premium vs Regular (remove noisy neighbor issue) • Pretty soon other services will distinguish themselves by performance(think H) • SQlAzure premium provides reserved IOPs • Host your own • Choose better compute • Choose better storage • Soon good news on more options • Eod you need to create monitoring, fixing & do planning

  9. Scale (Up/Out) • Hosted • SqlAzure • Web/Business – storage vs SqlPremium isolated perf • HDInsight • Scaleout vs scaleup of nodes (disruptive) • Table Storage/Azure Blog/Queues - Service Bus(little diff) • Unlimited storage(overall 200TB) – no explicit limit (no scale up sku) • Host your own • Need to plan for provisioning of storage/compute based on offering (redis vs Cassandra vs Hbase). Monitoring/Handling failover etc extra effort.

  10. Management/Monitoring • Hosted • API or Dashboard (mostly) • Everything abstraced – Cost/operations which matter than os/memetc • Mostly auto managed/healed with with overall backend taking care of many things • No worries about patch mgmt, backup schedules etc… • Host your own • Roll out your own (time vs what to expose/use/act upon) – Cloud aware SW needed. System Center can do x things • Backend can take care of say compute failover or storage but rest stuff needs to be built upon.

  11. Cost • Hosted • Generally easy (volume stored, unit/processed/sent) • For ISV Billing is still an exercise – should become better • Host your own • Roll your own – basically what you use is what you pay. • Plus licensing • Plus dedicated people(sometimes hierarchy, one to do day-day jobs, another to help business/dev)

  12. What to check for in Host your Own • License portability • Certification • Support • Preferred usage • Dev/Test vs Production

  13. Why diff kind of store • Data is complex - struct of struct of maps • Data is changing the shape • Lot of data is collected – scale of storage • Time Series • Sensors • Audit events • Data is schema? • easy to add new fields, and even completely change the structure of a model. • Need query model over shape rather than just key/value or pseudo mapping to Relational world • Low Latency high volume

  14. What kind of data • What is my scenario • Caching – Velocity, MemcacheD, Redis, Riak • Counters/Speed/Write – Velocity, Redis, Cassandra • Transactions – Database, SQL Azure (federation) • Documents/jsonfiedclass/shape – MongoDB, RavenDB, Riak * • Write large amount of data with throughput – Cassandra,Azure Storage • Full Text Search – Solr/ElasticSearch, Sphinx • Store data for scale out compute – Hadoop • Store data on specialized Appliance – PDW * Wished we could query shape data rather than fitting in relational world of columns/rows

  15. Where do I store my data - Location Data Lake/Store everything, Ref Data Tx data Tx Data Session data Entity data Shared high throughput Storage Dedicated Machine Shared raw, batch long term storage Low latency Local Memory Low latency Shared Memory Shared entity Storage HDInsight AzureTable In Node Cache Relational DB Azure Cache SQLAzure Relational DB

  16. Or another way to think • Will I write lot of data and need to store & query it • Will need very low latency • Can I compromise on consistency • What are my business needs (how fast we are growing), Can I afford to take a break and get/roll in new store

  17. How will we get/store the data • Query • SQL, LINQ, ORMed (challenge mapping to every language) or REST • Custom (query format, compression,serialization) • Tunable Consistency • Out of 5 nodes only when 3 respond yay – consider written • Out of 5 nodes when 2 respond yay – take that value

  18. Guidance

  19. End

  20. Compare them – summary (evolving) * :Most of them support, # :specific product support , + :partial support

More Related