1 / 25

SQL Server Data Quality Services

SQL Server Data Quality Services. Marc Jellinek Principal Consultant – Neudesic marc.jellinek@neudesic.com. About Me. Experience Principal Consultant - Neudesic Assistant Director (SQL Team) – Application Engineering at Ernst & Young IT Manager at MLB Network

brand
Download Presentation

SQL Server Data Quality Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL Server Data Quality Services Marc Jellinek Principal Consultant – Neudesic marc.jellinek@neudesic.com

  2. About Me Experience Principal Consultant - Neudesic Assistant Director (SQL Team) – Application Engineering at Ernst & Young IT Manager at MLB Network Sr. Technology Specialist at Microsoft Technologies Microsoft SQL Server 6.0, 6.5, 7.0, 2000, 2005, 2008, 2008 R2 and 2012 Relational Engine, Analysis Services, Integration Services and Reporting Services Marc Jellinek – marc.jellinek@neudesic.com

  3. Session Objectives Introduction to SQL Server Data Quality Services (DQS) Understanding the problem Demo Where do we go from here?

  4. Setting the stage Building from the SQL Server Series: Master Data Services in SQL Server 2012, presented by Patrick Gallucci http://www.neudesic.com/media/webcasts/20120501/20120501.wmv (start at 6:26) PASS MDM/DQS Virtual Chapter http://masterdata.sqlpass.org Based on demos from “SQL Server 2012 Developers Update”

  5. The Data Quality Problem Space

  6. What’s The Problem My name is MarcJellinek Marc <> “Mark”, “Marck” or “March” Jellinek <> “Jelinek”, “Jellineck”, “Jelineck”, “Jelliner”, “Jeliner” or “Jellyneck” R r

  7. What’s the Problem

  8. The nightmare Scenario The Customer Dimension Jelineck, Marck Jelineck, March Jelliner, Marc Jelliner, Mark Jelliner, Marck Jelliner, March Jelliner, Marc Jelliner, Mark Jelliner, Marck Jelliner, March Jellyneck, Marc Jellyneck, Mark Jellyneck, Marck Jellyneck, March • Jellinek, Marc • Jellinek, Mark • Jellinek, Marck • Jellinek, March • Jelinek, Marc • Jelinek, Mark • Jelinek, Marck • Jelinek, March • Jellineck, Marc • Jellineck, Mark • Jellineck, Marck • Jellineck, March • Jelineck, Marc • Jelineck, Mark

  9. Analytic Impact Average Revenue per customer Average Profit per customer Number of customers Customers per Geography Customers by Income Customers by Gender Customers by Educational Level Customers by Product Bought

  10. Obligatory Truisms The accuracy of your reporting is determined by the accuracy of your data (Garbage In, Garbage Out) Decisions made based on data will only be as good as the data on which you are basing your decisions. You can’t manage what you can’t measure. Inaccurate measurements lead to interesting management challenges.

  11. The Data Quality Solution Space Amend, remove or enrich data that is incorrect or incomplete. This includes correction, enrichment and standardization. Identifying, linking or merging related entries within or across sets of data. Cleansing Matching Profiling Monitoring Analysis of the data source to provide insight into the quality of the data and help to identify data quality issues. Tracking and monitoring the state of Quality activities and Quality of Data.

  12. SQL Server 2012 Data Quality Services Data Quality Services (DQS) is a Knowledge-Driven data quality solution enabling IT Pros and data stewards to easily improve the quality of their data High quality data is critical to effective business intelligence and to business activities DQS is an on-premise Data Quality product in SQL Server 2012, extendible with knowledge from multiple parties thru Azure DataMarket Richer DQ knowledge and capabilities in the cloud will make it even easier to provide high quality data Included with SQL Server 2012 Enterprise and BI Editions

  13. Key Data Quality Services Concepts • Knowledge-Driven • Based on a Data Quality Knowledge Base (DQKB) • Semantics • Data Domains capture the semantics of your data • Knowledge Discovery Acquires additional knowledge the more you use it • Add user-generated knowledge&3rdparty reference data providers • Open and Extendible • Easy to use • User experience designed for increased productivity

  14. DQS Architecture Azure Market Place DQ Clients DQS UI MS DQ Domains Store Categorized Reference Data Categorized Reference Data Services Knowledge Discovery and Management DQ Active Projects 3rd Party / Internal DQ Server Reference Data API (Browse, Get, Update…) RD Services API (Browse, Set, Validate…) Interactive DQ Projects DQ Engine Reference Data Services Data Exploration Cleansing Data Profiling & Exploration Knowledge Discovery Reference Data Matching Reference Data Sets Knowledge Base Store DQ Projects Store Common Knowledge Store SSIS DQ Component MS Data Domains Local Data Domains Published KBs MDS Excel Add in

  15. Data Quality Services Processes Cloud Services Knowledge Management Reference Data Build Discover / Explore Data EnterpriseData Manage Knowledge Integrated Profiling Progress Status Notifications Knowledge Base Correct & standardize Match & De-dupe Use DQ Projects

  16. Basic Definitions Knowledge Base Stores all the knowledge related to a specific type of data source Container for domains Domain Semantic representation of a type of data in a data field or column Trusted values, invalid values and erroneous data Synonym associations, term relationships, validation and business rules, matching policies Matching Rule Set of rules and conditions that determine a match or duplicate

  17. Data Quality Services Components Data Quality Server Data Quality Client DQS Cleansing Component for SQL Server Integration Data Quality Processes in Master Data Management

  18. Data Quality Services Components Data Quality Server SQL Server Databases DQS_MAIN DQS Stored Procedures, the DQS Engine and published Knowledge Bases DQS_PROJECTS Data required for knowledge base management and DQS project activities DQS_STAGING_AREA Intermediate staging area where source data is copied and processed

  19. Data Quality Services Components Data Quality Client Standalone application Designed for both data stewards and DQS Administrators Perform knowledge management, data quality projects and administration in one user interface Allows for domain management, matching policy creation, data cleansing, matching, profiling, monitoring and server administration. Can be installed on a remote computer

  20. Data Quality Services Components DQS Cleansing Component in SQL Server Integration Services Performs data cleansing as a part of an SSIS package Alternative to running a cleansing project within the Data Quality Services Client application Data Quality Processes within Master Data Services Perform de-duplication on source data and master data within the Microsoft SQL Server Data Services Add-in for Microsoft Excel.

  21. DEMO

  22. Resources DBI207: Using Knowledge to Cleanse Data with Data Quality Services http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/DBI207 Data Quality Services Blog http://blogs.msdn.com/b/dqs Books Online for SQL Server - Data Quality Services http://technet.microsoft.com/en-us/library/ff877925(SQL.110).aspx Install Data Quality Services http://technet.microsoft.com/en-us/library/gg492277.aspx Used Master Data Services Configuration Manager, set up IIS http://msdn.microsoft.com/library/ee633744%28SQL.110%29.aspx Troubleshoot Installation and Configuration Issues (Master Data Services in SQL Server 2012 http://go.microsoft.com/fwlink/?LinkId=226284 SQL Server 2012 Developer Training Kit Web Installer http://www.microsoft.com/en-us/download/details.aspx?id=27721 SQL Server 2012 Update for Developers Training Workshop http://social.technet.microsoft.com/wiki/contents/articles/6981.sql-server-2012-update-for-developers-training-workshop.aspx SQL Server 2012 Update for Developers Training Kit http://social.technet.microsoft.com/wiki/contents/articles/6982.sql-server-2012-developer-training-kit-bom-en-us.aspx SQL Server 2012 Update for Developers Training Kit Content http://social.technet.microsoft.com/wiki/contents/articles/6976.sql-server-2012-developer-training-kit-content-en-us.aspx MSDN – Data Quality Services http://msdn.microsoft.com/en-us/library/ff877925(v=sql.110).aspx MSDN Discussions – Data Quality Services http://social.msdn.microsoft.com/Forums/en-US/sqldataqualityservices/threads Technet – Data Quality Services http://technet.microsoft.com/en-us/library/ff877925.aspx PASS MDM/DQS Virtual Chapter http://masterdata.sqlpass.org

  23. THANK YOU

More Related