1 / 29

Metadata Best and Worst Practices

Metadata Best and Worst Practices. Ron Daniel, Jr. Joseph Busch. Overview of Talk. Introduction Who we are Where we see corporate use of DC going Best & worst practices, and the current state Reminders on creating catalog records Remainder of Talk:

karruda
Download Presentation

Metadata Best and Worst Practices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata Best and Worst Practices Ron Daniel, Jr. Joseph Busch

  2. Overview of Talk • Introduction • Who we are • Where we see corporate use of DC going • Best & worst practices, and the current state • Reminders on creating catalog records • Remainder of Talk: • Where we see corporate use of metadata and Dublin Core in 5 years • What’s the current state of corporate metadata and Dublin Core? • How do we get there from here? • What to avoid along the way?

  3. Outline • Introduction • One Potential Future for Corporate Metadata • Basic Vision • Limitations • Current State of Corporate Metadata • Areas for Greatest Improvement • Worst Practices • Conclusion

  4. Enterprise Metadata Layers Source: Todd Stephens, www.rtodd.com

  5. Using DC elements for Integration Metadata dc:type=“recipe”, dc:format=“text/html”, dc:language=“en” Legend: ? – 1 or more * - 0 or more

  6. Limitations of Previous View • Set of top-level applications very limited • The ROI of these applications will ultimately drive the development of ‘enterprise’ metadata. • Provides only one perspective on a very complex problem. • Does not show: • Practices used to create and maintain the information • Tools used to create and maintain the information • Growth of the system over time • Cyclic nature of Semantic Metadata also being part of an Asset Collection • … Based on our experiences, an organization’s culture and practices are vastly more important than the specific tools they select.

  7. Outline • Introduction • One Potential Future for Corporate Metadata • Current State of Corporate Metadata • Areas for Greatest Improvement • Worst Practices • Conclusion

  8. Experiences with Corporate Metadata • “Integration Metadata” is an approach from a very sophisticated group within a sophisticated organization • We encounter organizations at different levels of sophistication, which require different solutions. • e.g. Entity Extraction software exists out-of-the-box, but is best applied when the entities are known. Will the organization be able to build and maintain such lists? • To start to asses an organization’s level of metadata sophistication, we have begun to define a Metadata Maturity Model, based on similar ideas from software development.

  9. What’s a Maturity Model? • Began with the very pragmatic question: How can we predict the likelihood of a software development effort succeeding or failing? • This question has led to the development of “Maturity Models”, the assumption being that more “mature” development shops will have a higher likelihood of success. • Best-known is the CMMI (Capability and Maturity Model – Integrated) from the Software Engineering Institute at Carnegie Mellon University. • Categorizes organizations into one of five maturity levels. • Does it work? • “…compared with an average Level 2 program, Level 3 programs have 3.6 times fewer latent defects, Level 4 programs have 14.5 times fewer latent defects, and Level 5 programs have 16.8 times fewer latent defects”. Michael Diaz and Jeff King – “How CMM Impacts Quality, Productivity,Rework, and the Bottom Line” • Model is very complex and expensive to measure. Some accuse it of restraint of trade because of the impact on small software shops.

  10. CMMI Structure Early levels look at planning, later levels look at metrics. • Maturity Models are collections of Practices. • Main differences in Maturity Models concern: • Degree of Categorization of Practices • Descriptivist or Prescriptivist Purpose Source: http://chrguibert.free.fr/cmmi

  11. Developed by Joel Spolsky as reaction to CMMI complexity Positives - Quick, easy, and inexpensive to use. Negatives - Doesn’t scale up well: Not a good way to assure the quality of nuclear reactor software. Not suitable for scaring away liability lawyers. Not a longer-term improvement plan. The Joel Test Do you use source control? Can you make a build in one step? Do you make daily builds? Do you have a bug database? Do you fix bugs before writing new code? Do you have an up-to-date schedule? Do you have a spec? Do programmers have quiet working conditions? Do you use the best tools money can buy? Do you have testers? Do new candidates write code during their interview? Do you do hallway usability testing? Scoring: 1 point for each ‘yes’. Scores below 10 indicate serious trouble. At the Other Extreme, The Joel Test Source: http://www.joelonsoftware.com/articles/fog0000000043.html

  12. Aspects of Search and Metadata Maturity “Limiting” Processes are harmful practices which interfere with maturity. We are collecting business practices around metadata and taxonomy, and categorizing them by Area and Level

  13. So what’s the state of the art? • Started a survey to find out

  14. Search Practices

  15. Metadata Practices

  16. Taxonomy Practices

  17. Future Surveys • Future surveys will cover other process areas: • Team structure and job roles • Metrics • Categorization and metadata editing tools • To participate, or stay informed of results, see link on our homepage – www.taxonomystrategies.com

  18. Outline • Introduction • One Potential Future for Corporate Metadata • Current State of Corporate Metadata • Areas for Greatest Improvement • Worst Practices • Conclusion

  19. Data quality practices for descriptive metadata trail those from standard data management. How many of us implement practices like those in the Data Management Scorecard? For Integration Metadata to take hold, these will become more important. Data Quality Practices Not Adopted Fully Adopted Excerpt from Data Management Scorecard, Baseline Consulting

  20. What could possibly go wrong with a little edit? • ERP (Enterprise Resource Planning) team made a change to the product line data element in the product hierarchy. • They did not know this data was used by downstream applications outside of ERP. • An item data standards council discovered the error. • If the error had not been identified and fixed, the company’s sales force would not be correctly compensated. “Lack of the enterprise data standards process in the item subject area has cost us at least 30 person days of just ‘category’ rework.” Source: Danette McGilvray, Granite Falls Consulting, Inc. 20

  21. Web CMS Archives Intranet Search ERMS ’ ’ Other Controlled Items CVs Taxonomy governance environment Change Requests & Responses Published Facets Consuming Applications 1: External vocabularies change on their own schedule 2: Team decides when to update facets within Taxonomy ISO 3166-1 Vocabulary Management System Other External Notifications 3: Team adds value via mappings, translations, synonyms, training materials, etc. Intranet Nav. ERP DAM Custodians … Other Internal … ’ ’ 4: Updated versions of facets published to consuming applications Taxonomy Governance Environment

  22. Outline • Introduction • One Potential Future for Corporate Metadata • Current State of Corporate Metadata • Areas for Greatest Improvement • Worst Practices • Conclusion

  23. Worst Practices • Building a taxonomy before knowing: • How it will be shown to users, and what that UI will cost • How the data will be tagged, and what that tagging will cost • The benefits it is supposed to achieve • Tools, then Requirements, then Purpose • “Use it or lose it” budgeting • Throwing good money after bad • Sourcing part of the taxonomy from someplace that changes it without warning • No consideration of metrics

  24. How to estimate costs—Tagging Consider complexity of facet and ambiguity of content to estimate time per value. Is this field worth the cost? Estimated cost of tagging one item. This can be reduced with automation, but cannot be eliminated. Inspired by: Ray Luoma, BAU Solutions

  25. Sample ROI Calculations Ongoing cost of tagging due to 15% content growth. Inspired by: Todd Stephens, Dublin Core Global Corporate Circle

  26. Outline • Introduction • One Potential Future for Corporate Metadata • Current State of Corporate Metadata • Areas for Greatest Improvement • Worst Practices • Conclusion

  27. Future of DC within Corporations • Dublin Core receiving considerable attention within corporations • Implementations are happening, but not universally • What will help cross the chasm? • Tie-ins with standard tools, such as Google Enterprise, SAP, … • Integration metadata, not new metadata! • Continued presence of DCMI • Documents, tools, case studies, presence Technology Adoption Curve # Adopters Time I Innovators EA Early Adopters C Chasm P Pragmatists (Early & Late Majority) T Traditionalists Source: Moore, G. A. Crossing the Chasm, 1991

  28. Questions? Thank you!

  29. Controlled item: Communications plan • Stakeholders: Who are they and what do they need to know? • Channels: Methods available to send messages to stakeholders. • Need a mix of narrow vs. broad, formal vs. informal, interactive vs. archival, … • Messages: Communications to be sent at various stages of project. • Bulk of the plan is here

More Related