1 / 50

Uncorking the Varietals: Social Tagging, Folksonomies & Controlled Vocabularies

Uncorking the Varietals: Social Tagging, Folksonomies & Controlled Vocabularies. Margaret Maurer Head, Catalog and Metadata Kent State University Libraries and Media Services. In wine making - What is a Varietal?. A wine made from a single, named grape variety.

ulf
Download Presentation

Uncorking the Varietals: Social Tagging, Folksonomies & Controlled Vocabularies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Uncorking the Varietals: Social Tagging, Folksonomies & Controlled Vocabularies Margaret Maurer Head, Catalog and Metadata Kent State University Libraries and Media Services

  2. In wine making - What is a Varietal? • A wine made from a single, named grape variety. • Cabernet Sauvignon wines are made from cabernet sauvignon grapes • Chardonnay wines are made from chardonnay grapes

  3. In information seeking – on the Web or in the catalog • Access and identification systems may be controlled by librarians–controlled vocabularies • Access and identification systems may be dynamically generated by users–social tagging, folksonomies • These are different varieties of access and identification systems

  4. This presentation • Controlled vocabularies • Social Tagging • Folksonomies • My recommendations First we’ll talk about the cabernet sauvignons – the controlled vocabs

  5. Purpose of a controlled vocabulary • To create sets of objects • To serve as a bridge between the searcher’s language and the author’s language • To provide consistency • To improve precision and recall

  6. Characteristics of a controlled vocabulary • Features a single, authorized form of heading • Often features a syndetic structure of cross-references • Based on belief that the successful use of the catalog is based on the quality of the individual records

  7. The authority record structure • Records the standardized form • Ensures the gathering together of records via that access point • Enables standardized catalog records • Documents decisions taken • Records all other heading forms and provides links from them to the standardized form

  8. Benefits of controlled vocabularies • Promotes discovery generally • Promotes discovery when the aboutness of something has nothing to do with words in the resource or its representation • Imaginative literature (Genre headings) • Humanities • Promotes pre-coordinated displays expand access–http://cinema.library.ucla.edu

  9. Benefits when combined with keyword searching • Keywords hook into strings of terms most efficiently • Users can be routed by pre-coordinated strings

  10. Controlled vocabularies support faceted catalogs • Encore • Evergreen • Endeca • WorldCat Local All provide hyperlinks to authorized headings

  11. Weaknesses of controlled vocabularies • The artificially controlled language is not necessarily natural language—Cookery anyone? • Subject searches are the most problematic for users • It may work better in theory than in practice • It is costly to perform necessary maintenance • Cost is seen to outweigh the benefits by many administrators

  12. Library of Congress Subject Headings - LCSH • Has a long and well-documented history • Commonly used • Is contained in millions of bibliographic records • Strong institutional support from LC

  13. More benefits of LCSH • The rich vocabulary covers most subjects • It imposes synonym and homograph control • There are machine assisted authority control mechanisms • There is pre-coordination with LCC • The music subject heading system is well developed

  14. Weaknesses of LCSH • It is a generalist taxonomy that can’t always provide needed granularity • Terminology currency • It doesn’t allow for post-search coordination (it is pre-coordinated) • It suffers from LC Collection bias

  15. More weaknesses of LCSH • Training needed • Requires some orientation to use effectively • Is not always accurately applied by catalogers • Maintenance • It is difficult to maintain when changes occur

  16. Authority control outside the catalog • Data critical mass  tipping point? • Homogeneity of data in terms of subject matter • Requirements within data community’s users for specificity • Size • Computing power • Wikipedia’s “disambiguation”

  17. ZoomInfo http://www.zoominfo.com/Default.aspx

  18. What if we did open up our authority files to the web? • National Library of Australia’s People Australia Project http://www.nla.gov.au/initiatives/peopleaustralia/ • Wikipedia Persondata-Tool http://www.ifla.org/IV/ifla73/papers/113-Danowski-en.pdf

  19. Is ontology overrated? • Physicality requires ontologies for searching, but systems with hyperlinks do not • Browse versus search may eliminate the need for creating lists of authorized headings

  20. Ontological classification • Works well when the domain to be organized is small, has formal categories, has stable entities, is restricted and has clear edges • Does not work well when the domain to be organized is large, has no formal categories, is unstable, is unrestricted and has no clear edges

  21. Ontological classification • Works well when the participants are expert catalogers, authoritative sources of judgement, coordinated users or expert users • Does not work well when the participants are uncoordinated, armature, naïve or non-authoritative

  22. Now we talk about the Chardonnays – social tagging and folksonomies

  23. What are tags? • Keywords or terms associated with or assigned to a piece of information • They enable keyword-based classification and search of information

  24. Common Web sites that use tags include • Del.icio.us – Social bookmarking site • Flickr – Image tagging • LibraryThing • Gmail - Webmail • YouTube

  25. Tags, and therefore social tags and folksonomies are • Dynamic categorization systems • Often created on-the-fly • Chosen as relevant to the user – not to the creator, cataloger or researcher • A social activity (more on this later) • Hopefully one small step toward a more interactive and responsive library system

  26. Social tags are • Non-hierarchical • A way to create links between items by the creation of sets of objects • A means of connecting with others interested in the same things

  27. Way baaack in 2003… • Del.icio.us includes identity in its social bookmarking • Flickr includes tags • Lists of tags became a tool for serendipitous discovery (folksonomies)

  28. Why is tagging so popular? • It is easy and enjoyable • It has a low cognitive cost • It is quick to do • It provides self and social feedback immediately

  29. People tag things • To find them again • To get exposure and traffic • To voice their opinions • Incidentally as they perform other tasks • To take advantage of functionality built on top of a folksonomy • To play a game or earn points

  30. Putting the social in tagging • Tags allow for social interaction because when we navigate by tags we are directly connecting with others • People tag for their own benefit

  31. Don’t confuse tags with keywords or full-text searching • Keywords are behind the scenes, tags are often visibly aggregated for use and browsing • Keywords can not be hyper-linked • Keywords imply searching, tags imply linking • Full-text searching is passive, tagging is active • It’s more about connecting items rather than categorizing them.

  32. What is a Folksonomy? • Folksonomy refers to an “emergent, grassroots taxonomy” • An aggregate collections of tags • A bottom-up categorical structure development • An emergent thesaurus • A term coined by Thomas Vander Wal

  33. How do folksonomies work? • The searcher defines the access, but • The aggregation of the terms has public value • It’s a typically messy democratic approach

  34. What makes folksonomies popular? • Their dynamic nature works well with dynamic resources • They’re personal • They lower barriers to cooperation

  35. Tagging and the consequent folksonomies work best when • It’s easy to do • It’s not commercial in nature • Taggers have ownership • Taggers are more likely to tag their own stuff than they are your stuff • It has been shown to work well on the Web

  36. The unexpected development: terminological consensus • Collective action yields common terms • Stabilization may be caused by imitation and shared knowledge • The wisdom of the crowd

  37. Is your tagging influenced by my tagging? • Of course it is! • People are beginning tag in ways that make it easier for others to fine like stuff • Shared meaning consequently evolves for tags • Most used tags become most visible

  38. Strengths of folksonomies • Cost-effective way to organize Internet • Social benefits • It’s inclusive • For many environments, they work well

  39. Issues with meaning • They do not yield the level of clarity that controlled vocabularies do • Term ambiguity – words with multiple meanings • No synonym control

  40. Issues with specificity • Variable specificity for related terms • Broadness of terms impacts precision – terms are often imprecise • Mixed perspectives

  41. Issues with structure • Singular and plural forms create redundant headings • No guidelines for the use of compound headings, punctuation, word order • No scope notes • No cross references

  42. Issues with accuracy • Collective ‘wisdom’ of the tagging community • How does wrong information impact retrieval • Conflicting cultural norms • Sometimes authority counts

  43. “Spagging” and other problems • Opening doors to opinion tags • Tagging wars • “Spagging”  Spam tagging

  44. Tidying up the tags…? • Lists of tagging norms have been developed • Are there programmatic solutions? • Users know they are looking at tags • By tidying, do we destroy the essence of why this works? • Do we realistically have the resources?

  45. Recommendations Don’t assume that one size fits all • Retain controlled vocabularies in the catalog • Explore ways to use controlled vocabularies to help organize the internet by re-purposing controlled vocabularies that already exist • Invite Folksonomies to the party in the catalog to gain their benefits • Explore ways to combine the two systems

  46. Recommendations When you invite folksonomies into the catalog, do so strategically, and carefully • Don’t put terms in the same index as controlled vocabularies • Find ways to associate terms applied across editions of works • Need for mediation, or at least observation • The crowd is not necessarily the best arbiter of specific terminology

  47. Recommendations Always remember why people tag • People tag things because they want to find them, not because they want others to find them • Be aware that this will impact the quality of the terms, and their frequency

  48. Recommendations Controlled vocabularies could be better utilized than they currently are • Subject structures are underutilized in the ILS • Controlled vocabularies that exist are not being exported to the Web • Well-connected terms foster discovery – let’s connect them. Index those cross references where available

  49. Questions? Margaret Maurer mbmaurer@kent.edu

More Related