1 / 49

A Logical Model for Digital Archives

Draft document 0.1. A Logical Model for Digital Archives. Rathachai Chawuthai rathachai.chawuthai@live.com . Information Management CSIM / AIT. Agenda. Introduction Digital Preservation Underlying Community Knowledge Logical Model Prototype Related works. Introduction. Motivation.

saniya
Download Presentation

A Logical Model for Digital Archives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Draft document 0.1 A Logical Modelfor Digital Archives RathachaiChawuthai rathachai.chawuthai@live.com Information Management CSIM / AIT

  2. Agenda • Introduction • Digital Preservation • Underlying Community Knowledge • Logical Model • Prototype • Related works

  3. Introduction

  4. Motivation • Our valued digital information in the present may not be accessible or rendered originally in next 100 years. • Technological Obsolescence • Deterioration of digital storage media • A reader in next 100 years may not understand our today digital information as same as author’s purpose. • Author and reader do not have same context knowledge • Changing of contextual knowledge over the time • It could have the common knowledge somewhere that every local knowledge refer to. Yuan Li (2011), Flouris (2007)

  5. User are to be able to access and understand digital information in the future SDA 2011 at Berlin

  6. Objectives • To develop a theory for digital archive • To design an information model representing contextual knowledge • To explore knowledge by linking archives across communities ??????? • To develop a prototype system in order to test the theory

  7. Scopes • Do a theory by extending the existing theory of Flouris“Steps towards a theory of information preservation” (Underlying Community Knowledge) • Design “Context Model” of“Underlying Common Community Knowledge” • Use linked metadata to model contextual knowledge • Refer to OAIS information model • Integrate with PREMIS metadata • Build an archival system • Refer to OAIS guideline • Integrated with Fedora-Commons as a back-end service

  8. Digital Preservation

  9. Example in 22nd Century What is ? File is read protected Error: No program can open file format .doc Please key password Error: DVD unreadable !7rò??àÕ??ߟ²ÂÚ Õ??ߟ²ÂÚ ðŽɳ !Z?g! Õr/ÕŸ/?rò?

  10. Overview • Digital preservation is an active management of digital information to endure its accessibility over the time. • Digital preservation types • Bit PreservationAbility to produce a particular sequence of bits from storage media at any time. • Data PreservationAbility to rendered the produced bit stream and produce a meaningful output from it at any time. • Information PreservationAbility to understand the rendered digital object at any time Flouris (2007)

  11. Recommendation • Preservation policy • To use well-known file format,such as, .pdf, .xml, .tiff, .jpg, .avi, and etc • Preservation strategies • Secure storage system, Software migration, Emulation, Media refreshment, and Disaster planning. • Content policy • Track user activities, such as, ingest, migration, and etc. • Peer review be for deposit into repository • Right and agreement • Because some preservation activities need to duplicate and modify digital content, it needs to record right and agreement to digital object. Yuan Li (2011)

  12. OAIS Information Model PDI Preservation Description Information Content Information Archive Packaging Information Package 1 Descriptive Information about Package 1 OCLC.org

  13. OAIS Workflow Ingest Query SIP DIP Disseminate Consumer Producer Access AIP Store Manage Administrator OCLC.org

  14. Preservation Metadata Basic features • Provenance • Describe history of creation, ownership, access, and change • Authenticity • Ensure trustworthiness (Does digital resource render originally?) • Preservation activities • Record process supporting preservation, such as migration • Technical environment • Provide name and version of hardware, platform, OS, and software that is required to render digital resources • Rights management • Inform concern of intellectual property rights and agreement that need to be observed when execute preservation process.E.g. does a creator allow to copy his/her work or not? OCLC.org, usenix.org

  15. PREMIS Overview • Information providing to support preservation management • Technical information (Characteristics) • E.g. creator, created date-time, file format, software/hardware environment, … • Information about action of a digital object • E.g. ingest, migrate, verify, … • Inhibitors • Password, encryption, … in order to access digital objects • Digital Provenance • Record change of object format e.g. .DOC  .PDF • Contain application, version, environment, … in order to render digital objects • Significant Properties (If important) • Object’s characteristics e.g. font, formatting, color, …., etc • Look and feel • Rights • E.g. Rights and agreement metadata associated with preservation PREMIS from LOC.gov

  16. PREMIS Entities PREMIS from LOC.gov

  17. Challenge • Information Preservation Conceptual Level Physical Level • Data Preservation • Bit Preservation Flouris (2007)

  18. Underlying Community Knowledge

  19. Designated Community (DC) • DC is a group of people who • Have common knowledge (concept) • Have common background • Have common contextual knowledge • Have same language • Knowledge of DC called Underlying Community Knowledge (UCK) Flouris (2007)

  20. Underlying Community Knowledge (UCK) • UCKlooks like: knowledge, background, context, commonsense, semantic, and etc. that are understandable by all people in DC • It means that People in the same DC know the same UCK and understand every Concept inside UCK Flouris (2007)

  21. Problem UCK 2 UCK 1 Name : “RathachaiChawuthai” Read Write Producer Consumer First name = “Rathachai” Family name = “Chawuthai” First name = “Chawuthai” Family name = “Rathachai” Flouris (2007)

  22. Approach UCK 2 UCK 1 Delta Name : “RathachaiChawuthai” Read Write Producer Consumer First name = “Rathachai” Family name = “Chawuthai” First name = “Rathachai” Family name = “Chawuthai” Flouris (2007)

  23. Some Preliminary Ideas Towards a theory of digital preservation • GiorgosFlouris TBD Reference

  24. Challenge ? Name= First name + Last name Name= Family name + First name ? UCK A UCK B

  25. Logical Model

  26. Goal • A model must: • Represent contextual knowledge • Be a reference for all underlying community knowledge as a common knowledge • Identify associations and differentiates between common knowledge and community knowledge • Identify associations and differentiates between community knowledge • Capture change or evaluation of common knowledge itself • Be able to link concepts among designated community based on common contextual knowledge

  27. UCCK • Underlying Common Community Knowledge • A common contextual knowledge for all underlying community knowledge

  28. UCCK • C a set of concepts • R a set of Relations • HC a set of hierarchy of Classes • HR a set of hierarchy of Relations • IC a set of instances of C • IR a set of instances of R • A0 a set of Axiom (Inference relations of logic) R C IR IC HC HR AO Yildiz(2006)

  29. UCCK UCCK R C IR IC HC HR AO Derive UCK2 Derive UCK1

  30. UCCK UCK2 UCK1

  31. UCCK UCCK UCK2 UCK1

  32. UCCK UCCK UCK2 UCK1

  33. UCCK Future Past

  34. The Event Ontology TBD http://motools.sourceforge.net/event/event.html Raimodn (2007)

  35. Prototype

  36. As an Consumers Archival Information System • Browse digital objects • Search relevance digital objects across repositories • Link to other related digital objects under contextual knowledge across systems • Customize own designated community Consumers Link Link Another Archival Information System Another Archival Information System

  37. As an Archivist Archival Information System • Ingest digital objects • Define links to other objects • Add metadata according to digital object’s type • Add underlying community knowledge • Add contextual knowledge Archivist

  38. As an Administrator Archival Information System • Define metadata for each type of digital object • Define underlying common community knowledge • Define underlying community knowledge • Define designated communities Administrator

  39. Requirements • Be able to manage variety types of digital objects • Be able to link digital object to other ones semantically • Be able to provide context knowledge by linking digital objects for each designated community • Be able to manage variety types of metadata • Be able to do semantic search • Be able to store knowledge as ontology

  40. Fedora-Commons • Repository system • Features • Collect digital objects and their relations • Collect metadata • Collect ontology • Support versioning • Only one repository system that • Support Semantic Search • Provide Web Services • Work as back-end services Duraspace.org

  41. Drupal • Popular CMS • Features • Rich user management • Rich content management • Flexible for customized modules • Only one CMS that • supports SPARQL endpoint • Work as front-end service to end-user Drupal.org

  42. Islandora • A Drupal’s module • Features • Provide administration panel • Provide fast-search to Fedora database • Support many formats of metadata • Support many types of digital objects • Only one Drupal’s module that: • Integrate with Fedora-Commons • Works with GSearchservice (Semantic Search of Fedora-Commons) • Work as front-end administration services Islandora.ca

  43. System architectures Consumers Archivist Administrator Islandora Other content modules Administration Services Drupal Fedora Core Service SOLR • GSearch • Generic Search Database

  44. To find Architecture, like, Hitest’s diagram TBD Reference

  45. Related works

  46. CASPAR • Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval • Is an Integrated Project co-financed by the European Union within the Sixth Framework Programme • Add context knowledge to digital object following its characteristics and representations • Similarity • Integrate context knowledge of digital objects and estimate gap of designated communities’ knowledge with semantic technology • Advantage of my project • Explore knowledge by linking archive across designated communities referring to underlying common community knowledge • Emphasize changing common community knowledge over the time Casparpreserves.eu

  47. SHAMAN • Sustaining Heritage Access through Multivalent Archiving • Is an Integrated Project co-financed by the European Union within the Seventh Framework Programme • Represent context as relations between digital objects • Integrate context information by processes, such as, ingested, accessed, and reused with ontological representation • Similarity • Represent context information by linking digital objects and other things semantically based on document processes • Advantage of my project • Explore knowledge by linking to other digital objects and other things semantically referring to underlying common community knowledge capturing knowledge from real-world concept (rather than document processes) Reference

  48. ?

  49. References • CASPAR: Cultural, artistic and scientific knowledge for preservation, access an retrieval. eu funded project (fp6-2005-ist-033572). http://www.casparpreserves.eu • http://public.ccsds.org/publications/archive/650x0b1.PDF • http://www.loc.gov/standards/premis/ • http://www.drupal.org • http://www.duraspace.org/ • http://islandora.ca

More Related