1 / 25

Access Across Time: How the NAA Preserves Digital Records

Access Across Time: How the NAA Preserves Digital Records. Andrew Wilson Assistant Director, Preservation. What I will talk about. NAA Context Some Concepts NAA Implementation NAA Process flow Preservation Software Platform (Xena). National Archives of Australia.

Download Presentation

Access Across Time: How the NAA Preserves Digital Records

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Access Across Time: How the NAA Preserves Digital Records Andrew WilsonAssistant Director, Preservation

  2. What I will talk about • NAA Context • Some Concepts • NAA Implementation • NAA Process flow • Preservation Software Platform (Xena)

  3. National Archives of Australia • Established 1946 as part of National Library • Independent since 1960 • Legislation: Archives Act 1983 • Approx. 420 staff in 12 locations • Budget ca. A$65 million. • 350 shelf kilometres of records • Separate preservation funding since 2001

  4. Digital Preservation Project • Started in 2001/2 FY • Cost to date approx. A$2 million (30 months) • Aim: to develop viable approach to the preservation of ‘born digital’ records for long term accessibility and use • Deadline: July 2003

  5. Note • NOT a digital archive but an approach to digital preservation • Well developed archival processes that can be applied to records irrespective of format:- appraisal/selection- transfer- description- retrieval and access • Project purely about preservation

  6. Some Definitions • Records Recorded information created or received and maintained by an organisation in the transaction of business • Digital Records Records in digital form processed by computers • Not: Systems or working applications

  7. The preservation problem • Technological obsolescence • Hardware • Software • Restrictions on the use of technology

  8. Object Researcher Traditionally Researcher directly experiences the record through its source object Preserve the object and you preserve the record

  9. Source Process Performance Researcher But…digital records are performances Researcher experiences the record through a performance Preserve the performance and you preserve the record

  10. A Two Part Solution 1. Keep a master copy of every source we accept into custody - Passive Access • Researcher gets the 'Zeros and Ones', not the performance 2. Active Intervention to recreate the performance - Replace the source and process - Active Access to the 'essence' of the performance - Based on experience with Audiovisual material

  11. The essence of the record • What we want to preserve out of the performance - What aspects are essential to the record's value? - What aspects are incidental to the record's value?

  12. Our preservation approach • Select open and well documented data formats • Migrate records into these formats (‘normalisation’) • Support open source software tools that can read these formats

  13. Preservation System • 3 separate components 1. Quarantine 2. Preservation 3. Storage • All components physically separated from each other and all other NAA networks • Access to hardware restricted to digital preservation staff

  14. Quarantine server Records Transferwritten to server • Checksums verified • Objects undergo virus check Checked objectswritten to transportmedium • Transport medium stored on repository shelf for at least 4 weeks • Objects then re-checked for viruses using new virus definitions Digital Preservation recorder captures information aboutactions on each digital object DPr • For the technically minded:- Dell PowerEdge 2600 server • 2 x 2GHz processors- .7Tb disk store- independent UPS

  15. Preservation server Transport mediumis attached topreservation server Preservation software platform (Xena) processes digital objects Output setswritten to newtransport media Xena outputs two new objects and calculates new checksums for each: Wrapped bitstream ‘Normalised version’ Digital Preservation recorder captures information aboutactions on each digital object DPr • For the technically minded:- Dell PowerEdge 2600 server • 2 x 2GHz processors- .7Tb disk store- independent UPS

  16. Digital Repository Third copy on digital tape which is stored offsite Simple managementapplication to allowaccess to digitalobjects (eg. DSpace) Transport mediaare attached torepository server Copies written to new media for access Digital Preservation recorder captures information aboutactions on each digital object DPr RAID Storage To Access 2 copies on RAID storage - Configured as RAID 10 - Automated, regular, frequent verification of checksums • For the technically minded:- Dell PowerEdge 2600 server • 2 x 2GHz processors- .7Tb disk store- fibre channel between server and RAID- independent UPS RAID Storage

  17. NAA Implementation Follows Open Archival Information System framework Non-proprietary, open source solution Based on the extensible markup language (xml)

  18. xena= xml electronic normalising of archives

  19. xena • File-based • Java/Swing application • Runs in Java 1.3 + • Packaged as an executable .jar file • Modular • Multiple document interface

  20. xena functionality: A core module plus ‘plug-in’ modules which do: • File format guessing • File ‘normalisation’ • XML encapsulation • Process and data verification • File viewing

  21. Core module The core consists of: I. Graphical User Interface components II. Plug-in management components III. Generic validation components

  22. Plug-in modules • Plug-ins are created for identified data types that are to be processed. Each plug-in consists of: • A guesser component • One or more input format type components • A normalised format type • One or more normalisation modules • One or more view components • Sorting functionality • Validation functionality • Printing functionality • GUI interaction methods

  23. DEMONSTRATION OF XENA

  24. Contacts Andrew WilsonProject ManagerAtoR Digital Preservation Project +61 2 6212 3694andreww@naa.gov.au Web: http://www.naa.gov.au/recordkeeping/preservation/summary.html

  25. THANK YOU ANY QUESTIONS?

More Related