1 / 8

Publishing a Corpus

Publishing a Corpus. Presentation by Sandra Busch. The Planning Stage. Can others make use of your corpus too? Is it likely to be availale + useable in the future? Remove reliance on Particular individuals Institutional arrangements Technologies  Archive your corpus!.

cecily
Download Presentation

Publishing a Corpus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Publishing a Corpus Presentation by Sandra Busch Publishing a Corpus

  2. The Planning Stage • Can others make use of your corpus too? • Is it likely to be availale + useable in the future? • Remove reliance on • Particular individuals • Institutional arrangements • Technologies  Archive your corpus! Publishing a Corpus

  3. Organization of Materials • Corpus should be “fit for purpose” rather than perfect • Storage + organization with future generations in mind • Find a convention for naming materials • Field materials must be clearly labeled Publishing a Corpus

  4. Storage Formats • Working format • Whichever you find most convenient • Archival format • Openly accessible to the public • Supported by good software tools • Best reproduction of the original Publishing a Corpus

  5. Storage Formats • XML is appropriate for long-term preservation • PDF, RTF, Word are only for working or presentation purposes • Professional archival services  e.g. AHDS (Arts and Humanities Data Service (www.ahds.ac.uk) Publishing a Corpus

  6. Access to the Corpus • Arguments NOT to make the corpus widely available • To avoid legal issues • To ensure that the creator has the first chance to use the data • To retain the option to release the corpus commercially • Concern about unrestricted access + piracy Publishing a Corpus

  7. Access to the Corpus • It‘s too much trouble to administer distribution  Weigh up these issues + decide whether you want to publish your corpus in an archive Publishing a Corpus

  8. Bibliography • EMELD School of best practice: Archives http://emeld.org/school/classroom/archives/archive-digital.html (06.11.2007) • Developing Linguistic Corpora: a Guide to Good Practice: Chapter 6 "Archiving, Distribution and Preservation" Martin Wynne (University of Oxford) http://ahds.ac.uk/guides/linguistic-corpora/chapter6.htm (06.11.2007) Publishing a Corpus

More Related