Recent Changes · Search:
 

The State Library of Tasmania runs the STORS digital repository of electronic publications, launched in 2002. The priority has been ingest; preservation will come later. The system is designed to make it as simple and easy as possible for people to contribute material and in return, give them a stable url that points to a page describing the document. The submitter only has to provide a title and check a box confirming that the copyright holder authorises the document for upload. The submitter may also specify options such as point to a later or previous document version, or set an expiry date.

The service was initially promoted to the government sector as an easy way to meet their archival obligations. Organisation champions, usually a librarian, take responsibility for integrating document submission into their work processes. This includes linking to the repository from the agency’s library catalogue or its web site. The vast majority of documents are submitted in PDF. The preferred, and most commonly accessible, format today is HTML. They discovered that you cannot reliably convert PDF to HTML; for example table entries can get mis-matched, which is unacceptable. They want to adopt PDF/A so documents are self-describing.

It is not always clear what is the difference between a “publication”, which goes to STORS, and a private record, which goes to Archives. If in doubt, they take it — disk space is cheap and it’s better to preserve than to lose. Web documents are also publications, but tend to be more fluid than print documents. Their practice is to take the final version of a page, rather than all versions.

The STORS repository is not a document search and discovery service; this is provided by the library catalogue and Google. An information seeker always gets taken first to the document’s context page, rather than the document itself. This avoids the issue of going to an out of date or superseded version without realising.

The current implementation uses a learning object repository — at the time DSpace was new and the Library preferred a Microsoft Windows solution to one that ran under Unix and Linux. However, a learning object repository is not designed for this purpose, especially the preservation aspect. The Library wants to use the same class of software as others in this area, but is waiting for solutions to mature. Migrating to new software will have to be done carefully, to ensure each metadata record is matched to the correct file(s). The current software uses proprietary directory and file naming, so this is a point of risk.

The Library’s business processes are still catching up with the system; for example:

  • it takes time to merge electronic document intake with paper document intake
  • if the cataloguer disagrees with the metadata the publisher has assigned, they have to decide whether to ask the publisher to review it
  • the Library catalogues the object, not the metadata
  • the need to support granular (multi-part) items can cause problems; if one part of a multi-part item has a later version, this can’t be recorded
  • new activities are discovered, such as running md5 checksum calculations to detect possible malicious file changes (none found so far)
  • align the Library’s process with the submitter’s process; for example, don’t delete the original until you receive confirmation that the document is lodged in STORS
  • there is a debate about whether to add more discovery metadata

In future, the Library will need to increase cataloguing throughput to deal with the growing impact of electronic publishing. This means cataloguing for effective discovery outcomes, rather than to technical cataloguing rules. People are discovering items through full text search, rather than metadata, so the Library can adopt a more minimal cataloguing policy. Keeping up with incoming content will increasingly be the biggest issue; the Library can address the issues of content discovery and preservation later.

« University of Tasmania | Fact Finding | Monash University »

Home Page

Main.SideBar (edit)

PmWiki

pmwiki.org

ShareAlike Licence

Edit · History · Print · Recent Changes · Search · Links
Page last modified on 26 November 2006, at 06:34 PM