IDPD17: A day with the Digital Preservation Team

diesen Beitrag auf Deutsch lesen

 

November 30th 2017 is the first International Digital Preservation Day (IDPD17). IDPD will take place on every last Thursday of November and was initiated by the Digital Preservation Coalition to raise awareness of digital preservation and to increase visibility of worldwide activities which safeguard endangered digital materials.

As part of IDPD17, TIB’s team Digital Preservation is sharing insights into the daily activities in this blog.

Oleg Nekhayenko:
I‘m Oleg and I am working in the DFG funded LaZAR project.  One of the tasks I am currently working on as part of the project is the mapping of non-standard database metadata to Dublin Core. This will feed into the conceptualization for an automatic Ingest into our digital preservation system via the OAI interface.
In addition to the project related work I am writing scripts for various tasks, for example for our CD and USB-Imaging workflows where the script separates the digital objects from the data carriers and prepares the package structure for digital preservation.  Whenever I have some time left, I work on the further development of our submission application or the TIB metadata proxy which maps GVK-metadata from the union catalog to Dublin Core in order to be added to the preservation packages.

Merle Friedrichsen:
I spent today trying to figure out which of our e-journal holdings are already being archived by other institutions. This work is necessary so that we can archive only those e-journals, which no one else is “taking care of” already. As a first step, I received a complete list of our e-journal holdings  from  the serials team. The Keepers Registry collects information from different institutions who archive journals, such as Portico, the CLOCKSS Archive or the Library of Congress. Using the registry’s API you can query which institution is archiving what volumes of a specific e-journal title. My colleague Oleg wrote a script which allows me to pass the list of our ISSNs to the API and receive a spreadsheet in response, containing info on the archival status. And lo and behold: the majority of our e-journals are not being archived yet!

Thomas Bähr:
Not every institution is able to implement an in-house solution for digital preservation. Because of this, we have developed a product which allows us to offer digital-preservation-as-a-service to other institutions. And because every institution has different needs and requirements, I am working on concepts and answers to the vast amount of different technical, organizational, legal and financial questions. This is not a work I can tackle all by myself and a great team is contributing to the effort  on a daily basis!

Michelle Lindlar:
TIB has a visiting delegation from the Institute of Medical Information and Library / Chinese Academy of Medical Sciences today. A central part of this visit is an information exchange on digital preservation activities. I’m really looking forward to the presentations of our Chinese colleagues! After all, China hosted the first of 14 iPRES conferences in 2004 (International Conference on Digital Preservation)!
Also, some work for the „Rosetta Format Library Working Group“ (FLWG) is on the table today. The roadmap of the Rosetta software, which we use for the digital preservation of our holdings, is co-steered by customers. Working groups on various topics allow Rosetta customers from all around the globe to discuss needs and requirements and to feed those into institutional processes as well as into the product’s roadmap. The FLWG – as the name suggests – deals with everything-concerning-formats. Together with colleagues from New Zealand, the USA, Switzerland, Belgium and Germany we are currently checking the mapping of various technical metadata extractors to the Rosetta DNX profile.

Franziska Schwab und „Nimbie“:
Our disc autoloader Nimbie and I are currently working on a project to safeguard unique digitized BMBF research reports available in TIFF master, PDF derivatives  and OCR files on CD-Rs. The oldest CDs in that collection are up to 13 years of age and were burnt by a service provider. Here is a short project profile:

Project: Mass securing files from CDs using Nimbie USB robot
Holdings: approx. 14.000 CD-Rs with old digitization output from an outsourced project
Resources: 1 person 2 ½ h / day, 1 robot 9-10 h / day
Hardware: Mutli-functional disc autoloader Nimbie USB Plus with ASUS BW-16D1HT drive and USB-connector
Software: Disc to Computer

Workflow:

  1. Load Nimbie with 100 CDs
  2. Disc To Computer: check configuration and select target folder
  3. Start copy process
  4. …3 hours later…
  5. Save log file and check for errors
  6. Unload Nimbie
  7. Fill out checkliste for documentation of copy process / identification of duplicates
  8. Check copied files for completeness
  9. Load Nimbie with 100 CDs

After this scripts are run across the output to prepare the copied files for digital preservation:

  1. Rename folders by catalog ID
  2. Define representations