Polona/Labs

Grzegorz Płoszajski & Tomasz Gruszkowski

A general approach to digitization

These days, cultural goods are made publicly available ever more frequently. One may watch them on a screen of a computer or a smartphone and listen to them when one pleases. Also, they may be archived in a digital format.

The process of creating digital equivalents of original works, called digital reproductions, is referred to as digitization (whereby the word digit signifies a decimal). Digitization of cultural heritage is of a special significance because it makes it possible to give access to it to a wider public and to secure original copies. Libraries, archives, museums and other institutes of culture are encouraged to carry out digitization actions, using funding from the state budget and EU funds.

First and foremost, digitization activities are required to provide digital copies which provide adequate faithfulness to original copies. It is natural to expect that they will make the users aware of how beautiful original copies are. Therefore, digital copies should be faithful and beautiful. These two goals seem to be convergent, and definitely not opposite. Still, is it sufficient to take care of faithfulness in order to preserve the beauty? To what extent could digital reproductions be compared to translations, meaning that if they are beautiful, rarely are they faithful; when they are faithful, then not necessarily beautiful?

[Picture 1, Drawing by C.K. Norwid “Eye”- from the collection of the National Library of Poland]

In certain situations sustaining beauty in a wide sense will require other actions, including an alternative organization of digitization processes as compared to situations when the aim is to secure faithfulness, especially in technical sense. In simple cases, the point may be to ensure readable print, as well as the same width of margins and the same paper color of individual pages of a book. When digitizing other artifacts, including documents with an embossed stamp or beautiful binds of old prints, to attain decent visual effect and a full, vivid reflection of their form it may be necessary to apply properly adjusted, non-standard lighting.

A special challenge is also posed by damaged materials, e.g. faded documents, photographic  and film materials. In such cases it is not sufficient to make a digital copy reflect the actual condition of those. In fact, one strives to reconstruct the original conditions, sometimes also in order to in a way restore and depict the beauty of the original. A telling example are old film tapes losing colors as they age; moreover they often show mechanical damage. In such cases not only scratches need to be removed, but one also should try to restore proper colors and sound. These additional activities enriching pure digitization are referred to as digital reconstruction or restoration.

[Picture 2, A film frame subject to digital reconstruction – before the intervention, presentation by Filmoteka Narodowa]

[Picture 3, A film frame subject to digital reconstruction – after the intervention, presentation by Filmoteka Narodowa]

 Digitization in libraries

There are two groups of materials collected by libraries. One is typical materials, including books and journals, as well as graphics, maps, photographs, leaflets, posters, as well as notes and other artifacts). Materials in this group may be viewed by library users without any specialist equipment.

The other, separate group of library holdings, is constituted by collections which require proper equipment in order to view or listen to them, i.e.:

– sound and audiovisual material: normally vinyl tapes, audio cassettes and video tapes as well as CDs and DVDs;

– microfilms intended to safeguard own collections as well as to present the collections stored in other libraries.

[Pictures 4-5, Microfilm scanner, microfilm reader and devices used in the National Library of Poland]

To digitize these materials, one must have proper equipment – different for each of the categories of materials in this group. This topic will be discussed later, after the first group of holdings has been discussed.

Simply speaking, digitization of the materials belonging to this group consists in creating individual digital images (scans or photos) presenting whole documents such as graphics, maps, photographs, leaflets and posters. Or, in the case of large artifacts, images of their parts will be created or a series of images of individual pages of books, journals and other multi-page long documents. In order to create such digital images, scanners or digital cameras are utilized.

In practice however, one must take into account many other factors. Since this text is of a practical nature, let’s start with questions that should be posed at the point of preparing a digitization project.

Digitization projects: preliminary questions

When planning a digitization project in a library, one should answer several questions. Before inquiring on the very basic topic of funds and their availability, one should first determine the following:

  1. Should the collection be subject to digitization? Is it unique, has it not been already digitized by another institution?  How much time can we spend on carrying out the project? What resources, such as devices, rooms and people, our institution has? Can we purchase digitization services from an external entity?
  2. What is the material form of our holdings meant for digitization? Are these multi-page or one-page long artifacts? Do they have typical size or are they very small or large?
  3. What is the project scale? Are we talking about one hundred or several thousand books? What is the foreseeable number of scans/photos? Do we need high-performance equipment – if so, what performance should it have?

[Picture 6, part of a book by Bruno Schulz, “The Street of Crocodiles” from the collection of the National Library presented at the website polona.pl]

  1. What conservation constraints apply? What is the physical condition of the materials? How long can the works be exposed to light? Can a book, journal or a stapled yearbook of journal be opened to the full opening of the spine or only partly? Can a book opened to a certain level be pressed using a glass panel?
  2. What is the aim of the project? To present print documents so that everyone can read or view them, or is it to provide access to it, including OCR or automatic recognition of music, raising its efficiency (and if so, whether a particular OCR software has been selected, do we know its requirements as to certain file parameters) or maybe the aim of the project is to present details such as small elements of illustrations or paper texture? What does that imply looking at required parameters of digital images?

[Picture 7, part of a music manuscript by Karol Szymanowski from the collections of the National Library of Poland and an equivalent part of a processed scan of a sonata from the book by Gebethner and Wolff provided by the University of Rochester]

  1. What requirements should be met as regards color fidelity? What does it imply in terms of lab equipment and work stand lighting?
  2. What requirements should be adopted as regards resolution of digital reproductions?
  3. Does the individual institution have proper IT infrastructure for long-term storing of files created during the digitization project; does it want to present them individually or also make them available to existing digital libraries?

More extensive information on digitation can be obtained, among others, in the book published by the National Library of Poland edited by Dariusz Paradowski, entitled “Digitalizacja piśmiennictwa”. The book in Polish is available online.