You are currently browsing the monthly archive for September 2008.

The John Rylands University Library at the University of Manchester has over 4600 images in its digital library, including several hundred images of papyri ( The largest numbers are written in Coptic and Greek, but there are also some Demotic and hieroglypic texts. The library plans to add more digitized materials over time, as this is only a fraction of their holdings (none of their Arabic-language papyri have yet been digitized, for instance).

Of interest is the fact that there are two methods for accessing the materials: a browser, and a downloadable client which provides greater functionality in searching and viewing.  The client requires a username and password, but the page that explains the two options also provides a public username and password, so access is not restricted to University of Manchester users. This option strikes me as an excellent one, allowing individual users to choose what will be most effective for them.

The same page also gives overall copyright information on the collection (with a note that individual images may have different copyright restrictions). In general, private study and educational uses are permitted, although the latter must acknowledge the university, and boilerplate acknowledgment language is provided. Other uses require written permission and usually fees, and links to the request forms appear on the page.

I’m finding the differences between the ways that scholarly digital image collections are organized to be very interesting. The best of them have good searchable metadata, easy-to-use interfaces, and images that can be viewed in different resolutions. These are all obviously things to think about when creating or revamping such a collection.

The John Rylands Library is continuing to put additional rare and fragile manuscripts online (not just papyri). There’s a recent article from the Telegraph that indicates that a 14th-century recipe book is among the items to be digitally photographed and added to the collection in the next year. It’s really quite astonishing (and wonderful!) to see all of this work being done.

Oxford University has digital facsimiles of more than 80 medieval manuscripts scanned and online here: The images are copyrighted but personal research use is permitted.

The project of digitizing occurred in several phases. First a number of Celtic MSS were digitized, then additional medieval manuscripts deemed particularly valuable, useful, and/or fragile. These two phases were carried out with government funding. A server failure led to the takeover of the project by the Oxford University Library’s automation department, which also redesigned the website, and it now is controlled by the Oxford Digital Library.

During the site redesign, it was discovered that some images are missing from some of the MSS. A statement on the site indicates that the library is in the process of determining what is missing and what resources are needed to correct the problem.

Several potential issues with the creation of digital collection are thus highlighted. Funding may be temporary, and insufficient to digitize as much material as might be desired (by no means all of the medieval MSS held by Oxford colleges are included). If later problems are discovered, the funding may no longer be available to correct those problems. The technology may also fail, as happened with the server that originally housed this collection. This meant that the material was moved and now falls under the auspices of a different body.

The copyright restrictions on the images mean that although an individual may download a single copy of each for private personal use (they may also be displayed in an academic lecture), from another website only a URL linking to the image location may be used, not the image itself. This is a reasonable restriction, under copyright law, but if the image locations were again later to be changed, it would make access difficult. That’s merely something to be considered.

The descriptions of the MSS (i.e., the metadata) are quite limited and not really searchable; the MSS are listed by college and shelfmark, with brief descriptions in the browsing area and longer ones when you click through to a specific MS. Medievalists are used to such things, though, so it’s less of a limitation than would be the case for born-digital items.

This post is simply some musings about what collections are, and what’s necessary to make them valuable.

One fascinating thing about digital/online collections is how incredibly varied they can be. Text-based, still images, images of texts, sounds, videos… no one’s managed to capture and transmit touch or taste or smell, so far as I know, but sight has long been the primary sense upon which we rely for information transmission, and sound the secondary one. (I’m thinking long-term and long-distance here, as opposed to in-person communication.)

So the medium isn’t key to defining a collection, though a digital collection is by definition digitized in some manner.

To call something a collection does imply that a number of different items are included. How many? That can vary tremendously. Dozens? Hundreds? Thousands? Millions? Perhaps that doesn’t matter as much as the fact that once a collection has over perhaps fifty items, what becomes important is how to find a given desired item. Searching is key, especially when the searcher is not already familiar with the collection’s contents.

Any search relies on metadata of some sort. The conclusion I am reaching is that coming up with metadata categories, and then terms, is absolutely key to making collections of any significant size actually usable and useful.

All the information in the world is useless if it’s piled in a random heap.

The Disruptive Library Technology Jester has a link from here ( to a survey being carried out by David Lowe, Preservation Librarian at the University of Connecticut. Chapter 3 in the Lesk textbook is on “Images of Pages” and chapter 4 on “Multimedia Storage and Retrieval,” so people who are interested in collections of digital images may want to investigate this. The DLTJ post also has links below to some related posts/discussions of JPEG2000.

I’ve never been much of an image person myself (despite having taken art classes in high school and as an undergrad) and so my understanding of the different image file types is pretty minimal. I think this is probably an area I need to read up on, at least a little bit. Ideally digitized images would be faithful to the original, have a small file size, and bear metadata as part of the file – but I suspect it’s a case of “choose any two of the three”.