A Librarian with (a) View(s)

Readings on internet technology

Saturday, 25 July, 2009 in Information and Communication Technology | Tags: musings, technology | Leave a comment

The readings by Miller this week were partly review for me of things I knew already, with some elaboration and detail, but some of the information was definitely new.

I’ve been reasonably familiar with the various mechanisms for accessing the internet since I’ve used many of them at one point or another – dialup modems, DSL, and cable modems at home, T1/T3 lines at work, and wireless at both places plus other areas. I hadn’t known all the history of them, though, for instance how early T1 lines were developed.

The discussion of packets in ch. 4 of Miller helped clarify my understanding of those, especially the explanation of the differences between TCP and UDP. Likewise the material on IP addresses and proxy servers was really useful; proxy servers are something that I sort of vaguely understood but this helped me grasp what they do better. Same with routers, which again, I had some vague notions and I think now I understand better. I remember WINSOCK and having to install several versions of that! It’s so much easier now that TCP/IP is built into Windows.

Domain names and lookups I did know something about already, since I’ve maintained a personal website for some years and have to pay an annual fee to my host for domain registration, and I recall from early web days occasionally using the numeric addresses instead of names to reach some sites. I also remember doing ping tests a few times.

The readings by Ojala and by Eliopoulos & Gotlieb were a little dated, since there have been some changes in search engines in the last 6-7 years, but still useful. The table in the Ojala article which summarized the differences and similarities between search engines was good (although sadly the scan made it hard to read which engine was being described in which column). It was also good to be reminded (in the Eliopoulos & Gotlieb article) that the 80 results after the first 20 often can also be highly relevant – so often we settle for what we first see because it seems acceptable and we don’t want to waste time.

The Burd chapter looked back to what we did in building hypothetical computer labs in the second week of class; good as a refresher.

References:

Burd, Stephen D. 2006. Systems architecture. 5th ed. Boston: Thomson.

Eliopoulos, Demetrios, and Calvin Gotlieb. 2003. Evaluating web search results rankings. Online 27 (March-April): 42-8.

Miller, Joseph B. 2009. Internet technologies and information services. Westport, CT: Libraries Unlimited.

Ojala, Marydee. 2002. Web search engines: Search syntax and features. Online 26 (Sept.-Oct.): 28-31.

Thoughts on connectedness and tools

Thursday, 23 July, 2009 in Information and Communication Technology | Tags: musings, technology | Leave a comment

[This was originally written for a course on Information and Communication Technology]

“Connected data” and “connectedness” don’t necessarily carry the same meaning for me. “Connectedness” implies something more human – a person-to-person link – whereas “connected data” might foster the former, but are reducible to pixels and bytes.

That suggests that “connected data” are invariably digital… which isn’t quite what I meant, because it’s easy to think of ways in which non-digital information or data may be connected in various ways (texts by the same author, or on the same subject, for example). I do, however, think that digital data are easier to connect together, and that it’s easier for users to connect with those data as well. The hope with changing technologies is that they will enable users to make those connections faster/better/easier, allowing users to choose between different possibilities to find the one(s) that work(s) best for them.

We’ve looked at, used, and developed a number of different tools in this class, most although not all of them being web-related in some way (and even the ones that are not inherently web-related, like databases, can be accessed or used online). Most of these tools promote connections between pieces of data – databases most obviously, but web pages, blogs, and wikis all bring together information, for example by including links to other sites. Many of them also promote connections between the data and users, as in the way that wikis typically allow any user to add or edit information on their pages. Some of them facilitate connectedness between users, as with blogs where a reader can respond directly to the blogger.

The key thing to remember, though, is that no tool is perfect for every purpose, and also that however cool something is, if it doesn’t produce the kind of results we’re hoping for (whatever those may be), it’s pretty much useless. A wiki is a pallid and lifeless thing if no one out there is interested enough to contribute to it. Podcasts that no one wants to hear sound their barbaric yawp over the rooftops and into silence. (Sorry, got a bit carried away there.)

So I can see potentially using many or all of these tools, as a librarian, but I don’t think any one of them is truly necessary to do the job and do it well. They’re tools. Means to an end, not the end itself. If using podcasts and RSS feeds helps my library reach out to undergraduates and tell them about the things the library can assist them with, then those are good tools to use. If Adobe Connect Pro lets me meet virtually with colleagues at other institutions, so that we can brainstorm and work together without having to spend scarce funds on airfare and hotels, then that’s a good tool. If my department can use a wiki internally to get all of our procedural documents online, rather than producing dozens of paper copies and having to redo them all the time as the procedures are updated, then that’s a good tool, too.

We need to be aware of what’s out there, technologically. It’s changing so fast that it’s hard to keep up, and invariably we’ll get into some ruts. One can imagine hearing in ten years, “We’ve always had a weekly podcast!” The point is to do our best to stay aware, to stay flexible, to use the tools that work and not be afraid to try new ones or discard old ones if they’re not doing the job any more.

Medieval/Renaissance food image collection, with thoughts on copyright

Thursday, 4 December, 2008 in Digital Collections | Tags: collections, copyright, history, images, medieval texts, musings | 1 comment

I’ve drawn on the images from this site (http://www.godecookery.com/afeast/afeast.htm) in teaching, especially for a class that I have taught on food in pre-modern Europe, using several of the images to illustrate points in a lecture. It’s the simplest sort of a digital collection, really, maintained by a single person out of personal interest. The images are grouped thematically on separate pages, and each theme page has thumbnails and brief descriptions of the images; the images themselves can be reached by clicking the thumbnail. So it’s not a particularly sophisticated collection in terms of organization or labeling, but since it’s also not that large of one, the way that it is set up suffices for the purpose.

Given that I’ve just been thinking about copyright issues, it’s notable that the site’s owner is erratic in attributing the origins of the images displayed. Some are; for instance, an image of a baker (http://www.godecookery.com/afeast/kitchens/kit055.html) is said to be from a 1432 Flemish manuscript of Boccaccio’s Decameron. Others, like this merchant with a nutmeg (http://www.godecookery.com/afeast/foods/food005.html) merely have an approximate date given, but no other indication of where the image came from. The images from A Canterbury Calendar, originally from a manuscript dated to about 1280, are taken from a book published in 1984, and I’d hazard that probably no permission was given to display them online, although I might be wrong.

This is one of those areas where I find copyright law problematic. Frankly I doubt that having these images available online is going to prevent anyone who might want that book from buying it; the dozen images alone hardly comprise the information that would normally be wanted. So it’s not going to cut into sales or use of the book (the edition from which the images are drawn is out of print, in fact, although there is a revised edition in print). It’s hard to see how enforcing copyright here would encourage greater creativity. Actually I’d suspect that having the images out there on the web is what might stimulate interest and possibly new thoughts and ideas on the topic of medieval food.

Tags in online archives

Thursday, 4 December, 2008 in Digital Collections | Tags: archives, collections, fanfiction, metadata, tags | Leave a comment

I’m revisiting the Archive of Our Own (http://archiveofourown.org/) project because I wanted to talk a little bit about tags, and it’s an example of using them that illustrates a larger point. The Archive is still in closed beta, and does not (yet, as far as I can determine) have any FAQs available, so my comments are based strictly on what I can observe from poking around in the site; I know nothing about what is official policy.

From what I can tell, authors upload their stories and input various bits of information about them. A set of symbols identifies some characteristics of the stories: rating, type of relationship, whether there are content warnings, and whether the story is finished. Other information is given in text form, including title, author, fandom, characters, and specific warnings.

The specific warnings (convention in fandom calls for informing readers about certain elements in the story, such as sexual content, violence, death of major characters, etc.) are given in the form of tags, and the tags can be used for conveying other information as well. There is a page (http://archiveofourown.org/en/tags) where all tags that have been used are listed, in a cloud format that shows in larger type those tags more often used. The very most commonly used tags are not only in large type, but in red rather than black font as well; the three most often used are “Angst,” “First Time,” and “Humor.”

It seems clear that the tags are not taken from an established list (or not necessarily), but can be added by individual authors as desired. I deduce this from the variety of formats and meanings of the tags themselves, as shown by copying one line of tags at random:

This is a very Web 2.0 approach – the users create the content, and also identify it in ways that they choose, although there are also some standardized ways as well (title, fandom, etc.). It illustrates both the strengths and the weaknesses of such an approach, though. It gives the authors agency and ownership, which is very much in line with the purpose of the site. On the other hand, the total number of tags at this moment is 1085, a number that doubtless increases daily. Someone using the site might have a hard time thinking of what tag might identify the type of story they were hoping to read, especially given that different authors might use different tags to mean essentially the same thing, e.g. “smut” vs. “porn”.

So I wonder, are these tags really useful? That raises larger questions about metadata generally, who should create and maintain metadata, and how. Something worth considering even though I’m not sure there’s an ideal answer, certainly not in this specific case and perhaps not at all.

Digital collections and copyright

Thursday, 4 December, 2008 in Digital Collections | Tags: archives, collections, copyright, musings | Leave a comment

If I thought copyright was a tricky and irritating thing, coming from the perspective of an instructor, it’s much more of a potential problem when looked at from the perspective of creating an online collection, as I discovered in writing a paper on the topic.

Copyright law is lengthy, complicated, and (in my opinion at least) frequently does not actually do what it is intended for, that is, “promote the progress of science and useful arts,” as stated in article 1, section 8 of the U.S. Constitution. A compilation of the U.S. Copyright Act of 1976 plus related and more recent acts runs to over 300 pages (http://www.copyright.gov/title17/circ92.pdf). Wading through the tortuous prose of the Act is not easy. Luckily there are various other publications which summarize certain areas of the law and discuss their applicability, like the ARL’s “Know Your Copy Rights®: Using copyrighted works in academic settings” (http://www.knowyourcopyrights.org/index.shtml).

I learned the interesting fact that copyright law in the U.S. is usually revised by having the industries involved work out the changes in the law and then give that suggested text to Congress (Dames 2006, 36). This struck me as extremely problematic; the industries, e.g. publishers, are going to have a lot more clout and stand to profit the most from changes in copyright law such as a lengthened terms of copyright, and will the actual creators of the works. It’s difficult to see how, then, creative thought is likely to be stimulated by copyright.

On the other hand, the fact that copyright is now assumed and need not be formally registered (for a price) is a plus in these days of the internet, since someone (a U.S. citizen anyhow) who places a new creative work on the internet does retain copyright to that work. They may choose to relinquish some or all of their rights, e.g. with a Creative Commons license (see http://creativecommons.org/) which allows others to use the work in defined fashion, but it’s at their choice.

What I concluded overall from the research I did was that it’s critical for any person or institution that may be setting up a digital collection to be extremely careful ahead of time in paying attention to the issue of copyright, and ensuring that either the collection’s contents are in the public domain, or that all necessary rights permissions have been secured. While an instructor in a classroom may be able to use materials on the basis of fair use, creating a permanent online collection is a very different matter.

Reference:

Dames, K. Matthew. 2006. The copyright landscape: Introducing U.S. copyright law. Online (Sept./Oct.): 35-8.

Digitizing national collections

Friday, 28 November, 2008 in Digital Collections | Tags: archives, collections, history | Leave a comment

As someone trained in European, and particularly British, history rather than American history, I have used the National Archives (NA) of the UK on a number of occasions but never the Library of Congress (LoC) which is to some extent the equivalent, inasmuch as both institutions house various collections of original documents considered significant in the nation’s history, although in other respects the type of materials collected may differ quite a bit.

Both institutions have digitized some of their holdings, but their purposes and how they make these accessible can be quite different.

The Loc states, “The mission of the Library of Congress is to make its resources available and useful to Congress and the American people and to sustain and preserve a universal collection of knowledge and creativity for future generations. The goal of the Library’s National Digital Library Program is to offer broad public access to a wide range of historical and cultural documents as a contribution to education and lifelong learning.

“The Library of Congress presents these documents as part of the record of the past. These primary historical documents reflect the attitudes, perspectives, and beliefs of different times. The Library of Congress does not endorse the views expressed in these collections, which may contain materials offensive to some readers.”

This is part of their general boilerplate for each online collection; I found it at http://lcweb2.loc.gov/ammem/sfbmhtml/sfbmhome.html, the collection of Samuel F. B. Morse’s papers at the LoC.

The overall statement of purpose for the Loc (at http://www.loc.gov/library/about-digital.html) states:

“The Library of Congress has made digitized versions of collection materials available online since 1994, concentrating on its most rare collections and those unavailable anywhere else. The following services are your gateway to a growing treasury of digitized photographs, manuscripts, maps, sound recordings, motion pictures, and books, as well as “born digital” materials such as Web sites. In addition, the Library maintains and promotes the use of digital library standards and provides online research and reference services.

“The Library provides one of the largest bodies of noncommercial high-quality content on the Internet. By providing these materials online, those who may never come to Washington can gain access to the treasures of the nation’s library. Such online access also helps preserve rare materials that may be too fragile to handle.”

As far as I could tell, the digitized collections at LoC are all freely available via the web, regardless of the location or identity of the potential user.

This is not the case with the digitized collections at the NA. There it is stated (http://www.nationalarchives.gov.uk/documentsonline/about.asp): “DocumentsOnline allows you online access to The National Archives’ collection of digitised public records, including both academic and family history sources. We are committed to providing online access to the records, and DocumentsOnline forms a key part of this strategy. DocumentsOnline can be used free of charge on public access PCs at The National Archives.”

Some of the online records can be downloaded for free from other computers, but by no means all. I happen to have used records (wills) from the Prerogative Court of Canterbury, several hundred of which I looked at in the originals at the archive some years back. To download a single will costs £3.50 (about $5.40 at today’s exchange rates). It would be impossible to carry out the research that I did without either very substantial funding or going to the archive and looking at documents the old-fashioned way.

One drawback to digital collections from either institution is that in neither case are they comprehensive. This is completely understandable – the total collections held are enormous, and the sheer time needed to digitize the documents (not to mention adding metadata and other necessary steps, or the costs invlved) is prohibitive. Nevertheless having even some of this national-historical information available in digitized format, even for a fee, is potentially a tremendous benefit.

Whither digital repositories?

Friday, 28 November, 2008 in Digital Collections | Tags: articles, collections, musings | Leave a comment

Dr. Martens pointed us at this article at LibraryJournal.com: At SPARC Digital Repository Meeting, Shulenberger Calls Out AAUP, ACS. Towards the end of the article the author (Andrew Albanese) notes, “Libraries, with stretched budgets, have bought fewer monographs, and the consolidation in the bookselling market has left university presses increasingly alone to fend for their survival. It’s time, Shulenberger, urged, for all campus units, to find ways to pull in the same direction, for everyone’s common benefit.”

This makes me think that academically-oriented online collections in general, and institutional repositories in particular, need to be reconceptualized in terms of what they do and how their functions interact with and/or replace those of earlier methods of sharing information.

There are many reasons why information is shared in an academic context, but two stand out. First, because that is in a sense the entire purpose of the academy: to create and disseminate knowledge. Second, because the way that higher education is structured, the professoriate is judged primarily on the basis of such creation and its dissemination through publication.

Publication in journals or monographs is not the only way in which information can be disseminated, however. The advantages of publication have traditionally been threefold. Publication enabled information to reach more people more quickly than did personal communication (before academic journals existed, the sharing of knowledge was largely informal, and carried out through direct communication until and unless the author published a book). This advantage is now far less relevant, since it is easy to put information on the internet. Publication also fixed the information in a standard, findable, permanent form; and that is an advantage not yet always extant in online formats, though the use of repositories should improve permanency and findability. Finally, publication involves a certain amount of gatekeeping; no publisher can publish everything submitted, since there are questions of cost and also of suitability. Thus the existence of peer review, which in theory ensures that published academic works are of high quality. This is something that repositories do not assist with at the present time, since their purpose is to archive rather than to review.

It would be entirely possible to add some sort of review function to a repository, however. It need not be required for all items, but could be an option. A university press could use the same group of reviewers now called upon to evaluate submitted manuscripts (books, articles, or both) to evaluate items deposited in an institutional repository. If of suitable quality, those items could be designated as “peer reviewed” and be considered the equivalent of a formal publication, with the imprimatur of the press; authors could also have the opportunity to revise the work if reviewers felt it was not up to standard.

Substituting peer-reviewed items in digital repositories for traditional university press-published journals and monographs seems to me to be a potential way to continue disseminating new knowledge, retain quality standards, and yet not continue to need to subsidize money-losing presses. If the digitized items in the repository could also be converted into printable formats, using the Espresso Book Machine or similar technology, then it’s hard to see what losses there would be in such a shift.

Repositories and copyright

Friday, 28 November, 2008 in Digital Collections | Tags: articles, collections | Leave a comment

Greig, Morag. 2007. Repositories and copyright: Major hurdle or minor obstacle? ALISS Quarterly 3 (1): 16-9.

In doing some research on copyright I found this article, which looks at issues of copyright connected to the institutional repository at the University of Glasgow, specifically at that part of the repository which holds published materials (unpublished materials not usually being subject to copyright problems). Authors may either self-deposit or have repository staff make the deposit; the latter is generally preferred since authors do not feel they can accurately understand publishers’ copyright agreements, and are concerned lest they break the law.

Grieg discusses the methods by which staff can determine if an article may be deposited. Often a particular version is authorized by the publisher for deposit, but it may be difficult to obtain the correct version. She notes that it is usually relatively straightforward to find out if journal articles may be deposited, but more difficult for other items such as conference papers. Books and book chapters are also difficult as the contracts between publishers and authors rarely stipulate what is permissible with respect to repositories, but some books have been placed in this repository and have been very frequently used (over 22,000 downloads in one case).

The final comment in the article is of especial interest. Grieg points out that for all the problems that can exist in getting the appropriate permissions from publishers for deposit of materials in repositories, the greatest barrier remains the authors themselves, who must make the first effort to deposit, but who often still do not see this as part of what is expected of them.

The Google Books Settlement

Friday, 28 November, 2008 in Digital Collections | Tags: collections, google | Leave a comment

This is a subject that will be ongoing for quite a long time, I think. I’m still trying to get my head around what the settlement does, and what the implications potentially are for both libraries and individuals.

The Disruptive Library Technology Jester so far has six posts dealing with the settlement, and I have found the comments and summaries there to be quite useful.

1. Google Book Search Settlement: Introduction, Public Announcements
2. Google Book Search Settlement: Reviewing the Notice of Settlement
3. Is OCLC’s Change of WorldCat Record Use/Transfer Policy Related to the Google Book Search Agreement?
4. Google Book Search Settlement: Public Access Service
5. Preliminary Court Approval of Google Book Settlement; Final Approval Hearing Set
6. Google Book Search Settlement and Library Consortia

What does seem clear to me so far is that any library that might wish to make use of the materials digitized by Google and its partners is going to have to read very carefully the settlement and the terms of use that Google is establishing. Libraries acting as part of consortiums may be able to get discounts in pricing.

Choosing metadata schema for digital collections

Sunday, 26 October, 2008 in Digital Collections | Tags: articles, collections, metadata | 3 comments

Kennedy, Marie R. 2008. Nine questions to guide you in choosing a metadata schema. Journal of Digital Information 9 (26). http://journals.tdl.org/jodi/article/view/226/205 (accessed October 26, 2008).

Kennedy assumes that the collection is already in development, with a known rationale for its creation, and that any copyright issues have been considered. She then lays out the following questions as guidelines for choosing a metadata scheme that will be useful for the collection’s users:

Who will use the collection?
Who will catalog the collection?
How much time and money is available?
How will the collection be accessed?
How does the collection relate to other collections?
What is the collection’s scope?
Will the metadata be harvested?
Will the collection “work with” other collections?
How much maintenance and quality control is desired?

Kennedy doesn’t so much promote specific extant metadata schemes (like Dublin Core or others), as she instead encourages developers to pay attention to the needs inherent in their own situation. This is a very useful checklist for someone developing any collection, but particularly for someone working on academic or institutional collections and repositories.

A Librarian with (a) View(s)

Readings on internet technology

Thoughts on connectedness and tools

Medieval/Renaissance food image collection, with thoughts on copyright

Tags in online archives

Digital collections and copyright

Digitizing national collections

Whither digital repositories?

Repositories and copyright

The Google Books Settlement

Choosing metadata schema for digital collections

Pages

Archives

Blogroll

Classmates

Meta

Categories