Collection Assessment

Remember how I wrote sometime last year that I had an interesting assignment? “Last year?! Whoever remembers anything from last year?” is absolutely the right answer. I can hardly remember it myself, but luckily, my laptop is much smarter than I am and has saved a copy of the assignment, which I’d like to now share.

I am, of course, exercising common blogging courtesy and so will not post the entire 9-page paper here. However, perhaps a table of summary statistics and circulation data will prove not terribly trying on your patience? Although I realize that there’s no account for taste, I find them interesting.


In October of 2011, I examined print materials at the Young Research Library (YRL) with Library of Congress (LC) call numbers in the range Z471-Z81, inclusive.

Summary Statistics
In Table 1, I present basic summary statistics on books in the range of Z471-Z481, or American publishing and bookselling, that are either reported by the online catalog as being held by YRL or are physically on the shelves at YRL.

Table 1: Summary Statistics

Total Number of Books 325
Number of Unique Titles 275
Number of titles with duplicates 15
Number of books currently checked out 13
Range for dates of publication 1884-2010
Number of serial titles 17
Number of serials 48
Ratio of serials to monographs 17:100
Number of languages present 3
Number of books not on shelf (but not marked as checked out or missing in catalog) 20
Number of books on shelves that are not marked as being in YRL 20
Number of books on shelves that cannot be located from online catalog 10

There were 325 items in this category, with 13 currently checked out to patrons. The section contains both monographs and serials, with monographs dominating in number. Because of the subject matter – primarily American bookselling and publishing with a couple of tomes in the Z481 range being Canadian publishing – all but 2 items are in English. The collection contains quite a lot of older materials. The average copyright year is 1974. There were only 28 items that were copyrighted on or after 2000, balanced against 53 items that were published before 1950.

The last three rows in Table 1 demonstrate the discrepancies between the online library catalog and the reality on the shelves. In my mind, the most problematic issue is the fact that there are 10 books on the shelves that cannot be located by using the catalog. For all intents and purposes, these books are lost to the users since no one using the catalog would be informed of their existence. Although these 10 items represent only a small fraction of books and records, they are mostly very old items that may be very difficult to find elsewhere. (Admittedly, their age also means that few people are likely to be looking for them.)

Circulation Statistics
Figure 1 shows a frequency distribution for the number of circulations for 29 items. As the figure shows, the plurality of the items has not circulated (8 items with zero circulation). There were two volumes of a serial included in this random sample, and both had zero circulation. There are two titles with duplicate copies in the random sample, and both had strictly positive circulations (3 each). There are two “superstars” in the sample who had more than 20 circulations. Overall, 21% of the items account 85% of the total circulations, coming very close to the 80/20 rule first documented by Trueswell’s 1985 paper “Some Behavioral Patterns of Library Users: The 80/20 Rule”.

Figure1: Frequency Distribution of Total Circulations

