Paper presented at the Digital Resources for the Humanities conference, Glasgow 1998; published in: Marilyn Deegan, Jean Anderson and Harold Short (eds) DRH98: Selected Papers from DRH98, Digital Resources for the Humanities Conference, University of Glasgow, September 1998 (London: Office for Humanities Communication, 2000), pp. 129-141.
Creating an Image Edition of Historical Material
Asia: Official British Documents, 1945-1965
Brad Scott
Asia: Official British Documents has been created largely independently of any of the debates on the use of document imaging and other technologies in the development of historical resources that have been taking place in academia over the past ten years or so. It was only during preparation for this paper that I become aware of the extent to which many of the issues we have been trying to address have been grappled with elsewhere. In a sense, this is one of the main points of this paper; that there is a considerable overlap between what we are all doing in humanities computing, as publishers, or as academics, librarians and archivists. Though as sub-groups within humanities computing we each may have different expertises and agendas that we bring to our work, I believe that it is only through collaboration and discussions that we can each try to ensure the success of our own projects, in what is still a difficult and evolving technical and economic climate. Asia, then, is a project born out of pragmatism.
BACKGROUND
As a publisher, Routledge has produced a number of print editions of a range of historical resources over the years. These include edited summary collections in a single volume; some facsimile editions of documents; and some scholarly ‘full text’ editions. Like many others, as we formed our ideas about what it may be appropriate to develop in electronic media, historical materials were, on the face of it, an obvious choice.
In choosing to develop this project, we had two key requirements:
- That the resource would be useful to and suitable for a range of academic research users throughout the world; and
- That the model we developed for producing it would be sustainable, ie that the software could be re-used with minimal changes; that the editing and production practices could be applied to other image editions of historical materials, perhaps even from other archives than the Public Record Office; that the processes involved (such as writing the image metadata headers, image and text checking, and managing the electronic capture of the images) could be carried out in a predictable timeframe, by a small team over a six to nine month period; and that it would be economically sustainable (ie that, in the absence of grant funding, we could be reasonably confident that the project would be self-financing, cover all appropriate overheads (including distribution, warehousing, credit control, heating our offices, rent etc) and give us a small amount to help grow the business so that we will have a better chance to do it all again).
Of course, we have never expected that the model we would develop for this first project would necessarily provide all the answers. Over time, there are bound to be changes, modifications and improvements to the procedures we have set up, and to the software and general publishing strategy. Any model such as this has to be flexible enough to respond to user feedback, editorial needs and technical changes.
In this paper, I will address how we have attempted to meet these requirements, and detail those issues which have arisen. I will also cover the following additional themes:
- related projects elsewhere
- document selection
- features and requirements of the software
- workflow management; and
- image capture
IMAGE EDITIONS
In parallel with our work on the Asia project the Model Editions Partnership (MEP) under David Chestnutt at the University of South Carolina has emerged as a major co-ordinated programme of historical digitization projects. The MEP combines seven projects to devise standards for both full text mark-up in SGML and also for SGML metadata headers for images, which will be independent of commercial software products. 1 The attention given to metadata headers is important in the context of this paper as two of the projects in the Partnership are what have been designated ‘image editions’, ie contain searchable textual descriptors of documents linked to tiff images of them. Such editions may well provide a model for the future handling of large amounts of historical archival material whereas in contrast, a full text SGML transcript may prove too costly for all but the most important documentary collections.
One of the MEP image editions is the Margaret Sanger Papers project, based at New York University and directed by Esther Katz.2 Sanger was prominent in the international birth control movement from before the First World War through to the 1960s. Microfilms of her papers have already been produced and total nearly 250 reels containing over 115,000 multi-page documents. Though an early plan was to digitize all of these documents, considering the large quantity of material and consequent data capture and editing costs, current plans are to develop electronic versions which would comprise topic-based selections of the total content. The editorial team are also preparing detailed contextual materials to help the user understand the documents. The MEP and the Sanger project is currently using DynaWeb to deliver their materials, though this is still experimental and is not yet publicly accessible. In addition, a print edition of the most important selection of the materials will be published around 2002. Similarly, the other MEP image edition, the Lincoln Legal Papers, based in Illinois3, is planning to produce a digital edition of all the quarter of a million pages that have been identified, using the same approach.
EARLIER RELATED PROJECTS AT THE PRO
Like many national archival repositories, the Public Record Office (PRO) in London has been keenly exploring the opportunities that are opened up by digital technologies, including document imaging, full text and archival metadata. During 1994-5, the PRO worked on PROfiles 1964, an image edition of a comprehensive selection of the British government documents for 1964 released under the thirty-year rule. Published in four multiple-CD-ROM sets over two years and edited by Michael Kandiah and Gillian Staerck of the Institute of Contemporary British History, PROfiles 1964 contains 150,000 images with editorial commentary and searchable textual descriptions of the documents.4
As the first of the PROfiles products was nearing completion, Chris Doutney and Aidan Lawes of the PRO approached us with an idea to develop a range of subject-specific PROfiles-like resources. From the start, though, it was clear that the new resources would need a somewhat different approach from the earlier project:
- The subject area would need to be one which we could confidently sell in as wide an international market as possible;
- It would need to be priced more comparably with microfilm containing a similar amount of material, as financially, PROfiles had not been a success;
- We would need to revisit the software design; and
- Considering the breadth of any likely subject area covered, the resource would probably have to comprise a substantial selection of material, rather than the traditionally comprehensive collections which have been developed as a result of the microfilm model of archival document reproduction and supply. It was also likely that a different approach would need to be adopted in constructing the image metadata. Both considerations derived from the requirement that the project be sustainable and that it be completed within a reasonable time scale.
DEVELOPING THE IDEA
Of the range of possible subject areas under consideration in the early planning stages, Asia soon emerged as an obvious candidate. We needed to be reasonably confident that, whatever the subject area, we would sell sufficient numbers of what would be a high-priced resource throughout the world in order to make the project viable. Our long-standing relationships with the Asian markets had taught us much about library purchasing patterns and budgets in the region, and we knew that sales in Japan especially would be critical in making any such project financially viable irrespective of its content. Thereafter, we reasoned, once we had developed a coherent model for creating and selling this sort of electronic resource, some of the cost components would be reduced and other subject areas would more likely become viable, and our criterion of sustainability would be met.
In the course of developing our initial ideas, as with all our publishing projects, whether in print or electronic form, we spent considerable time finding out what people actually wanted in such a resource, how they wanted to use it, and how they wanted to buy it. To do this we adopted a number of methods, including in-depth interviews with, and proposal reviews by, leading academics working in the area. We also sent a questionnaire to a number of academics and Asian Studies subject librarians in some of the most likely target institutions, especially in the Pacific Rim region, and conducted several discussions with our agents in Asia to get a realistic idea of the number of copies we might expect to sell, and also to establish what library suppliers in the core markets would prefer for a product of this type in terms of discount structure, pre-publication offers and the extent and nature of marketing information that we would need to produce. In so doing, we discovered that translating the user manual, at least into Japanese, may also be a requirement, which is something we have already done with the Routledge Encyclopedia of Philosophy.
These enquiries led to the view that a document collection from the British national archive focussing on post-war Asia was most sensible for the following reasons:
- It is historically and politically interesting for researchers both in the region and elsewhere in that Britain was still an important power at the time and was closely involved with others, especially the US, as part of the Allied Powers; Britain was retreating from Empire in Southeast Asia (and elsewhere in the world) throughout this period (eg Singapore and Burma), and so its record of events is valuable for constructing accounts of the post-colonial world; and Britain’s officials commented on, and analysed, the actions of other powers, such as the USA during the occupation of Japan. Britain’s officials routinely sent back long and detailed reports to London. British documents are therefore of vital interest to historians.
- It is a manageable corpus from which to draw a comprehensive selection of materials. I will return to this issue later.
It is perhaps also worth emphasizing that the collection does not just contain documents created by the British government and its officials; of necessity, a significant proportion of the materials are communications from other governments and agencies which are included in the PRO’s collection. Given this initial plan, it was obviously vital that the document selection criteria were drawn up and reviewed by subject specialists. This plan was put together by Gillian Staerck and Michael Kandiah, from the Institute of Contemporary British History, the same team who had worked on PROfiles, who knew the modern document collection at the PRO, and had had the vital experience of selecting and annotating documents for an electronic edition. Our editorial board subsequently made suggestions for a number of other topics, themes and document classes to be considered for inclusion, while broadly supporting the approach proposed. These advisers were: Sumio Hatano, University of Tsukuba; Wang Gungwu, East Asian Institute, Singapore; Antony Best, London School of Economics; John Saltford, PRO.
SELECTIONS
Why have we elected to produce a resource containing selections (albeit comprehensive ones), rather than all documents relating to postwar Asia from the PRO?
From the outset, we have wanted to devise a model for this sort of publication that would be sustainable and repeatable; for anything but the narrowest archival collection, the only possible sustainable way to produce archival material electronically is to be (slightly) selective — a point which is acknowledged by the current plans of the Margaret Sanger Papers project to produce themed collections from the complete corpus. You cannot include everything in the same way as is possible with microfilm, due to the cost and time it takes to write the metadata, however brief it is. Still, all of us working on Asia: Official British Documents, both publishers and academics, have been careful to ensure that this will be a comprehensive collection, one in which every page will have been examined by academic advisers (who will also have written the metadata headers); which is not something that is the case with microfilm collections. The principle of selection accepts that people will still want to look at the bulk of any documentary collection, but we have done a good preliminary sifting; in many cases this approach meant that the pages that were omitted were things like compliments slips and standard covering notes from civil servants. Even so, this approach was not one that we could take lightly and necessarily figured large in our preliminary research. Interestingly then, the general impression from our informed advisers was that, as long as the selection criteria are coherent to the main body of researchers at whom the resource is aimed, and that it is clear to the user that there may be additional, perhaps more esoteric, materials at the PRO not included, then this is a sensible (and indeed necessary) course of action. In making such a selection, one can be fairly confident in providing the resources that will satisfy the research requirements of at least ninety per cent of users. As I have discovered from my experiences in developing the Arden Shakespeare CD-ROM,5 to attempt to try and satisfy the often conflicting and irreconcilable requirements of all potential users of such an electronic resource can only be achievable with extremely elastic budgets and indeterminate schedules.
The issue of document selection of course begs the question - what overall percentage of potentially relevant PRO files have been selected for the CD-ROM? Unfortunately this is difficult to answer as the files have never been quantified. However, some individual estimates are that the collection includes about 95 per cent of the relevant Cabinet files (which cover policy making); about 95 per cent of Foreign Office Ambassadors’ reports and Research Memoranda; and 50 per cent of the files from the Prime Minister’s office, the Board of Trade and the Dominion Office. Of the Foreign Office political files which are included, it is really impossible to tell since there is such a huge volume of material, large portions of which are duplicates of the output from other departments. In total then, Asia: Official British Documents comprises about 3500 documents made up of 40,000 images, which by way of comparison is about ten to twenty per cent of the total amount of material in the Sanger Papers or the Lincoln Legal Papers projects.
The other important feature of the resource that we all agreed on at an early stage was the nature of the searchable textual descriptions of the documents. From their experience working on PROfiles, Michael Kandiah and Gillian Staerck knew how long it took to write and edit free text descriptions, detailing all parties to correspondence (even relatively minor officials) together with their job titles, which necessitated a considerable amount of additional research. Considering the absolute requirement that the project be developed as a sustainable model, we all realised that, in terms of both time and money, the cost of adopting the same approach as in the earlier project would not be viable. Our solution was for the textual descriptions to comprise keyword phrases, and amounting to between 200 and 500 words. In addition, the textual header for each document would also include: date of the document, or data range where it was compiled over or related to a long period; the countries to which the document relates; PRO reference number; and the PRO ‘class description’, ie the broad outline of the provenance of the document collection. To aid the data creation process, Clarinet Systems Ltd, the software company we were working with, built an easy-to-use data entry environment in an Access database, allowing the researchers working at the PRO to compile the metadata more efficiently. This included a pick-list for country names, and standard date formats.
SOFTWARE SELECTION/FEATURES OF SOFTWARE
When we came to think about what we wanted from the software that would support the data we had been talking of creating, our key requirements were that: it was already available; was already being used for publishing applications; needed minimal customization for our requirements; was easy to use; and could support multiple disks and networked environments. During the course of our investigations we discovered that most image management software turned out to be corporate document management solutions for intranets, not easily reconfigurable to a CD-ROM ‘product’. We also considered creating the resource in an SGML environment, but at the time did not feel that any of the available browsers would be appropriate for handling such a large number of images; such applications would in any event have required more customization than we wanted.
In the end we chose Clarinet Systems Limited,6 who are based in Surrey, and whose Clearview Authoring and Retrieval software has been available in a number of versions since 1991. This software has been tried and tested in a wide range of publishing applications, especially in delivering large quantities of documents for the financial services industry, in products comprising several hundred CD-ROMs, which in these cases are supplied with juke boxes of sufficient size to enable the efficient retrieval of information. In addition, the software has been used in a number of other industries and in supplying local and central government with document management solutions. Though Clearview had not been used to handle historical materials before, it was not difficult to see that conceptually and technically it would not be too different from any of the other applications for which it had been used.
What attracted us to Clearview were the following features:
- Each document, comprising any number of page images, may have one or more searchable textual descriptions attached to it, detailing the important features of the document. The facility to have multiple such sheets is useful when handling long documents whose content may be considerably diverse.
- Through the search dialog, the user can build up quite complex queries, limiting their search to any of the data fields. The software supports Boolean and truncation searching, and indexes the words or (where appropriate) phrases in the data. The user can also search by date range, and can search the text of any public and private notes that have been created at their institution (if they have access rights), though these are, of course, not indexed. This note feature can be particularly useful for facilitating the transcribing of documents, for instance.
- Usefully, Clearview can handle pdf as well as tiff, should we need that option in future, and the images can easily be manipulated, such as by rotation and enlargement, which is useful for reading marginal scribbles on manuscripts. Search results are also sortable, such by the disk number on which the images reside, which will minimize disk-swapping if only a single CD-ROM drive is available.
- The resource can easily be run off a juke box, on a stand alone machine, or networked (which may be useful in a teaching lab environment for sources and methods courses); and users can print any of the images and accompanying textual descriptions (including any notes attached to a document).
WORKFLOW MANAGEMENT AND PLANNING
Initial discussions about the project took place in the summer of 1995, which led to a formal proposal being presented to the Routledge publishing committee in September 1996. We then spent over a year piecing together the practicalities of the project and how to manage it. Unlike earlier projects which we had worked on which generally involved just two parties for the greater part of the projects’ duration, ie Routledge and a software developer, Asia was somewhat more complicated. Throughout the course of the project there have been four parties working together to establish the most efficient way of doing the work: Routledge; Clarinet; the academics; and the PRO (which involved both the publishing and reprographics departments). In addition, there has been on-going liaison with the image digitization company to make sure the way we were proposing to send the materials to them was acceptable and unambiguous. It has been essential for the smooth running of the project that as much as possible all parties were comfortable with and agreed to the schedule and the day-to-day practicalities, and that we mutually understood any constraints within which we were operating. Certainly, none of us would claim it has been without a hitch, but I have been especially pleased that the careful preliminary planning has meant that, having started the project for real in January 1998, together we managed to keep to budget and (largely) to schedule, and that it has been fairly straightforward. The product was published two months later than planned, in April 1999.
In practice, one of the most important features of the planning was that we spent two years talking about the organisational practicalities and the contracts to support them before embarking on the project for real. This meant that we all had a good understanding of what we were trying to achieve and has minimized potential sources of conflict and disagreement between parties. It also meant that we weren’t designing the project as we went; the detailed initial talking essentially cost only our respective salaries and some small scale trials; so, by the time the project started formally and larger sums of money had to be spent, the bulk of the practicalities had been sorted out and agreed. This has certainly led to a much more streamlined project, and one which can be more easily replicated in future.
IMAGE CAPTURE
How did we select the scanning company? We had three (potentially incompatible) considerations and requirements driving our choice:
- Cost
- Quality and reliability
- File numbering capabilities
An absolute requirement of the ClearView software is that the images for each document bundle are numbered with sequential numerical file extensions, eg .001, .002, .003 etc. Several of the (admittedly smaller) scanning companies we approached were not suitable because their equipment would not meet this requirement.
In the end, we chose Portsmouth-based IMAGinations7 on the basis of recommendations from other publishers and software companies, and after comparing their quality, service and cost with a number of others. Throughout the project, they have been extremely helpful and accommodating.
Once our academic editors had selected the materials we obviously needed to capture the document image electronically. Three routes were open to us:
- Photocopy then scan
- Microfilm then scan
- Scan straight from the document
As the PRO Reprographics Department is in the process of increasing its capabilities in document scanning, the latter option was a possibility, and certainly the most desirable, but this service was not going to be available for the volume of documents that we required and within the timeframe we needed. Of the other two options, both are simple and straightforward, and we already had had experience of managing the process of creating images from microfilm from our work in developing the Arden Shakespeare CD-ROM. Still, in terms of both quality and cost, trials suggested that the best, most consistent results across a number of batches were obtainable from scanning from photocopies. Even so, there were four features of the process that we had not expected:
- We knew that it was neither practical nor realistic to be able to change the working practices of the PRO Reprographics Department for this special job, and all parties took part in very useful discussions to make sure that, as much as possible, our needs were met. Nevertheless, as standard working practice is to copy a document from its beginning to its end, the photocopier consequently delivered us with pages in reverse order and so had to be hand sorted; in contrast, the scanner at the PRO produces its output in the correct order. Such a consideration is trivial for small data samples, but has the potential to quickly become a major headache when handling tens of thousands of pieces of paper.
- All documents are copied with a standard header slip detailing the PRO document reference number; this is of course essential for identifying the images both during the development process, and also for users of the finished resource, especially should they print any of the material. However, the use of these header slips meant that none of the copies were A4; they were all copied as A3 with a large amount of space down one side. One of the implications of this became clear when we sent the first batch of copies for scanning. The average file size was 162k, which suggested that the 40,000 images would need to be accommodated on ten disks, which was rather more than we planned, and certainly more than most users would find acceptable for the amount of material in the collection. Clearly, we knew that we had to crop dead space on the images to reduce their size. Initially we thought that, due to the necessary irregularities in document size and positioning on the sheets, and due to the complete absence of any fixed ‘anchor’ in the image such as a strong vertical black line, it was not going to be possible to automate the cropping of the images, and that all would have to be done by hand. This would have had the effect of more than doubling the cost of the image capture. In the end, after consulting with IMAGinations, automation of the cropping was achievable, reducing the file size by over 40 per cent, which kept the number of disks needed to supply all the material down to six.
- The A3 photocopies were sent for scanning as soon as they were received as it was originally assumed that the main checking stage would focus on the digital images. However, once these were delivered to us it became apparent that, not only were a number of pages missing or duplicated, but it was more practical to check the images from print-outs of them rather than on screen. This had two consequences: first, most of the image files would need to be renamed; and second, we had to pay for the supply of 40,000 print-outs which also carried detailed of the respective file names. In retrospect, it would have been more sensible to do the detailed image checking from the A3 originals; though subsequent checking would also have been necessary, there would have been less of a problem in renumbering the files.
- The renumbering of the files was a large issue, the scale of which had not been predicted. Using large spreadsheets detailing the existing file names, we spent several weeks checking and creating the correct file names in the spreadsheet. These tables were then used to generate batch files to do the renumbering. However, all of the images were on CD-ROM (totalling 4.5Gb) and needed to be loaded onto a network environment before this could be done. Even when this process was complete, there were still some remaining difficulties; in trying to keep together all the documents from the same PRO class on the same final disk there was necessarily a great deal of sorting and resorting to create six chunks of data under 650Mb and would allow enough space for the program and index files. All of these processes necessitated constant updating of the Access database containing the metadata, and the Help file.
FUTURE
Perhaps for some specific types of electronic data such as time-sensitive information, the market for it is beginning to settle down and web-based models are emerging as standard. This is certainly not the case though with more specialized resources and, as Carol Tenopir and Lisa Ennis have noted in a recent review of the electronic holdings of member institutions of the Association of Research Libraries, “despite dire predictions, CD-ROM continues (for now) to be a popular option in libraries”.8 For a resource like Asia: Official British Documents, or indeed any image edition of historical materials, CD-ROM still has clear benefits to users compared with web delivery. Download times are speedier and more reliable; the software solutions presently available are perhaps more flexible and well-suited to handling this sort of content; and institutions will retain archival control of the content in a way which is impossible with web-delivered and remotely-hosted data.
This is not to say, though, that this will always be the case. As content creators and distributors, we have to be alert to new ways of providing the data in ways in which users want it, as long as it is economic to do so. In time, if there is sufficient demand, it may be desirable to produce country-specific sub-sets of the data, perhaps for individuals or smaller institutions. This could be done using the same software, and we have certainly had this in mind as the resource has been put together, but could equally well be delivered through the Internet (eg using Clarinet’s ClearNet), or supplied as tiff image files for loading on an institution’s own document management system, or with the metadata headers converted to SGML to be loaded onto a local server; we will certainly be exploring the issues in converting them to either TEI/MEP or Dublin Core header form in the coming months.9
This paper has assumed throughout that it is inconceivable that all possible archival digitization projects can be grant funded and that it is important to try and create sustainable models for an ongoing and evolving programme of such projects. Whether we work in a formally commercial sector, or are answerable to grant agencies, being faced with the ’shrink-wrap imperative’ of economic accountability, it is impossible to develop any electronic resource that will provide everything to all possible users on its first release, though of course anyone who tries to develop such resources can and should have users’ ideals in mind when crafting them. To have a chance of success though, the team must also be aware of where compromises can and can’t be made. Discussions at forums like Digital Resources in the Humanities and the Association for Computing in the Humanities, and with other academics and librarians show that we cannot expect consensus among the diverse interests and requirements that are represented. Projects should certainly be viable from their inception, but may well evolve over time and respond to user feedback in ways in which it is generally impossible with print media.
We believe that projects like Asia: Official British Documents and the work of the MEP are exciting explorations of the issues involved in establishing user needs and the feasibility of making archives available digitally. Image data capture costs for electronic resources are relatively not that different from microfilm, but the added components of compiling the metadata and licensing or modifying the software make up the greatest proportion of any budget. As a result, the shovelware approach of microfilm is unlikely to be easily replicated in an electronic environment.
We must not forget that microfilm has been an extremely successful delivery medium for historical materials due to the comparatively low cost and the minimal technology needed to access it. Current experiments in making digital resources also need to have a sustainable economic model for their creation else available projects will be merely limited to those prestigious collections which can attract large amounts of grant dollars to create them, but which may not actually be the most intellectually interesting or useful. If that turns out to be the case, we will all have failed to realise one of the great hopes of the digital revolution, and access to information will be little better than it is today.
NOTES
[This article was written in 1998 and most of the web sources are no longer live. I've located any current versions and added a few extra links to fill in the gaps.]
- The MEP home page is at <http://mep.cla.sc.edu/MEP-Home.HTM>. See also David R. Chesnutt’s paper at ACH/ALLC97, ‘The Model Editions Partnership–Towards a National Database’, <http://www.qucis.queensu.ca/achallc97/papers/p036.html> [now at <http://xml.coverpages.org/chesnuttACH97.html>, and see also David R. Chestnutt 'The Model Editions Partnership: "Smart Text" and Beyond' D-Lib Magazine, July/August 1997 <http://www.dlib.org/dlib/july97/07chesnutt.html>].
- Details of the project can be found at <http://www.nyu.edu/projects/sanger/>. See also Cathy Moran Hajo and Esther Katz (1998) ‘The Margaret Sanger Papers Project: A Documentary Edition in the Digital Age’, Connect, Spring issue, <http://www.nyu.edu/acf/pubs/connect/spring98/HumSangerSp98.html> [now at <http://www.nyu.edu/its/pubs/connect/archives/98spring/hajosanger.html>].
- Martha L. Benner (1997) ‘The Lincoln Legal Papers and the New Age of Documentary Editing’, Computers and the Humanities, 30, 365-372, and see project web site at <http://www.fgi.net/LincolnLegalPapers/> [now at <http://www.papersofabrahamlincoln.org/>].
- <http://www.pro.gov.uk/bookshop/electron.htm>.
- See my response to various reviews, posted to Humanist, 12, No. 159, 5th August 1998 <http://lists.village.virginia.edu/lists_archive/Humanist/v12/0156.html>.
- <http://www.clarinet.co.uk/>.
- <http://dspace.dial.pipex.com/imaginations/>.
- Carol Tenopir and Lisa Ennis (1998) ‘The Digital Reference World of Academic Libraries’ Online (July issue) <http://www.onlineinc.com/onlinemag/OL1998/tenopir7.html>.
- For information of the development of TEI headers for the MEP, see David R. Chesnutt, ‘The Text Encoding Initiative and The Model Editions Partnership’, TEI Tenth Anniversary User Conference November 1997, <http://www.stg.brown.edu/webs/tei10/tei10.papers/chesnutt.html> [now at <http://www.stg.brown.edu/conferences/tei10/tei10.papers/chesnutt.html>].