Guidelines for the Bentley Historical Library Web Archives
The Bentley Historical Library was established in 1935 by the University of Michigan Regents to carry out two functions: to serve as the official archives of the University and to document the history of the state of Michigan and the activities of its people, organizations and voluntary associations. The University Archives and Records Program (UARP) and Michigan Historical Collections (MHC) are committed to the preservation of records and papers of historical significance, regardless of format.
Given the ubiquity of online resources at the University of Michigan and among individuals and organizations across the state, the Bentley Historical Library sought an efficient and cost-effective means of preserving websites for future research and study. After evaluating several service providers, the Bentley Library subscribed to the California Digital Library's Web Archiving Service (WAS) in July 1, 2010. (See the WAS homepage for an overview of the service as well as specific information for researchers and webmasters of archived sites.)
This subscription-service model divides responsibilities for the development and support of web archives between the Bentley Historical Library and the California Digital Library. While no active participation is required of content owners, several steps may be taken to ensure that websites are preserved as completely as possible.
The Bentley Historical Library will...
- Identify, appraise, and select websites that reflect the mission and collecting interests of the University Archives and Records Program and the Michigan Historical Collections.
- Organize and manage archived websites to complement current holdings in the Bentley Historical
Library.
- Provide descriptions and contextual information for materials.
- Mediate access (via metadata, catalog records, and an access interface) to facilitate the search
and retrieval of content.
- Respect the intellectual property rights of owners:
- Distinguish 'archived' sites from 'live' content with a prominent banner and statement
at the top of each preserved web page.
- Embargo archived content for six months after its capture.
- Suppress content from public view or refrain from website preservation at the request of
content owners.
- Reach out to webmasters when website design or configurations pose issues for the accurate
capture of content.
- Promote the use and development of new features for the web archives.
The California Digital Library will...
- Maintain the Heretrix web crawler, a computer program ('or robot') that browses websites and
saves a copy of all the content and hypertext links it encounters. By default, Heretrix will not
degrade website performance and WAS will suspend harvesting if technical difficulties are detected
on a target server.
- Securely store archived content in a digital preservation repository at the San
Diego Supercomputer Center.
- Host publicly available content from the University of California Office of the President Data
Center in Oakland, CA and resolve associated service outages or technical issues.
- Offer general technical assistance and customer support.
Content owners will be able to...
- Rely upon the Bentley Historical Library to identify, preserve, and provide access to multiple versions of select websites over time.
- Allow the WAS web crawler to preserve website by including the following exception in the
site's robots.txt file:
User-Agent: cdlwas_bot
Disallow:
- Inform the Bentley Historical Library if a website is scheduled to go online, be decommissioned, or undergo significant changes.
- Request that archived content be suppressed from public view after captures have been completed.
- Follow best practices for the design and maintenance of websites (cf. UARP's
Guidelines for Web-Disseminated Records and/or Google's Webmaster
Guidelines).
Please note: the Bentley Library may not be able to preserve the exact form, functionality, and content of sites as they appear on the live web. The following types of content present significant issues for capture and/or display:
- Dynamic scripts or applications such as JavaScript or Adobe Flash
- Streaming media players with video or audio content
- Password protected material
- Forms or database driven content that requires interaction with the site
- Exclusions specified in robots.txt files
Please send your comments, questions, and suggestions about the Bentley Historical Library Web Archives
to bhlwebarchive@umich.edu.