Lafayette College Web Archive · Special Collections & College Archives

Mission Statement

As an extension of its overall mission, the College Archives gathers web content made available via the institution’s public website. This includes administrative and academic webpages as well as publications, policies, events, and news of the College community as presented online.

Acquisition Method

The College Archives uses the subscription service Archive-It and its open source software Heritrix to crawl the Lafayette College’s web site and harvest web pages. The crawler captures web domains by taking a snapshot of the page and storing a copy in the Internet Archive, which can be accessed through Archive-It and the WayBack Machine. Captured files are stored in the WARC (Web ARChive) file format, which is the Library of Congress preferred archival format for web sites harvested in bulk.

Crawl Scope & Limitations

Our current web archiving program crawls publicly available web content that is part of the Lafayette College web domain. File formats and types captured include HTML, JavaScript, PDFs, embedded images, videos, and audio. Frequency of capture of the College’s web site is several times a year, with captures of web pages that change often (i.e., calendar events) occurring more frequently.

Links to many Lafayette College administered external sites are not captured. Limited captures of the College’s external athletics site are available, as well as primary College social media pages like Facebook, Twitter, Flickr, and Instagram. Other organization sites (CNN, The Chronicle of Higher Education, NASA) are not captured. Content not crawled also includes digital repository collections, databases, streaming media, and password-protected sites.

Access & Use

Researchers can access Lafayette College’s web archive by using the search box at the top of this page or through the College’s collection page. Content is searchable by keyword or specific URL. Results may have advanced search options applied, including file format and capture date range. Within each result a Show All Captures link will provide a chronological list of captures for that specific URL within the Wayback Machine. Captured sites are available to view in Wayback within 24 hours after a crawl has finished. Full-text searching is available 7 days after the crawl has been completed, tested, and saved.

Researchers may also click on this link to browse all captures of the Lafayette College website from March 2, 2000 through the most recent crawl.

Rights Management

The Lafayette College Web Archive is a collection of the institution’s publicly available web content. For inquiries relating to use, reproduction, and copyright, please contact archives@lafayette.edu.

Questions or comments?

Are you a member of the Lafayette College community and your public web content via the College’s web site is not currently captured by our archiving program?

Please contact us at archives@lafayette.edu.