As web technologies are continually developing, websites require an increasing amount of time and energy to maintain them and keep them running securely as they get older. Old websites that are not maintained can become unusable, and even hazardous to users, editors or the host institution.
Archiving older websites ensures that their contents and interfaces can be kept accessible in the long term, and their data is stored safely for future use.
A limited time frame for keeping a site live (usually 5 years), followed by an appropriate archiving process, is a standard part of data management for projects funded by public grants. Our website archiving processes follow guidance from the UK Web Archive and National Archives.
Our archived websites are available in two different versions:
At the point of archiving, the website is submitted to the Wayback Machine, which uses its own capture method to create a snapshot of the website. These snapshots are stored within the Wayback Machine, and can be viewed online using the built-in viewer. Wayback Machine snapshots are easy to view quickly, but cannot be downloaded for offline viewing. As these snapshots are not created by the Digital Humanities Team, we cannot control the capture process, and it is possible that elements may be missing from the snapshot.
This is a captured version of the website which has been created and checked by the Digital Humanities Team and uploaded to various repositories for secure storage. These archived websites are usually stored as a single .warc or .wacz file, and are available for download from Open Research Exeter (the University of Exeter's institutional repository), Github and Zenodo.
To view these files, you will need to use a free viewer software such as Replayweb.page, which will allow you to browse the archived version like the original website.
The archived versions contain as much of the content, structure and appearance of the original live website as possible. However, the processes and tools available for copying sites are not always able to capture the full content of each website. Search features and interactive elements such as embedded maps are particularly difficult to capture using current archiving tools and may not be functional in all archived copies.
Archives created using the Wayback Machine might contain slightly different content to those captured by the Digital Humanities team, due to the different archiving tools used.
Where specific content or data could not be captured as part of the site, it is sometimes possible to archive it as a dataset alongside the website files. If these are available, they will be included in the ORE, Zenodo, and/or Git repositories alongside the web archive file.
We have deposited archived copies in our institutional and trusted third party repositories. Please see their individual retention policies for details:
For more help, feel free to contact the Digital Humanities team.