University of Edinburgh: Digital archive captures health websites

The project will preserve 10,000 official and unofficial websites relating to health to create available to researchers and the public.

The initiative called ‘The Archive of Tomorrow: Health Information and Misinformation in the UK Web Archive’ has been launched by the National Library of Scotland and is funded by the Wellcome Trust.

Edinburgh’s Library and University Collections is one of the project partners with Cambridge University Library, and Bodleian Libraries, Oxford. The British Library will play also play a key role in the project.

The initiative will examine how websites and other online information about health is archived.

Digital resource
A new digital resource will be developed to collate and record online information on health issues – including representation of diverse and otherwise un-collected sources.

One of the considerations of the project is how to ethically capture online misinformation in relation to Covid-19.

Other aspects will include exploring best practice in preserving information published on the web, building a research network across a range of disciplines, and supporting different research approaches.

The archive will help users make sense of events and improve the ability to interrogate factual information and misinformation, organisers say.



The pandemic has contributed to a global crisis of information versus misinformation which has played out mostly online. Government and medical websites have changed on a daily basis as new information emerges, and there has been a massive proliferation of opining on social media and other online publications. Health advice, data and scientific evidence have been contested, revised, used and misused with dramatic and sometimes tragic consequences, and yet the digital record of this is fragile and difficult to access. How easy will it be in a few years’ time to source the tweets, blogs and news stories from the past 18 months and will we be able to make sense of it all? These are the questions we’ll be asking.

Joseph Marshall
National Library of Scotland’s Associate Director of Collections Management
The project will include the curation of a new collection of websites within the UK Web Archive – a partnership of UK Legal Deposit Libraries – under the theme of ‘Health and Misinformation’.

The project will help build a research network tackling questions such as techniques for data analysis and how to make web archives more representative, inclusive, and open for health research.

The Centre for Research Collections has well-established experience preserving collections to ensure their integrity and authenticity. As more information sources move online, we continue to commit to the same values and standards. This project provides an opportunity to explore best practice in preserving information published on the web and how to support different research approaches.

Daryl Green
Head of Special Collections and Deputy Head of the Centre of Research Collections (CRC) at the University of Edinburgh