The heritage and archaeology sectors are producing increasing volumes of very diverse data through extensive use of digital technologies and data analysis. Given the processes and the uniqueness of the objects and sites studied, such data is often derived from non-repeatable interventions. This, combined with its heterogeneity, makes the data particularly fragile and subject to obsolescence. Under these conditions, management of data is complex.
At the end of March 2020, SSHOC joined forces with the European Research Infrastructure for Heritage Science (E-RIHS) and Saving European Archaeology from the Digital Dark Age COST Action (SEADDA) to co-host a webinar on the use and re-use of archaeological and heritage data. The purpose was to present best practice and guidelines both of relevance to the sector and also tranferable to other contexts.
The event was chaired by Julian Richards (University of York) and brought together a pool of experts from diverse background and expertise: Holly Wright (University of York), Jessica Hendy (University of York) and Scott Orr (UCL) and Alejandra Albuerne (UCL).
FAIR in heritage science and archaeological data
FAIR principles (Findable, Accessible, Interoperable and Reusable) are broadly accepted for research data, but their implementation can be tricky. Holly Wright presented the data curation policy framework recently developed by E-RIHS for heritage science data, which provides guidelines[1] for meeting the FAIR principles. Making data open is not enough to ensure its use and re-use: it is also necessary to understand how data is being reused from both, a qualitative and quantitative perspective. This has been a key objective for E-RIHS.
The guidelines are organised around the four components of FAIR:
Data archiving for re-use in heritage and archaeological science
One of the key strategies for enabling the re-use of data is effective data archiving. Jessica Hendy reflected on her work on molecular biology in archaeological science to offer her perspective on the practice of data archiving, discussing the possibilities and challenges it brings.
The benefits of archiving data are many. It allows for a thorough analysis of the interpretation and quality assurance. It enables the replication of data analysis strategies in peer review and after. It also provides a means of long-term data storage that is non institute-specific. Data archiving is key for allowing future research to reanalyse existing data when new strategies are developed. There is also an increasing demand for transparency on how data was generated, both in the lab and computationally, which is promoting the recording of laboratory protocols using protocols.io and computational processing using GitHub or Bitbucket.
Nonetheless, data archiving also brings certain challenges. For example, datasets can be massive, so only institutions with sufficient computational support can analyse available data. This can lead to their domination in the field. In addition, data in heritage science and archaeology can be highly specialised, requiring specialised knowledge to critically interpret it. In addressing these challenges, Jessica Hendy suggested that data exploration should be made more easily accessible and not exclusively of interest to just a few research groups. This can be done, for example, through online processing capacity. In addition, since there remains a lack of awareness and communication of what data are produced and how it is stored, it is necessary to raise awareness among research partners and collaborators about the importance of sharing data and respecting community standards.
Exploring best practice and tools for use and re-use
The last part of the webinar, delivered by Scott Orr, had the purpose of offering hands-on advice in the form of best practices and tools to plan for the re-use of scientific data in heritage and archaeological contexts. He suggested the following question that should guide the researcher in planning data management: if you should come back to the data in a few years’ time, what will you want to know about it in order to interrogate it again?
Many points were discussed during this part with the following highlights:
Learning from each other
All presentations were followed by lively discussions where participants shared their knowledge and experience. There was ample discussion about the CARE principles for Indigenous Data Governance and how they can be generalised for other context where privacy needs consideration. Other topics of discussion were the role of DOIs and URIs in linking data to sites and objects, and the challenge with speed of data publication and data embargoes in different fields of heritage and archaeological science. Finally, Ron Dekker, SSHOC coordinator and member of EOSC Executive Board, shared information on the active role of SSHOC in supporting the SSH community, and underlined the specific need for SSH vocabularies.
E-RHIS wants to find out more about the different methodologies and workflows for data in heritage science and calls for new case-studies that will help them better understand how data is being used in the sector. If you have any suggestions, you can get in touch with Holly Wright.
In the coming months SSHOC will again join forces with key actors in heritage science data to prepare a workshop where the leading experts from the field will be discussing more aspects of the management of heritage data. Follow SSHOC events announcements on SSHOC website or SSHOC Twitter account in order not to miss it, or simply sign up for the SSHOC newsletter.
[1] The guidelines will be published this summer on E-RIHS website, so keep an eye on E-RIHS updates.