On May 20th 2021, SSHOC hosted a Roundtable of Experts for Data Citation, to stimulate discussion on data citation in the Social Sciences and Humanities (SSH). The session was led by CNRS and had around 30 participants, including invited speakers from UGOE, CLARIN, CNR/ISTI, the Turing Institute, the Observatory of Paris - PSL Research University, Vienna - RDA data citation WG, OpenAIRE and CODATA.
This event followed the discussions that began during a joint event between SSHOC, FREYA, and EOSC-hub - “Realising the European Open Science Cloud” - in November. The session was focused on data citation, and different approaches and experiences related to data citation were discussed by speakers from SSHOC and beyond.
After an inventory of SSH citation practices, we began developing a prototype to implement what we called “FAIR SSH Data Citation’’. Based on that work, we drafted a first set of recommendations about data citation, adapted to the specific needs of SSH. For these discussions we invited experts from known organizations (e.g. RDA, OpenAire, CODATA, Turing Institute, Observatory of Paris) in order to hear their thoughts and feedback on the work of our task as well as data citation in the SSH in general.
The session began with a brief contextual presentation by Nicolas Larrousse (CNRS), who also presented the Recommendations for FAIR Data Citation in the SHS. Indeed, he pointed out the reality that while there have been many initiatives to standardize data citation practices, there exist many communities of practice that have not yet harmonized.
Following this introduction, Carlo Maria Zwolf (Observatory of Paris) presented the VAMDC, a world-renowned e-infrastructure for Astrophysics data. Even though VADMC provides a reliable mechanism to generate a citation (and therefore give credit to the author) it lacks the context of a citation, or put in another way, the intention behind a citation. For instance, it cannot evaluate the question: “was the citation made in a positive or negative way?’’
Understanding the intention behind a citation is crucial for scientific reasons. There is therefore a need to provide the community with the capacity to define the ‘role’ of cited data in a machine actionable way. This topic was presented during a Birds of a Feather (BoF) session during the 17th RDA plenary conference. The BoF then discussed what kind of annotation would be needed to express, in a machine-actionable way, the reason why A cites B.
The outcome is to create an RDA Interest Group to discuss this matter in more detail (e.g. granularity and curation of annotation).
During the second part of the session, Cesare Concordia presented the Citation Service Prototype developed in the framework of task 3.4. The main goals are to:
Particular attention was paid to the “Citation Metadata Viewer” in order to demonstrate the diversity of existing information regarding datasets grabbed from API, embedded metadata in landing pages, and citations extracted from the abstracts of the DH conference (organized by ADHO) amongst others.
This focus explains why we need to standardize and curate information to make it machine actionable, which can be done via the API component of the Citation Service Prototype
The prototype was well-received by the round table, with several helpful suggestions being made for its future development.
Following each presentation, the experts asked questions of the presenters and themselves, as well as sharing their experiences with data citation and their hopes for the future. These rich discussions gave rise the following points:
Coming away from these presentations and rich discussions, we as a community need to reflect on why we cite: is it to provide evidence, to foster reuse, to give access, to give credit, or all these things combined?
It seems that the citation prototype developed in the context of the task is going in the right direction. In particular, the citation viewer was met with strong interest.
All these remarks, suggestions and references will be used to:
A possible output of this round table is the creation of an RDA Interest Group following the Bird of a Feather session -”Rich Metadata for annotation of citations contexts and data-citations contexts”- during the 17th RDA plenary.