To promote effective sharing, we must create an enduring link between the people who generate data and its future uses, urge Heather H. Pierce and colleagues.
Much effort has gone towards crafting mandates and standards for researchers to share their data1–3. Considerably less time has been spent measuring just how valuable data sharing is, or recognizing the scientific contributions of the people responsible for those data sets. The impact of research continues to be measured by primary publications, rather than by subsequent uses of the data.
To incentivize the sharing of useful data, the scientific enterprise needs a well-defined system that links individuals with reuse of data sets they generate4. To further this goal, the Association of American Medical Colleges (where H.H.P. and A.D. work) and the Multi-Regional Clinical Trials Center at Brigham and Women’s Hospital and Harvard Medical School (where E.S. and B.E.B. work), along with The New England Journal of Medicine, convened a 2018 workshop of representatives from 50 organizations to discuss and validate such a system. The workshop included major journals, funders, data-citation groups and academic centres (see Supplementary Information, Participant list) and was preceded by numerous meetings.
Here we propose a system for leveraging existing initiatives and infrastructure to track the use, reuse and impact of scientific data through the consistent adoption of unique identifiers. Our system begins when researchers deposit a data set that they have generated. It then links every use and published analysis of that data set back to the original researchers (see ‘Virtuous cycle’).