A Data Sharing Story

Mercè Crosas

doi:10.7191/JESLIB.2012.1020

Outline

A Data Sharing Story

Mercè Crosas

2012, Journal of eScience Librarianship

https://doi.org/10.7191/JESLIB.2012.1020

visibility

…

description

16 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

The paper discusses the fundamental principles and significance of data sharing in the digital age, emphasizing the replication standard as a key component for validating research and enabling further advancements. It highlights the importance of transparency in providing adequate information that allows other researchers to understand and reconstruct prior findings, drawing on foundational works in the field.

Sophia Lafferty-Hess

International Journal of Digital Curation, 2018

In response to widespread concerns about the integrity of research published in scholarly journals, several initiatives have emerged that are promoting research transparency through access to data underlying published scientific findings. Journal editors, in particular, have made a commitment to research transparency by issuing data policies that require authors to submit their data, code, and documentation to data repositories to allow for public access to the data. In the case of the American Journal of Political Science (AJPS) Data Replication Policy, the data also must undergo an independent verification process in which materials are reviewed for quality as a condition of final manuscript publication and acceptance. Aware of the specialized expertise of the data archives, AJPS called upon the Odum Institute Data Archive to provide a data review service that performs data curation and verification of replication datasets. This article presents a case study of the collaboration b...

downloadDownload free PDF View PDFchevron_right

Proposed Foundations for Evaluating Data Sharing and Reuse in the Biomedical Literature

Heather Piwowar

researchremix.org

Science progresses by building upon previous research. Progress can be most rapid, efficient, and focused when raw datasets from previous studies are available for reuse. To facilitate this practice, funders and journals have begun to request and require that investigators share their primary datasets with other researchers. Unfortunately, it is difficult to evaluate the effectiveness of these policies. This study aims to develop foundations for evaluating data sharing and reuse decisions in the biomedical literature by developing tools to answer the following research questions, within the context of biomedical gene expression datasets: What is the prevalence of biomedical research data sharing? Biomedical research data reuse? What features are most associated with an investigator's decision to share or reuse a biomedical research dataset? Does sharing or reusing data contribute to the impact of a research article, independently of other factors? What do the results suggest for developing efficient, effective policies, tools, and initiatives for promoting data sharing and reuse? I suggest a novel approach to identifying publications that share and reuse datasets, through the application of natural language processing techniques to the full text of primary research articles. Using these classifications and extracted covariates, univariate and multivariate analysis will assess which features are most important to data sharing and reuse prevalence, and also estimate the contribution that sharing data and reusing data make to a publication's research impact. I hope the results will inform the development of effective policies and tools to facilitate this important aspect of scientific research and information exchange.

downloadDownload free PDF View PDFchevron_right

Whitepaper: Practical challenges for researchers in data sharing

Mathias Astell

2018

In one of the largest surveys of researchers about research data (with over 7,700 respondents), Springer Nature finds widespread data sharing associated with published works and a desire from researchers that their data are discoverable.<br><br>This whitepaper examines the results of this survey and discusses the challenges that researchers face in sharing their data. The whitepaper looks at data sharing attitudes globally, as well as in relation to region, subject and seniority.<br><br>Infographic: https://doi.org/10.6084/m9.figshare.5996786

downloadDownload free PDF View PDFchevron_right

An examination of research data sharing and re-use: implications for data citation practice

Dietmar Wolfram, Hyoungjoo Park

This study examines characteristics of data sharing and data re-use in Genetics and Heredity, where data citation is most common. This study applies an exploratory method because data citation is a relatively new area. The Data Citation Index (DCI) on the Web of Science was selected because DCI provides a single access point to over 500 data repositories worldwide and to over two million data studies and datasets across multiple disciplines and monitors quality research data through a peer review process. We explore data citations for Genetics and Heredity, as a case study by examining formal citations recorded in the DCI and informally by sampling a selection of papers for implicit data citations within publications. Citer-based analysis is conducted in order to remedy self-citation in the data citation phenomena. We explore 148 sampled citing articles in order to identify factors that influence data sharing and data re-use, including references, main text, supplementary data/information, acknowledgments, funding information, author information, and web/author resources. This study is unique in that it relies on a citer-based analysis approach and by analyzing peer-reviewed and published data, data repositories, and citing articles of highly productive authors where data sharing is most prevalent. This research is intended to provide a methodological and practical contribution to the study of data citation.

downloadDownload free PDF View PDFchevron_right

Formalised data citation practices would encourage more authors to make their data available for reuse

Hyoungjoo Park

2017

It is increasingly common for researchers to make their data freely available. This is often a requirement of funding agencies but also consistent with the principles of open science, according to which all research data should be shared and made available for reuse. Once data is reused, the researchers who have provided access to it should be acknowledged for their contributions, much as authors are recognised for their publications through citation. Hyoungjoo Park and Dietmar Wolfram have studied characteristics of data sharing, reuse, and citation and found that current data citation practices do not yet benefit data sharers, with little or no consistency in their format. More formalised citation practices might encourage more authors to make their data available for reuse.

downloadDownload free PDF View PDFchevron_right

The Reproducibility Project: A model of large-scale collaboration for empirical research on reproducibility.

Calvin Lai

downloadDownload free PDF View PDFchevron_right

Data Management and Data Sharing in Science and Technology Studies

Manfred Laubichler

Science, Technology, & Human Values

This paper presents reports on discussions among an international group of science and technology studies (STS) scholars who convened at the US National Science Foundation (January 2015) to think about data sharing and open STS. The first report, which reflects discussions among members of the Society for Social Studies of Science (4S), relates the potential benefits of data sharing and open science for STS. The second report, which reflects discussions among scholars from many professional STS societies (i.e., European Association for the Study of Science and Technology [ EASST], 4S, Society for the History of Technology [ SHOT], History of Science Society [ HSS], and Philosophy of Science Association [ PSA]), focuses on practical and conceptual issues related to managing, storing, and curating STS data. As is the case for all reports of such open discussions, a scholar’s presence at the meeting does not necessarily mean that they agree with all aspects of the text to follow.

downloadDownload free PDF View PDFchevron_right

Data sharing practices and data availability upon request differ across scientific disciplines

Kajar Köster

Scientific Data

Data sharing is one of the cornerstones of modern science that enables large-scale analyses and reproducibility. We evaluated data availability in research articles across nine disciplines in Nature and Science magazines and recorded corresponding authors’ concerns, requests and reasons for declining data sharing. Although data sharing has improved in the last decade and particularly in recent years, data availability and willingness to share data still differ greatly among disciplines. We observed that statements of data availability upon (reasonable) request are inefficient and should not be allowed by journals. To improve data sharing at the time of manuscript acceptance, researchers should be better motivated to release their data with real benefits such as recognition, or bonus points in grant and job applications. We recommend that data management costs should be covered by funding agencies; publicly available research data ought to be included in the evaluation of application...

downloadDownload free PDF View PDFchevron_right

The Essential Nature of Sharing in Science

Michael Zigmond

Science and Engineering Ethics, 2010

Advances in science are the combined result of the efforts of a great many scientists, and in many cases, their willingness to share the products of their research. These products include data sets, both small and large, and unique research resources not commercially available, such as cell lines and software programs. The sharing of these resources enhances both the scope and the depth of research, while making more efficient use of time and money. However, sharing is not without costs, many of which are borne by the individual who develops the research resource. Sharing, for example, reduces the uniqueness of the resources available to a scientist, potentially influencing the originator's perceived productivity and ultimately his or her competitiveness for jobs, promotions, and grants. Nevertheless, for most researchers-particularly those using public funds-sharing is no longer optional but must be considered an obligation to science, the funding agency, and ultimately society at large. Most funding agencies, journals, and professional societies now require a researcher who has published work involving a unique resource to make that resource available to other investigators. Changes could be implemented to mitigate some of the costs. The creator of the resource This paper is based on a presentation made at the conference ''New Capabilities, Emerging Issues, and Responsible Conduct in Data Management'' jointly sponsored by the U.S. Office of Research Integrity and the University of Maryland Baltimore and held in Baltimore, Maryland on September 28-29, 2006. This paper and associated references reflects the state of the field at that time.

downloadDownload free PDF View PDFchevron_right

Taking the pain out of data sharing

Matthew Hutson

Nature, 2022

J ournals and funding bodies increasingly require manuscript authors to share data on request or make the information publicly available. It's a big ask from a technical standpoint, but some straightforward strategies can simplify the process. Scientific papers rarely include all the data used to justify the conclusions, even in the supplementary material. Authors might fear getting scooped, or that other researchers will use the raw data to make fresh discoveries, or they might wish to protect the privacy of study participants. Or, more probably, authors have neither the time nor the expertise to package the data for others to view and understand. Such reticence costs the research community. Data transparency allows others to repeat analyses and catch mistakes or fraudulent Despite agreeing to make raw data available, some authors fail to comply. The right strategies and platforms can ease the task.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Mercè Crosas

New England Journal of Medicine, 2017

downloadDownload free PDF View PDFchevron_right

Promise and practice in data sharing

Paul Wouters

2003

Once collected and safely deposited, the promise of research data is that in principle they can be put to use all over the global science system. And this can be done in principle at negligible additional cost, once the prospective users have the necessary ICT infrastructure available. In this way, sharing of digital research data opens up substantial new vistas for international scientific cooperation. Starting with the data will simplify co-operative arrangements because the value of the data in principle will only increase by making them available for multiple uses. Data in principle are wear-proof and the more they are used by more researchers, the more results they will produce. After all the ultimate goal of the parties investing in the collection and storage of research data is to get as much knowledge (end product) as possible out of their investments in data resources. In the context of publicly financed science, the investor does not have to worry even if total strangers are using 'his' data: the results will be publicly available. The more data value you spread around, the more knowledge value you will get in return. To more than one listener/reader this may sound like some overgrown dotcom wisdom. The principle may be correct, but it will not come as a surprise that there can exist a certain discrepancy between the current research practice and the principles that will characterise the future international digital research environment. Instead of considering data sharing as getting free additional help in getting the intended scientific work done, initial investors in digital data are often suspicious of unfair competition and free riding. Researchers looking for existing data sources from colleagues are not always welcomed heartily. Researchers, officials and managers mention financial, legal, and organisational barriers, cultural and ethical problems that complicate the full realisation the potential of digital data resources. 9 digital research data as floating capital of the global science system and new data policies new international research practices. promise and practice in data sharing digital research data as floating capital of the global science system The more data are used, the more results they will produce. More and more fields in science, social science and the humanities are becoming "data rich".

downloadDownload free PDF View PDFchevron_right

A Replication Manifesto

Joseph Valacich

AIS Transactions on Replication Research

Replication is one of the main principles of the scientific method. The social sciences, and in particular the information systems discipline, has lagged behind the physical sciences which have more established traditions of independently replicating studies from other labs. In this essay, we outline the need for replication in the information systems discipline, identifying three possible approaches for executing such studies. There are numerous benefits to the discipline from embracing and valuing replication research. Replication will either improve confidence in our research findings or identify important boundary conditions. Replications also enhance various scientific processes and offer methodical and educational improvements. Collectively, these benefits will help the information systems discipline mature and prosper.

downloadDownload free PDF View PDFchevron_right

If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology

Jillian Wallis

PLoS ONE, 2013

Research on practices to share and reuse data will inform the design of infrastructure to support data collection, management, and discovery in the long tail of science and technology. These are research domains in which data tend to be local in character, minimally structured, and minimally documented. We report on a ten-year study of the Center for Embedded Network Sensing (CENS), a National Science Foundation Science and Technology Center. We found that CENS researchers are willing to share their data, but few are asked to do so, and in only a few domain areas do their funders or journals require them to deposit data. Few repositories exist to accept data in CENS research areas.. Data sharing tends to occur only through interpersonal exchanges. CENS researchers obtain data from repositories, and occasionally from registries and individuals, to provide context, calibration, or other forms of background for their studies. Neither CENS researchers nor those who request access to CENS data appear to use external data for primary research questions or for replication of studies. CENS researchers are willing to share data if they receive credit and retain first rights to publish their results. Practices of releasing, sharing, and reusing of data in CENS reaffirm the gift culture of scholarship, in which goods are bartered between trusted colleagues rather than treated as commodities.

downloadDownload free PDF View PDFchevron_right

Data accessibility is not sufficient for making replication studies a matter of course

Gert Wagner

2012

downloadDownload free PDF View PDFchevron_right

Research with Built-in Replication: Comment and Further Suggestion for Replication Research

Scott Armstrong

In this brief commentary on the paper Designing Research with In-Built Differentiated Replication, we expand on concerns about a lack of replication research raised by the authors by focusing on three key questions of continuous importance: Why should more replication research be conducted? Why do we find so few replications studies? What can be done about it? We identify barriers preventing replication related to the scientific system, the replication researcher, and the initial research. Suggestions are made that all papers should be published electronically along with reviews, authors should take steps to encourage replications of their work, and editors should invite replications of important papers. Moreover, the scientific community should establish a replication index as a measure of output quality.

downloadDownload free PDF View PDFchevron_right

Sharing data: Practices, barriers, and incentives

Jim Malone

Proceedings of the American Society for Information Science and Technology, 2011

Bringing together a panel of researchers who have conducted surveys regarding current data sharing practices and scientific perceptions of it, this paper addresses findings from surveys including the PARSE Insight survey, DataONE survey, Data Conservancy/University of Illinois and Purdue interviews, as well as a survey and interviews of scientists in the Southeast US done for USGS. The paper analyzes the findings of these surveys and interviews and discusses the advantages of data sharing. It addresses the varying degrees of data sharing and data hoarding and insight regarding the sharing of data among respondents. It also touches on concerns of those who are reluctant to share data and the role the development of cyberinfrastructure will play in future data sharing. The surveys and in-depth interviews discussed in this panel will help information scientists and system designers understand the current practices, barriers to data sharing, and needs of scientists into the future. Inculcating a culture of data sharing and curation requires first understanding the motivations and concerns of the scientists who collect and use research data.

downloadDownload free PDF View PDFchevron_right

Research: Data Sharing

Nicola Stingelin

The availability of high quality data is a foundational component of developments in health care, with one step in the research process being to develop an ethically appropriate data sharing policy that will optimizse the benefits derived from a particular research project, whilstwhile protecting rights and interests. The work of devising a policy would benefit from having an ethics framework available, although the impacts on research of the ongoing rapid advances in science and technology constantly bring new challenges for the ethics of data sharing. Some of these challenges are addressed in this essayentry, starting by making an overview of the main data sharing stakeholders and the most important settings and perspectives that frame the ethics of data sharing. The main ethics principles and positions relevant to an ethics framework are introduced, noting that the moral default position is held to be that data should be shared so as to achieve a just distribution of research benefits, although some conditions must be attached to protect and respect tangential rights and interests. A case study is then described and discussed. The central point of the concluding comments is that " 'data " ' is not meaningful object of ethics reflection if disconnected and isolated from its context. 206 Keywords

downloadDownload free PDF View PDFchevron_right

The future of replication

Gary King

2012

Abstract: Since the replication standard was proposed for political science research, more journals have required or encouraged authors to make data available, and more authors have shared their data. The calls for continuing this trend are more persistent than ever, and the agreement among journal editors in this Symposium continues this trend. In this article, I offer a vision of a possible future of the replication movement.

downloadDownload free PDF View PDFchevron_right

Healthcare data-sharing from the perspective of a scientist

Ciprian Cornea

2015

downloadDownload free PDF View PDFchevron_right

A Data Sharing Story

Sign up for access to the world's latest research

AbstractAI

Related papers

Related papers

Related topics

Abstract
AI