DataCite Metadata: Getting Connected!

MOHAMED YAHIA

doi:10.5281/ZENODO.5534094

Outline

Title

Abstract

DataCite Metadata: Getting Connected!

MOHAMED YAHIA

2021

https://doi.org/10.5281/ZENODO.5534094

visibility

…

description

25 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

The role of DataCite and other large-scale infrastructures is evolving from identifying things to connecting things and DataCite metadata includes many ways to make connections. We will concentrate on relatedIdentifiers (and citations), nameIdentifiers and affiliationIdentifiers. We will explore how these connectors are being used in DataCite metadata. Adding these connectors to your DataCite metadata provides great opportunities for you to improve connectivity for your datasets and your users. A recording of the presentation can be found at: https://youtu.be/5WIBGY-Z7E8

Prof. Rupak Chakravarty

2021

Objective The enormous growth in research data generated today has highlighted the value of data management (RDM) to make research FAIR (Findable, Accessible, Interconnected and Reusable). Appropriate data instructs researchers to use and reuse that data within appropriate citations and attribute it to the author. And Data citation refers to the process of presenting a reference to data in the same way as a bibliographic reference to printed resources is regularly provided by researchers. In this regard, the objective of this paper is to investigate the activities of the Datacite website in managing research data. Methodology The study approached the Datacite website, a non-profit organization that provides analysis with persistent identifiers (DOIs). The research examines the Statistics systems and other critical resources. Registrations by the Collective group and most involved repositories are included in the statistical approaches. The basic resources include top executives, OAI...

downloadDownload free PDF View PDFchevron_right

DataID: towards semantically rich metadata for complex datasets

Ivan Ermilov

The constantly growing amount of Linked Open Data (LOD) datasets constitutes the need for rich metadata descriptions, enabling users to discover, understand and process the available data. This metadata is often created, maintained and stored in diverse data repositories featuring disparate data models that are often unable to provide the metadata necessary to automatically process the datasets described. This paper proposes DataID, a best-practice for LOD dataset descriptions which utilize RDF files hosted together with the datasets, under the same domain. We are describing the data model, which is based on the widely used DCAT and VoID vocabularies, as well as supporting tools to create and publish DataIDs and use cases that show the benefits of providing semantically rich metadata for complex datasets. As a proof of concept, we generated a DataID for the DBpedia dataset, which we will present in the paper.

downloadDownload free PDF View PDFchevron_right

Recommendations for Data/Publication Linkage

Brian Matthews

A key aim of the CLADDIER project is to investigate the cross-linking and citation of resources (in particular data and their associated publications) held in institutional and subjectbased repositories within the research sector. Typically traditional citations are partial in that they are "backward citations", referring to work which influenced the current research, and they only cite other formal publications, ignoring other artefacts which are the output of research, in particular research data. Online repositories storing more dynamic digital objects gives the opportunity to provide a more complete picture of the relationships between them, with backward and forward citations to data and publications being propagated between repositories.

downloadDownload free PDF View PDFchevron_right

An Interoperability Infrastructure for Digital Identifiers in e-science

Paolo Bouquet

The rapid increase of scientific digital assets in the last years has made clear that digital identifiers are crucial for effectively publishing, accessing and managing digital information in e-science contexts. From persistent keys for access to digital objects in network environments, the concept of persistent identifiers has been more recently extended to identify also physical objects like people, institutions and any type of relevant entity in the e-Science domain, opening the way to the creation of an integrated information space where a network of resources can be resolved, linked, navigated and analyzed, as the Linked Open Data approach envisions for the Web. However, the creation and full exploitation of this valuable network of connections is currently hindered by the fragmentation and lack of coordination of the digital identifier ecosystem. The aim of this paper is to propose an open, distributed and scalable infrastructure for interoperating existing Persistent Identifiers and other digital identifier systems (like Cool URIs) in e-science, overcoming geographical, disciplinary and organizational boundaries. The Digital Identifier interoperability infrastructure is presented as a crosscutting solution of core services enabling interoperability at three different levels: identifier, co-reference and semantic.

downloadDownload free PDF View PDFchevron_right

Richard Grunzke, Volker Hartmann, Thomas Jejkal, Ajinkya Prabhune, Hendrik Herold, Aline Deicke, Alexander Hoffmann, Torsten Schrade, Gotthard Meinel, Sonja Herres-Pawlis, Rainer Stotzka, Wolfgang E. Nagel: Towards a Metadata-driven Multi-community Research Data Management Service

Aline Deicke

S Gesing/J. Krüger (Eds.), Proceedings of the 8th International Workshop on Science Gateways (IWSG 2016). Rome, Italy, June 8-10, 2016. – Nowadays, the daily work of many research communities is characterized by an increasing amount and complexity of data. This makes it increasingly difficult to manage, access and utilize to ultimately gain scientific insights based on it. At the same time, domain scientists want to focus on their science instead of IT. The solution is research data management in order to store data in a structured way to enable easy discovery for future reference. An integral part is the use of metadata. With it, data becomes accessible by its content instead of only its name and location. The use of metadata shall be as automatic and seamless as possible in order to foster a high usability. Here we present the architecture and initial steps of the MASi project with its aim to build a comprehensive research data management service. First, it extends the existing KIT Data Manager framework by a generic programming interface and by a generic graphical web interface. Advanced additional features includes the integration of provenance metadata and persistent identifiers. The MASi service aims at being easily adaptable for arbitrary communities with limited effort. The requirements for the initial use cases within geography, chemistry and digital humanities are elucidated. The MASi research data management service is currently being built up to satisfy these complex and varying requirements in an efficient way.

downloadDownload free PDF View PDFchevron_right

The road towards structured affiliation information in a national bibliographic database

Peter Aspeslagh

2021

The implementation of a Flemish research evaluation parameter highlights the complexity of author affiliation data collection for publications not included in major bibliographic databases. In this paper, we discuss a set of fundamental challenges that were encountered during a first data collection project. More specifically, we will elaborate the multifaceted data retrieval approach, the quest for a sustainable way of data registration and the development of necessary infrastructure and procedures. Although a lot of efforts are being invested in optimizing the exchange of well-structured author affiliation data, we will zoom in on opportunities that might arise to facilitate similar projects in the future.

downloadDownload free PDF View PDFchevron_right

Linking Data and Publications: Towards a Cross-Disciplinary Approach

Paolo Manghi

2013

In this paper, we tackle the challenge of linking scholarly information in multi-disciplinary research infrastructures. There is a trend towards linking publications with research data and other information, but, as it is still emerging, this is handled differently by various initiatives and disciplines. For OpenAIRE, a European cross-disciplinary publication infrastructure, this poses the challenge of supporting these heterogeneous practices. Hence, OpenAIRE wants to contribute to the development of a common approach for discipline-independent linking practices between publications, data, project information and researchers. To this end, we constructed two demonstrators to identify commonalities and differences. The results show the importance of stable and unique identifiers, and support a "by reference" approach of interlinking research results. This approach allows discipline-specific research information to be managed independently in distributed systems and avoids redundant maintenance. Furthermore, it allows these disciplinary systems to manage the specialized structures of their contents themselves.

downloadDownload free PDF View PDFchevron_right

Collaborate, automate, prepare, prioritize: Creating metadata for legacy research data

Inna Kouper

2013

Data curation projects frequently deal with data that were not created for the purposes of longterm preservation and re-use. How can curation of such legacy data be improved by supplying necessary metadata? In this report, we address this and other questions by creating robust metadata for twenty legacy research datasets. We report on the metrics of creating domainspecific metadata and propose a four-prong framework of metadata creation for legacy research data. Our findings indicate that there is a steep learning curve in encoding metadata using the FGDC content standard for digital geospatial metadata. Our project demonstrates that when data curators are handed research data "as is," they may be successful in incorporating such data into a data sharing environment. We found that data curators can be successful in creating descriptive metadata and enhancing discoverability via subject analysis. However, curators must be aware of the limitations in applying structural and administrative metadata for legacy data.

downloadDownload free PDF View PDFchevron_right

Adding escience assets to the data web

Robert Sanderson

2009

Aggregations of Web resources are increasingly important in scholarship as it adopts new methods that are data-centric, collaborative, and networked-based. The same notion of aggregations of resources is common to the mashed-up, socially networked information environment of Web 2.0. We present a mechanism to identify and describe aggregations of Web resources that has resulted from the Open Archives Initiative -Object Reuse and Exchange (OAI-ORE) project. The OAI-ORE specifications are based on the principles of the Architecture of the World Wide Web, the Semantic Web, and the Linked Data effort. Therefore, their incorporation into the cyberinfrastructure that supports eScholarship will ensure the integration of the products of scholarly research into the Data Web.

downloadDownload free PDF View PDFchevron_right

Expanding the Metadata Librarian Horizon: Reflections on the Metadata Practices in the Web and Digital Repositories

Sai Deng

2018

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Prof. Rupak Chakravarty

Zenodo (CERN European Organization for Nuclear Research), 2021

The enormous growth in research data generated today has highlighted the value of data management (RDM) to make research FAIR (Findable, Accessible, Interconnected and Reusable). Appropriate data instructs researchers to use and reuse that data within appropriate citations and attribute it to the author. And Data citation refers to the process of presenting a reference to data in the same way as a bibliographic reference to printed resources is regularly provided by researchers. In this regard, the objective of this paper is to investigate the activities of the Datacite website in managing research data. Methodology The study approached the Datacite website, a non-profit organization that provides analysis with persistent identifiers (DOIs). The research examines the Statistics systems and other critical resources. Registrations by the Collective group and most involved repositories are included in the statistical approaches. The basic resources include top executives, OAI-PMH, DataCite Public Roadmap, DataCite Commons, DataCite/ORCID Auto-update and Service Providers. The outcomes were analysed by MS Excel. Results It is noted that there were 293 members of the registry from different countries. The USA was at the top of the 137 members according to registration, while at least one was located in India, Finland, Spain, etc. Germany was listed as the top member and most of the repository holding companies. Datafirst is the only server found in an Indian context. DataCite Commons found as a discovery tool which allows simple searches by works, individuals and organisations, while providing users with a detailed overview of the relationships between the entities in the research setting. Using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), the DataCite service exposes metadata stored in the DataCite Metadata Store (MDS). Datacite Auto-update unambiguously categorises researchers and provides tools to automate the link between researchers and their creative work.

downloadDownload free PDF View PDFchevron_right

ODIN: the ORCID and DataCite interoperability network

Sergio Pareja ruiz

International Journal of Knowledge and Learning, 2014

Research data is increasingly seen as the most significant untapped resource in scholarship. Awareness and practice of referencing and citing research data is increasing, and different initiatives to unambiguously identify datasets are in place. Steps are being taken to identify the individuals who created or contributed to research outputs. Lack of interoperability between the different initiatives to identify datasets and contributors remains a major hurdle. The ODIN project (ORCID and DataCite Interoperability Network) tries to address this need. ODIN builds on the ORCID and DataCite initiatives to uniquely identify scientists and data sets and connect this information

downloadDownload free PDF View PDFchevron_right

Pre-Metadata Counseling: Putting the DataCite relationType Attribute into Action

Ayla S . Kenfield

2017

downloadDownload free PDF View PDFchevron_right

Linking Publications and Data: Challenges, Trends, and Opportunities

Eric Nienhouse

D-Lib Magazine, 2016

The Digital Object Identifier (DOI ®) System is a managed system for persistent identification of content on digital networks. It can be used to identify physical, digital, or abstract entities. The identifiers (DOI names) resolve to data specified by the registrant, and use an extensible metadata model to associate descriptive and other elements of data with the DOI name. The DOI system is implemented through a federation of registration agencies, under policies and common infrastructure provided by the International DOI Foundation which developed and controls the system. The DOI system has been developed and implemented in a range of publishing applications since 2000; by early 2009 over 40 million DOIs had been assigned. The DOI system provides identifiers which are persistent, unique, resolvable, and interoperable and so useful for management of content on digital networks in automated and controlled ways.

downloadDownload free PDF View PDFchevron_right

Managing Metadata in Web-Scale Discovery Systems

Rene Erlandson

Journal of Electronic Resources Librarianship, 2017

downloadDownload free PDF View PDFchevron_right

The future of interlinked, interoperable and scalable metadata

Getaneh Agegn Alemu

International Journal of Metadata, Semantics and Ontologies, 2020

With the growing diversity of information resources the emphasis on data-centric applications such as big data, metadata, semantics and ontologies has become central. This editorial paper presents a summary of recent developments in metadata, semantics and ontologies-focusing in particular on metadata enriching, linking and interoperability. National libraries and archives are devising new bibliographic models and metadata presentation formats. Bibliographic metadata sets are being made available using these new data formats such as RDF. The new formats are aiming to represent data in granular structures and define unique identification protocols such as URIs. The paper concludes by introducing the five papers included in the special issue. The papers in this special issue present novel approaches to metadata integration, interoperability frameworks, re-use of metadata ontologies and methods of metadata quality analysis.

downloadDownload free PDF View PDFchevron_right

Writeslike. us: Linking people through OAI Metadata

Emma Tonkin

Informal scholarly communication is an important aspect of discourse both within research communities and in dissemination and reuse of data and findings. Various tools exist that are designed to facilitate informal communication between researchers, such as social networking software, including those dedicated specifically for academics. Others make use of existing information sources, in particular structured information such as social network data (e.g. FOAF) or bibliographic data, in order to identify links between individuals; co-authorship, membership of the same organisation, attendance at the same conferences, and so forth. Writeslike.us is a prototype designed to support the aim of establishing informal links between researchers. It makes use of data harvested from OAI repositories as an initial resource. This raises problems less evident in the use of more consistently structured data. The information extracted is filtered using a variety of processes to identify and benefit from systematic features in the data. Following this, the record is analysed for subject, author name, and full text link or source; this is spidered to extract full text, where available, to which is applied a formal metadata extraction package, extracting several relevant features ranging from document format to author email address/citations. The process is supported using data from Wikipedia. Once available, this information may be explored using both graph and matrixbased approaches; we present a method based on spreading activation energy, and a similar mechanism based on cosine similarity metrics. A number of prototype interfaces/data access methods are described, along with relevant use cases, in this paper.

downloadDownload free PDF View PDFchevron_right

Conference Linked Data: The ScholarlyData Project

Aldo Gangemi

Lecture Notes in Computer Science, 2016

downloadDownload free PDF View PDFchevron_right

Metadata Approaches for Shareable and LOD-enabled Bibliographic Data from Open Repositories

Marcia Zeng

International Conference on Dublin Core and Metadata Applications, 2011

Keywords: interoperable metadata; LOD-enabled metadata; open bibliographic data; metadata mapping approaches; methodology This poster presents the processes and paths of the authors who have recently prepared a report on descriptive metadata encoding recommendations for an European project, VOA3R (Virtual Open Access in Agriculture and Aquaculture Repository), which aims to deploy a virtual entry-point for

downloadDownload free PDF View PDFchevron_right

Cross-linking and referencing data and publications in Claddier

Brian Matthews, Catherine Jones

Proc. UK e-Science 2007 All Hands Meeting, 2007

Institutional repositories are becoming an established part of research communication, giving an opportunity to explore their relationship with the underlying data. The JISC funded Citation, Location and Deposition in Discipline & Institutional Repositories (CLADDIER) project in the UK has been investigating the issue of linking publications held in institutional repositories to the underlying data held in specialist repositories, such as NERC data centres, by developing the theme of citations, not only for publications but also ...

downloadDownload free PDF View PDFchevron_right

DataCite Metadata: Getting Connected!

Sign up for access to the world's latest research

Abstract

Related papers

Related papers