background layout invoked

Projects

COMET organisers and community members conduct projects that produce shared, durable assets such as enrichment methods, gold-standard benchmarks, and enriched datasets for round-tripping into the systems that maintain and disseminate scholarly metadata. Explore the projects below and get in touch to collaborate.

grid layout invoked

In progress

Improve funding metadata

Developing a parsing model for funding statements to address information gaps within scholarly metadata. Piloting with arXiv preprints.

In progress

Improve parsing of ETDs

Collaborators: University of Oklahoma

Creating approaches for parsing author, advisor and degree metadata from electronic theses and dissertations (ETDs), recognising the distinct challenges and opportunities presented by this important type of scholarly output.

In progress

Posters.science

Collaborators: The FAIR Data Innovations Hub at California Medical Innovations Institute (CalMI²) LEAD

Collaborating on a scientific poster classification model, developing standards for converting posters to json, and roundtripping enriched metadata to DataCite DOI records.

Planned

Improve affiliations parsing of multilingual articles

Assessing current approaches to affiliations parsing and developing new methods for extracting accurate institutional information from a multilingual corpus. Piloting the approach with open-access articles from the PKP Beacon Journal.

Complete

Match preprints to published articles

Developing a matching strategy to connect preprints to published articles, improving knowledge of how research moves from early sharing to formal publication. Piloting with arXiv preprints.

Complete

Leverage affiliations from CRIS systems

Collaborators: European Molecular Biology Lab (EMBL); Wageningen University & Research (WUR)

Exploring pathways to integrate high-quality affiliation metadata maintained by institutional Current Research Information Systems (CRIS) into the scholarly metadata ecosystem.

Complete

Improve resource type classification

Enhancing research outputs discoverability by developing a classifier. Piloting the approach with a subset of DataCite DOI records registered with generic resource types.

Complete

Improve affiliations parsing of preprints

Assessing current approaches to affiliations parsing and developing new methods for extracting accurate institutional information from preprint documents. Piloting with arXiv preprints.