Resources

Presentations

In this Community Meeting, COMET Organizers give an update on COMET's progress to date, introduce the COMET Model, and demonstrate pilot projects. Watch on Zenodo.

‘The Collaborative Metadata Enrichment Taskforce (COMET): Uniting Stakeholders for Collaborative Metadata Enrichment’ presented by John Chodacki at the CNI Spring 2025 Membership Meeting April 7-8, 2025.

On March 5, 2025, the Collaborative Metadata Enrichment Taskforce (COMET) conveners held a wider community webinar to introduce the work of COMET and it's Community Call to Action.

Case Studies

Project Briefs

Improve Resource Type Classification of DOI Records

We're exploring an automated system to improve how research outputs are categorized, piloting the approach with subsets of DOI records registered with DataCite. Instead of vague labels like "Text" or "Other," our classifier will assign specific, meaningful categories like "Dataset," "Journal Article," or "Software", when appropriate, to improve research discoverability.

Improve Affiliations Parsing of Preprints

We're developing improved parsing methods for preprints, using works from arXiv in the pilot, whilst simultaneously evaluating how well OpenAlex captures author and affiliation metadata through the well-established GROBID parsing tool.

Match Preprints to Published Articles

We're developing automated tools to connect preprints with their corresponding published journal articles, piloting the approach with preprints in the arXiv repository, creating a comprehensive map of how research moves from early sharing to formal publication.

Improve Titles of DOI Records

We're developing systematic approaches to identify and repair missing or generic titles to transform unhelpful text such as "Dataset" or blank title fields into meaningful, descriptive titles that enable proper discovery and citation.

Improve Affiliations Parsing of Journal Articles

We're evaluating how accurately OpenAlex captures affiliation metadata from journal articles hosted on the PKP platform, creating pathways to improve metadata quality for open access journals that serve diverse global research communities.

Improve Affiliations Parsing of ETDs

We're developing a specialised PDF parsing pipeline for Electronic Theses and Dissertations (ETDs), enabling accurate extraction of author and affiliation metadata from works by early stage researchers.