Enrichment Projects
The COMET enrichment projects are real-world demonstrations of collaborative metadata improvement in action. Each project addresses critical challenges in scholarly metadata whilst testing and refining the COMET Model through practical application.
These projects serve as proof-of-concept initiatives that evaluate the effectiveness, accuracy, cost, and scalability of community-driven metadata enrichment approaches. By focusing on high-impact problems with measurable consequences, we're building the evidence base needed to transform how the scholarly community maintains and improves research metadata.
Match Preprints to Published Articles
Status: Assessment Phase
Developing a matching strategy to connect preprints to published articles, piloting with arXiv preprints, improving knowledge of how research moves from early sharing to formal publication.
Improve Funding Metadata
Status: Active Development
Developing a parsing model for funding statements to address gaps in funding information within scholarly metadata, improving visibility of research investments and enable better tracking of funded research outcomes.
Improve Affiliations Parsing of Preprints
Status: Active Development
Assessing current approaches to affiliations parsing and developing new methods for extracting accurate institutional information from preprint documents, piloting with arXiv preprints.
Improve Resource Type Classification
Status: Active Development
Enhancing research discoverability by developing a classifier to improve how research outputs are categorized, piloting the approach with a subset of DataCite DOI records registered with generic resource types.
Improve Affiliations Parsing of ETDs
Status: Active Development
Creating approaches for parsing affiliation metadata from electronic theses and dissertations (ETDs), recognising the distinct challenges and opportunities presented by this important type of scholarly output.
Improve Affiliations Parsing of Articles
Status: Active Development
Assessing current approaches to affiliations parsing and developing new methods for extracting accurate institutional information, piloting the approach with open-access articles from the PKP Beacon Journal.
Leverage Affiliations from CRIS Systems
Status: Active Development
Exploring pathways to integrate high-quality affiliation metadata maintained by institutional Current Research Information Systems (CRIS) into the scholarly metadata ecosystem.
How Projects Work
Each enrichment project follows a structured approach designed to maximise learning and community benefit:
Transparent Documentation: Every project maintains a public Project Hub Doc that provides comprehensive visibility into objectives, methodologies, progress updates, and results.
Open Results: All code and enriched datasets are published openly to enable review and reuse by the community.
Community Engagement: Each project includes community participants and welcome community contribution throughout their development.
Rigorous Assessment: Each project concludes with thorough evaluation involving both internal analysis and community input to capture learnings and guide future development and collaborations.
Get Involved
Contribute to Active Projects: Each Project Hub Doc includes opportunities for community input through commenting and direct engagement with project teams.
Participate in Assessments: We conduct community assessments of completed projects. These sessions provide opportunities to evaluate outcomes, share expertise, and influence the direction of the COMET initiative.
Share Your Use Cases: Have metadata enrichment use cases that align with our approach? We're interested in learning about community needs that could inform future project development. Contribute enrichments or contact us to discuss collaboration opportunities.
Stay Updated
Follow our progress through project documentation and newsletter updates as we continue to demonstrate the transformative potential of collaborative metadata stewardship.