COMET Enrichment Projects: From Ideas to Action

The scholarly metadata ecosystem has long operated under an outdated paradigm; one where individual depositors bear the burden of maintaining metadata whilst the broader community, despite possessing valuable knowledge and expertise, remains unable to contribute. The COMET initiative represents a fundamental shift from this individual stewardship model to collaborative community curation, driven by a clear vision: metadata quality should reflect the collective expertise of those who depend on it.

In our Community Meeting in July, we introduced the COMET demonstration enrichment projects that are designed to test ideas and prove through practice that this transformation is both technically achievable and practically beneficial.

Current Enrichment Projects

We are running seven demonstration projects that address critical metadata challenges across the scholarly ecosystem:

  1. Match preprints to published articles

  2. Improve resource type classification

  3. Improve funding metadata

  4. Improve affiliations parsing of preprints

  5. Improve affiliations parsing of electronic thesis and dissertations (ETDs) 

  6. Improve affiliations parsing of journal articles

  7. Leverage affiliations from CRIS systems

From Vision to Reality

By serving as real-world test cases, these projects allow us to try out different approaches to metadata enrichment, rigorously evaluating the effectiveness, accuracy, cost, and scalability of these methods. This helps us determine what resources are required to support community-driven workflows by tackling genuine challenges with real stakeholders.

Each project focuses on high-impact problems that have measurable consequences across the scholarly record. This targeted approach helps us diagnose and quantify the true extent of metadata quality issues, ensuring our solutions address the most pressing needs whilst building confidence in the broader COMET vision.

The projects are designed to establish the operational foundations that will define COMET's success: requirements for community interaction, workflows for data sharing, and shared evaluative frameworks that ensure the infrastructure serves its diverse stakeholders effectively. Every challenge we encounter and solve strengthens the COMET model for future adoption.

Through these initial collaborations, we're starting to build a network of stakeholders who are invested in the long-term success of collaborative metadata infrastructure. By delivering tangible improvements to the scholarly record, we're demonstrating the potential value to researchers, institutions, funders, and the broader community-building momentum for the widespread transformation we envision.

To keep our work tangible and achievable, the COMET Model applies a "fields as features" approach that breaks down the overwhelming complexity of metadata improvement into manageable, measurable progress. Rather than attempting to solve everything at once, we're systematically working on high-impact metadata fields, building on proven methodologies and creating new approaches where gaps exist.

Four of our projects focus on affiliations metadata, recognising that incomplete affiliations create cascading problems throughout the research ecosystem. When affiliation metadata is poor, researchers struggle to find relevant work, institutions cannot demonstrate their impact, and funders lose visibility into the outcomes of their investments. By improving this key metadata field, we're working to remove barriers to equity and inclusion in global scholarship. Our additional pilots target resource type classification, funding statements, and related identifiers.

True transformation requires solutions that work across the full spectrum of scholarly content. Our projects also deliberately address diverse content types, ensuring that the improvements we develop can be applied broadly rather than benefiting only specific communities.

Join Us to Drive This Transformation

We've established 'Project Hubs' as the operational centre for each demonstration project. These publicly accessible Google Docs embody our commitment to transparency whilst offering a springboard for community contribution. This isn't just documentation. It's an active collaboration where your expertise can directly influence project direction and outcomes.

Each Project Hub provides visibility into objectives, methodologies, progress updates, and results. They serve as both working documents and permanent records of our systematic approach to metadata transformation.

Learning from Results: Published Findings

We're committed to sharing not just our successes, but our complete learnings including the challenges we encounter along the way. The first project results report for the ‘Match preprints to published articles’ project is available here, and we expect to publish the results from other projects soon with the "Improve resource type classification" project coming next.

Get Involved Now: Community Assessments

Every project concludes with rigorous assessment designed to capture learnings and guide COMET's strategic direction. We invite community members to participate in these assessment sessions, contributing their expertise to evaluate project outcomes and influence the development of the COMET initiative. The first community assessment for the "Match preprints to published articles" project is being scheduled for September, and we're actively seeking participants who can bring diverse perspectives to this critical evaluation process. To join us, complete this availability poll and we’ll be in touch.


Ready to contribute your expertise? Engage with the Enrichment Projects, contribute to our community project assessments, or share how your organisation can bring your metadata enrichments to the COMET initiative. Together, we're not just improving metadata, we're fundamentally transforming how the scholarly community creates, maintains, and benefits from the research information infrastructure that powers research discovery and evaluation.

Next
Next

Driving open research information forward: Why metadata enrichment matters for us at CWTS