Projects
Improve funding metadata
Developing a parsing model for funding statements to address information gaps within scholarly metadata. Piloting with arXiv preprints.
Improve parsing of ETDs
Collaborators: University of Oklahoma
Creating approaches for parsing author, advisor and degree metadata from electronic theses and dissertations (ETDs), recognising the distinct challenges and opportunities presented by this important type of scholarly output.
Improve affiliations parsing of multilingual articles
Assessing current approaches to affiliations parsing and developing new methods for extracting accurate institutional information from a multilingual corpus. Piloting the approach with open-access articles from the PKP Beacon Journal.
Posters.Science
Collaborators: The FAIR Data Innovations Hub at California Medical Innovations Institute (CalMI2) LEAD
Collaborating on a scientific poster classification model, developing standards for converting posters to json, and roundtripping enriched metadata to DataCite DOI records.
Match preprints to published articles
Developing a matching strategy to connect preprints to published articles, improving knowledge of how research moves from early sharing to formal publication. Piloting with arXiv preprints.
Leverage affiliations from CRIS systems
Collaborators: European Molecular Biology Lab (EMBL); Wageningen University & Research (WUR)
Exploring pathways to integrate high-quality affiliation metadata maintained by institutional Current Research Information Systems (CRIS) into the scholarly metadata ecosystem.
Improve resource type classification
Enhancing research outputs discoverability by developing a classifier. Piloting the approach with a subset of DataCite DOI records registered with generic resource types.