Scroll to top

Annotate them all

Led by Tiago Lubiana

Teaching computers how to read articles can bridge the publication-to-knowledge gap, accelerate scientific discovery and save us time to focus on improving research culture.

Led by Tiago Lubiana


The vast amount of scientific articles produced nowadays makes it nearly impossible to understand and interpret all relevant works. We end up missing essential discoveries and failing to compare findings across the literature. The lack of standardisation makes it harder to discover researchers that are working with any given topic, and it also hinders the precise definitions of research settings needed to ensure reproducibility. This gap affects the everyday life of every researcher, limiting the reach of our hypotheses and discussions, and the information-based decision making that can solve real-world problems.

Work at the Sprint

To integrate knowledge with computers, we need, as a first step, to annotate scientific texts. We will build a prototype that allows the community annotation of scientific texts to Wikidata items. Wikidata is designed as a single system for all concepts, even outside the biological domain. The use of a single database lowers the barrier for editing, likely improving user engagement. The prototype will save the annotations in a format compatible with the EuropePMC annotations framework

The design of the tool may employ the Wikidata API to power a user interface to provide candidate matchings to full open texts and abstracts. This organised corpus of Wikidata-annotated texts, after quality control, can be fed to EuropePMC’s system, which displays annotations using the functioning SciLite system.

I am looking for

Contributors skilled in UX and software development, and with domain expertise in natural language processing, annotations and bioinformatics.