Aravind Venkatesan

Senior Data Scientist,

Europe PMC is a digital repository that indexes life science scholarly publications, it provides intuitive and powerful search tools and links the underlying data to the relevant biological data resources. Europe PMC hosts 40.5 million abstracts and 7.8 million full-text articles, including research articles, preprints, books, protocols, and reviews. Europe PMC uses text-mining techniques including machine learning, to annotate literature from the Open Access and CC-BY set with relevant biological terms, their relationships, data citations/accession numbers etc. The text-mined information is publicly available and programmatically accessible through our Annotation API in a standardised machine-readable format for reusability, helping stakeholders including scientists, bioinformaticians and curators access the underlying data, promoting Open Science. In this talk, we will give an overview of our text and data mining activities and their impact.