Workshop: Accessing knowledge from legacy biodiversity literature
Part of Biodiversity_Next Leiden / Netherlands
Wednesday, 23 October 2019, 13:30 – 15:00 h
Stadsgehoorzaal Breestraat 60, Room:WAALSE KERK, Leiden / Netherlands
Giuseppe Abrami, Text Technology Lab, Goethe University Frankfurt
Gerwin Kasperek, University Library J.C. Senckenberg
Alexander Mehler, Text Technology Lab, Goethe University Frankfurt
The workshop addresses scientists working on all data-intensive aspects of biodiversity research and comprises three sections:
- An introduction into the BIOfid web portal enabling fast and easy access to literature, facts, and concepts extracted from historical texts through a visual interface.
- Participants will utilise state-of-the-art, easy-to-use Natural Language Processing (NLP) tools, e.g. deep learning of text content. We will analyse large text corpora automatically to extract knowledge and to link it to established ontologies and knowledge bases. Participants are invited to bring a selection of own texts to explore them with our methods.
- The BIOfid team supports the participants in establishing custom workflows in order to perform all stages from source materials to processable texts and thus to achieve the best results through the BIOfid methods.
In all sections, the participants will learn how the BIOfid team overcame diverse challenges in regard to data quality, text recognition, information extraction and linking.
Making knowledge and data from legacy biodiversity literature available is the main goal of BIOfid. Hence, we gather the expertise of biologists and computer scientists to give biodiversity researchers a gateway into the data of historical biodiversity literature and to supply them with high-quality tools for text mining. The current focus of the project is Central European literature about three taxonomic groups: vascular plants, birds, as well as moths and butterflies.
Day