TTLab – Text Technology Lab

The TTLab (Text Technology Lab), headed by Prof. Alexander Mehler, is part of the Department of Computer Science and Mathematics (Fachbereich Informatik und Mathematik) at the Goethe Universität in Frankfurt. It investigates formal, algorithmic models to deepen our understanding of information processing in the humanities. We examine diachronic, time-dependent as well as synchronic aspects of processing linguistic and non-linguistic, multimodal signs. The Lab works across several disciplines to bridge between computer science on the one hand and corpus-based research in the humanities on the other. To this end, we develop information models and algorithms for the analysis of texts, images, and other objects relevant to research in the humanities.

News

  • Two publications accepted at IJCNLP-AACL

    by

    The following publications were accepted at the International Joint Conference on Natural Language Processing & Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL):

    Leon Hammerla, Alexander Mehler and Giuseppe Abrami. 2025. Standardizing Heterogeneous Corpora with DUUR: A Dual Data- and Process-Oriented Approach to Enhancing NLP Pipeline Integration. Proceedings of 2025 International Joint Conference on Natural Language Processing & Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL-Findings). accepted.
    BibTeX
    @inproceedings{Hammerla:et:al:2025a,
      author    = {Hammerla, Leon and Mehler, Alexander and Abrami, Giuseppe},
      title     = {Standardizing Heterogeneous Corpora with DUUR: A Dual Data- and
                   Process-Oriented Approach to Enhancing NLP Pipeline Integration},
      booktitle = {Proceedings of 2025 International Joint Conference on Natural
                   Language Processing & Asia-Pacific Chapter of the Association
                   for Computational Linguistics (IJCNLP-AACL-Findings)},
      year      = {2025},
      note      = {accepted},
      keywords  = {neglab}
    }
    Leon Hammerla, Andy Lücking, Carolin Reinert and Alexander Mehler. 2025. D-Neg: Syntax-Aware Graph Reasoning for Negation Detection. Proceedings of 2025 International Joint Conference on Natural Language Processing & Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL-Findings). accepted.
    BibTeX
    @inproceedings{Hammerla:et:al:2025b,
      author    = {Hammerla, Leon and Lücking, Andy and Reinert, Carolin and Mehler, Alexander},
      title     = {D-Neg: Syntax-Aware Graph Reasoning for Negation Detection},
      booktitle = {Proceedings of 2025 International Joint Conference on Natural
                   Language Processing & Asia-Pacific Chapter of the Association
                   for Computational Linguistics (IJCNLP-AACL-Findings)},
      year      = {2025},
      note      = {accepted},
      keywords  = {neglab}
    }

  • New EMNLP 2025 publication accepted

    by

    The publication MedLinkDE — MedDRA Entity Linking for German with Guided Chain of Thought Reasoning was accepted at the EMNLP 2025.

    Roman Christof, Farnaz Zeidi, Manuela Messelhäußer, Dirk Mentzer, Renate Koenig, Liam Childs and Alexander Mehler. November, 2025. MedLinkDE – MedDRA Entity Linking for German with Guided Chain of Thought Reasoning. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 31569–31581.
    BibTeX
    @inproceedings{Christof:et:al:2025,
      author    = {Christof, Roman and Zeidi, Farnaz and Messelhäußer, Manuela and Mentzer, Dirk
                   and Koenig, Renate and Childs, Liam and Mehler, Alexander},
      title     = {{M}ed{L}ink{DE} {--} {M}ed{DRA} Entity Linking for {G}erman with
                   Guided Chain of Thought Reasoning},
      editor    = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn
                   and Peng, Violet},
      booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural
                   Language Processing},
      month     = {nov},
      year      = {2025},
      address   = {Suzhou, China},
      publisher = {Association for Computational Linguistics},
      url       = {https://aclanthology.org/2025.emnlp-main.1609/},
      doi       = {10.18653/v1/2025.emnlp-main.1609},
      pages     = {31569--31581},
      isbn      = {979-8-89176-332-6},
      pdf       = {https://aclanthology.org/2025.emnlp-main.1609.pdf},
      abstract  = {In pharmacovigilance, effective automation of medical data structuring,
                   especially linking entities to standardized terminologies such
                   as MedDRA, is critical. This challenge is rarely addressed for
                   German data. With MedLinkDE we address German MedDRA entity linking
                   for adverse drug reactions in a two-step approach: (1) retrieval
                   of medical terms with fine-tuned embedding models, followed (2)
                   by guided chain-of-thought re-ranking using LLMs. To this end,
                   we introduce RENOde, a German real-world MedDRA dataset consisting
                   of reportings from patients and healthcare professionals. To overcome
                   the challenges posed by the linguistic diversity of these reports,
                   we generate synthetic data mapping the two reporting styles of
                   patients and healthcare professionals. Our embedding models, fine-tuned
                   on these synthetic, quasi-personalized datasets, show competitive
                   performance with real datasets in terms of accuracy at high top-
                   recall, providing a robust basis for re-ranking. Our subsequent
                   guided Chain of Thought (CoT) re-ranking, informed by MedDRA coding
                   guidelines, improves entity linking accuracy by approximately
                   15{\%} (Acc@1) compared to embedding-only strategies. In this
                   way, our approach demonstrates the feasibility of entity linking
                   in medical reports under the constraints of data scarcity by relying
                   on synthetic data reflecting different informant roles of reporting
                   persons.}
    }

  • Invited Talk

    by

    Andy Lücking has been invited to give a talk at the joint MMSR/ISA workshop as part of the international conference on Computational Semantics.

Sign up to our mailing list to receive news updates.

Click here to see all recent news.