TTLab – Text Technology Lab

The TTLab (Text Technology Lab), headed by Prof. Alexander Mehler, is part of the Department of Computer Science and Mathematics (Fachbereich Informatik und Mathematik) at the Goethe Universität in Frankfurt. It investigates formal, algorithmic models to deepen our understanding of information processing in the humanities. We examine diachronic, time-dependent as well as synchronic aspects of processing linguistic and non-linguistic, multimodal signs. The Lab works across several disciplines to bridge between computer science on the one hand and corpus-based research in the humanities on the other. To this end, we develop information models and algorithms for the analysis of texts, images, and other objects relevant to research in the humanities.

News

  • New publication within the journal PLOS ONE

    by

    We are pleased to announce that the article Syntactic language change in English and German: Metrics, parsers, and convergences has been published in PLOS ONE.

    Yanran Chen, Wei Zhao, Anne Breitbarth, Manuel Stoeckel, Alexander Mehler, Dominik Schlechtweg and Steffen Eger. April, 2026. Syntactic language change in English and German: Metrics, parsers, and convergences. PLOS ONE, 21(4):1–33.
    BibTeX
    @article{Chen:et:al:2026,
      doi       = {10.1371/journal.pone.0346096},
      author    = {Chen, Yanran and Zhao, Wei and Breitbarth, Anne and Stoeckel, Manuel
                   and Mehler, Alexander and Schlechtweg, Dominik and Eger, Steffen},
      journal   = {PLOS ONE},
      publisher = {Public Library of Science},
      title     = {Syntactic language change in English and German: Metrics, parsers,
                   and convergences},
      year      = {2026},
      month     = {04},
      volume    = {21},
      url       = {https://doi.org/10.1371/journal.pone.0346096},
      pages     = {1-33},
      abstract  = {Syntactic language change has gained increasing attention in recent
                   years. Previous computational work based on dependency relations
                   has focused on diachronic trends in dependency distance, which
                   measures the linear distance between dependent words, using dependency
                   trees automatically predicted by a dependency parser (mostly the
                   Stanford CoreNLP parser). In this work, we introduce a set of
                   15 syntax metrics that extend the analysis beyond linear distance
                   by incorporating both linear and tree graph properties of dependency
                   trees, such as tree height and degree. Besides, we propose a multi-parser
                   approach to reduce the impact of using specific parsers, thereby
                   increasing the robustness of the detected language changes. Through
                   a cross-lingual investigation of English and German in parliamentary
                   debates from the last 160 years, using 6 different parsers (CoreNLP
                   and five newer alternatives), we demonstrate that: (1) Relying
                   on one single parser can be problematic, as the agreement on predicted
                   trends can be low across parsers. (2) Our set of metrics can capture
                   subtle patterns of syntactic changes. Our analysis shows that
                   syntactic change over the time period inspected is largely similar
                   between English and German, with only 2.2% of cases yielding opposite
                   trends in these metrics. (3) We also show that changes in syntactic
                   metrics seem to be more frequent at the tails of sentence length
                   distributions and often move in opposite directions for short
                   and long sentences. To our best knowledge, ours is the most comprehensive
                   computational analysis of syntactic language change using modern
                   NLP technology in recent corpora of English and German.},
      number    = {4}
    }
  • New publications at SemEval-2026

    by

    We are pleased to inform you about the acceptance of papers at the International Workshop on Semantic Evaluation (SemEval-2026):

    Yahya Missaoui, Solomon Kebede, Mounika Marreddy and Alexander Mehler. 2026. SemEval-2026 Task 3: Dimensional Aspect-Based Sentiment Analysis. Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026). accepted.
    BibTeX
    @inproceedings{Missaoui:et:al:2026,
      title     = {SemEval-2026 Task 3: Dimensional Aspect-Based Sentiment Analysis},
      author    = {Missaoui, Yahya and Kebede, Solomon and Marreddy, Mounika and Mehler, Alexander},
      booktitle = {Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026)},
      year      = {2026},
      publisher = {Association for Computational Linguistics},
      note      = {accepted}
    }

    Noah Tratzsch, Asmaa Al-Raian, Mounika Marreddy and Alexander Mehler. 2026. SemEval-2026 Task 11: Reducing Content Effects Using Layered Activation Steering. Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026). accepted.
    BibTeX
    @inproceedings{Tratzsch:et:al2026,
      title     = {SemEval-2026 Task 11: Reducing Content Effects Using Layered Activation Steering},
      author    = {Tratzsch, Noah and Al-Raian, Asmaa and Marreddy, Mounika and Mehler, Alexander},
      booktitle = {Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026)},
      year      = {2026},
      publisher = {Association for Computational Linguistics},
      note      = {accepted}
    }

    Samuel Richer, Mounika Marreddy and Alexander Mehler. 2026. TTLab at SemEval-2026 Task 10: Transformer-based Approaches for Psycholinguistic Conspiracy Detection in Social Media Discourse. Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026). accepted.
    BibTeX
    @inproceedings{Richer:et:al:2026,
      title     = {TTLab at SemEval-2026 Task 10: Transformer-based Approaches for
                   Psycholinguistic Conspiracy Detection in Social Media Discourse},
      author    = {Richer, Samuel and Marreddy, Mounika and Mehler, Alexander},
      booktitle = {Proceedings of the International Workshop on Semantic Evaluation (SemEval-2026)},
      year      = {2026},
      publisher = {Association for Computational Linguistics},
      note      = {accepted}
    }

  • New workshop publications at LREC 2026

    by

    We are pleased to inform you about the acceptance of papers at the Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) as well as the Workshop on Structured Linguistic Data and Evaluation (SLiDE), co-located with the Language Resources and Evaluation Conference (LREC 2026)

    TTLab at AraSentEval: SARF (صرف) Sentiment Analysis via Root-based Fusion for Multi-Dialectal Arabic
    Ali Abusaleh, Bhuvanesh Verma and Alexander Mehler. 2026. TTLab at AraSentEval: SARF (صرف) Sentiment Analysis via Root-based Fusion for Multi-Dialectal Arabic. Proceedings of the 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7), co-located with the Language Resources and Evaluation Conference (LREC 2026). accepted.
    BibTeX
    @inproceedings{Abusaleh:et:al:2026:sarf,
      title     = {TTLab at AraSentEval: SARF (صرف) Sentiment Analysis via Root-based
                   Fusion for Multi-Dialectal Arabic},
      author    = {Abusaleh, Ali and Verma, Bhuvanesh and Mehler, Alexander},
      booktitle = {Proceedings of the 7th Workshop on Open-Source Arabic Corpora
                   and Processing Tools (OSACT7), co-located with the Language Resources
                   and Evaluation Conference (LREC 2026)},
      eventdate = {May, 2026},
      location  = {Palma, Mallorca, Spain},
      year      = {2026},
      keywords  = {NLP, Sentiment Analysis, Arabic analysis, new-data-spaces, circlet, satek},
      abstract  = {Arabic sentiment analysis is challenged by morphological complexity
                   and lexical variation across Arabic dialects, compounded by subjectivity
                   in how speakers and writers express sentiment. In this paper,
                   we present our submission for the AraSentEval 2026 Shared Task
                   on Arabic Dialect Sentiment Analysis. We propose SARF (صرف) a
                   multi-view architectural framework that integrates surface-level
                   context with stemmed and rooted morphological perspectives using
                   a shared MARBERTv2 encoder. Our system employs a hybrid BERT-CNN-BiLSTM-Attention
                   architecture to capture both local sentiment n-grams and global
                   sequential dependencies. Experimental results show that while
                   individual morphological normalization strategies (stemming or
                   rooting) may degrade performance, their joint integration via
                   cross-morphological attention provides robust features across
                   diverse dialects. Our final system achieved a competitive macro-F1-score
                   of 0.9263, ranking 2nd out of 15 participating teams.},
      note      = {accepted}
    }
    Gutenberg+: A More Temporally Faithful Corpus for Diachronic NLP
    Leon Hammerla and Alexander Mehler. 2026. Gutenberg+: A More Temporally Faithful Corpus for Diachronic NLP. Proceedings Workshop on Structured Linguistic Data and Evaluation (SLiDE 2026), co-located with the Language Resources and Evaluation Conference (LREC 2026). accepted.
    BibTeX
    @inproceedings{Hammerla:Mehler:2026:a,
      title     = {{Gutenberg+}: A More Temporally Faithful Corpus for Diachronic {NLP}},
      author    = {Leon Hammerla and Alexander Mehler},
      booktitle = {Proceedings Workshop on Structured Linguistic Data and Evaluation
                   (SLiDE 2026), co-located with the Language Resources and Evaluation
                   Conference (LREC 2026)},
      address   = {Palma de Mallorca (Spain)},
      year      = {2026},
      keywords  = {neglab},
      note      = {accepted}
    }

Sign up to our mailing list to receive news updates.

Click here to see all recent news.