News

Student tutors for the “Programming Practical Course (PPR)” (winter semester 25 / 26)

The Institute of Computer Science is recruiting tutors for the winter semester 2025 / 2026 to assist in various courses.

The chair of Computational Humanities / Text Technology is responsible for the basic course “Programming Practical Course (PPR)” and we are particularly looking for students who have already passed the programming practical course.

There are several tutor positions available and all additional information are noted on the announcement here.

The deadline for applications is July 31, 2025 and applications can be sent to .

We look forward to receiving your application.

New publications accepted

The following publications were accepted at the related conferences:

ACM Hypertext 2025 (36th ACM Conference on Hypertext and Social Media)

Giuseppe Abrami, Daniel Bundan, Chrisowaladis Manolis and Alexander Mehler. 2025. VR-ParlExplorer: A Hypertext System for the Collaborative Interaction in Parliamentary Debate Spaces. Proceedings of the 36th ACM Conference on Hypertext and Social Media, 177–183.
BibTeX
@inproceedings{Abrami:et:al:2025:c,
  author    = {Abrami, Giuseppe and Bundan, Daniel and Manolis, Chrisowaladis
               and Mehler, Alexander},
  title     = {VR-ParlExplorer: A Hypertext System for the Collaborative Interaction
               in Parliamentary Debate Spaces},
  year      = {2025},
  isbn      = {9798400715341},
  publisher = {Association for Computing Machinery},
  address   = {New York, NY, USA},
  url       = {https://doi.org/10.1145/3720553.3746672},
  doi       = {10.1145/3720553.3746672},
  abstract  = {The enhanced visualization and interaction with information in
               collaborative VR environments enabled by chatbots is currently
               rather limited. To fill this gap and create a concrete application
               that combines spatial and virtual concepts of hypertext systems
               based on the use of LLMs, we present VR-ParlExplorer as a system
               for virtualizing plenary debates that allows users to interact
               with virtual members of parliament through chatbots. VR-ParlExplorer
               is implemented as a Plugin for Va.Si.Li-Lab to enable immersion
               in the dynamics of communication in parliamentary debates. The
               paper describes the functionality of VR-ParlExplorer and discusses
               specifics of the use case it addresses.},
  booktitle = {Proceedings of the 36th ACM Conference on Hypertext and Social Media},
  pages     = {177--183},
  numpages  = {7},
  location  = {Chicago, USA},
  series    = {HT '25},
  pdf       = {https://dl.acm.org/doi/pdf/10.1145/3720553.3746672}
}


KONVENS 2025 (21th Conference on Natural Language Processing)

Daniel Bundan, Giuseppe Abrami and Alexander Mehler. 2025. Multimodal Docker Unified UIMA Interface: New Horizons for Distributed Microservice-Oriented Processing of Corpora using UIMA. Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Long and Short Papers, 257–268.
BibTeX
@inproceedings{Bundan:Abrami:Mehler:2025,
  author    = {Bundan, Daniel and Abrami, Giuseppe and Mehler, Alexander},
  title     = {Multimodal Docker Unified {UIMA} Interface: New Horizons for Distributed
               Microservice-Oriented Processing of Corpora using {UIMA}},
  booktitle = {Proceedings of the 21st Conference on Natural Language Processing
               (KONVENS 2025): Long and Short Papers},
  year      = {2025},
  editor    = {Wartena, Christian and Heid, Ulrich},
  location  = {Hildesheim, Germany},
  address   = {Hannover, Germany},
  publisher = {HsH Applied Academics},
  pages     = {257--268},
  series    = {KONVENS '25},
  url       = {https://aclanthology.org/2025.konvens-1.22/},
  pdf       = {https://aclanthology.org/2025.konvens-1.22.pdf},
  poster    = {https://www.texttechnologylab.org/wp-content/uploads/2025/09/Poster_Multimodal_DUUI_KONVENS_2025.pdf},
  keywords  = {duui,neglab,new-data-spaces,circlet}
}

New publication accepted in ACL Findings 2025

Our paper, Filling the Temporal Void: Recovering Missing Publication Years in the Project Gutenberg Corpus Using LLMs, has been accepted to the Findings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).

Omar Momen, Manuel Schaaf and Alexander Mehler. July, 2025. Filling the Temporal Void: Recovering Missing Publication Years in the Project Gutenberg Corpus Using LLMs. Findings of the Association for Computational Linguistics: ACL 2025, 17318–17334.
BibTeX
@inproceedings{Momen:Schaaf:Mehler:2025,
  title     = {Filling the Temporal Void: Recovering Missing Publication Years
               in the Project Gutenberg Corpus Using {LLM}s},
  author    = {Momen, Omar and Schaaf, Manuel and Mehler, Alexander},
  editor    = {Che, Wanxiang and Nabende, Joyce and Shutova, Ekaterina and Pilehvar, Mohammad Taher},
  booktitle = {Findings of the Association for Computational Linguistics: ACL 2025},
  month     = {jul},
  year      = {2025},
  address   = {Vienna, Austria},
  publisher = {Association for Computational Linguistics},
  url       = {https://aclanthology.org/2025.findings-acl.890/},
  pages     = {17318--17334},
  isbn      = {979-8-89176-256-5},
  abstract  = {Analysing texts spanning long periods of time is critical for
               researchers in historical linguistics and related disciplines.
               However, publicly available corpora suitable for such analyses
               are scarce. The Project Gutenberg (PG) corpus presents a significant
               yet underutilized opportunity in this context, due to the absence
               of accurate temporal metadata. We take advantage of language models
               and information retrieval to explore four sources of information
               {--} Open Web, Wikipedia, Open Library API, and PG books texts
               {--} to add missing temporal metadata to the PG corpus. Through
               20 experiments employing state-of-the-art Large Language Models
               (LLMs) and Retrieval-Augmented Generation (RAG) methods, we estimate
               the production years of all PG books. We curate an enriched metadata
               repository for the PG corpus and propose a refined version for
               it, which includes 53,774 books with a total of 3.8 billion tokens
               in 11 languages, produced between 1600 and 2000. This work provides
               a new resource for computational linguistics and humanities studies
               focusing on diachronic analyses. The final dataset and all experiments
               data are publicly available (https://github.com/OmarMomen14/pg-dates).},
  pdf       = {https://aclanthology.org/2025.findings-acl.890.pdf}
}