Teaching – Text Technology Lab

Quick Selection

Thesis Proposals
Courses

Thesis Topic Proposals

2026

Master Thesis: AnonymizeR: Anonymizing Real-Time Augmented Reality Video Feedback in Real-World Settings.

Description

As Augmented Reality devices are showing the first signs of becoming prominent in everyday life (e.g., Meta Ray-Bans), various societal issues and challenges arise that will need to be resolved in the near future. An important aspect of these developments concerns the privacy of the people who share a space with the user, as well as the privacy of the user and their environment themselves. It should therefore be possible to anonymize certain aspects in real-time, both during live AR use and in recordings, before this data is sent off for LLM inference or permanently saved. This includes faces and voices, but also other information such as visible documents, names, and further personally identifying elements. Throughout this thesis, you will explore the field of automatic anonymization and develop novel solutions for anonymizing surrounding information in both AR and video contexts. This work sits at the intersection of privacy-preserving machine learning and human-computer interaction (HCI), and will involve designing, implementing, and evaluating a prototype that anonymizes sensitive visual and auditory information in real time without compromising the user's experience. See also:

Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.

Bachelor/Master Thesis: SoundAideR: An Augmented Reality Application for Visualizing and Contextualizing Auditory Signals.

Description

As Augmented Reality devices are showing the first signs of becoming prominent in everyday life (e.g., Meta Ray-Bans), various societal issues and challenges arise that will need to be resolved in the near future. Focusing on the new capabilities these glasses allow for, there is also an exciting, more inclusive future ahead of us. One major aspect concerns aiding individuals with restricted hearing by visualizing certain information which would otherwise be missing or difficult to notice. The obvious use case would be visualizing real-time transcripts of a person's spoken utterances within the user's peripheral vision, which - besides translation - is a well-established area. Once using such tools, it becomes obvious that not only does the information portrayed through the transcript matter, but even more importantly, the space the user interacts within becomes significant. Highlighting a specific interlocutor that the user wants to focus on, or visualizing the data at a certain position in space, are examples of this. Additionally, sounds outside of the user's peripheral vision could be of vital importance (e.g., a car approaching from behind). Throughout this thesis, you will work on utilizing existing models and algorithms to contextualize and visualize auditory signals. This work lies strongly in the field of human-computer interaction (HCI), and will involve designing, implementing, and evaluating an AR prototype that helps users perceive and interpret sound events in their environment more effectively. See also:

Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.

Master Thesis: Generative Frameworks for Multimodal Survey Data Anonymization.

Description

Ensuring privacy in open-ended survey research demands sophisticated anonymization frameworks capable of sanitizing text, images, and audio without destroying the downstream research utility of the data. This Master's thesis explores the intersection of deep generative modeling and privacy preservation to build an end-to-end multimodal anonymization pipeline. For text, the student will investigate advanced pseudonymization and contextual text-replacement strategies leveraging state-of-the-art LLMs and privacy-filtering guardrails. For visual data, the project will combine semantic segmentation and OCR with generative Diffusion Models to perform secure, context-aware inpainting (e.g., replacing faces or identifying text while preserving the overall scene context). For audio, the student will explore voice-masking and speech synthesis techniques to redact speaker identity. The core research focus will be a rigorous, quantitative evaluation of the trade-off between strict privacy preservation and the semantic utility of the sanitized datasets for downstream empirical analysis.
Corresponding Lab Member: Ali Abusaleh and Alexander Mehler.

Bachelor Thesis: Joint Sarcasm, Sentiment, and Stance Detection in Underrepresented Dialects.

Description

Colloquial communication on social media frequently blends emotional expression with social or political viewpoints, making the simultaneous detection of sentiment, stance, and sarcasm highly interconnected. Sarcasm heavily co-exists with these features, often acting as a linguistic disruptor that flips sentiment polarity and obscures an author's true stance toward a target. This thesis involves developing a unified prompt-based or multi-task classification pipeline to jointly model sarcasm, sentiment, and stance in dialectal text. The research will primarily anchor on Arabic dialects (such as Levantine, Egyptian, and Gulf) and extend to underrepresented Indic dialects (such as regional variants of Hindi or Bhojpuri). The student will leverage state-of-the-art dialectal and multilingual language models (e.g., MARBERTv2, IndicBERT) to evaluate how models capture subjective alignment under low-resource constraints. Tasks include benchmarking multi-dialectal datasets, implementing joint classification layers, and analyzing error patterns caused by sarcasm-induced stance flips.
Corresponding Lab Member: Ali Abusaleh and Alexander Mehler.

(joined) Bachelor/Master Thesis: Graph-Based Modeling, Visualization, and Extension of a Discourse Segmentation Tool.

Description

This thesis focuses on the development and extension of a discourse text and reasoning segmentation tool. The tool currently segments texts into labeled reasoning and discourse steps, which are represented as chain-like graphs with labels on the nodes. Possible thesis directions include the development of a web-based frontend, the exploration and evaluation of visualization approaches for these graphs, including possible three-dimensional representations, and backend extensions such as introducing additional LLM-based engines for deriving labeled edges between segments after the initial segmentation step. Depending on the thesis type and scope, the work may focus only on the frontend and visualization component, only on the graph-based backend extension, or on both. The combined frontend and backend extension is especially suitable for a Master's thesis or as a joint project for two Bachelor's thesis candidates. In this way, the project connects graph-based modeling, discourse segmentation, LLM-based analysis, graph visualization, and extensible tool development.
Corresponding Lab Member: Leon Hammerla and Alexander Mehler.

2025

Bachelor Thesis: Full-text Scientific Argument Mining using Large Language Models.

Description

Scientific articles contain a mix of argumentative and non-argumentative content, yet only argumentative sentences, particularly claims, contribute to the scientific discourse and are therefore central to argument mining. A key challenge is not only to identify whether a sentence expresses a claim, but also to distinguish between own claims (novel contributions by the author), background claims (statements grounded in prior work, often signaled by citations), data or evidence (empirical results that support claims), and non-argumentative content (methodological or descriptive text). This project proposes to address the task of claim detection and classification in full-text scientific articles by leveraging large language models, beginning with binary classification of claim versus non-claim sentences and extending to multi-class classification across the four categories. The approach will explore prompt-based classification and domain-specific fine-tuning, with the potential integration of citation-aware heuristics, aiming to establish a robust baseline for scientific claim detection as a foundation for downstream argument mining tasks.

See also:

Corresponding Lab Member: Bhuvanesh Verma and Alexander Mehler.

Bachelor/Master Thesis: Constructing and Evaluating Human Digital Twins from Browsing Data.

Description

The topic of digital twins, specifically Human Digital Twins (HDTs) is growing fast, with many works surveying the architectures, applications and ethical issues. One research problem concerns the creation of these HDTs. Browsing histories and search behavior provide a rich source of data detailing what people read and search. Utilizing this data a HDT might be able to predict a users behavior or responses, analogous to how advertisers profile users for targeted recommendations. The task of this thesis is to design a systems for creating such HDTs from browsing data as well as evaluating the effectiveness. For this a dataset of existing browsing data and user responses can be used although extending on it is encouraged. The system can be further extended into the area of survey studies to validate the accuracy and perceived relevance of the Human Digital Twin by comparing its predictions with self-reported user responses. This work lies at the intersection of human-computer interaction (HCI), generative modeling and computational social science. See also:

Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.

Bachelor/Master Thesis: Holodecks: Real-Time Interactive 3D Scene Generation with Large Language Models.

Description

The creation of immersive digital environments lies at the core of every immersive interaction. Designing such spaces, however, is time expensive and most systems do not support environment creation in real time or at runtime. Various research as has explored the generation of environments through the use of Large Language Models(LLMs) and pre-defined object databases, enabling a user to describe a scene which the LLM then reconstruct in 3D space. This remains an unsolved challenge, as the adjustable dimensions and flexibility of generated spaces are still limited. Furthermore as object generation based on LLM prompts grows ever more capable integrating such capabilities with scene generation becomes increasingly relevant. Real-time deployment of these systems could enable the creation of holodeck-like experiences, in which anything a user describes can be dynamically generated and interacted with in 3D space. The task of this thesis is to design an LLM-based framework that allows users to request the generation of 3D scenes through natural language input. This includes both partial descriptions of individual elements and complete, coherent scene specifications. Furthermore, the elements within the scene may optionally be derived from 3D models generated by a separate generative model. Furthermore these environments should not be confined to a singular user but instead be usable and sharable with multiple users. This work lies at the intersection of computational design, generative modeling, and human-computer interaction (HCI). It aims to contribute toward more intuitive, language-driven methods for creating and manipulating virtual environments. See also:

Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.

Master Thesis: Multi-Modal AI Agents for Immersive Virtual Reality.

Description

As virtual reality (VR) systems become ever more capable and immersive, the need for highly interactive environments continues to grow. One area of active research is that of AI agents, software entities that dynamically interact with their surroundings to perform tasks aligned with predefined goals. The incorporation of such agents into the landscape of virtual environments has met an equivalent high interest even in enterprise spaces e.g. with NVIDIAs Autonomous Game Characters. These systems can seamlessly converse with a user through natural language and even execute predefined tasks. However, in the multi-modal landscape enabled by VR technology, these predominantly auditory interactions remain limited compared to the modalities such a system could leverage. Furthermore, the rise in more modular and capable LLM-systems through systems like the Model Context Protocol allow for highly adaptable and extensible agents. The task of this thesis is to design and implement an AI agent that is not solely mono-modal. This could include aspects besides understanding natural spoken language, such as facial expressions, the environment surrounding the agent as well as physical interactions. These expanded input and output modalities should be reflected in the agent’s range of actions and behaviors.This work lies at the intersection of virtual reality, embodied artificial intelligence, and human-computer interaction (HCI). It aims to contribute to the development of more immersive, expressive, and responsive AI agents that bridge the gap between language-driven interaction and embodied presence in virtual environments. See also:

Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.

Bachelor Thesis: Language Forensics in the Age of AI: Retrospective Watermarking for Text Authenticity.

Description

Recent advances in large language models (LLMs) have made it increasingly difficult to distinguish between human-written and AI-generated text. While proactive watermarking techniques can embed detectable patterns during text generation, they rely on control over the generation process — a condition often unmet in real-world scenarios where AI-generated texts circulate freely online without prior tagging. This thesis explores retrospective watermarking and tagging — the post-hoc identification and labeling of AI-generated text after its creation and distribution. You investigate methods that combine linguistic stylometry, statistical signal analysis, and semantic fingerprinting to identify traces of machine generation. Furthermore, it examines how artificial "watermarks" can be retroactively embedded or inferred to improve downstream detection models and content source attribution. This work lies at the intersection of computational linguistics, machine learning, and digital forensics, and aims to address pressing societal concerns regarding misinformation and authorship transparency in the age of generative AI.
Corresponding Lab Member: Alexander Mehler.

Bachelor Thesis: Multimodal Sentiment Analysis via Cross-Attention Fusion in Latent Space.

Description

Accurately estimating sentiments in video data is a complex challenge that requires integrating visual, audio, and textual cues. This proposal involves developing a multimodal sentiment analysis pipeline that uses state-of-the-art models to extract embeddings from different modalities: Qwen2.5 for text, JEPA 2 for visuals, and WhisperX (pyannote) for audio. The embeddings then undergo projection into a shared latent space and fusion via cross-attention mechanisms to capture intermodal dependencies. A probabilistic output layer will then estimate sentiment distributions over fine-grained emotions. To accelerate the analysis process for real-time applications, the pipeline should use a stream-like vector projection method that updates latent space representations incrementally instead of reprocessing entire sequences. The pipeline will be built using DUUI to ensure modularity, scalability, and reproducibility. The goal of this thesis is to tackle limitations of unimodal sentiment analysis and traditional fusion methods to achieve efficient, accurate, and scalable multimodal sentiment analysis.
Corresponding Lab Member: Ali Abusaleh and Alexander Mehler.

Master Thesis: Negation and LLM Reasoning.

Description

As lexical and logical negation appears to play a crucial role in human reasoning and inquiry, we are interested in analyzing negation patterns in reasoning traces produced by large language models (LLMs), as well as in LLM reasoning frameworks that explicitly incorporate negation, with the goal of better mimicking human reasoning. Possible directions for this thesis include: (1) The development of LLM reasoning frameworks centered around the phenomenon of negation and their evaluation against existing frameworks such as Chain-of-Thought (CoT) or Tree-of-Thought (ToT). (2) Negation-centered fine-tuning of LLM reasoning. (3) Qualitative and quantitative analysis of reasoning traces produced by LLMs, focusing on negation patterns.
Corresponding Lab Member: Leon Hammerla and Alexander Mehler.

Bachelor Thesis: Detecting the negated Event/Detecting the Focus of Negation.

Description

Classical negation annotation in computational linguistics involves identifying the negation cue, determining the scope of the negation, and detecting both the negated event and the most prominent part of the scope that is negated (the focus). While reliable systems already exist for detecting negation cues and scopes, current frameworks need to be extended to identify the negated event and/or the focus. For a bachelor thesis, addressing one of these two aspects is sufficient; for a master thesis, both should be tackled. A Python-based pipeline for cue and scope detection is already available, and the newly developed detection modules can be integrated into this existing framework (python).
Corresponding Lab Member: Leon Hammerla and Alexander Mehler.

Bachelor Thesis: Can we use scientific mentions to reconstruct or identify scientific argumentative text?.

Description

Scientific articles contain numerous mentions of datasets, methods, tasks, and metrics, which capture essential elements of the scientific discourse. A key question is whether these scientific mentions and their interrelations can be leveraged to reconstruct or identify argumentative text, such as claims and supporting evidence. Existing resources like SciER and SciREX provide annotations for such mentions and their relations, which can be used to detect how claims are formulated or to identify sentences that express claims in context. Beyond leveraging existing mentions, identifying additional scientific entities and their relations could further enrich the representation of scientific arguments. Given the current lack of full-text scientific argument mining datasets, this task has the potential to support the creation of a large-scale corpus of argumentative sentences and their relational structure, providing a foundation for downstream tasks in scientific argument mining and automated knowledge extraction. See also:

Corresponding Lab Member: Bhuvanesh Verma and Alexander Mehler.

Bachelor Thesis: Modelling Task Effects: Human-AI Judgements.

Description

Motivation
Given both abstract and concrete words, does a word (eg: Apartment) always have the same representation? If the word appeared in different views (with picture, in a sentence, and in a word cloud), does the task (can you take rest in it?) under which the word presented with different views alter the word representation? The idea is to label these task effects associated with words labelled by humans align with labels generated by LLMs?

Tasks

Create questions related to daily life activities and how relevant these questions with abstract and concrete words.
The dataset is periera dataset consisting of 180 words (120 concrete and 60 abstract). Every word is presented in three views.
Using human annotators we can annotate how relevant a word w.r.t task in a given view.
Do the similar analysis with LLMs where we can annotate relevance of a word w.r.t task in a given view. 5. Evaluate the human judgements with LLM based annotators and provide unique capabilities and drawbacks of these LLMs (GPT-3.5,4, and Mistral models)

Main Research Questions
To what extent do large language model (LLM) annotations of task relevance for concepts align with human annotations across different representational views and concept types?

View-Based Alignment: How does the similarity between human and LLM annotations of concept relevance vary across different presentation views (e.g., sentences, pictures, word clouds)? b. In which representational view is the alignment between human and LLM relevance ratings most pronounced?

Concept Type Sensitivity: Do LLMs align more closely with human judgments for concrete concepts than for abstract concepts, or vice versa? b. How does the type of concept (abstract vs. concrete) affect the degree of alignment between human and LLM annotations?

Interaction Effects: Is there an interaction between concept type and representational view that influences the alignment of task-relevance judgments between humans and LLMs? b. Are certain combinations of view and concept type particularly conducive to high alignment between LLM and human annotations?

Cognitive Modeling Potential: Can LLMs effectively model human-like concept relevance judgments across varying contexts of presentation and abstraction?

Goal
Should work towards publishing the work in a conference or workshop.

References

Corresponding Lab Member: Mounika Marreddy and Alexander Mehler.

Bachelor Thesis: Exploring Pretrained Retrievers and Embedding-Based Search for Accurate Book Metadata Retrieval in RAG Pipelines.

Description

Retrieving accurate book metadata is essential for enhancing the performance of Retrieval-Augmented Generation (RAG) pipelines. This project explores modern, non-heuristic approaches to metadata retrieval, focusing on the use of pretrained retrievers and embedding-based similarity search. Instead of relying on manually crafted heuristics, these methods leverage embeddings generated by state-of-the-art models to identify the most relevant metadata and associated texts. The experiment will utilize large indexed corpora, such as Wikipedia and online library databases, to evaluate the efficacy of pretrained retrievers and embedding similarity for matching input metadata with incomplete or ambiguous information. The project will involve indexing metadata and textual content from publicly available sources (e.g., Open Library, Google Books, Wikipedia) using vector-based search frameworks. Pretrained models, such as dense retrievers (e.g., DPR, SentenceTransformers), will be used to generate embeddings for both input metadata and indexed corpora. The results will be compared to traditional heuristic-based methods to evaluate retrieval accuracy, scalability, and adaptability to incomplete metadata scenarios. This research addresses a significant bottleneck in RAG pipelines, where retrieval systems must efficiently integrate external knowledge to improve language model performance in answering specific queries. While this study focuses on bibliographic data, the proposed methods are generalizable and applicable to other domains requiring accurate and scalable metadata retrieval. The outcomes will provide insights into the trade-offs between heuristic and non-heuristic approaches and contribute to advancing metadata retrieval techniques for knowledge-intensive NLP tasks. References:

Corresponding Lab Member: Alexander Mehler.

Bachelor Thesis: Developing a Heuristic for Retrieving Specific Book Metadata in Retrieval-Augmented Generation (RAG) Pipelines.

Description

Accurate retrieval of book metadata is a critical challenge in the development of Retrieval-Augmented Generation (RAG) pipelines. This project aims to develop a heuristic-based procedure for retrieving the most valid metadata - and potentially the text - of books from various online library databases using publicly available APIs. These databases contain large collections of book records, often with incomplete or inconsistent metadata. This makes querying and matching a specific publication a complex task, especially when dealing with incomplete input metadata. The procedure will address cases where multiple books share similar metadata, such as the same title and author, but belong to different editions or publications. The proposed heuristic will analyze and rank the results of API queries to identify the best match for the input data. The approach involves a detailed study of metadata patterns in online libraries and the development of robust matching criteria that account for variations and gaps in the data. This work contributes to an emerging area in natural language processing where RAG pipelines rely on external knowledge sources to augment large language models (LLMs) with domain-specific information. By addressing the challenge of metadata retrieval, this project will improve the accuracy and reliability of downstream tasks, such as answering questions about specific books. Although the focus of this work is on bibliographic data, the developed heuristic has the potential to be generalized for metadata retrieval in other domains. The outcome of this project will be a validated methodology that can be seamlessly integrated into RAG pipelines, representing a significant step forward in leveraging external databases for high quality contextual information retrieval. References:

Corresponding Lab Member: Alexander Mehler.

Bachelor Thesis: How does Language Bias Affect Pretrained Language Models?.

Description

Does language bias exist in pretrained large language models, such as those trained using a masked language modeling objective? What are the core components of these models that tend to produce this bias? Language bias refers to the tendency of multilingual models to prefer answering or selecting responses (e.g., in question-answering or information retrieval tasks) in the same language as the query, even when more likely candidate answers are available in other languages. What are the primary causes of this behavior? Are they linguistic, embedded in the training objective, or influenced by the loss function? These questions remain unresolved. Bachelor's and Master's theses are invited to explore these or related questions. References:

Language Bias in Multilingual Information Retrieval: The Nature of the Beast and Mitigation Methods

Corresponding Lab Member: Alexander Mehler.

Bachelor Thesis: A comparative study of methodologies that are used to identifying human vs automatic generated text.

Description

With the advent of large language models such as ChatGPT, growing ethical concerns have emerged, highlighting the need for approaches to address automatic text recognition models. These models are becoming increasingly popular but remain underexplored and not well established. A study is needed to provide an overview of existing work in this area and evaluate its usefulness. Bachelor's and Master's theses are invited to explore this field through a comparative approach by reimplementing and testing a range of established methods. References:

Scalable watermarking for identifying large language model outputs

Corresponding Lab Member: Alexander Mehler.

Master Thesis: Unlocking Wikipedia for Research: A Modular Toolkit for Structured NLP Applications.

Description

Wikipedia serves as a vast and diverse resource that is widely used in research domains to address a variety of tasks and questions. However, its size, semi-structured form, inconsistent formatting, and noisy elements (e.g., infoboxes) pose significant challenges to its accessibility and usability in structured research applications. This thesis aims to develop a comprehensive framework to overcome these challenges and enable researchers to effectively use Wikipedia's content for NLP and other structured research purposes. The proposed work focuses on the design of a modular, database-driven toolkit that supports the local use of Wikipedia for NLP processing. Key objectives include exploring existing tools and databases, integrating Wikidata, and leveraging different database solutions to address different use cases. Specific tasks include selecting and evaluating databases, designing database schemas, processing Wikipedia dump files as source data, and implementing robust mechanisms for data extraction, parsing (e.g., Wikitext), and updating. Additional challenges such as constructing category and social graphs, managing interlanguage links, handling revisions, and integrating DUUI (Docker Unified UIMA Interface) will also be addressed. The goal of this thesis is to provide a practical toolkit for researchers that facilitates the effective and flexible use of Wikipedia's content for a wide range of applications. See also:

Corresponding Lab Member: Daniel Baumartz and Alexander Mehler.

Bachelor Thesis: Multimodal data integration and processing in DUUI.

Description

The Docker Unified UIMA Interface (DUUI) is a tool designed for the automated analysis of large corpora using a variety of NLP tools. Currently, DUUI supports the processing of text, audio, and video data. To extend its capabilities, additional support for multimodal data, such as that provided by Va.Si.Li-Lab – which includes motion data, object interaction data, and more – should be integrated into DUUI. All integrated data will need to be linked through a new type system tailored to each modality. Furthermore, processes such as motion detection must be incorporated to effectively process and analyze these new data types within DUUI. Bachelor's and Master's theses are invited to explore this multimodal model extension and integration. References:

Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.

Bachelor Thesis: Affiliation of Speech and Gesture through LLMs.

Description

Most "referential" gestures have a docking point in accompanying speech, known as the lexical affiliate. This bachelor’s thesis leverages this empirical fact to utilize large language models (LLMs) for gesture annotation. Each occurrence of a referential gesture in a multimodal dataset is presented to an LLM, which is tasked with identifying the corresponding affiliate expression in speech. Through this process, a gesture interpretation is derived. Additionally, the approach aims to detect gestures that lack an overt affiliate. Building on the strong performance of LLMs in handling bridging relations, the thesis proposes a frame-based interpretation for such gestures. This work makes a central topic of multimodal communication accessible to modern computational techniques, provides quantitative insights into speech-gesture affiliation, and lays the foundation for further gesture classifications.
Corresponding Lab Member: Alexander Mehler.

Master Thesis: Aristotelian Modification of Nominals.

Description

The standard semantics of noun-modifying adjectives is typically explained in terms of set membership in one way or another. Modern theories often incorporate scales, particularly for measure adjectives. This master's thesis will generalize such approaches by employing more general property spaces, which can be conceptualized as accidental qualities, a notion derived from Aristotle’s linguistic work. The accidental qualities of nominals will be determined by clustering adjectives from large corpora, thereby enriching lexical entries. This thesis complements computational linguistic research on the generative lexicon, has relevance for multimodal speech-gesture integration, and offers a novel perspective on the metaphoric use of adjectives.
Corresponding Lab Member: Alexander Mehler.

Bachelor/Master Thesis: Multimodal VR Data Meets DUUI.

Description

The processing of large and extensive unstructured corpora is a constant challenge for various scientific disciplines. For this purpose, the Docker Unified UIMA Interface (DUUI) was developed, which provides NLP analysis methods based on container services to perform horizontally and vertically distributed big data analyses in a unified, standardized and reusable and schema-based process. The first steps towards multimodality have also already been taken. The task of this thesis is to adapt DUUI processing so that it can also be used to process multimodal data collected through VR experiments. The main difficulty lies in the alignment of speech, transcription and movements.
See also:

Unlocking the Heterogeneous Landscape of Big Data NLP with DUUI
Efficient, uniform and scalable parallel NLP pre-processing with DUUI: Perspectives and Best Practice for the Digital Humanities

Alexander Mehler.

Master Thesis: Natural Human interactions with LLM’s per Audio.

Description

Natural conversations between people are standard, and this is also possible with large language models (LLMs). Human speech can be converted to text, which can then be used as input for the LLM. The output of the LLM is then converted back to audio. However, due to latency and the nature of audio output, it is still a major challenge to integrate a chatbot that can communicate naturally in both text and audio without human interlocutors noticing this latency, especially in multilingual environments. Therefore, Bachelor's or Master's are invited that address these latency issues. See also:

LLaMA-Omni: Seamless Speech Interaction with Large Language Model

Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.

Bachelor Thesis: Diversification of the container landscape for DUUI.

Description

The processing of large and extensive unstructured corpora remains a significant challenge for various scientific disciplines. To address this, the Docker Unified UIMA Interface (DUUI) was developed. DUUI provides NLP methods through container services to perform horizontally and vertically distributed big data analysis in a unified, standardized, reusable, and schema-based process. In the medium to long term, DUUI can leverage a variety of container services to implement optimal processing solutions tailored to specific scenarios and environmental parameters. This involves the creation, implementation, and evaluation of container services for DUUI that have not yet been integrated. Bachelor's or Master's theses are invited to address this task of services integration. See also:

Corresponding Lab Member: Giuseppe Abrami and Alexander Mehler.

Bachelor Thesis: Retrieval-Augmented Generation (RAG): Synthesizing Knowledge from Large Corpora.

Description

The increase of textual data in scientific and other domains has created an urgent need for tools that can efficiently retrieve accurate information from large corpora. Can large language models help researchers identify critical information - metaphorically, "needles in a haystack"? This research explores Retrieval-Augmented Generation (RAG) as a framework for proposing pipelines and models capable of locating specific units of information in response to user queries. Crucially, this approach avoids the need for explicit fine-tuning of large language models on domain-specific data. Instead, it emphasizes techniques such as prompt engineering, advanced data retrieval mechanisms, and innovative query formulation. Possible methodologies include the use of embedding spaces, graph databases, or hybrid architectures to improve retrieval accuracy and synthesis capabilities. Bachelor's or Master's theses are invited to contribute novel solutions to this interdisciplinary challenge. See also: OPEN SCHOLAR: SYNTHESIZING SCIENTIFIC LITERATURE WITH RETRIEVAL-AUGMENTED LMS; CCC-BERT | Kaggle
Corresponding Lab Member: Alexander Mehler.

Courses

Summer Semester, 2026

Lecture: NLP-gestützte Data Science. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Seminar: Seminar Forschungsmethodik in den Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Winter Semester, 2025

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Transformer-based Natural Language Processing. Alexander Mehler, Manuel Schaaf and Mounika Marreddy.
_QIS _OLAT

Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Seminar: Research Methodology in Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Summer Semester, 2025

Lecture: NLP-gestützte Data Science. Alexander Mehler and Manuel Schaaf.
_QIS _OLAT

Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Schaaf.
_QIS _OLAT

Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Winter Semester, 2024

Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Schaaf.
_QIS _OLAT

Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
_QIS _OLAT

Summer Semester, 2024

Lecture: NLP-gestützte Data Science. Alexander Mehler and Manuel Stoeckel.
_QIS _OLAT

Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Winter Semester, 2023

Lecture: Einführung Computational Humanities. Alexander Mehler and Manuel Stoeckel.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
_QIS _OLAT

Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Summer Semester, 2023

Lecture: NLP-gestützte Data Science. Alexander Mehler, Manuel Stoeckel and Giuseppe Abrami.
_QIS _OLAT

Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
_QIS _OLAT

Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
_QIS _OLAT

Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
_QIS _OLAT

Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
_QIS _OLAT

Seminar: Computational Humanities. Alexander Mehler.
_QIS _OLAT

Seminar: Text Analytics. Alexander Mehler.
_QIS _OLAT

Resources

Quick Selection

Thesis Topic Proposals

2026

2025

Courses

Summer Semester, 2026

Winter Semester, 2025

Summer Semester, 2025

Winter Semester, 2024

Summer Semester, 2024

Winter Semester, 2023

Summer Semester, 2023