Teaching

Resources

Unter Teaching/Resources finden Sie:

Thesis Topic Proposals

2025

Master Thesis: Can Adversarial Text Snippets Achieve Refusal Dimension Deletion?.
Description
The threat of abuse through determined adversaries makes safety of public-facing LLMs a key priority for developers and researcher alike.
Despite intensive efforts, recent research shows that "refusal in language models [may be] mediated by a [one-dimensional subspace in the model's weights]" (Arditi et al., 2024) and that it is possible to create text-snippets that circumvent harmful response prevention in open- and closed-source LLMs using adversarial algorithms (Zou et al., 2023). This beckons the question, whether these two methods of "jailbreaking" LLMs align; i.e. whether adversarially generated text segments can shift a model's hidden states into a position that effectively approach refusal dimension deletion.

Related Work

Corresponding Lab Member: Manuel Schaaf and Alexander Mehler.
Master Thesis: Enhancing Audio Transcription with Visual Cues: A Multimodal Approach Utilizing Lip Movements and Facial Expressions for German Language Applications.
Description
Accurate audio transcription remains a challenge in environments with background noise, low-quality recordings, or overlapping speech. While significant progress has been made using audio-only approaches powered by deep learning and automatic speech recognition (ASR) systems (Graves et al., 2013), such methods often fail in adverse acoustic conditions. This thesis proposes the design and implementation of a multimodal transcription tool that integrates visual information, such as lip movements and facial expressions, to improve transcription accuracy, with a focus on adapting this approach to the German language. The proposed tool leverages the correlation between spoken words and their associated visual signals, such as lip shape dynamics and facial expressions, to improve the decoding of ambiguous or misinterpreted audio signals (Chung et al., 2017). It combines deep learning-based audio and video models to refine transcription results (Afouras et al., 2018). Existing datasets will form the basis for training and testing. For English, datasets such as LRS3 (Afouras et al., 2018) will be used. For German, the GLips (German Lips) dataset (Zöllner et al., 2022) provides extensive video data suitable for word-level lip-reading research. An important sub-task is the fine-tuning of existing pre-trained models for German-specific linguistic and phonetic features. This requires transfer learning techniques to adapt models trained on English datasets to German phoneme distributions, articulatory patterns, and grammatical structures. Experimental evaluation will measure transcription accuracy in both English and German, especially under noisy conditions, to quantify the advantages of the multimodal approach. This work aims to advance multilingual ASR systems by demonstrating the benefits of integrating audiovisual data for transcription.The results will demonstrate the effectiveness of combining existing datasets and adapting pre-trained models to improve transcription accuracy in real-world scenarios.
  • Graves et al., 2013
  • Chung et al., 2017
  • Afouras et al., 2018
  • Zöllner et al., 2022

  • Corresponding Lab Member: Maxim Konca and Alexander Mehler.
    Master Thesis: Unlocking Wikipedia for Research: A Modular Toolkit for Structured NLP Applications.
    Description
    Wikipedia serves as a vast and diverse resource that is widely used in research domains to address a variety of tasks and questions. However, its size, semi-structured form, inconsistent formatting, and noisy elements (e.g., infoboxes) pose significant challenges to its accessibility and usability in structured research applications. This thesis aims to develop a comprehensive framework to overcome these challenges and enable researchers to effectively use Wikipedia's content for NLP and other structured research purposes. The proposed work focuses on the design of a modular, database-driven toolkit that supports the local use of Wikipedia for NLP processing. Key objectives include exploring existing tools and databases, integrating Wikidata, and leveraging different database solutions to address different use cases. Specific tasks include selecting and evaluating databases, designing database schemas, processing Wikipedia dump files as source data, and implementing robust mechanisms for data extraction, parsing (e.g., Wikitext), and updating. Additional challenges such as constructing category and social graphs, managing interlanguage links, handling revisions, and integrating DUUI (Docker Unified UIMA Interface) will also be addressed. The goal of this thesis is to provide a practical toolkit for researchers that facilitates the effective and flexible use of Wikipedia's content for a wide range of applications. See also:
    Corresponding Lab Member: Daniel Baumartz and Alexander Mehler.
    Bachelor Thesis: Development of an HTML Parser for Efficient Extraction of Search Engine Results.
    Description
    The exponential growth of online information has made search engines indispensable tools for accessing relevant data. Search engines such as Google, Bing, and Yandex generate results that serve a variety of needs, from academic research to commercial applications. However, accessing and analyzing these results often requires parsing the underlying HTML code of the search results pages. This thesis investigates the design and implementation of an HTML parser capable of extracting, structuring, and analyzing search engine results in a reliable and efficient manner. The goal of this project is to develop a robust HTML parser tailored for extracting search results from multiple search engines, while addressing challenges such as dynamic content loading, anti-scraping measures, and variations in HTML structures. The parser will identify key elements such as titles, URLs, snippets, and metadata, standardize the extracted data into a consistent format, and output it for further analysis or integration with other systems. The implementation involves a combination of web scraping libraries, regular expressions, and advanced parsing techniques, with an emphasis on handling dynamic web content rendered through JavaScript. The project also addresses ethical and legal considerations related to web scraping, and proposes mechanisms for compliance with search engine terms of service and applicable data usage regulations. The developed parser will be evaluated based on its accuracy, speed, and adaptability to changes in search engine HTML structures. Performance benchmarks and use cases, such as competitive analysis and data aggregation, will be presented to demonstrate the utility and versatility of the system. The outcome of this thesis aims to contribute to the fields of data mining and web technologies by providing a fundamental tool for generically accessing and leveraging search engine data.
    Corresponding Lab Member: Maxim Konca and Alexander Mehler.
    Bachelor Thesis: Multimodal data integration and processing in DUUI.
    Description
    The Docker Unified UIMA Interface (DUUI) is a tool designed for the automated analysis of large corpora using a variety of NLP tools. Currently, DUUI supports the processing of text, audio, and video data. To extend its capabilities, additional support for multimodal data, such as that provided by Va.Si.Li-Lab – which includes motion data, object interaction data, and more – should be integrated into DUUI. All integrated data will need to be linked through a new type system tailored to each modality. Furthermore, processes such as motion detection must be incorporated to effectively process and analyze these new data types within DUUI. Bachelor's and Master's theses are invited to explore this multimodal model extension and integration. References:
    Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.
    Bachelor Thesis: Briding the Gap Between Virtual Environments and Reality.
    Description
    Virtual Reality (VR) enables immersive user experiences by providing highly realistic environments and interactions, particularly with advances in hand, eye, and face tracking. These technologies enhance engagement and facilitate more natural communication, effectively reducing the perceived physical distance between users. However, most virtual meeting environments remain entirely synthetic, disconnected from the physical spaces of users. Despite ongoing improvements in realistic digital avatars (e.g., MetaHumans), the creation and accessibility of authentic virtual environments remain limited. To address this, we propose a novel approach using real-time photogrammetry to reconstruct physical spaces in VR accurately. This method enables users to virtually visit each other's physical environments, seamlessly blending virtual and real spaces, thereby narrowing the gap between digital and physical interactions. Bachelor's and Master's theses are invited to experiment with and evaluate these emerging technologies. See also:
    Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.
    Bachelor Thesis: Affiliation of Speech and Gesture through LLMs.
    Description
    Most "referential" gestures have a docking point in accompanying speech, known as the lexical affiliate. This bachelor’s thesis leverages this empirical fact to utilize large language models (LLMs) for gesture annotation. Each occurrence of a referential gesture in a multimodal dataset is presented to an LLM, which is tasked with identifying the corresponding affiliate expression in speech. Through this process, a gesture interpretation is derived. Additionally, the approach aims to detect gestures that lack an overt affiliate. Building on the strong performance of LLMs in handling bridging relations, the thesis proposes a frame-based interpretation for such gestures. This work makes a central topic of multimodal communication accessible to modern computational techniques, provides quantitative insights into speech-gesture affiliation, and lays the foundation for further gesture classifications.
    Corresponding Lab Member: Andy Lücking and Alexander Mehler.
    Master Thesis: Aristotelian Modification of Nominals.
    Description
    The standard semantics of noun-modifying adjectives is typically explained in terms of set membership in one way or another. Modern theories often incorporate scales, particularly for measure adjectives. This master's thesis will generalize such approaches by employing more general property spaces, which can be conceptualized as accidental qualities, a notion derived from Aristotle’s linguistic work. The accidental qualities of nominals will be determined by clustering adjectives from large corpora, thereby enriching lexical entries. This thesis complements computational linguistic research on the generative lexicon, has relevance for multimodal speech-gesture integration, and offers a novel perspective on the metaphoric use of adjectives.
    Corresponding Lab Member: Andy Lücking and Alexander Mehler.
    Bachelor Thesis: A comparative study of methodologies that are used to identifying human vs automatic generated text.
    Description
    With the advent of large language models such as ChatGPT, growing ethical concerns have emerged, highlighting the need for approaches to address automatic text recognition models. These models are becoming increasingly popular but remain underexplored and not well established. A study is needed to provide an overview of existing work in this area and evaluate its usefulness. Bachelor's and Master's theses are invited to explore this field through a comparative approach by reimplementing and testing a range of established methods. References:
    Corresponding Lab Member: Ali Raza and Alexander Mehler.
    Bachelor Thesis: How does Language Bias Affect Pretrained Language Models?.
    Description
    Does language bias exist in pretrained large language models, such as those trained using a masked language modeling objective? What are the core components of these models that tend to produce this bias? Language bias refers to the tendency of multilingual models to prefer answering or selecting responses (e.g., in question-answering or information retrieval tasks) in the same language as the query, even when more likely candidate answers are available in other languages. What are the primary causes of this behavior? Are they linguistic, embedded in the training objective, or influenced by the loss function? These questions remain unresolved. Bachelor's and Master's theses are invited to explore these or related questions. References:
    Corresponding Lab Member: Ali Raza and Alexander Mehler.
    Bachelor Thesis: Exploring Pretrained Retrievers and Embedding-Based Search for Accurate Book Metadata Retrieval in RAG Pipelines.
    Description
    Retrieving accurate book metadata is essential for enhancing the performance of Retrieval-Augmented Generation (RAG) pipelines. This project explores modern, non-heuristic approaches to metadata retrieval, focusing on the use of pretrained retrievers and embedding-based similarity search. Instead of relying on manually crafted heuristics, these methods leverage embeddings generated by state-of-the-art models to identify the most relevant metadata and associated texts. The experiment will utilize large indexed corpora, such as Wikipedia and online library databases, to evaluate the efficacy of pretrained retrievers and embedding similarity for matching input metadata with incomplete or ambiguous information. The project will involve indexing metadata and textual content from publicly available sources (e.g., Open Library, Google Books, Wikipedia) using vector-based search frameworks. Pretrained models, such as dense retrievers (e.g., DPR, SentenceTransformers), will be used to generate embeddings for both input metadata and indexed corpora. The results will be compared to traditional heuristic-based methods to evaluate retrieval accuracy, scalability, and adaptability to incomplete metadata scenarios. This research addresses a significant bottleneck in RAG pipelines, where retrieval systems must efficiently integrate external knowledge to improve language model performance in answering specific queries. While this study focuses on bibliographic data, the proposed methods are generalizable and applicable to other domains requiring accurate and scalable metadata retrieval. The outcomes will provide insights into the trade-offs between heuristic and non-heuristic approaches and contribute to advancing metadata retrieval techniques for knowledge-intensive NLP tasks. References:
    Corresponding Lab Member: Omar Momen and Alexander Mehler.
    Bachelor Thesis: Developing a Heuristic for Retrieving Specific Book Metadata in Retrieval-Augmented Generation (RAG) Pipelines.
    Description
    Accurate retrieval of book metadata is a critical challenge in the development of Retrieval-Augmented Generation (RAG) pipelines. This project aims to develop a heuristic-based procedure for retrieving the most valid metadata - and potentially the text - of books from various online library databases using publicly available APIs. These databases contain large collections of book records, often with incomplete or inconsistent metadata. This makes querying and matching a specific publication a complex task, especially when dealing with incomplete input metadata. The procedure will address cases where multiple books share similar metadata, such as the same title and author, but belong to different editions or publications. The proposed heuristic will analyze and rank the results of API queries to identify the best match for the input data. The approach involves a detailed study of metadata patterns in online libraries and the development of robust matching criteria that account for variations and gaps in the data. This work contributes to an emerging area in natural language processing where RAG pipelines rely on external knowledge sources to augment large language models (LLMs) with domain-specific information. By addressing the challenge of metadata retrieval, this project will improve the accuracy and reliability of downstream tasks, such as answering questions about specific books. Although the focus of this work is on bibliographic data, the developed heuristic has the potential to be generalized for metadata retrieval in other domains. The outcome of this project will be a validated methodology that can be seamlessly integrated into RAG pipelines, representing a significant step forward in leveraging external databases for high quality contextual information retrieval. References:
    Corresponding Lab Member: Omar Momen and Alexander Mehler.
    Bachelor/Master Thesis: Live VR Experiment Visualisation.
    Description
    Va.Si.Li-Lab is a virtual reality-based system designed for tracking and analyzing interpersonal communication by integrating extensive tracking capabilities, such as hand, face, and eye movements, alongside audio data. It enables controlled multi-user scenarios, allowing researchers to assign roles, impose modality-specific restrictions, and analyze communication behavior through aligned multimodal data stored in a central database. It can be problematic both to track the progress of an experiment and to process the data in a meaningful way afterwards. This thesis is about the meaningful processing of the tracked data, both live and afterwards.
    See also:
    Bachelor/Master Thesis: Multimodal VR Data Meets DUUI.
    Description
    The processing of large and extensive unstructured corpora is a constant challenge for various scientific disciplines. For this purpose, the Docker Unified UIMA Interface (DUUI) was developed, which provides NLP analysis methods based on container services to perform horizontally and vertically distributed big data analyses in a unified, standardized and reusable and schema-based process. The first steps towards multimodality have also already been taken. The task of this thesis is to adapt DUUI processing so that it can also be used to process multimodal data collected through VR experiments. The main difficulty lies in the alignment of speech, transcription and movements.
    See also:
    Master Thesis: Natural Human interactions with LLM’s per Audio.
    Description
    Natural conversations between people are standard, and this is also possible with large language models (LLMs). Human speech can be converted to text, which can then be used as input for the LLM. The output of the LLM is then converted back to audio. However, due to latency and the nature of audio output, it is still a major challenge to integrate a chatbot that can communicate naturally in both text and audio without human interlocutors noticing this latency, especially in multilingual environments. Therefore, Bachelor's or Master's are invited that address these latency issues. See also:
    Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.
    Bachelor Thesis: The emperor's new clothes alias TextAnnotator's new and responsive interface.
    Description
    TextAnnotator, a web-based tool for platform-independent, simultaneous, and collaborative semi-automated annotation of unstructured corpora based on UIMA, is a flexible and feature-rich solution for annotating various linguistic and semantic features using multiple annotation views and tools. However, TextAnnotator is currently implemented at the visual interface level using an older version of ExtJS, which needs to be upgraded to a modern interface. This upgrade is necessary in the short to medium term to enable the creation and implementation of more modular and interchangeable components. Bachelor's or Master's theses are invited that aim to develop and test new interfaces to enhance TextAnnotator's versatility and attractiveness by leveraging modern web interface technologies. See also:
    Corresponding Lab Member: Giuseppe Abrami and Alexander Mehler.
    Bachelor Thesis: Diversification of the container landscape for DUUI.
    Description
    The processing of large and extensive unstructured corpora remains a significant challenge for various scientific disciplines. To address this, the Docker Unified UIMA Interface (DUUI) was developed. DUUI provides NLP methods through container services to perform horizontally and vertically distributed big data analysis in a unified, standardized, reusable, and schema-based process. In the medium to long term, DUUI can leverage a variety of container services to implement optimal processing solutions tailored to specific scenarios and environmental parameters. This involves the creation, implementation, and evaluation of container services for DUUI that have not yet been integrated. Bachelor's or Master's theses are invited to address this task of services integration. See also:
    Corresponding Lab Member: Giuseppe Abrami and Alexander Mehler.
    Bachelor Thesis: Retrieval-Augmented Generation (RAG): Synthesizing Knowledge from Large Corpora.
    Description
    The increase of textual data in scientific and other domains has created an urgent need for tools that can efficiently retrieve accurate information from large corpora. Can large language models help researchers identify critical information - metaphorically, "needles in a haystack"? This research explores Retrieval-Augmented Generation (RAG) as a framework for proposing pipelines and models capable of locating specific units of information in response to user queries. Crucially, this approach avoids the need for explicit fine-tuning of large language models on domain-specific data. Instead, it emphasizes techniques such as prompt engineering, advanced data retrieval mechanisms, and innovative query formulation. Possible methodologies include the use of embedding spaces, graph databases, or hybrid architectures to improve retrieval accuracy and synthesis capabilities. Bachelor's or Master's theses are invited to contribute novel solutions to this interdisciplinary challenge. See also: OPEN SCHOLAR: SYNTHESIZING SCIENTIFIC LITERATURE WITH RETRIEVAL-AUGMENTED LMS; CCC-BERT | Kaggle
    Corresponding Lab Member: Kevin Boenisch and Alexander Mehler.
    Bachelor Thesis: Bringing Order to Chaos: Structuring Unstructured Documents.
    Description
    The increasing volume and diversity of text corpora generated daily from various sources poses significant challenges for their processing and analysis. Generic tools that facilitate the exploration and understanding of such corpora in a standardized and intuitive manner are rare. A key issue is the transformation of unstructured plain text into generically structured formats that allow efficient reading, sorting, and searching. This task aims to develop models or algorithms that can process raw, unstructured text and produce structured outputs based on predefined rules, algorithms, or models. These outputs should be compatible with the Docker Unified UIMA Interface (DUUI), our general purpose corpus annotation tool. The structured format must also comply with the UIMA (Unstructured Information Management Architecture) standard. Bachelor's or Master's theses are invited to explore this topic and contribute to the broader goal of making diverse textual corpora more accessible and manageable. See also: Unlocking the Heterogeneous Landscape of Big Data NLP with DUUI - ACL Anthology
    Corresponding Lab Member: Kevin Boenisch and Alexander Mehler.

    Courses

    Winter Semester, 2024

    Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Schaaf.
    QISOLAT
    Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT

    Summer Semester, 2024

    Lecture: NLP-gestützte Data Science. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT

    Winter Semester, 2023

    Lecture: Einführung Computational Humanities. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT

    Summer Semester, 2023

    Lecture: NLP-gestützte Data Science. Alexander Mehler, Manuel Stoeckel and Giuseppe Abrami.
    QISOLAT
    Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT