
Research assistant
Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Room 401e
D-60325 Frankfurt am Main
D-60629 Frankfurt am Main (use for package delivery)
Postfach / P.O. Box: 154
Phone:
Mail:
Thesis topic proposals
2026
Master Thesis: Generative Frameworks for Multimodal Survey Data Anonymization.
Description
Ensuring privacy in open-ended survey research demands sophisticated anonymization frameworks capable of sanitizing text, images, and audio without destroying the downstream research utility of the data. This Master's thesis explores the intersection of deep generative modeling and privacy preservation to build an end-to-end multimodal anonymization pipeline. For text, the student will investigate advanced pseudonymization and contextual text-replacement strategies leveraging state-of-the-art LLMs and privacy-filtering guardrails. For visual data, the project will combine semantic segmentation and OCR with generative Diffusion Models to perform secure, context-aware inpainting (e.g., replacing faces or identifying text while preserving the overall scene context). For audio, the student will explore voice-masking and speech synthesis techniques to redact speaker identity. The core research focus will be a rigorous, quantitative evaluation of the trade-off between strict privacy preservation and the semantic utility of the sanitized datasets for downstream empirical analysis.
Corresponding Lab Member:
Corresponding Lab Member:
Bachelor Thesis: Joint Sarcasm, Sentiment, and Stance Detection in Underrepresented Dialects.
Description
Colloquial communication on social media frequently blends emotional expression with social or political viewpoints, making the simultaneous detection of sentiment, stance, and sarcasm highly interconnected. Sarcasm heavily co-exists with these features, often acting as a linguistic disruptor that flips sentiment polarity and obscures an author's true stance toward a target. This thesis involves developing a unified prompt-based or multi-task classification pipeline to jointly model sarcasm, sentiment, and stance in dialectal text. The research will primarily anchor on Arabic dialects (such as Levantine, Egyptian, and Gulf) and extend to underrepresented Indic dialects (such as regional variants of Hindi or Bhojpuri). The student will leverage state-of-the-art dialectal and multilingual language models (e.g., MARBERTv2, IndicBERT) to evaluate how models capture subjective alignment under low-resource constraints. Tasks include benchmarking multi-dialectal datasets, implementing joint classification layers, and analyzing error patterns caused by sarcasm-induced stance flips.
Corresponding Lab Member:
Corresponding Lab Member:
2025
Bachelor Thesis: Multimodal Sentiment Analysis via Cross-Attention Fusion in Latent Space.
Description
Accurately estimating sentiments in video data is a complex challenge that requires integrating visual, audio, and textual cues. This proposal involves developing a multimodal sentiment analysis pipeline that uses state-of-the-art models to extract embeddings from different modalities: Qwen2.5 for text, JEPA 2 for visuals, and WhisperX (pyannote) for audio. The embeddings then undergo projection into a shared latent space and fusion via cross-attention mechanisms to capture intermodal dependencies. A probabilistic output layer will then estimate sentiment distributions over fine-grained emotions. To accelerate the analysis process for real-time applications, the pipeline should use a stream-like vector projection method that updates latent space representations incrementally instead of reprocessing entire sequences. The pipeline will be built using DUUI to ensure modularity, scalability, and reproducibility. The goal of this thesis is to tackle limitations of unimodal sentiment analysis and traditional fusion methods to achieve efficient, accurate, and scalable multimodal sentiment analysis.
Corresponding Lab Member:
Corresponding Lab Member:
If you have any suggestions of your own relating to this or our other proposed topics, please do not hesitate to contact us.
In addition, we provide a mailing list for free, which we use to inform regularly about updates on new qualification and research work as well as other information relating to Texttechnology.
Publications
2026
May, 2026.
From Images to Topics: Evaluating Vision-Language Models for Topic
Classification of Election Advertising. Companion Publication of the 2026 18th ACM Web Science Conference, 10–14.
BibTeX
@inproceedings{weiss:et:al:2026,
series = {WebSci Companion ’26},
title = {From Images to Topics: Evaluating Vision-Language Models for Topic
Classification of Election Advertising},
url = {http://dx.doi.org/10.1145/3795513.3807426},
doi = {10.1145/3795513.3807426},
booktitle = {Companion Publication of the 2026 18th ACM Web Science Conference},
publisher = {ACM},
author = {Weiss, Julia and Burger, Axel and Roßmann, Joss and Meurer, Jan Eric
and Abusaleh, Ali},
year = {2026},
month = {May},
pages = {10–14},
collection = {WebSci Companion ’26}
keywords = {Multimodal Large Language Models, Political communication, Privacy-aware AI, new-data-spaces, circlet}
}
2026.
TTLab at AraSentEval: SARF (صرف) Sentiment Analysis via Root-based
Fusion for Multi-Dialectal Arabic. Proceedings of the 7th Workshop on Open-Source Arabic Corpora
and Processing Tools (OSACT7), co-located with the Language Resources
and Evaluation Conference (LREC 2026).
accepted.
BibTeX
@inproceedings{Abusaleh:et:al:2026:sarf,
title = {TTLab at AraSentEval: SARF (صرف) Sentiment Analysis via Root-based
Fusion for Multi-Dialectal Arabic},
author = {Abusaleh, Ali and Verma, Bhuvanesh and Mehler, Alexander},
booktitle = {Proceedings of the 7th Workshop on Open-Source Arabic Corpora
and Processing Tools (OSACT7), co-located with the Language Resources
and Evaluation Conference (LREC 2026)},
eventdate = {May, 2026},
location = {Palma, Mallorca, Spain},
year = {2026},
keywords = {NLP, Sentiment Analysis, Arabic analysis, new-data-spaces, circlet, satek},
abstract = {Arabic sentiment analysis is challenged by morphological complexity
and lexical variation across Arabic dialects, compounded by subjectivity
in how speakers and writers express sentiment. In this paper,
we present our submission for the AraSentEval 2026 Shared Task
on Arabic Dialect Sentiment Analysis. We propose SARF (صرف) a
multi-view architectural framework that integrates surface-level
context with stemmed and rooted morphological perspectives using
a shared MARBERTv2 encoder. Our system employs a hybrid BERT-CNN-BiLSTM-Attention
architecture to capture both local sentiment n-grams and global
sequential dependencies. Experimental results show that while
individual morphological normalization strategies (stemming or
rooting) may degrade performance, their joint integration via
cross-morphological attention provides robust features across
diverse dialects. Our final system achieved a competitive macro-F1-score
of 0.9263, ranking 2nd out of 15 participating teams.},
note = {accepted}
}
2026.
Learning to Detect Cross-Modal Negation: An Analysis of Latent
Representations and an Attention-Based Solution. 2026 8th International Conference on Natural Language Processing (ICNLP), 613–622.
BibTeX
@inproceedings{Abusaleh:et:al:2026,
author = {AbuSaleh, Ali and Hammerla, Leon and Mehler, Alexander},
booktitle = {2026 8th International Conference on Natural Language Processing (ICNLP)},
title = {Learning to Detect Cross-Modal Negation: An Analysis of Latent
Representations and an Attention-Based Solution},
year = {2026},
volume = {},
number = {},
pages = {613-622},
keywords = {Modeling;Videos;Labeling;Visualization;Signal detection;Large language models;Head;Media;Accuracy;Annotations;Vision language model;Natural language processing;Cross-modal retrieval;negation detection;video analysis;Multimodal analysis;Political Communication, neglab, new-data-spaces, circlet},
doi = {10.1109/ICNLP69856.2026.11527861},
abstract = {Detecting high-level semantic concepts like negation across modalities
remains a challenge for current multimodal systems. We analyze
this as a fundamental representation learning problem, providing
the first evidence that negation does not form a linearly or non-linearly
separable class in the latent spaces of standard vision-language
models (VLMs). We demonstrate that pretrained embeddings primarily
encode modality-specific features, lacking a generalizable negation
signal. To overcome this, we propose a novel cross-modal attention
architecture that explicitly models inter-modal dependencies,
achieving performance gains of up to +7.03% F1 over unimodal baselines.
Our analysis reveals a key asymmetry: while textual negation often
appears independently, visual negation is semantically dependent
on linguistic context, a finding validated through our statistical
analysis of 3,222 political video-text pairs automatically annotated
via Qwen2.5-VL. By combining this analysis with self-supervised
video representations (JEPA2), we advance the modeling of temporal
negation. This work provides new methods and insights for learning
robust, semantically-aligned representations in multimodal systems.}
}
2025
2025.
GENERATIVE AI ON CGM: TOWARDS A FOUNDATION MODEL FOR GLUCOSE PREDICTION,
ROOT CAUSE ANALYSIS AND ANOMALY DETECTION. DIABETES TECHNOLOGY & THERAPEUTICS, 27:E144–E144.
BibTeX
@inproceedings{rahim2025generative,
title = {GENERATIVE AI ON CGM: TOWARDS A FOUNDATION MODEL FOR GLUCOSE PREDICTION,
ROOT CAUSE ANALYSIS AND ANOMALY DETECTION},
author = {Rahim, Mehdi and Abusaleh, Ali},
booktitle = {DIABETES TECHNOLOGY \& THERAPEUTICS},
volume = {27},
pages = {E144--E144},
year = {2025},
organization = {MARY ANN LIEBERT, INC 140 HUGUENOT STREET, 3RD FL, NEW ROCHELLE, NY 10801 USA}
}
2024
2024.
A Multitask VAE for Time Series Preprocessing and Prediction of
Blood Glucose Level.
BibTeX
@misc{Abusaleh:Rahim:2024,
title = {A Multitask VAE for Time Series Preprocessing and Prediction of
Blood Glucose Level},
author = {Ali Abusaleh and Mehdi Rahim},
year = {2024},
eprint = {2410.00015},
archiveprefix = {arXiv},
primaryclass = {eess.SP},
url = {https://arxiv.org/abs/2410.00015}
}
