Staff Member
Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Room 401e
D-60325 Frankfurt am Main
D-60054 Frankfurt am Main (use for package delivery)
Postfach / P.O. Box: 154
Phone:
Mail:
Follow me on:
Office Hour: Thursday, 10-12 AM
“Let there be neither dissatisfaction with oneself – for that would be pusillanimity; – nor self-satisfaction, for that would be stupidity.”
– Baltasar Gracián
Hi there,
I have 5 years of experience as a professional Software Developer (C# .NET Full-Stack) before transitioning to a Researcher role in AI and Natural Language Processing here at the Text Technology Lab, while pursuing my Master’s in Computer Science. I’ve also been a tutor twice.
My published research spans several fields, including visualization (2D & 3D) through easy-to-use user interfaces and also more complex spatial systems, virtual reality adaptations, and NLP techniques such as information retrieval, topic extraction, text classification, and fine-tuning or training of (Large) Language Models. I have also worked with traditional machine learning and deep learning methods for regression and classification tasks. Whatever research I do, I try to combine the more practical and hands-on craft of classical Sofware Engineering with the more scientific and theoretical side of research. You can find my publications and projects down below.
Finally, I like to participate in software development and data science competitions on platforms like kaggle, with results published on my GitHub and other outlets. Out of those competitions, my biggest accomplishments are:
- First place in the IAV-Coding competition: “Automated driving – GUI for visual evaluation of drone data”
- Best Dynamic Website Designs with Stunning Visuals and Interactive Elements by DESIGNRUSH
- Kaggle Discussion Master
Whatever project I do, I approach it with a strong sense of purpose, genuine enjoyment and the desire to become better, which is why I take pride in the things I create.
If you have found some common ground from this description that resonates with you, feel free to contact me – I’d like to hear from you as well.
Projects
As governments worldwide continue to release vast amounts of textual information, the need for efficient and insightful tools to extract, interpret and present this data has become increasingly critical. Towards solving this issue, we present the Bundestags-Mine: an environment that periodically retrieves pertinent data from the German parliament, parses and analyzes it using pipelines for natural language processing, and then displays the results in a web application that is publicly accessible. Bundestags-Mine helps to extract key information from parliamentary documents in a visually appealing matter for many use cases. For instance, the tool can be leveraged by journalists for news detection, lawyers for compliance checking, linguists for discourse analysis, and the broad public to inform themselves about the positions of political party members on a topic.
New Data Spaces for the Social Sciences
In order to more precisely research the major societal challenges of the coming decades, including digitization, climate change, and war- and pandemic-related societal changes, and to be able to identify the need for political action on this basis, the social sciences need innovative research data and methods. The DFG has established the long-term infrastructure priority programme “New Data Spaces” (SPP 2431) to open up and develop such new data spaces.
It is managed by the programme committee, consisting of Prof. Dr. Cordula Artelt (spokesperson) and Prof. Dr. Corinna Kleinert (both LIfBi), Prof. Dr. Reinhard Pollak (GESIS), Prof. Dr. Stefan Liebig (FU Berlin) and Prof. Dr. Alexander Mehler (Goethe University Frankfurt).
Viki LibraRy, is a first implementation for generating and exploring online information based on hypertext systems in a three-dimensional environment using virtual reality. Thereby a virtual library, based on Wikipedia, is created, in which Rooms are dynamically created with data, which is provided via a RESTful backend. In these Rooms the user can browse through all kind of different articles of the category in the form of Books. In addition, users can access different Rooms, through virtual portals. Beyond that, the explorations can be done alone or collaboratively, using Ubiq.
The specialised information service BIOfid (www.biofid.de) is oriented towards the special needs of scientists researching biodiversity topics at research institutions and in natural history collections. Since 2017, BIOfid has been building an infrastructure that contributes to the provision and mobilisation of research-relevant data in a variety of ways in the context of current developments in biodiversity research.
Publications
2024
BibTeX
@inbook{Mehler:et:al:2024:a,
author = {Mehler, Alexander and Bagci, Mevl{\"u}t and Schrottenbacher, Patrick
and Henlein, Alexander and Konca, Maxim and Abrami, Giuseppe and B{\"o}nisch, Kevin
and Stoeckel, Manuel and Spiekermann, Christian and Engel, Juliane},
editor = {Zlatkin-Troitschanskaia, Olga and Nagel, Marie-Theres and Klose, Verena
and Mehler, Alexander},
title = {Towards New Data Spaces for the Study of Multiple Documents with
Va.Si.Li-Lab: A Conceptual Analysis},
booktitle = {Students', Graduates' and Young Professionals' Critical Use of
Online Information: Digital Performance Assessment and Training
within and across Domains},
year = {2024},
publisher = {Springer Nature Switzerland},
address = {Cham},
pages = {259--303},
abstract = {The constitution of multiple documents has so far been studied
essentially as a process in which a single learner consults a
number (of segments) of different documents in the context of
the task at hand in order to construct a mental model for the
purpose of completing the task. As a result of this research focus,
the constitution of multiple documents appears predominantly as
a monomodal, non-interactive process in which mainly textual units
are studied, supplemented by images, text-image relations and
comparable artifacts. This approach is reflected in the contextual
fixity of the research design, in which the learners under study
search for information using suitably equipped computers. If,
on the other hand, we consider the openness of multi-agent learning
situations, this scenario lacks the aspects of interactivity,
contextual openness and, above all, the multimodality of information
objects, information processing and information exchange. This
is where the chapter comes in. It describes Va.Si.Li-Lab as an
instrument for multimodal measurement for studying and modeling
multiple documents in the context of interactive learning in a
multi-agent environment. To this end, the chapter places Va.Si.Li-Lab
in the spectrum of evolutionary approaches that vary the combination
of human and machine innovation and selection. It also combines
the requirements of multimodal representational learning with
various aspects of contextual plasticity to prepare Va.Si.Li-Lab
as a system that can be used for experimental research. The chapter
is conceptual in nature, designing a system of requirements using
the example of Va.Si.Li-Lab to outline an experimental environment
in which the study of Critical Online Reasoning (COR) as a group
process becomes possible. Although the chapter illustrates some
of these requirements with realistic data from the field of simulation-based
learning, the focus is still conceptual rather than experimental,
hypothesis-driven. That is, the chapter is concerned with the
design of a technology for future research into COR processes.},
isbn = {978-3-031-69510-0},
doi = {10.1007/978-3-031-69510-0_12},
url = {https://doi.org/10.1007/978-3-031-69510-0_12}
}
BibTeX
@inproceedings{Boenisch:Mehler:2024,
title = {Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval
via Bagging and SVR Ensembles},
author = {B\"{o}nisch, Kevin and Mehler, Alexander},
year = {2024},
booktitle = {Proceedings of the 2nd Legal Information Retrieval meets Artificial
Intelligence Workshop LIRAI 2024},
location = {Poznan, Poland},
publisher = {CEUR-WS.org},
address = {Aachen, Germany},
series = {CEUR Workshop Proceedings},
note = {accepted},
abstract = {We introduce a retrieval approach leveraging Support Vector Regression
(SVR) ensembles, bootstrap aggregation (bagging), and embedding
spaces on the German Dataset for Legal Information Retrieval (GerDaLIR).
By conceptualizing the retrieval task in terms of multiple binary
needle-in-a-haystack subtasks, we show improved recall over the
baselines (0.849 > 0.803 | 0.829) using our voting ensemble, suggesting
promising initial results, without training or fine-tuning any
deep learning models. Our approach holds potential for further
enhancement, particularly through refining the encoding models
and optimizing hyperparameters.},
keywords = {legal information retrieval, support vector regression, word embeddings, bagging ensemble}
}
BibTeX
@article{Boenisch:et:al:2024:b,
author = {B\"{o}nisch, Kevin and Mehler, Alexander and Babbili, Shaduan
and Heinrich, Yannick and Stephan, Philipp and Abrami, Giuseppe},
abstract = {We present Viki LibraRy, a dynamically built library in virtual
reality (VR) designed to visualize hypertext systems, with an
emphasis on collaborative interaction and spatial immersion. Viki
LibraRy goes beyond traditional methods of text distribution by
providing a platform where users can share, process, and engage
with textual information. It operates at the interface of VR,
collaborative learning and spatial data processing to make reading
tangible and memorable in a spatially mediated way. The article
describes the building blocks of Viki LibraRy, its underlying
architecture, and several use cases. It evaluates Viki LibraRy
in comparison to a conventional web interface for text retrieval
and reading. The article shows that Viki LibraRy provides users
with spatial references for structuring their recall, so that
they can better remember consulted texts and their meta-information
(e.g. in terms of subject areas and content categories)},
title = {{Viki LibraRy: Collaborative Hypertext Browsing and Navigation
in Virtual Reality}},
journal = {New Review of Hypermedia and Multimedia},
volume = {0},
number = {0},
pages = {1--31},
year = {2024},
publisher = {Taylor \& Francis},
doi = {10.1080/13614568.2024.2383581},
url = {https://doi.org/10.1080/13614568.2024.2383581},
eprint = {https://doi.org/10.1080/13614568.2024.2383581}
}
BibTeX
@inproceedings{Boenisch:et:al:2024,
author = {B\"{o}nisch, Kevin and Stoeckel, Manuel and Mehler, Alexander},
title = {HyperCausal: Visualizing Causal Inference in 3D Hypertext},
year = {2024},
isbn = {9798400705953},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3648188.3677049},
doi = {10.1145/3648188.3677049},
abstract = {We present HyperCausal, a 3D hypertext visualization framework
for exploring causal inference in generative Large Language Models
(LLMs). HyperCausal maps the generative processes of LLMs into
spatial hypertexts, where tokens are represented as nodes connected
by probability-weighted edges. The edges are weighted by the prediction
scores of next tokens, depending on the underlying language model.
HyperCausal facilitates navigation through the causal space of
the underlying LLM, allowing users to explore predicted word sequences
and their branching. Through comparative analysis of LLM parameters
such as token probabilities and search algorithms, HyperCausal
provides insight into model behavior and performance. Implemented
using the Hugging Face transformers library and Three.js, HyperCausal
ensures cross-platform accessibility to advance research in natural
language processing using concepts from hypertext research. We
demonstrate several use cases of HyperCausal and highlight the
potential for detecting hallucinations generated by LLMs using
this framework. The connection with hypertext research arises
from the fact that HyperCausal relies on user interaction to unfold
graphs with hierarchically appearing branching alternatives in
3D space. This approach refers to spatial hypertexts and early
concepts of hierarchical hypertext structures. A third connection
concerns hypertext fiction, since the branching alternatives mediated
by HyperCausal manifest non-linearly organized reading threads
along artificially generated texts that the user decides to follow
optionally depending on the reading context.},
booktitle = {Proceedings of the 35th ACM Conference on Hypertext and Social Media},
pages = {330–-336},
numpages = {7},
keywords = {3D hypertext, large language models, visualization},
location = {Poznan, Poland},
series = {HT '24},
video = {https://www.youtube.com/watch?v=ANHFTupnKhI}
}
2023
BibTeX
@bathesis{boenisch:2023,
author = {Kevin B{\"o}nisch},
title = {Dialog generation using language models},
institution = {Goethe University},
pages = {28},
year = {2023},
url = {https://publikationen.ub.uni-frankfurt.de/opus4/frontdoor/index/index/docId/79165},
repository = {https://github.com/texttechnologylab/ROBERT}
}
BibTeX
@inproceedings{Boenisch:et:al:2023,
title = {{Bundestags-Mine}: Natural Language Processing for Extracting
Key Information from Government Documents},
isbn = {9781643684734},
issn = {1879-8314},
url = {http://dx.doi.org/10.3233/FAIA230996},
doi = {10.3233/faia230996},
booktitle = {Legal Knowledge and Information Systems},
publisher = {IOS Press},
author = {B\"{o}nisch, Kevin and Abrami, Giuseppe and Wehnert, Sabine and Mehler, Alexander},
year = {2023}
}
BibTeX
@inproceedings{Babbili:et:al:2023,
author = {Babbili, Shaduan and B\"{o}nisch, Kevin and Heinrich, Yannick
and Stephan, Philipp and Abrami, Giuseppe and Mehler, Alexander},
title = {Viki LibraRy: A Virtual Reality Library for Collaborative Browsing
and Navigation through Hypertext},
year = {2023},
isbn = {9798400702327},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3603163.3609079},
doi = {10.1145/3603163.3609079},
abstract = {We present Viki LibraRy, a virtual-reality-based system for generating
and exploring online information as a spatial hypertext. It creates
a virtual library based on Wikipedia in which Rooms are used to
make data available via a RESTful backend. In these Rooms, users
can browse through all articles of the corresponding Wikipedia
category in the form of Books. In addition, users can access different
Rooms, through virtual portals. Beyond that, the explorations
can be done alone or collaboratively, using Ubiq.},
booktitle = {Proceedings of the 34th ACM Conference on Hypertext and Social Media},
articleno = {6},
numpages = {3},
keywords = {virtual reality simulation, virtual reality, virtual hypertext, virtual museum},
location = {Rome, Italy},
series = {HT '23},
pdf = {https://dl.acm.org/doi/pdf/10.1145/3603163.3609079}
}