Staff Member
Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Room 401e
D-60325 Frankfurt am Main
D-60054 Frankfurt am Main (use for package delivery)
Postfach / P.O. Box: 154
Phone:
Mail:
Follow me on:
Office Hour: Thursday, 10-12 AM
“Let there be neither dissatisfaction with oneself – for that would be pusillanimity; – nor self-satisfaction, for that would be stupidity.”
– Baltasar Gracián
Projects
As governments worldwide continue to release vast amounts of textual information, the need for efficient and insightful tools to extract, interpret and present this data has become increasingly critical. Towards solving this issue, we present the Bundestags-Mine: an environment that periodically retrieves pertinent data from the German parliament, parses and analyzes it using pipelines for natural language processing, and then displays the results in a web application that is publicly accessible. Bundestags-Mine helps to extract key information from parliamentary documents in a visually appealing matter for many use cases. For instance, the tool can be leveraged by journalists for news detection, lawyers for compliance checking, linguists for discourse analysis, and the broad public to inform themselves about the positions of political party members on a topic.
The specialised information service BIOfid (www.biofid.de) is oriented towards the special needs of scientists researching biodiversity topics at research institutions and in natural history collections. Since 2017, BIOfid has been building an infrastructure that contributes to the provision and mobilisation of research-relevant data in a variety of ways in the context of current developments in biodiversity research.
New Data Spaces for the Social Sciences
In order to more precisely research the major societal challenges of the coming decades, including digitization, climate change, and war- and pandemic-related societal changes, and to be able to identify the need for political action on this basis, the social sciences need innovative research data and methods. The DFG has established the long-term infrastructure priority programme “New Data Spaces” (SPP 2431) to open up and develop such new data spaces.
It is managed by the programme committee, consisting of Prof. Dr. Cordula Artelt (spokesperson) and Prof. Dr. Corinna Kleinert (both LIfBi), Prof. Dr. Reinhard Pollak (GESIS), Prof. Dr. Stefan Liebig (FU Berlin) and Prof. Dr. Alexander Mehler (Goethe University Frankfurt).
Viki LibraRy, is a first implementation for generating and exploring online information based on hypertext systems in a three-dimensional environment using virtual reality. Thereby a virtual library, based on Wikipedia, is created, in which Rooms are dynamically created with data, which is provided via a RESTful backend. In these Rooms the user can browse through all kind of different articles of the category in the form of Books. In addition, users can access different Rooms, through virtual portals. Beyond that, the explorations can be done alone or collaboratively, using Ubiq.
Publications
2024
BibTeX
@inproceedings{Boenisch:et:al:2024,
author = {B\"{o}nisch, Kevin and Stoeckel, Manuel and Mehler, Alexander},
title = {HyperCausal: Visualizing Causal Inference in 3D Hypertext},
year = {2024},
isbn = {9798400705953},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3648188.3677049},
doi = {10.1145/3648188.3677049},
abstract = {We present HyperCausal, a 3D hypertext visualization framework
for exploring causal inference in generative Large Language Models
(LLMs). HyperCausal maps the generative processes of LLMs into
spatial hypertexts, where tokens are represented as nodes connected
by probability-weighted edges. The edges are weighted by the prediction
scores of next tokens, depending on the underlying language model.
HyperCausal facilitates navigation through the causal space of
the underlying LLM, allowing users to explore predicted word sequences
and their branching. Through comparative analysis of LLM parameters
such as token probabilities and search algorithms, HyperCausal
provides insight into model behavior and performance. Implemented
using the Hugging Face transformers library and Three.js, HyperCausal
ensures cross-platform accessibility to advance research in natural
language processing using concepts from hypertext research. We
demonstrate several use cases of HyperCausal and highlight the
potential for detecting hallucinations generated by LLMs using
this framework. The connection with hypertext research arises
from the fact that HyperCausal relies on user interaction to unfold
graphs with hierarchically appearing branching alternatives in
3D space. This approach refers to spatial hypertexts and early
concepts of hierarchical hypertext structures. A third connection
concerns hypertext fiction, since the branching alternatives mediated
by HyperCausal manifest non-linearly organized reading threads
along artificially generated texts that the user decides to follow
optionally depending on the reading context.},
booktitle = {Proceedings of the 35th ACM Conference on Hypertext and Social Media},
pages = {330–-336},
numpages = {7},
keywords = {3D hypertext, large language models, visualization},
location = {Poznan, Poland},
series = {HT '24},
video = {https://www.youtube.com/watch?v=ANHFTupnKhI}
}
BibTeX
@article{Boenisch:et:al:2024:b,
author = {B\"{o}nisch, Kevin and Mehler, Alexander and Babbili, Shaduan
and Heinrich, Yannick and Stephan, Philipp and Abrami, Giuseppe},
abstract = {We present Viki LibraRy, a dynamically built library in virtual
reality (VR) designed to visualize hypertext systems, with an
emphasis on collaborative interaction and spatial immersion. Viki
LibraRy goes beyond traditional methods of text distribution by
providing a platform where users can share, process, and engage
with textual information. It operates at the interface of VR,
collaborative learning and spatial data processing to make reading
tangible and memorable in a spatially mediated way. The article
describes the building blocks of Viki LibraRy, its underlying
architecture, and several use cases. It evaluates Viki LibraRy
in comparison to a conventional web interface for text retrieval
and reading. The article shows that Viki LibraRy provides users
with spatial references for structuring their recall, so that
they can better remember consulted texts and their meta-information
(e.g. in terms of subject areas and content categories)},
title = {{Viki LibraRy: Collaborative Hypertext Browsing and Navigation
in Virtual Reality}},
year = {2024},
journal = {New Review of Hypermedia and Multimedia},
numpages = {29},
publisher = {Taylor \& Francis},
note = {accepted}
}
BibTeX
@inproceedings{Boenisch:Mehler:2024,
title = {Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval
via Bagging and SVR Ensembles},
author = {B\"{o}nisch, Kevin and Mehler, Alexander},
year = {2024},
booktitle = {Proceedings of the 2nd Legal Information Retrieval meets Artificial
Intelligence Workshop LIRAI 2024},
location = {Poznan, Poland},
publisher = {CEUR-WS.org},
address = {Aachen, Germany},
series = {CEUR Workshop Proceedings},
note = {accepted},
abstract = {We introduce a retrieval approach leveraging Support Vector Regression
(SVR) ensembles, bootstrap aggregation (bagging), and embedding
spaces on the German Dataset for Legal Information Retrieval (GerDaLIR).
By conceptualizing the retrieval task in terms of multiple binary
needle-in-a-haystack subtasks, we show improved recall over the
baselines (0.849 > 0.803 | 0.829) using our voting ensemble, suggesting
promising initial results, without training or fine-tuning any
deep learning models. Our approach holds potential for further
enhancement, particularly through refining the encoding models
and optimizing hyperparameters.},
keywords = {legal information retrieval, support vector regression, word embeddings, bagging ensemble}
}
2023
BibTeX
@inproceedings{Babbili:et:al:2023,
author = {Babbili, Shaduan and B\"{o}nisch, Kevin and Heinrich, Yannick
and Stephan, Philipp and Abrami, Giuseppe and Mehler, Alexander},
title = {Viki LibraRy: A Virtual Reality Library for Collaborative Browsing
and Navigation through Hypertext},
year = {2023},
isbn = {9798400702327},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3603163.3609079},
doi = {10.1145/3603163.3609079},
abstract = {We present Viki LibraRy, a virtual-reality-based system for generating
and exploring online information as a spatial hypertext. It creates
a virtual library based on Wikipedia in which Rooms are used to
make data available via a RESTful backend. In these Rooms, users
can browse through all articles of the corresponding Wikipedia
category in the form of Books. In addition, users can access different
Rooms, through virtual portals. Beyond that, the explorations
can be done alone or collaboratively, using Ubiq.},
booktitle = {Proceedings of the 34th ACM Conference on Hypertext and Social Media},
articleno = {6},
numpages = {3},
keywords = {virtual reality simulation, virtual reality, virtual hypertext, virtual museum},
location = {Rome, Italy},
series = {HT '23},
pdf = {https://dl.acm.org/doi/pdf/10.1145/3603163.3609079}
}
BibTeX
@inproceedings{Boenisch:et:al:2023,
title = {{Bundestags-Mine}: Natural Language Processing for Extracting
Key Information from Government Documents},
isbn = {9781643684734},
issn = {1879-8314},
url = {http://dx.doi.org/10.3233/FAIA230996},
doi = {10.3233/faia230996},
booktitle = {Legal Knowledge and Information Systems},
publisher = {IOS Press},
author = {B\"{o}nisch, Kevin and Abrami, Giuseppe and Wehnert, Sabine and Mehler, Alexander},
year = {2023}
}
BibTeX
@bathesis{boenisch:2023,
author = {Kevin B{\"o}nisch},
title = {Dialog generation using language models},
institution = {Goethe University},
pages = {28},
year = {2023},
url = {https://publikationen.ub.uni-frankfurt.de/opus4/frontdoor/index/index/docId/79165},
repository = {https://github.com/texttechnologylab/ROBERT}
}