Bhuvanesh Verma

Research assistant

Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Room 401c
D-60325 Frankfurt am Main
D-60054 Frankfurt am Main (use for package delivery)
Postfach / P.O. Box: 154
Phone:
Mail:

Thesis topic proposals

2025

Bachelor Thesis: Full-text Scientific Argument Mining using Large Language Models.
Description
Scientific articles contain a mix of argumentative and non-argumentative content, yet only argumentative sentences, particularly claims, contribute to the scientific discourse and are therefore central to argument mining. A key challenge is not only to identify whether a sentence expresses a claim, but also to distinguish between own claims (novel contributions by the author), background claims (statements grounded in prior work, often signaled by citations), data or evidence (empirical results that support claims), and non-argumentative content (methodological or descriptive text). This project proposes to address the task of claim detection and classification in full-text scientific articles by leveraging large language models, beginning with binary classification of claim versus non-claim sentences and extending to multi-class classification across the four categories. The approach will explore prompt-based classification and domain-specific fine-tuning, with the potential integration of citation-aware heuristics, aiming to establish a robust baseline for scientific claim detection as a foundation for downstream argument mining tasks.

See also:
Corresponding Lab Member: Bhuvanesh Verma and Alexander Mehler.
Bachelor Thesis: Can we use scientific mentions to reconstruct or identify scientific argumentative text?.
Description
Scientific articles contain numerous mentions of datasets, methods, tasks, and metrics, which capture essential elements of the scientific discourse. A key question is whether these scientific mentions and their interrelations can be leveraged to reconstruct or identify argumentative text, such as claims and supporting evidence. Existing resources like SciER and SciREX provide annotations for such mentions and their relations, which can be used to detect how claims are formulated or to identify sentences that express claims in context. Beyond leveraging existing mentions, identifying additional scientific entities and their relations could further enrich the representation of scientific arguments. Given the current lack of full-text scientific argument mining datasets, this task has the potential to support the creation of a large-scale corpus of argumentative sentences and their relational structure, providing a foundation for downstream tasks in scientific argument mining and automated knowledge extraction. See also:
Corresponding Lab Member: Bhuvanesh Verma and Alexander Mehler.

If you have any suggestions of your own relating to this or our other proposed topics, please do not hesitate to contact us.

In addition, we provide a mailing list for free, which we use to inform regularly about updates on new qualification and research work as well as other information relating to Texttechnology.

Publications

2024

Babajide Owoyele, Bhuvanesh Verma, Victor Omolaoye, Jonathan Antonio Edelman, Derk Loorbach and Gerard de Melo. 2024. Socio-Semantic X-Ray of Multi-Actor Constellations using Topics and Interstitial Authors: A Toolkit for Augmenting Computational Literature Reviews. Available at SSRN 4713155.
BibTeX
@article{Owoyele:et:al:2020,
  title     = {Socio-Semantic X-Ray of Multi-Actor Constellations using Topics
               and Interstitial Authors: A Toolkit for Augmenting Computational
               Literature Reviews},
  author    = {Owoyele, Babajide and Verma, Bhuvanesh and Omolaoye, Victor and Edelman, Jonathan Antonio
               and Loorbach, Derk and de Melo, Gerard},
  journal   = {Available at SSRN 4713155},
  doi       = {10.2139/ssrn.4713155},
  url       = {https://dx.doi.org/10.2139/ssrn.4713155},
  year      = {2024}
}
Babajide Alamu Owoyele, Martin Schilling, Rohan Sawahn, Niklas Kaemer, Pavel Zherebenkov, Bhuvanesh Verma, Wim Pouw and Gerard de Melo. 2024. MaskAnyone Toolkit: Offering Strategies for Minimizing Privacy Risks and Maximizing Utility in Audio-Visual Data Archiving.
BibTeX
@misc{Owoyele:et:al:2024,
  title     = {MaskAnyone Toolkit: Offering Strategies for Minimizing Privacy
               Risks and Maximizing Utility in Audio-Visual Data Archiving},
  author    = {Babajide Alamu Owoyele and Martin Schilling and Rohan Sawahn and Niklas Kaemer
               and Pavel Zherebenkov and Bhuvanesh Verma and Wim Pouw and Gerard de Melo},
  year      = {2024},
  eprint    = {2408.03185},
  archiveprefix = {arXiv},
  primaryclass = {cs.CR},
  url       = {https://arxiv.org/abs/2408.03185}
}
Bhuvanesh Verma and Lisa Raithel. 2024. DFKI-NLP at SemEval-2024 Task 2: Towards Robust LLMs Using Data Perturbations and MinMax Training.
BibTeX
@misc{Verma:Raithel:2024,
  title     = {DFKI-NLP at SemEval-2024 Task 2: Towards Robust LLMs Using Data
               Perturbations and MinMax Training},
  author    = {Bhuvanesh Verma and Lisa Raithel},
  year      = {2024},
  eprint    = {2405.00321},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  url       = {https://arxiv.org/abs/2405.00321}
}
Lisa Raithel, Philippe Thomas, Bhuvanesh Verma, Roland Roller, Hui-Syuan Yeh, Shuntaro Yada, Cyril Grouin, Shoko Wakamiya, Eiji Aramaki, Sebastian Möller and Pierre Zweigenbaum. August, 2024. Overview of #SMM4H 2024 – Task 2: Cross-Lingual Few-Shot Relation Extraction for Pharmacovigilance in French, German, and Japanese. Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, 170–182.
BibTeX
@inproceedings{Raithel:et:al:2024,
  title     = {Overview of {\#}{SMM}4{H} 2024 {--} Task 2: Cross-Lingual Few-Shot
               Relation Extraction for Pharmacovigilance in {F}rench, {G}erman,
               and {J}apanese},
  author    = {Raithel, Lisa and Thomas, Philippe and Verma, Bhuvanesh and Roller, Roland
               and Yeh, Hui-Syuan and Yada, Shuntaro and Grouin, Cyril and Wakamiya, Shoko
               and Aramaki, Eiji and M{\"o}ller, Sebastian and Zweigenbaum, Pierre},
  editor    = {Xu, Dongfang and Gonzalez-Hernandez, Graciela},
  booktitle = {Proceedings of The 9th Social Media Mining for Health Research
               and Applications (SMM4H 2024) Workshop and Shared Tasks},
  month     = {aug},
  year      = {2024},
  address   = {Bangkok, Thailand},
  publisher = {Association for Computational Linguistics},
  url       = {https://aclanthology.org/2024.smm4h-1.39/},
  pages     = {170--182},
  abstract  = {This paper provides an overview of Task 2 from the Social Media
               Mining for Health 2024 shared task ({\#}SMM4H 2024), which focused
               on Named Entity Recognition (NER, Subtask 2a) and the joint task
               of NER and Relation Extraction (RE, Subtask 2b) for detecting
               adverse drug reactions (ADRs) in German, Japanese, and French
               texts written by patients. Participants were challenged with a
               few-shot learning scenario, necessitating models that can effectively
               generalize from limited annotated examples. Despite the diverse
               strategies employed by the participants, the overall performance
               across submissions from three teams highlighted significant challenges.
               The results underscored the complexity of extracting entities
               and relations in multi-lingual contexts, especially from the noisy
               and informal nature of user-generated content. Further research
               is required to develop robust systems capable of accurately identifying
               and associating ADR-related information in low-resource and multilingual
               settings.}
}

2022

Arne Binder, Bhuvanesh Verma and Leonhard Hennig. 2022. Full-Text Argumentation Mining on Scientific Publications.
BibTeX
@misc{Binder:et:al:2022,
  title     = {Full-Text Argumentation Mining on Scientific Publications},
  author    = {Arne Binder and Bhuvanesh Verma and Leonhard Hennig},
  year      = {2022},
  eprint    = {2210.13084},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  url       = {https://arxiv.org/abs/2210.13084}
}

2020

Arati Paul, Bhuvanesh Verma and Debasish Chakraborty. 2020. Estimating electrification using multi-temporal DMSP/OLS night imagery as proxy measure of human well-being in India. Spatial Information Research, 28:469–473.
BibTeX
@article{Paul:et:al:2020,
  title     = {Estimating electrification using multi-temporal DMSP/OLS night
               imagery as proxy measure of human well-being in India},
  author    = {Paul, Arati and Verma, Bhuvanesh and Chakraborty, Debasish},
  journal   = {Spatial Information Research},
  volume    = {28},
  issn      = {2366-3294},
  pages     = {469--473},
  year      = {2020},
  url       = {http://dx.doi.org/10.1007/s41324-019-00307-8},
  doi       = {10.1007/s41324-019-00307-8},
  publisher = {Springer}
}