Prof. Dr. Alexander Mehler

Prof. Dr. Alexander Mehler

Fachbereich für Informatik und Mathematik
Goethe-Universität Frankfurt am Main
Robert-Mayer-Straße 10
Raum 403
D-60325 Frankfurt am Main
D-60054 Frankfurt am Main (use for package delivery)

Postfach / P.O. Box: 154
Tel:
Fax:
Mail:

Office Hour Thursday, 10 – 12 AM (make an appointment)

Follow me on ResearchGate.


Alexander Mehler is professor of Computational Humanities / Text Technology at Goethe University Frankfurt where he heads the Text Technology Lab (TTLab). He belonged to the executive committee of the German Society for Computational Linguistics & Language Technology where he headed the research group on Quantitative Corpus Linguistics. Alexander Mehler was head of the research group Computational Semiotics of the German Society of Semiotics and belonged to the executive committee of the LOEWE Priority Program Digital Humanities. He was member of the executive committee of the Center for Digital Research in the Humanities, Social Sciences and Education Sciences (CEDIFOR). Alexander Mehler is a founding member of the German Society for Network Research (DGNet). His research interests include the quantitative analysis, simulative synthesis and formal modeling of textual units in spoken and written communication. To this end, he investigates linguistic networks based on contemporary and historical languages (using models of language evolution). A current research interest of Alexander Mehler concerns 4D text technologies based on Virtual Reality (VR), Augmented Reality (AR) and Augmented Virtuality (AV).

Projects

ENTAILab - Research Infrastructure and Innovation Lab. 2024 – . Funded by DFG (539634240).
Description
The DFG Infrastructure Priority Program New Data Spaces for the Social Sciences (InfPP) was established in order to meet the challenges of traditional survey research and to make use of newly available data sources. The ENTAILab as the central feature of the InfPP offers infrastructure services, excellent research opportunities and will promote the dissemination of results into existing and emerging research and data provision programs. There are clear upscaling and synergy effects of ENTAILab for the projects within the InfPP, leading to opportunities and successful realizations that - partly due to high costs for individual projects and a lack of infrastructural equipment - would otherwise not exist. By providing the infrastructure for a range of projects at the same time, e.g. the connection to existing data, samples and panels, server capacities, code and tools, expert counseling, and research overviews, we make use of synergies and create added value, while at the same time conducting cross-cutting research on these infrastructures. The measures combined in ENTAILab represent the central and unifying element for the research and development and transfer activities of the priority program. ENTAILab will unlock and sustainably use emerging opportunities of the new data spaces and to meet the requirements of the InfPP in general.
BibTeX
@project{spp-2431-entailab,
  name      = {ENTAILab - Research Infrastructure and Innovation Lab},
  abstract  = {The DFG Infrastructure Priority Program New Data Spaces for the
               Social Sciences (InfPP) was established in order to meet the challenges
               of traditional survey research and to make use of newly available
               data sources. The ENTAILab as the central feature of the InfPP
               offers infrastructure services, excellent research opportunities
               and will promote the dissemination of results into existing and
               emerging research and data provision programs. There are clear
               upscaling and synergy effects of ENTAILab for the projects within
               the InfPP, leading to opportunities and successful realizations
               that - partly due to high costs for individual projects and a
               lack of infrastructural equipment - would otherwise not exist.
               By providing the infrastructure for a range of projects at the
               same time, e.g. the connection to existing data, samples and panels,
               server capacities, code and tools, expert counseling, and research
               overviews, we make use of synergies and create added value, while
               at the same time conducting cross-cutting research on these infrastructures.
               The measures combined in ENTAILab represent the central and unifying
               element for the research and development and transfer activities
               of the priority program. ENTAILab will unlock and sustainably
               use emerging opportunities of the new data spaces and to meet
               the requirements of the InfPP in general.},
  year      = {2024},
  funded_by = {DFG (539634240)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/539634240},
  url       = {https://www.new-data-spaces.de/de-de/Start/Infrastructure-Priority-Programme/ENTAILab},
  logo      = {/wp-content/uploads/2024/01/logo-NewDataSpaces-long.png}
}
Durchführbarkeit, Akzeptanz und Datenqualität neuer multimodaler Erhebungen (FACES). 2024 – . Funded by DFG (539621548).
Description
Das Projekt eröffnet einen neuen multimodalen Datenraum für die Umfrageforschung aus einer survey-analytischen und einer informatischen Perspektive. Dieser Datenraum wird jüngste Innovationen im Bereich der VR und der KI nutzen und weiterentwickeln, um Face-to-Face-Interviews zu ersetzen und so das Problem der steigenden Kosten und sinkenden Rücklaufquoten interviewerbasierter Erhebungen zu lösen. Zu diesem Zweck wird ein Multi-Interface-System für Online-Befragungen auf der Grundlage von VR und Mixed Reality entwickelt und durch eine Reihe von Experimenten und eingehenden Vergleichen mit videobasierten Methoden getestet und bewertet. Das System wird ein breites Spektrum an Variabilität in Bezug auf Avatare und situative Parameter von Interviews, Schnittstellen und KI-Technologien zur automatischen Verarbeitung von Sprach- und Verhaltensdaten aufweisen. Dabei werden systematisch avatar-basierte, menschengesteuerte Lebensumfragen mit videobasierten Lebensumfragen verglichen. Bei ersteren werden die Verhaltensfreiheitsgrade der Interaktanten durch die Wahl der Avatare deutlich erweitert. Aufgrund der großen Anzahl möglicher Merkmalskombinationen wird auf zwei Arten vorgegangen: (1) die Auswirkungen von Avatar- und Situationsmerkmalen werden in Experimenten untersucht, um eine Vorauswahl zu treffen; (2) es werden vielversprechende Merkmalskombinationen identifiziert und unter realen Bedingungen (reale Interviews) getestet. Wir werden drei Forschungsfragen behandeln: RQ1: Was sind die Vorteile von avatar-basierten Interviews im Vergleich zu videobasierten Interviews in Bezug auf Akzeptanz, Machbarkeit und Datenqualität? RQ2: Welche Interviewereffekte werden durch welche Kombinationen von Merkmalen reduziert und wie interagieren sie? RQ3: Wie können die Ergebnisse in eine Theorie integriert werden, die das Training von automatisierten, vollständig immersiven virtuellen Interviewern darstellt? Zur Beantwortung der Fragen geht das Projekt über bestehende Studien hinaus und betrachtet vier Szenarien. In den Szenarien 2 und 3 ist das multimodale Verhalten der befragten Person noch aktiv, kann aber nicht wie in Szenario 1 aufgezeichnet werden. Die Szenarien 1-3 werden mit Szenario 4 verglichen, um die Auswirkungen von Avataren zu untersuchen. Sie unterscheiden sich durch den Grad der Immersion des Befragten und ermöglichen Betrachtungen zum Einfluss von Avataren in (teil-)virtualisierten Interviews, während Szenario 1 die zusätzliche Untersuchung des Proteus-Effekts ermöglicht. Die Szenarien 1-2 werden experimentell untersucht, die Szenarien 3-4 zusätzlich auch unter realen Interviewbedingungen mit ehemaligen NEPS SC3 Teilnehmern. Auf diese Weise wird das Projekt das erste sein, das Selbst- und Fremd-Avatar-Effekte, sowohl in Bezug auf Avatar-Eigenschaften als auch im Vergleich zu klassischen Interviews untersucht. Da im Rahmen des Projekts ein Open-Source-System entwickelt wird, wird es den Weg für künftige Avatar-basierte Lebensumfragen ebnen.
BibTeX
@project{spp-2431-faces,
  name      = {Durchführbarkeit, Akzeptanz und Datenqualität neuer multimodaler Erhebungen (FACES)},
  abstract  = {Das Projekt eröffnet einen neuen multimodalen Datenraum für die
               Umfrageforschung aus einer survey-analytischen und einer informatischen
               Perspektive. Dieser Datenraum wird jüngste Innovationen im Bereich
               der VR und der KI nutzen und weiterentwickeln, um Face-to-Face-Interviews
               zu ersetzen und so das Problem der steigenden Kosten und sinkenden
               Rücklaufquoten interviewerbasierter Erhebungen zu lösen. Zu diesem
               Zweck wird ein Multi-Interface-System für Online-Befragungen auf
               der Grundlage von VR und Mixed Reality entwickelt und durch eine
               Reihe von Experimenten und eingehenden Vergleichen mit videobasierten
               Methoden getestet und bewertet. Das System wird ein breites Spektrum
               an Variabilität in Bezug auf Avatare und situative Parameter von
               Interviews, Schnittstellen und KI-Technologien zur automatischen
               Verarbeitung von Sprach- und Verhaltensdaten aufweisen. Dabei
               werden systematisch avatar-basierte, menschengesteuerte Lebensumfragen
               mit videobasierten Lebensumfragen verglichen. Bei ersteren werden
               die Verhaltensfreiheitsgrade der Interaktanten durch die Wahl
               der Avatare deutlich erweitert. Aufgrund der großen Anzahl möglicher
               Merkmalskombinationen wird auf zwei Arten vorgegangen: (1) die
               Auswirkungen von Avatar- und Situationsmerkmalen werden in Experimenten
               untersucht, um eine Vorauswahl zu treffen; (2) es werden vielversprechende
               Merkmalskombinationen identifiziert und unter realen Bedingungen
               (reale Interviews) getestet. Wir werden drei Forschungsfragen
               behandeln: RQ1: Was sind die Vorteile von avatar-basierten Interviews
               im Vergleich zu videobasierten Interviews in Bezug auf Akzeptanz,
               Machbarkeit und Datenqualität? RQ2: Welche Interviewereffekte
               werden durch welche Kombinationen von Merkmalen reduziert und
               wie interagieren sie? RQ3: Wie können die Ergebnisse in eine Theorie
               integriert werden, die das Training von automatisierten, vollständig
               immersiven virtuellen Interviewern darstellt? Zur Beantwortung
               der Fragen geht das Projekt über bestehende Studien hinaus und
               betrachtet vier Szenarien. In den Szenarien 2 und 3 ist das multimodale
               Verhalten der befragten Person noch aktiv, kann aber nicht wie
               in Szenario 1 aufgezeichnet werden. Die Szenarien 1-3 werden mit
               Szenario 4 verglichen, um die Auswirkungen von Avataren zu untersuchen.
               Sie unterscheiden sich durch den Grad der Immersion des Befragten
               und ermöglichen Betrachtungen zum Einfluss von Avataren in (teil-)virtualisierten
               Interviews, während Szenario 1 die zusätzliche Untersuchung des
               Proteus-Effekts ermöglicht. Die Szenarien 1-2 werden experimentell
               untersucht, die Szenarien 3-4 zusätzlich auch unter realen Interviewbedingungen
               mit ehemaligen NEPS SC3 Teilnehmern. Auf diese Weise wird das
               Projekt das erste sein, das Selbst- und Fremd-Avatar-Effekte,
               sowohl in Bezug auf Avatar-Eigenschaften als auch im Vergleich
               zu klassischen Interviews untersucht. Da im Rahmen des Projekts
               ein Open-Source-System entwickelt wird, wird es den Weg für künftige
               Avatar-basierte Lebensumfragen ebnen.},
  year      = {2024},
  funded_by = {DFG (539621548)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/539621548},
  url       = {https://www.new-data-spaces.de/de-de/Start/Infrastructure-Priority-Programme/Projects},
  logo      = {/wp-content/uploads/2024/01/logo-NewDataSpaces-long.png}
}
Negation in Language and Beyond (NegLaB). 2024 – . Funded by DFG (SFB 1629).
Description
Negation is a fundamental and unique property of human language since it allows us to reason about what is not the case. It does not only express a clearly defined grammatical function, it also interacts with various aspects of grammar and cognition. The acquisition and processing of negation encompass linguistic as well as non-linguistic cognitive procedures. Hence, negation constitutes an ideal testing ground to differentiate cognitive mechanisms that are grammatical in nature from those that are shared with other cognitive domains, such as memory, attention, decision making and cognitive control. We intend to explore how the expression of negation is cross-linguistically associated with grammatical and non-linguistic cognitive operations and also whether the operations observed in negative utterances are part of negation itself or, rather, arises as an effect of the grammatical system and cognitive functions. While the semantics of negation is generally analyzed as a unique propositional operator, its morphosyntactic expression is much more varied and often involves more than one morphological exponent. Hence, there is a tension between a rich morphosyntax and a more straightforward semantics. The semantics of negation leads one to expect negation to be expressed by a single morpheme positioned at the beginning of the clause (Neg-Only Hypothesis). The rich and variable morphosyntax leads us to expect that negation requires a number of conditions in the semantics (Neg-Plus Hypothesis). We aim to solve this puzzle covering several empirical domains. More grammatical effects than semantics would lead us to expect are visible in the interaction between negative utterances and the cognitive processing and semantic evaluation of alternative propositions. This is reflected in acquisition, since children produce negative utterances relatively early, but all the aspects of negation take a rather long time to be acquired. Downstream effects of this can be seen in adult processing as the comprehension of negative sentences is costlier than for positive sentences. This is supposedly due to the inhibition of the corresponding positive sentence that is necessary for the interpretation of negative statements. Our exploration into the way negation and other grammatical categories or non-linguistic cognitive functions interact will lead us to identify how negation functions in natural language and how it favors or hinders other (extra-)grammatical components or processes. Why do some of them need to occur together with negation (e.g., negative polarity items) and why are others incompatible with it (as some types of imperatives)? Our general aim is to develop a theoretical perspective on the way negation manifests itself in natural language, how it is acquired and processed, and why it varies so much across languages. Thereby, we will gain a better understanding of the connections between linguistic competence and general cognition.
BibTeX
@project{sfb-neglab,
  name      = {Negation in Language and Beyond (NegLaB)},
  abstract  = {Negation is a fundamental and unique property of human language
               since it allows us to reason about what is not the case. It does
               not only express a clearly defined grammatical function, it also
               interacts with various aspects of grammar and cognition. The acquisition
               and processing of negation encompass linguistic as well as non-linguistic
               cognitive procedures. Hence, negation constitutes an ideal testing
               ground to differentiate cognitive mechanisms that are grammatical
               in nature from those that are shared with other cognitive domains,
               such as memory, attention, decision making and cognitive control.
               We intend to explore how the expression of negation is cross-linguistically
               associated with grammatical and non-linguistic cognitive operations
               and also whether the operations observed in negative utterances
               are part of negation itself or, rather, arises as an effect of
               the grammatical system and cognitive functions. While the semantics
               of negation is generally analyzed as a unique propositional operator,
               its morphosyntactic expression is much more varied and often involves
               more than one morphological exponent. Hence, there is a tension
               between a rich morphosyntax and a more straightforward semantics.
               The semantics of negation leads one to expect negation to be expressed
               by a single morpheme positioned at the beginning of the clause
               (Neg-Only Hypothesis). The rich and variable morphosyntax leads
               us to expect that negation requires a number of conditions in
               the semantics (Neg-Plus Hypothesis). We aim to solve this puzzle
               covering several empirical domains. More grammatical effects than
               semantics would lead us to expect are visible in the interaction
               between negative utterances and the cognitive processing and semantic
               evaluation of alternative propositions. This is reflected in acquisition,
               since children produce negative utterances relatively early, but
               all the aspects of negation take a rather long time to be acquired.
               Downstream effects of this can be seen in adult processing as
               the comprehension of negative sentences is costlier than for positive
               sentences. This is supposedly due to the inhibition of the corresponding
               positive sentence that is necessary for the interpretation of
               negative statements. Our exploration into the way negation and
               other grammatical categories or non-linguistic cognitive functions
               interact will lead us to identify how negation functions in natural
               language and how it favors or hinders other (extra-)grammatical
               components or processes. Why do some of them need to occur together
               with negation (e.g., negative polarity items) and why are others
               incompatible with it (as some types of imperatives)? Our general
               aim is to develop a theoretical perspective on the way negation
               manifests itself in natural language, how it is acquired and processed,
               and why it varies so much across languages. Thereby, we will gain
               a better understanding of the connections between linguistic competence
               and general cognition.},
  year      = {2024},
  funded_by = {DFG (SFB 1629)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/509468465?language=en},
  url       = {https://www.uni-frankfurt.de/149292001/Negation_in_Language_and_Beyond},
  logo      = {/wp-content/uploads/2024/04/logo-NegLaB.jpg}
}
C08: Integration von Prozess- und Textdaten der Studierenden zur Messung der Interdependenz von domänenspezifischem und generischem kritischen Online Reasoning (DOM-COR und GEN-COR). 2023 – . Funded by DFG (462702138).
Description
Der Standardansatz zur Bewertung von Lernergebnissen sieht Bewertung als einen Prozess an, bei dem aus den notwendigerweise begrenzten Nachweisen zu den Aktivitäten von Lernenden Aussagen über ihr Wissen und ihre Fähigkeiten gemacht werden können. Die Analyse von Prozess- und Textdaten, die beim Lernen als zusammenhängende Verhaltenssequenzen generiert werden, gilt demgegenüber als realitätsnähere Alternative. Diese multimodalen Daten haben das Potenzial, ein vollständigeres Bild von kritischen Online-Reasoning-Prozessen (COR) von Studierenden wiederzugeben und können zugleich mit datenwissenschaftlichen Methoden analysiert werden. Dabei stellt sich die Frage, inwieweit datenwissenschaftliche Methoden mit Standardbewertungsansätzen vergleichbar sind, um COR-Prozesse näher zu untersuchen. C08 verfolgt drei Hauptziele: (1) Bereitstellung einer authentischen digitalen Bewertungs- und Lernumgebung in der AZURE-Cloud, in der sich Studierende so verhalten können, wie sie es auf ihren Computern tun; (2) Integration der Aktivitäten von Studierenden anhand von multimodalen Text- und Antwortprozess-Daten in einer Forschungsinfrastruktur namens Multimodal Learning Data Science System (MLDS) – dies ermöglicht die Analyse von Prozessdaten (z.B. Scrollen von Webseiten, Browsing-Historie, Zeitaufwand) und Textdaten (z.B. genutzte Webseiten, geschriebener Text) von Studierenden bei Bearbeitung von generischen (GEN) und domänenspezifischen (DOM) COR-Aufgaben; (3) Analyse der multimodalen Daten, um latente Beziehungen zwischen den von Studierenden bearbeiteten oder geschriebenen Textdaten und ihren Verhaltensdaten bei der Lösung von COR-Aufgaben aufzudecken. Die digitale Bewertungs- und Lernumgebung von C08 wird die Erfassung von Lernverhalten bei COR-Aufgaben der A-Projekte in realen Internetszenarien und vergleichbaren Simulationen erlauben. C08 wird Text- und Prozessdaten in seiner MLDS-Forschungsinfrastruktur erfassen, ihre Rolle und Interaktion bei der Bearbeitung von COR-Aufgaben untersuchen und klären, inwiefern sie mit dem Fachwissen und persönlichen Eigenschaften der Studierenden zusammenhängen. C08 prüft die Bedeutung von datenwissenschaftlichen Methoden im Bildungsbereich. Es identifiziert den Mehrwert und die Grenzen datenwissenschaftlicher Methoden für die Verarbeitung multimodaler Text- und Prozessdaten, die in GEN- und DOM-COR-Bewertungen generiert werden, um neue Erkenntnisse und Methoden für die erziehungswissenschaftliche Forschung beizutragen. C08 wird mit allen Projekten in der Forschungsgruppe (FOR) zusammenarbeiten, um einen neuartigen Big-Data-Datensatz für die GEN- und DOM-COR-Studien zu erstellen und auszuwerten, und zwar mittels einer für diese Analysen zu entwickelnden neuartigen Infrastruktur. C08 bringt datenwissenschaftliche Expertise in die FOR ein und erfordert die Expertise der Bildungswissenschaftler*innen, um seine Methoden anzupassen und zu kalibrieren.
BibTeX
@project{core-c08-integration,
  name      = {C08: Integration von Prozess- und Textdaten der Studierenden zur Messung der Interdependenz von domänenspezifischem und generischem kritischen Online Reasoning (DOM-COR und GEN-COR)},
  abstract  = {Der Standardansatz zur Bewertung von Lernergebnissen sieht Bewertung
               als einen Prozess an, bei dem aus den notwendigerweise begrenzten
               Nachweisen zu den Aktivitäten von Lernenden Aussagen über ihr
               Wissen und ihre Fähigkeiten gemacht werden können. Die Analyse
               von Prozess- und Textdaten, die beim Lernen als zusammenhängende
               Verhaltenssequenzen generiert werden, gilt demgegenüber als realitätsnähere
               Alternative. Diese multimodalen Daten haben das Potenzial, ein
               vollständigeres Bild von kritischen Online-Reasoning-Prozessen
               (COR) von Studierenden wiederzugeben und können zugleich mit datenwissenschaftlichen
               Methoden analysiert werden. Dabei stellt sich die Frage, inwieweit
               datenwissenschaftliche Methoden mit Standardbewertungsansätzen
               vergleichbar sind, um COR-Prozesse näher zu untersuchen. C08 verfolgt
               drei Hauptziele: (1) Bereitstellung einer authentischen digitalen
               Bewertungs- und Lernumgebung in der AZURE-Cloud, in der sich Studierende
               so verhalten können, wie sie es auf ihren Computern tun; (2) Integration
               der Aktivitäten von Studierenden anhand von multimodalen Text-
               und Antwortprozess-Daten in einer Forschungsinfrastruktur namens
               Multimodal Learning Data Science System (MLDS) – dies ermöglicht
               die Analyse von Prozessdaten (z.B. Scrollen von Webseiten, Browsing-Historie,
               Zeitaufwand) und Textdaten (z.B. genutzte Webseiten, geschriebener
               Text) von Studierenden bei Bearbeitung von generischen (GEN) und
               domänenspezifischen (DOM) COR-Aufgaben; (3) Analyse der multimodalen
               Daten, um latente Beziehungen zwischen den von Studierenden bearbeiteten
               oder geschriebenen Textdaten und ihren Verhaltensdaten bei der
               Lösung von COR-Aufgaben aufzudecken. Die digitale Bewertungs-
               und Lernumgebung von C08 wird die Erfassung von Lernverhalten
               bei COR-Aufgaben der A-Projekte in realen Internetszenarien und
               vergleichbaren Simulationen erlauben. C08 wird Text- und Prozessdaten
               in seiner MLDS-Forschungsinfrastruktur erfassen, ihre Rolle und
               Interaktion bei der Bearbeitung von COR-Aufgaben untersuchen und
               klären, inwiefern sie mit dem Fachwissen und persönlichen Eigenschaften
               der Studierenden zusammenhängen. C08 prüft die Bedeutung von datenwissenschaftlichen
               Methoden im Bildungsbereich. Es identifiziert den Mehrwert und
               die Grenzen datenwissenschaftlicher Methoden für die Verarbeitung
               multimodaler Text- und Prozessdaten, die in GEN- und DOM-COR-Bewertungen
               generiert werden, um neue Erkenntnisse und Methoden für die erziehungswissenschaftliche
               Forschung beizutragen. C08 wird mit allen Projekten in der Forschungsgruppe
               (FOR) zusammenarbeiten, um einen neuartigen Big-Data-Datensatz
               für die GEN- und DOM-COR-Studien zu erstellen und auszuwerten,
               und zwar mittels einer für diese Analysen zu entwickelnden neuartigen
               Infrastruktur. C08 bringt datenwissenschaftliche Expertise in
               die FOR ein und erfordert die Expertise der Bildungswissenschaftler*innen,
               um seine Methoden anzupassen und zu kalibrieren.},
  year      = {2023},
  funded_by = {DFG (462702138)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/520631675},
  url       = {https://de.core.uni-mainz.de/c08/},
  logo      = {/wp-content/uploads/2024/11/CORE_Logo_neu.png}
}
B05: Modellierung der Informationslandschaft (IL) zur Bewertung und Analyse von domänenspezifischem und generischem Critical Online Reasoning (DOM-COR und GEN-COR). 2023 – . Funded by DFG (462702138).
Description
Die Rolle linguistischer Indikatoren für die Lesbarkeit von Texten oder die Glaubwürdigkeit von Webquellen wurde bereits intensiv erforscht. Die Antragssteller konnten anhand eines Korpus kurzer Offline-Texte zudem zeigen, dass linguistische Merkmale Vorhersagen über die Ergebnisse von Studierenden in domänenspezifischen Wissenstests erlauben. Inwieweit solche Zusammenhänge für Testaufgaben in komplexen offenen Informationslandschaften generalisierbar sind, ist ein wichtiges Desiderat. Daher fokussiert B05 auf die Modellierung linguistischer Merkmale der online Informationslandschaft (IL), in der sich Studierende zur Lösung von Aufgaben zum Critical Online-Reasoning (COR) bewegen. B05 zielt auf die Entwicklung eines theoretisch fundierten linguistischen Merkmalsmodells ab, das auf den Texten basiert, die Studierende als Komponenten der online IL bei der Lösung von COR-Aufgaben nutzen bzw. produzieren. Das Modell soll Vorhersagen über COR-Prozesse und COR-Leistungen erlauben. Dabei fokussiert B05 auf drei Forschungsfragen: (i) Inwieweit unterscheiden sich die linguistischen Merkmale beim generischen vs. domänenspezifischen COR (GEN-COR/DOM-COR) sowie innerhalb der Domänen Wirtschaft, Medizin, Soziologie und Physik? (ii) Wie unterscheiden sich diese Merkmale bezüglich der drei kognitiven COR-Facetten: Online-Informationsbeschaffung, kritische Informationsbewertung sowie evidenzbasiertes Argumentieren und Synthetisieren von Informationen. (iii) Auf welchen Ebenen wirken diese Merkmale: auf der Ebene einzelner Texte, multipler Texte, Domänen, Genres, der IL oder der zugrundeliegenden Sprache (z.B. Deutsch)? B05 geht von der qualitativen Auswahl linguistischer Merkmale zum Evidenzstatus, zur Informationsquelle und zur Textorganisation aus. Der quantitative Teil operationalisiert diese Merkmale mittels eines erweiterten maschinellen Lernmodells und testet ihre Vorhersagekraft und Spezifität bezüglich der drei Forschungsfragen. Die Verknüpfung von qualitativen und quantitativen Analysen erfolgt über einen computationellen hermeneutischen Zirkel, in dem der quantitative Teil statistische Auswertungen und Vorhersagen generiert, deren Interpretierbarkeit auf qualitativen linguistischen Analysen fußt. B05 stellt für die Forschungsgruppe (FOR) maschinelle Lernmodelle bereit, die die linguistischen Merkmale multipler Texte als Teil der IL automatisch analysieren und auf einer linguistischen Theorie zu COR basieren, die die Ebene feinkörniger linguistischer Informationseinheiten adressiert. Die A-Projekte stellen Texte und Informationen über die COR-Testergebnisse von Studierenden zur Verfügung und erhalten linguistische Analysen von B05. Als die detailliertesten Informationseinheiten in der FOR sind linguistische Merkmale für die anderen B-Projekte bezüglich Medien- und Inhaltseigenschaften (B04) und narrative Strukturen (B06) relevant. Das Multimodal Learning Data Science System von C08 ist zentral für die Integration aller Text- und Leistungsdaten in B05.
BibTeX
@project{core-b05-modellierung,
  name      = {B05: Modellierung der Informationslandschaft (IL) zur Bewertung und Analyse von domänenspezifischem und generischem Critical Online Reasoning (DOM-COR und GEN-COR)},
  abstract  = {Die Rolle linguistischer Indikatoren für die Lesbarkeit von Texten
               oder die Glaubwürdigkeit von Webquellen wurde bereits intensiv
               erforscht. Die Antragssteller konnten anhand eines Korpus kurzer
               Offline-Texte zudem zeigen, dass linguistische Merkmale Vorhersagen
               über die Ergebnisse von Studierenden in domänenspezifischen Wissenstests
               erlauben. Inwieweit solche Zusammenhänge für Testaufgaben in komplexen
               offenen Informationslandschaften generalisierbar sind, ist ein
               wichtiges Desiderat. Daher fokussiert B05 auf die Modellierung
               linguistischer Merkmale der online Informationslandschaft (IL),
               in der sich Studierende zur Lösung von Aufgaben zum Critical Online-Reasoning
               (COR) bewegen. B05 zielt auf die Entwicklung eines theoretisch
               fundierten linguistischen Merkmalsmodells ab, das auf den Texten
               basiert, die Studierende als Komponenten der online IL bei der
               Lösung von COR-Aufgaben nutzen bzw. produzieren. Das Modell soll
               Vorhersagen über COR-Prozesse und COR-Leistungen erlauben. Dabei
               fokussiert B05 auf drei Forschungsfragen: (i) Inwieweit unterscheiden
               sich die linguistischen Merkmale beim generischen vs. domänenspezifischen
               COR (GEN-COR/DOM-COR) sowie innerhalb der Domänen Wirtschaft,
               Medizin, Soziologie und Physik? (ii) Wie unterscheiden sich diese
               Merkmale bezüglich der drei kognitiven COR-Facetten: Online-Informationsbeschaffung,
               kritische Informationsbewertung sowie evidenzbasiertes Argumentieren
               und Synthetisieren von Informationen. (iii) Auf welchen Ebenen
               wirken diese Merkmale: auf der Ebene einzelner Texte, multipler
               Texte, Domänen, Genres, der IL oder der zugrundeliegenden Sprache
               (z.B. Deutsch)? B05 geht von der qualitativen Auswahl linguistischer
               Merkmale zum Evidenzstatus, zur Informationsquelle und zur Textorganisation
               aus. Der quantitative Teil operationalisiert diese Merkmale mittels
               eines erweiterten maschinellen Lernmodells und testet ihre Vorhersagekraft
               und Spezifität bezüglich der drei Forschungsfragen. Die Verknüpfung
               von qualitativen und quantitativen Analysen erfolgt über einen
               computationellen hermeneutischen Zirkel, in dem der quantitative
               Teil statistische Auswertungen und Vorhersagen generiert, deren
               Interpretierbarkeit auf qualitativen linguistischen Analysen fußt.
               B05 stellt für die Forschungsgruppe (FOR) maschinelle Lernmodelle
               bereit, die die linguistischen Merkmale multipler Texte als Teil
               der IL automatisch analysieren und auf einer linguistischen Theorie
               zu COR basieren, die die Ebene feinkörniger linguistischer Informationseinheiten
               adressiert. Die A-Projekte stellen Texte und Informationen über
               die COR-Testergebnisse von Studierenden zur Verfügung und erhalten
               linguistische Analysen von B05. Als die detailliertesten Informationseinheiten
               in der FOR sind linguistische Merkmale für die anderen B-Projekte
               bezüglich Medien- und Inhaltseigenschaften (B04) und narrative
               Strukturen (B06) relevant. Das Multimodal Learning Data Science
               System von C08 ist zentral für die Integration aller Text- und
               Leistungsdaten in B05.},
  year      = {2023},
  funded_by = {DFG (462702138)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/520621868},
  url       = {https://de.core.uni-mainz.de/b05/},
  logo      = {/wp-content/uploads/2024/11/CORE_Logo_neu.png}
}
New Data Spaces for the Social Sciences. 2023 – . Funded by DFG (SPP 2431).
Description
In order to more precisely research the major societal challenges of the coming decades, including digitization, climate change, and war- and pandemic-related societal changes, and to be able to identify the need for political action on this basis, the social sciences need innovative research data and methods.
BibTeX
@project{spp-2431-new-data-spaces,
  name      = {New Data Spaces for the Social Sciences},
  abstract  = {In order to more precisely research the major societal challenges
               of the coming decades, including digitization, climate change,
               and war- and pandemic-related societal changes, and to be able
               to identify the need for political action on this basis, the social
               sciences need innovative research data and methods.},
  year      = {2023},
  funded_by = {DFG (SPP 2431)},
  funded_by_url = {https://www.dfg.de/de/aktuelles/neuigkeiten-themen/info-wissenschaft/2023/info-wissenschaft-23-20},
  url       = {https://www.new-data-spaces.de/en-us/},
  logo      = {/wp-content/uploads/2024/01/logo-NewDataSpaces-long.png}
}
Virtual Reality Sustained Multimodal Distributional Semantics for Gestures in Dialogue (GeMDiS). 2021 – . Funded by DFG (SPP 2392).
Description
Both corpus-based linguistics and contemporary computational linguistics rely on the use of often large, linguistic resources. The expansion of the linguistic subject area to include visual means of communication such as gesticulation has not yet been backed up with corresponding corpora. This means that “multimodal linguistics” and dialogue theory cannot participate in established distributional methods of corpus linguistics and computational semantics. The main reason for this is the difficulty of collecting multimodal data in an appropriate way and at an appropriate scale. Using the latest VR-based recording methods, the GeMDiS project aims to close this data gap and to investigate visual communication by means of machine-based methods and innovative use of neuronal and active learning for small data using the systematic reference dimensions of associativity and contiguity of the features of visual and non-visual communicative signs.
BibTeX
@project{vicom-gemdis,
  name      = {Virtual Reality Sustained Multimodal Distributional Semantics for Gestures in Dialogue (GeMDiS)},
  abstract  = {Both corpus-based linguistics and contemporary computational linguistics
               rely on the use of often large, linguistic resources. The expansion
               of the linguistic subject area to include visual means of communication
               such as gesticulation has not yet been backed up with corresponding
               corpora. This means that “multimodal linguistics” and dialogue
               theory cannot participate in established distributional methods
               of corpus linguistics and computational semantics. The main reason
               for this is the difficulty of collecting multimodal data in an
               appropriate way and at an appropriate scale. Using the latest
               VR-based recording methods, the GeMDiS project aims to close this
               data gap and to investigate visual communication by means of machine-based
               methods and innovative use of neuronal and active learning for
               small data using the systematic reference dimensions of associativity
               and contiguity of the features of visual and non-visual communicative
               signs.},
  year      = {2021},
  funded_by = {DFG (SPP 2392)},
  funded_by_url = {https://www.dfg.de/en/research_funding/announcements_proposals/2021/info_wissenschaft_21_45/index.html},
  url       = {https://vicom.info/projects/virtual-reality-sustained-multimodal-distributional-semantics-for-gestures-in-dialogue-gemdis/},
  logo      = {/wp-content/uploads/2024/01/ViComGeMDis.png}
}
LOEWE-Schwerpunkt "Minderheitenstudien: Sprache und Identität". 2020 – . Funded by LOEWE.
Description
Der LOEWE-Schwerpunkt "Minderheiten: Sprache und Identität" erarbeitet eine interdisziplinäre Untersuchung der Problematik von Identitätsbildung bei Minderheiten. Dazu untersuchen wir drei Arten von Relationen: die Relation zwischen Minderheiten "im eigenen Land" und Minderheiten "im Ausland"; die Relation zwischen Selbstwahrnehmung und Fremdwahrnehmung von Minderheiten (sowohl "im eigenen Land" als auch im "Ausland"); und die wechselseitige Relation der identitätsbedingenden Vorgaben Sprache, Religion, Kultur und Ethnos, in Selbstsicht und Fremdsicht "im eigenen Land" und "im Ausland".
BibTeX
@project{loewe-minderheitenstudien,
  name      = {LOEWE-Schwerpunkt "Minderheitenstudien: Sprache und Identität"},
  abstract  = {Der LOEWE-Schwerpunkt "Minderheiten: Sprache und Identität" erarbeitet
               eine interdisziplinäre Untersuchung der Problematik von Identitätsbildung
               bei Minderheiten. Dazu untersuchen wir drei Arten von Relationen:
               die Relation zwischen Minderheiten "im eigenen Land" und Minderheiten
               "im Ausland"; die Relation zwischen Selbstwahrnehmung und Fremdwahrnehmung
               von Minderheiten (sowohl "im eigenen Land" als auch im "Ausland");
               und die wechselseitige Relation der identitätsbedingenden Vorgaben
               Sprache, Religion, Kultur und Ethnos, in Selbstsicht und Fremdsicht
               "im eigenen Land" und "im Ausland".},
  year      = {2020},
  funded_by = {LOEWE},
  funded_by_url = {https://proloewe.de/de/loewe-vorhaben/nach-themen/minderheitenstudien/},
  url       = {https://sprache-identitaet.uni-frankfurt.de/},
  logo      = {/wp-content/uploads/2024/02/logo-loewe-minderheitenstudien-blau.png}
}
Berufspraktische Bildungsprozesse im Recht- und Lehramtsreferendariat sowie der Medizin unter Nutzung digitaler Medien (BRIDGE). 2020 – 2023. Funded by BMBF (01JD1906B).
Description
Die Nutzung von Onlinemedien steigt in allen Bildungsbereichen zunehmend an. Auch für den Berufseinstieg nutzen Lernende immer häufiger online verfügbare Medien. Ob sich das Nutzungsverhalten je nach Beruf unterscheidet und ob es berufsspezifische Unterschiede gibt, ist bislang nicht erforscht. Hier setzt das Forschungsprojekt der Johannes Gutenberg-Universität Mainz und der Johann Wolfgang Goethe-Universität Frankfurt an. Die Wissenschaftlerinnen und Wissenschaftler untersuchen, wie Berufseinsteigende der Medizin im Praktischen Jahr, Lehramts- sowie Rechtsreferendarinnen und -referendare online verfügbare Medien nutzen. Hierbei vergleicht das interdisziplinäre Team die generelle mit der berufsspezifischen Nutzung von Onlinemedien anhand einer repräsentativen Stichprobe im Längsschnitt. Mithilfe von innovativen Ansätzen aus der Computerlinguistik und aus dem Bereich der Learning Analytics werden Online-Trainings entwickelt und die Lernprozesse der Probanden untersucht. Um die Daten multiperspektivisch zu interpretieren, wird eng mit der Praxis zusammengearbeitet. Das Team der Universität Mainz koordiniert das Verbundprojekt und bringt die wirtschaftspädagogische und rechtswissenschaftliche Perspektive sowie Expertise zur Kompetenzentwicklung ein. Die repräsentative und methodenintegrative Studie liefert wissenschaftliche Erkenntnisse zum Einfluss und zur digitalen Förderbarkeit der berufsbezogenen Mediennutzung in der Berufspraxis. Außerdem sind Ergebnisse zu erwarten, die zur Gestaltung der Verbindung von formalen und non-formalen Lerngelegenheiten genutzt werden können. Praxispartner, wie zum Beispiel Ausbildnerinnen und Ausbildner, können die entwickelten adaptiven Trainingskonzepte praktisch anwenden und nutzen.
BibTeX
@project{bridge,
  name      = {Berufspraktische Bildungsprozesse im Recht- und Lehramtsreferendariat sowie der Medizin unter Nutzung digitaler Medien (BRIDGE)},
  abstract  = {Die Nutzung von Onlinemedien steigt in allen Bildungsbereichen
               zunehmend an. Auch für den Berufseinstieg nutzen Lernende immer
               häufiger online verfügbare Medien. Ob sich das Nutzungsverhalten
               je nach Beruf unterscheidet und ob es berufsspezifische Unterschiede
               gibt, ist bislang nicht erforscht. Hier setzt das Forschungsprojekt
               der Johannes Gutenberg-Universität Mainz und der Johann Wolfgang
               Goethe-Universität Frankfurt an. Die Wissenschaftlerinnen und
               Wissenschaftler untersuchen, wie Berufseinsteigende der Medizin
               im Praktischen Jahr, Lehramts- sowie Rechtsreferendarinnen und
               -referendare online verfügbare Medien nutzen. Hierbei vergleicht
               das interdisziplinäre Team die generelle mit der berufsspezifischen
               Nutzung von Onlinemedien anhand einer repräsentativen Stichprobe
               im Längsschnitt. Mithilfe von innovativen Ansätzen aus der Computerlinguistik
               und aus dem Bereich der Learning Analytics werden Online-Trainings
               entwickelt und die Lernprozesse der Probanden untersucht. Um die
               Daten multiperspektivisch zu interpretieren, wird eng mit der
               Praxis zusammengearbeitet. Das Team der Universität Mainz koordiniert
               das Verbundprojekt und bringt die wirtschaftspädagogische und
               rechtswissenschaftliche Perspektive sowie Expertise zur Kompetenzentwicklung
               ein. Die repräsentative und methodenintegrative Studie liefert
               wissenschaftliche Erkenntnisse zum Einfluss und zur digitalen
               Förderbarkeit der berufsbezogenen Mediennutzung in der Berufspraxis.
               Außerdem sind Ergebnisse zu erwarten, die zur Gestaltung der Verbindung
               von formalen und non-formalen Lerngelegenheiten genutzt werden
               können. Praxispartner, wie zum Beispiel Ausbildnerinnen und Ausbildner,
               können die entwickelten adaptiven Trainingskonzepte praktisch
               anwenden und nutzen.},
  year      = {2020},
  until     = {2023},
  funded_by = {BMBF (01JD1906B)},
  funded_by_url = {https://www.empirische-bildungsforschung-bmbf.de/de/Themenfinder-1720.html/projekt/01JD1906A},
  url       = {https://bridge.uni-mainz.de/},
  logo      = {/wp-content/uploads/2024/01/logo-BRIDGE.png}
}
Specialised Information Service Biodiversity Research (BIOfid). 2017 – . Funded by DFG (FID 326061700).
Description
The specialised information service BIOfid (www.biofid.de) is oriented towards the special needs of scientists researching biodiversity topics at research institutions and in natural history collections. Since 2017, BIOfid has been building an infrastructure that contributes to the provision and mobilisation of research-relevant data in a variety of ways in the context of current developments in biodiversity research.
BibTeX
@project{biofid,
  name      = {Specialised Information Service Biodiversity Research (BIOfid)},
  abstract  = {The specialised information service BIOfid (www.biofid.de) is
               oriented towards the special needs of scientists researching biodiversity
               topics at research institutions and in natural history collections.
               Since 2017, BIOfid has been building an infrastructure that contributes
               to the provision and mobilisation of research-relevant data in
               a variety of ways in the context of current developments in biodiversity
               research.},
  year      = {2017},
  funded_by = {DFG (FID 326061700)},
  funded_by_url = {https://gepris.dfg.de/gepris/projekt/326061700},
  url       = {https://www.biofid.de/en/},
  repository = {https://github.com/FID-Biodiversity},
  logo      = {/wp-content/uploads/2024/01/logo-BIOfid.png},
  keywords  = {biofid,biodiversity}
}

Teaching

None, 2025

Bachelor Thesis: Diversification of the container landscape for DUUI.
Description
The processing of large and extensive unstructured corpora remains a significant challenge for various scientific disciplines. To address this, the Docker Unified UIMA Interface (DUUI) was developed. DUUI provides NLP methods through container services to perform horizontally and vertically distributed big data analysis in a unified, standardized, reusable, and schema-based process. In the medium to long term, DUUI can leverage a variety of container services to implement optimal processing solutions tailored to specific scenarios and environmental parameters. This involves the creation, implementation, and evaluation of container services for DUUI that have not yet been integrated. Bachelor's or Master's theses are invited to address this task of services integration. See also:
Corresponding Lab Member: Giuseppe Abrami and Alexander Mehler.
Bachelor Thesis: How does Language Bias Affect Pretrained Language Models?.
Description
Does language bias exist in pretrained large language models, such as those trained using a masked language modeling objective? What are the core components of these models that tend to produce this bias? Language bias refers to the tendency of multilingual models to prefer answering or selecting responses (e.g., in question-answering or information retrieval tasks) in the same language as the query, even when more likely candidate answers are available in other languages. What are the primary causes of this behavior? Are they linguistic, embedded in the training objective, or influenced by the loss function? These questions remain unresolved. Bachelor's and Master's theses are invited to explore these or related questions. References:
Corresponding Lab Member: Ali Raza and Alexander Mehler.
Master Thesis: Aristotelian Modification of Nominals.
Description
The standard semantics of noun-modifying adjectives is typically explained in terms of set membership in one way or another. Modern theories often incorporate scales, particularly for measure adjectives. This master's thesis will generalize such approaches by employing more general property spaces, which can be conceptualized as accidental qualities, a notion derived from Aristotle’s linguistic work. The accidental qualities of nominals will be determined by clustering adjectives from large corpora, thereby enriching lexical entries. This thesis complements computational linguistic research on the generative lexicon, has relevance for multimodal speech-gesture integration, and offers a novel perspective on the metaphoric use of adjectives.
Corresponding Lab Member: Andy Lücking and Alexander Mehler.
Bachelor Thesis: Exploring Pretrained Retrievers and Embedding-Based Search for Accurate Book Metadata Retrieval in RAG Pipelines.
Description
Retrieving accurate book metadata is essential for enhancing the performance of Retrieval-Augmented Generation (RAG) pipelines. This project explores modern, non-heuristic approaches to metadata retrieval, focusing on the use of pretrained retrievers and embedding-based similarity search. Instead of relying on manually crafted heuristics, these methods leverage embeddings generated by state-of-the-art models to identify the most relevant metadata and associated texts. The experiment will utilize large indexed corpora, such as Wikipedia and online library databases, to evaluate the efficacy of pretrained retrievers and embedding similarity for matching input metadata with incomplete or ambiguous information. The project will involve indexing metadata and textual content from publicly available sources (e.g., Open Library, Google Books, Wikipedia) using vector-based search frameworks. Pretrained models, such as dense retrievers (e.g., DPR, SentenceTransformers), will be used to generate embeddings for both input metadata and indexed corpora. The results will be compared to traditional heuristic-based methods to evaluate retrieval accuracy, scalability, and adaptability to incomplete metadata scenarios. This research addresses a significant bottleneck in RAG pipelines, where retrieval systems must efficiently integrate external knowledge to improve language model performance in answering specific queries. While this study focuses on bibliographic data, the proposed methods are generalizable and applicable to other domains requiring accurate and scalable metadata retrieval. The outcomes will provide insights into the trade-offs between heuristic and non-heuristic approaches and contribute to advancing metadata retrieval techniques for knowledge-intensive NLP tasks. References:
Corresponding Lab Member: Omar Momen and Alexander Mehler.
Bachelor Thesis: A comparative study of methodologies that are used to identifying human vs automatic generated text.
Description
With the advent of large language models such as ChatGPT, growing ethical concerns have emerged, highlighting the need for approaches to address automatic text recognition models. These models are becoming increasingly popular but remain underexplored and not well established. A study is needed to provide an overview of existing work in this area and evaluate its usefulness. Bachelor's and Master's theses are invited to explore this field through a comparative approach by reimplementing and testing a range of established methods. References:
Corresponding Lab Member: Ali Raza and Alexander Mehler.
Bachelor Thesis: Affiliation of Speech and Gesture through LLMs.
Description
Most "referential" gestures have a docking point in accompanying speech, known as the lexical affiliate. This bachelor’s thesis leverages this empirical fact to utilize large language models (LLMs) for gesture annotation. Each occurrence of a referential gesture in a multimodal dataset is presented to an LLM, which is tasked with identifying the corresponding affiliate expression in speech. Through this process, a gesture interpretation is derived. Additionally, the approach aims to detect gestures that lack an overt affiliate. Building on the strong performance of LLMs in handling bridging relations, the thesis proposes a frame-based interpretation for such gestures. This work makes a central topic of multimodal communication accessible to modern computational techniques, provides quantitative insights into speech-gesture affiliation, and lays the foundation for further gesture classifications.
Corresponding Lab Member: Andy Lücking and Alexander Mehler.
Bachelor Thesis: Multimodal data integration and processing in DUUI.
Description
The Docker Unified UIMA Interface (DUUI) is a tool designed for the automated analysis of large corpora using a variety of NLP tools. Currently, DUUI supports the processing of text, audio, and video data. To extend its capabilities, additional support for multimodal data, such as that provided by Va.Si.Li-Lab – which includes motion data, object interaction data, and more – should be integrated into DUUI. All integrated data will need to be linked through a new type system tailored to each modality. Furthermore, processes such as motion detection must be incorporated to effectively process and analyze these new data types within DUUI. Bachelor's and Master's theses are invited to explore this multimodal model extension and integration. References:
Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.
Bachelor Thesis: Briding the Gap Between Virtual Environments and Reality.
Description
Virtual Reality (VR) enables immersive user experiences by providing highly realistic environments and interactions, particularly with advances in hand, eye, and face tracking. These technologies enhance engagement and facilitate more natural communication, effectively reducing the perceived physical distance between users. However, most virtual meeting environments remain entirely synthetic, disconnected from the physical spaces of users. Despite ongoing improvements in realistic digital avatars (e.g., MetaHumans), the creation and accessibility of authentic virtual environments remain limited. To address this, we propose a novel approach using real-time photogrammetry to reconstruct physical spaces in VR accurately. This method enables users to virtually visit each other's physical environments, seamlessly blending virtual and real spaces, thereby narrowing the gap between digital and physical interactions. Bachelor's and Master's theses are invited to experiment with and evaluate these emerging technologies. See also:
Corresponding Lab Member: Patrick Schrottenbacher and Alexander Mehler.
Master Thesis: Enhancing Audio Transcription with Visual Cues: A Multimodal Approach Utilizing Lip Movements and Facial Expressions for German Language Applications.
Description
Accurate audio transcription remains a challenge in environments with background noise, low-quality recordings, or overlapping speech. While significant progress has been made using audio-only approaches powered by deep learning and automatic speech recognition (ASR) systems (Graves et al., 2013), such methods often fail in adverse acoustic conditions. This thesis proposes the design and implementation of a multimodal transcription tool that integrates visual information, such as lip movements and facial expressions, to improve transcription accuracy, with a focus on adapting this approach to the German language. The proposed tool leverages the correlation between spoken words and their associated visual signals, such as lip shape dynamics and facial expressions, to improve the decoding of ambiguous or misinterpreted audio signals (Chung et al., 2017). It combines deep learning-based audio and video models to refine transcription results (Afouras et al., 2018). Existing datasets will form the basis for training and testing. For English, datasets such as LRS3 (Afouras et al., 2018) will be used. For German, the GLips (German Lips) dataset (Zöllner et al., 2022) provides extensive video data suitable for word-level lip-reading research. An important sub-task is the fine-tuning of existing pre-trained models for German-specific linguistic and phonetic features. This requires transfer learning techniques to adapt models trained on English datasets to German phoneme distributions, articulatory patterns, and grammatical structures. Experimental evaluation will measure transcription accuracy in both English and German, especially under noisy conditions, to quantify the advantages of the multimodal approach. This work aims to advance multilingual ASR systems by demonstrating the benefits of integrating audiovisual data for transcription.The results will demonstrate the effectiveness of combining existing datasets and adapting pre-trained models to improve transcription accuracy in real-world scenarios.
  • Graves et al., 2013
  • Chung et al., 2017
  • Afouras et al., 2018
  • Zöllner et al., 2022

  • Corresponding Lab Member: Maxim Konca and Alexander Mehler.
    Master Thesis: Can Adversarial Text Snippets Achieve Refusal Dimension Deletion?.
    Description
    The threat of abuse through determined adversaries makes safety of public-facing LLMs a key priority for developers and researcher alike.
    Despite intensive efforts, recent research shows that "refusal in language models [may be] mediated by a [one-dimensional subspace in the model's weights]" (Arditi et al., 2024) and that it is possible to create text-snippets that circumvent harmful response prevention in open- and closed-source LLMs using adversarial algorithms (Zou et al., 2023). This beckons the question, whether these two methods of "jailbreaking" LLMs align; i.e. whether adversarially generated text segments can shift a model's hidden states into a position that effectively approach refusal dimension deletion.

    Related Work

    Corresponding Lab Member: Manuel Schaaf and Alexander Mehler.
    Master Thesis: Unlocking Wikipedia for Research: A Modular Toolkit for Structured NLP Applications.
    Description
    Wikipedia serves as a vast and diverse resource that is widely used in research domains to address a variety of tasks and questions. However, its size, semi-structured form, inconsistent formatting, and noisy elements (e.g., infoboxes) pose significant challenges to its accessibility and usability in structured research applications. This thesis aims to develop a comprehensive framework to overcome these challenges and enable researchers to effectively use Wikipedia's content for NLP and other structured research purposes. The proposed work focuses on the design of a modular, database-driven toolkit that supports the local use of Wikipedia for NLP processing. Key objectives include exploring existing tools and databases, integrating Wikidata, and leveraging different database solutions to address different use cases. Specific tasks include selecting and evaluating databases, designing database schemas, processing Wikipedia dump files as source data, and implementing robust mechanisms for data extraction, parsing (e.g., Wikitext), and updating. Additional challenges such as constructing category and social graphs, managing interlanguage links, handling revisions, and integrating DUUI (Docker Unified UIMA Interface) will also be addressed. The goal of this thesis is to provide a practical toolkit for researchers that facilitates the effective and flexible use of Wikipedia's content for a wide range of applications. See also:
    Corresponding Lab Member: Daniel Baumartz and Alexander Mehler.
    Bachelor Thesis: Development of an HTML Parser for Efficient Extraction of Search Engine Results.
    Description
    The exponential growth of online information has made search engines indispensable tools for accessing relevant data. Search engines such as Google, Bing, and Yandex generate results that serve a variety of needs, from academic research to commercial applications. However, accessing and analyzing these results often requires parsing the underlying HTML code of the search results pages. This thesis investigates the design and implementation of an HTML parser capable of extracting, structuring, and analyzing search engine results in a reliable and efficient manner. The goal of this project is to develop a robust HTML parser tailored for extracting search results from multiple search engines, while addressing challenges such as dynamic content loading, anti-scraping measures, and variations in HTML structures. The parser will identify key elements such as titles, URLs, snippets, and metadata, standardize the extracted data into a consistent format, and output it for further analysis or integration with other systems. The implementation involves a combination of web scraping libraries, regular expressions, and advanced parsing techniques, with an emphasis on handling dynamic web content rendered through JavaScript. The project also addresses ethical and legal considerations related to web scraping, and proposes mechanisms for compliance with search engine terms of service and applicable data usage regulations. The developed parser will be evaluated based on its accuracy, speed, and adaptability to changes in search engine HTML structures. Performance benchmarks and use cases, such as competitive analysis and data aggregation, will be presented to demonstrate the utility and versatility of the system. The outcome of this thesis aims to contribute to the fields of data mining and web technologies by providing a fundamental tool for generically accessing and leveraging search engine data.
    Corresponding Lab Member: Maxim Konca and Alexander Mehler.
    Bachelor Thesis: Bringing Order to Chaos: Structuring Unstructured Documents.
    Description
    The increasing volume and diversity of text corpora generated daily from various sources poses significant challenges for their processing and analysis. Generic tools that facilitate the exploration and understanding of such corpora in a standardized and intuitive manner are rare. A key issue is the transformation of unstructured plain text into generically structured formats that allow efficient reading, sorting, and searching. This task aims to develop models or algorithms that can process raw, unstructured text and produce structured outputs based on predefined rules, algorithms, or models. These outputs should be compatible with the Docker Unified UIMA Interface (DUUI), our general purpose corpus annotation tool. The structured format must also comply with the UIMA (Unstructured Information Management Architecture) standard. Bachelor's or Master's theses are invited to explore this topic and contribute to the broader goal of making diverse textual corpora more accessible and manageable. See also: Unlocking the Heterogeneous Landscape of Big Data NLP with DUUI - ACL Anthology
    Corresponding Lab Member: Kevin Boenisch and Alexander Mehler.
    Bachelor Thesis: Retrieval-Augmented Generation (RAG): Synthesizing Knowledge from Large Corpora.
    Description
    The increase of textual data in scientific and other domains has created an urgent need for tools that can efficiently retrieve accurate information from large corpora. Can large language models help researchers identify critical information - metaphorically, "needles in a haystack"? This research explores Retrieval-Augmented Generation (RAG) as a framework for proposing pipelines and models capable of locating specific units of information in response to user queries. Crucially, this approach avoids the need for explicit fine-tuning of large language models on domain-specific data. Instead, it emphasizes techniques such as prompt engineering, advanced data retrieval mechanisms, and innovative query formulation. Possible methodologies include the use of embedding spaces, graph databases, or hybrid architectures to improve retrieval accuracy and synthesis capabilities. Bachelor's or Master's theses are invited to contribute novel solutions to this interdisciplinary challenge. See also: OPEN SCHOLAR: SYNTHESIZING SCIENTIFIC LITERATURE WITH RETRIEVAL-AUGMENTED LMS; CCC-BERT | Kaggle
    Corresponding Lab Member: Kevin Boenisch and Alexander Mehler.
    Bachelor/Master Thesis: Multimodal VR Data Meets DUUI.
    Description
    The processing of large and extensive unstructured corpora is a constant challenge for various scientific disciplines. For this purpose, the Docker Unified UIMA Interface (DUUI) was developed, which provides NLP analysis methods based on container services to perform horizontally and vertically distributed big data analyses in a unified, standardized and reusable and schema-based process. The first steps towards multimodality have also already been taken. The task of this thesis is to adapt DUUI processing so that it can also be used to process multimodal data collected through VR experiments. The main difficulty lies in the alignment of speech, transcription and movements.
    See also:
    Bachelor Thesis: The emperor's new clothes alias TextAnnotator's new and responsive interface.
    Description
    TextAnnotator, a web-based tool for platform-independent, simultaneous, and collaborative semi-automated annotation of unstructured corpora based on UIMA, is a flexible and feature-rich solution for annotating various linguistic and semantic features using multiple annotation views and tools. However, TextAnnotator is currently implemented at the visual interface level using an older version of ExtJS, which needs to be upgraded to a modern interface. This upgrade is necessary in the short to medium term to enable the creation and implementation of more modular and interchangeable components. Bachelor's or Master's theses are invited that aim to develop and test new interfaces to enhance TextAnnotator's versatility and attractiveness by leveraging modern web interface technologies. See also:
    Corresponding Lab Member: Giuseppe Abrami and Alexander Mehler.
    Master Thesis: Natural Human interactions with LLM’s per Audio.
    Description
    Natural conversations between people are standard, and this is also possible with large language models (LLMs). Human speech can be converted to text, which can then be used as input for the LLM. The output of the LLM is then converted back to audio. However, due to latency and the nature of audio output, it is still a major challenge to integrate a chatbot that can communicate naturally in both text and audio without human interlocutors noticing this latency, especially in multilingual environments. Therefore, Bachelor's or Master's are invited that address these latency issues. See also:
    Corresponding Lab Member: Mevlüt Bagci and Alexander Mehler.
    Bachelor/Master Thesis: Live VR Experiment Visualisation.
    Description
    Va.Si.Li-Lab is a virtual reality-based system designed for tracking and analyzing interpersonal communication by integrating extensive tracking capabilities, such as hand, face, and eye movements, alongside audio data. It enables controlled multi-user scenarios, allowing researchers to assign roles, impose modality-specific restrictions, and analyze communication behavior through aligned multimodal data stored in a central database. It can be problematic both to track the progress of an experiment and to process the data in a meaningful way afterwards. This thesis is about the meaningful processing of the tracked data, both live and afterwards.
    See also:
    Bachelor Thesis: Developing a Heuristic for Retrieving Specific Book Metadata in Retrieval-Augmented Generation (RAG) Pipelines.
    Description
    Accurate retrieval of book metadata is a critical challenge in the development of Retrieval-Augmented Generation (RAG) pipelines. This project aims to develop a heuristic-based procedure for retrieving the most valid metadata - and potentially the text - of books from various online library databases using publicly available APIs. These databases contain large collections of book records, often with incomplete or inconsistent metadata. This makes querying and matching a specific publication a complex task, especially when dealing with incomplete input metadata. The procedure will address cases where multiple books share similar metadata, such as the same title and author, but belong to different editions or publications. The proposed heuristic will analyze and rank the results of API queries to identify the best match for the input data. The approach involves a detailed study of metadata patterns in online libraries and the development of robust matching criteria that account for variations and gaps in the data. This work contributes to an emerging area in natural language processing where RAG pipelines rely on external knowledge sources to augment large language models (LLMs) with domain-specific information. By addressing the challenge of metadata retrieval, this project will improve the accuracy and reliability of downstream tasks, such as answering questions about specific books. Although the focus of this work is on bibliographic data, the developed heuristic has the potential to be generalized for metadata retrieval in other domains. The outcome of this project will be a validated methodology that can be seamlessly integrated into RAG pipelines, representing a significant step forward in leveraging external databases for high quality contextual information retrieval. References:
    Corresponding Lab Member: Omar Momen and Alexander Mehler.

    Winter Semester, 2024

    Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Schaaf.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT

    Summer Semester, 2024

    Lecture: NLP-gestützte Data Science. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Practical: Multimodal AI. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT

    Winter Semester, 2023

    Lecture: Einführung Computational Humanities. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Programmierpraktikum. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Time Machines on Virtual- and Augmented Reality. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT

    Summer Semester, 2023

    Lecture: NLP-gestützte Data Science. Alexander Mehler, Manuel Stoeckel and Giuseppe Abrami.
    QISOLAT
    Practical: Multimodal Computing: Machine Learning, virtuelle Realität und Kommunikation. Alexander Mehler, Andy Lücking and Alexander Henlein.
    QISOLAT
    Practical: Transformer-based Natural Language Processing. Alexander Mehler and Manuel Stoeckel.
    QISOLAT
    Practical: Deep Learning for Text Imaging. Alexander Mehler and Giuseppe Abrami.
    QISOLAT
    Practical: Multilingual systems with AI. Alexander Mehler and Mevlüt Bagci.
    QISOLAT
    Seminar: Text Analytics. Alexander Mehler.
    QISOLAT
    Seminar: Computational Humanities. Alexander Mehler.
    QISOLAT

    Supervisions

    Alexander Henlein. 2023. PhD Thesis: Toward context-based text-to-3D scene generation.
    BibTeX
    @phdthesis{Henlein:2023,
      author    = {Alexander Henlein},
      title     = {Toward context-based text-to-3D scene generation},
      type      = {doctoralthesis},
      pages     = {199},
      school    = {Johann Wolfgang Goethe-Universität},
      doi       = {10.21248/gups.73448},
      year      = {2023},
      pdf       = {https://publikationen.ub.uni-frankfurt.de/files/73448/main.pdf}
    }
    Wahed Hemati. 2020. PhD Thesis: TextImager-VSD : large scale verb sense disambiguation and named entity recognition in the context of TextImager.
    BibTeX
    @phdthesis{Hemati:2020,
      author    = {Wahed Hemati},
      title     = {TextImager-VSD : large scale verb sense disambiguation and named
                   entity recognition in the context of TextImager},
      pages     = {174},
      year      = {2020},
      url       = {http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/56089},
      pdf       = {http://publikationen.ub.uni-frankfurt.de/files/56089/dissertation_Wahed_Hemati.pdf}
    }
    Tolga Uslu. 2020. PhD Thesis: Multi-document analysis : semantic analysis of large text corpora beyond topic modeling.
    BibTeX
    @phdthesis{Uslu:2020,
      author    = {Tolga Uslu},
      title     = {Multi-document analysis : semantic analysis of large text corpora
                   beyond topic modeling},
      pages     = {204},
      year      = {2020},
      url       = {http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/56140},
      pdf       = {http://publikationen.ub.uni-frankfurt.de/files/56140/Dissertation_Tolga_Uslu.pdf}
    }
    Armin Hoenen. 2018. PhD Thesis: Tools, evaluation and preprocessing for stemmatology.
    BibTeX
    @phdthesis{Hoenen2018,
      type      = {Dissertation},
      author    = {Armin Hoenen},
      title     = {Tools, evaluation and preprocessing for stemmatology},
      school    = {Goethe University Frankfurt},
      year      = {2018}
    }
    Mohammad Zahurul Islam. 2015. PhD Thesis: Multilingual text classification using information-theoretic features.
    BibTeX
    @phdthesis{Islam:2015,
      author    = {Mohammad Zahurul Islam},
      title     = {Multilingual text classification using information-theoretic features},
      pages     = {189},
      year      = {2015},
      pdf       = {http://publikationen.ub.uni-frankfurt.de/files/38157/thesis.pdf},
      abstract  = {The number of multilingual texts in the World Wide Web (WWW) is
                   increasing dramatically and a multilingual economic zone like
                   the European Union (EU) requires the availability of multilingual
                   Natural Language Processing (NLP) tools. Due to a rapid development
                   of NLP tools, many lexical, syntactic, semantic and other linguistic
                   features have been used in different NLP applications. However,
                   there are some situations where these features can not be used
                   due the application type or unavailability of NLP resources for
                   some of the languages. That is why an application that is intended
                   to handle multilingual texts must have features that are not dependent
                   on a particular language and specific linguistic tools. In this
                   thesis, we will focus on two such applications: text readability
                   and source and translation classification. In this thesis, we
                   provide 18 features that are not only suitable for both applications,
                   but are also language and linguistic tools independent. In order
                   to build a readability classifier, we use texts from three different
                   languages: English, German and Bangla. Our proposed features achieve
                   a classification accuracy that is comparable with a classifier
                   using 40 linguistic features. The readability classifier achieves
                   a classification F-score of 74.21\% on the English Wikipedia corpus,
                   an F-score of 75.47\% on the English textbook corpus, an F-score
                   of 86.46\% on the Bangla textbook corpus and an F-score of 86.26\%
                   on the German GEO/GEOLino corpus. We used more than two million
                   sentence pairs from 21 European languages in order to build the
                   source and translation classifier. The classifier using the same
                   eighteen features achieves a classification accuracy of 86.63\%.
                   We also used the same features to build a classifier that classifies
                   translated texts based on their origin. The classifier achieves
                   classification accuracy of 75\% for texts from 10 European languages.
                   In this thesis, we also provide four different corpora, three
                   for text readability analysis and one for corpus based translation
                   studies.}
    }
    Olga Abramov. 2012. PhD Thesis: Network theory applied to linguistics: new advances in language classification and typology.
    BibTeX
    @phdthesis{Abramov:2012,
      author    = {Abramov, Olga},
      title     = {Network theory applied to linguistics: new advances in language
                   classification and typology},
      school    = {Bielefeld University, Germany},
      abstract  = {This thesis bridges between two scientific fields -- linguistics
                   and computer science -- in terms of Linguistic Networks. From
                   the linguistic point of view we examine whether languages can
                   be distinguished when looking at network topology of different
                   linguistic networks. We deal with up to 17 languages and ask how
                   far the methods of network theory reveal the peculiarities of
                   single languages. We present and apply network models from different
                   levels of linguistic representation: syntactic, phonological and
                   morphological. The network models presented here allow to integrate
                   various linguistic features at once, which enables a more abstract,
                   holistic view at the particular language. From the point of view
                   of computer science we elaborate the instrumentarium of network
                   theory applying it to a new field. We study the expressiveness
                   of different network features and their ability to characterize
                   language structure. We evaluate the interplay of these features
                   and their goodness in the task of classifying languages genealogically.
                   Among others we compare network features related to: average degree,
                   average geodesic distance, clustering, entropy-based indices,
                   assortativity, centrality, compactness etc. We also propose some
                   new indices that can serve as additional characteristics of networks.
                   The results obtained show that network models succeed in classifying
                   related languages, and allow to study language structure in general.
                   The mathematical analysis of the particular network indices brings
                   new insights into the nature of these indices and their potential
                   when applied to different networks.},
      pdf       = {https://pub.uni-bielefeld.de/download/2538828/2542368},
      website   = {http://pub.uni-bielefeld.de/publication/2538828},
      year      = {2012}
    }

    Publications

    Total: 292

    2025

    Giuseppe Abrami, Markos Genios, Filip Fitzermann, Daniel Baumartz and Alexander Mehler. 2025. Docker Unified UIMA Interface: New perspectives for NLP on big data. SoftwareX, 29:102033.
    BibTeX
    @article{Abrami:et:al:2025:a,
      title     = {Docker Unified UIMA Interface: New perspectives for NLP on big data},
      journal   = {SoftwareX},
      volume    = {29},
      pages     = {102033},
      year      = {2025},
      issn      = {2352-7110},
      doi       = {https://doi.org/10.1016/j.softx.2024.102033},
      url       = {https://www.sciencedirect.com/science/article/pii/S2352711024004047},
      author    = {Giuseppe Abrami and Markos Genios and Filip Fitzermann and Daniel Baumartz
                   and Alexander Mehler},
      keywords  = {duui, Docker, Kubernetes, UIMA, Distributed NLP},
      abstract  = {Processing large amounts of natural language text using machine
                   learning-based models is becoming important in many disciplines.
                   This demand is being met by a variety of approaches, resulting
                   in the heterogeneous deployment of separate, partly incompatible,
                   not natively scalable applications. To overcome the technological
                   bottleneck involved, we have developed Docker Unified UIMA Interface,
                   a system for the standardized, parallel, platform-independent,
                   distributed and microservices-based solution for processing large
                   and extensive text corpora with any NLP method. We present DUUI
                   as a framework that enables automated orchestration of GPU-based
                   NLP processes beyond the existing Docker Swarm cluster variant,
                   and in addition to the adaptation to new runtime environments
                   such as Kubernetes. Therefore, a new driver for DUUI is introduced,
                   which enables the lightweight orchestration of DUUI processes
                   within a Kubernetes environment in a scalable setup. In this way,
                   the paper opens up novel text-technological perspectives for existing
                   practices in disciplines that deal with the scientific analysis
                   of large amounts of data based on NLP.}
    }
    Giuseppe Abrami, Daniel Baumartz and Alexander Mehler. 2025. DUUI: A Toolbox for the Construction of a new Kind of Natural Language Processing. Proceedings of the DHd 2025: Under Construction. Geisteswissenschaften und Data Humanities. accepted.
    BibTeX
    @inproceedings{Abrami:et:al:2025:b,
      author    = {Abrami, Giuseppe and Baumartz, Daniel and Mehler, Alexander},
      title     = {DUUI: A Toolbox for the Construction of a new Kind of Natural
                   Language Processing},
      year      = {2025},
      booktitle = {Proceedings of the DHd 2025: Under Construction. Geisteswissenschaften
                   und Data Humanities},
      numpages  = {3},
      location  = {Bielefeld, Germany},
      series    = {DHd 2025},
      keywords  = {duui},
      note      = {accepted}
    }

    2024

    Alexander Mehler, Mevlüt Bagci, Patrick Schrottenbacher, Alexander Henlein, Maxim Konca, Giuseppe Abrami, Kevin Bönisch, Manuel Stoeckel, Christian Spiekermann and Juliane Engel. 2024. Towards New Data Spaces for the Study of Multiple Documents with Va.Si.Li-Lab: A Conceptual Analysis. In: Students', Graduates' and Young Professionals' Critical Use of Online Information: Digital Performance Assessment and Training within and across Domains, 259–303. Ed. by Olga Zlatkin-Troitschanskaia, Marie-Theres Nagel, Verena Klose and Alexander Mehler. Springer Nature Switzerland.
    BibTeX
    @inbook{Mehler:et:al:2024:a,
      author    = {Mehler, Alexander and Bagci, Mevl{\"u}t and Schrottenbacher, Patrick
                   and Henlein, Alexander and Konca, Maxim and Abrami, Giuseppe and B{\"o}nisch, Kevin
                   and Stoeckel, Manuel and Spiekermann, Christian and Engel, Juliane},
      editor    = {Zlatkin-Troitschanskaia, Olga and Nagel, Marie-Theres and Klose, Verena
                   and Mehler, Alexander},
      title     = {Towards New Data Spaces for the Study of Multiple Documents with
                   Va.Si.Li-Lab: A Conceptual Analysis},
      booktitle = {Students', Graduates' and Young Professionals' Critical Use of
                   Online Information: Digital Performance Assessment and Training
                   within and across Domains},
      year      = {2024},
      publisher = {Springer Nature Switzerland},
      address   = {Cham},
      pages     = {259--303},
      abstract  = {The constitution of multiple documents has so far been studied
                   essentially as a process in which a single learner consults a
                   number (of segments) of different documents in the context of
                   the task at hand in order to construct a mental model for the
                   purpose of completing the task. As a result of this research focus,
                   the constitution of multiple documents appears predominantly as
                   a monomodal, non-interactive process in which mainly textual units
                   are studied, supplemented by images, text-image relations and
                   comparable artifacts. This approach is reflected in the contextual
                   fixity of the research design, in which the learners under study
                   search for information using suitably equipped computers. If,
                   on the other hand, we consider the openness of multi-agent learning
                   situations, this scenario lacks the aspects of interactivity,
                   contextual openness and, above all, the multimodality of information
                   objects, information processing and information exchange. This
                   is where the chapter comes in. It describes Va.Si.Li-Lab as an
                   instrument for multimodal measurement for studying and modeling
                   multiple documents in the context of interactive learning in a
                   multi-agent environment. To this end, the chapter places Va.Si.Li-Lab
                   in the spectrum of evolutionary approaches that vary the combination
                   of human and machine innovation and selection. It also combines
                   the requirements of multimodal representational learning with
                   various aspects of contextual plasticity to prepare Va.Si.Li-Lab
                   as a system that can be used for experimental research. The chapter
                   is conceptual in nature, designing a system of requirements using
                   the example of Va.Si.Li-Lab to outline an experimental environment
                   in which the study of Critical Online Reasoning (COR) as a group
                   process becomes possible. Although the chapter illustrates some
                   of these requirements with realistic data from the field of simulation-based
                   learning, the focus is still conceptual rather than experimental,
                   hypothesis-driven. That is, the chapter is concerned with the
                   design of a technology for future research into COR processes.},
      isbn      = {978-3-031-69510-0},
      doi       = {10.1007/978-3-031-69510-0_12},
      url       = {https://doi.org/10.1007/978-3-031-69510-0_12}
    }
    Maxim Konca, Alexander Mehler, Andy Lücking and Daniel Baumartz. 2024. Visualizing Domain-specific and Generic Critical Online Reasoning Related Structures of Online Texts: A Hybrid Approach. In: Students', Graduates' and Young Professionals' Critical Use of Online Information: Digital Performance Assessment and Training within and across Domains, 195–239. Ed. by Olga Zlatkin-Troitschanskaia, Marie-Theres Nagel, Verena Klose and Alexander Mehler. Springer Nature Switzerland.
    BibTeX
    @inbook{Konca:et:al:2024:a,
      author    = {Konca, Maxim and Mehler, Alexander and L{\"u}cking, Andy and Baumartz, Daniel},
      editor    = {Zlatkin-Troitschanskaia, Olga and Nagel, Marie-Theres and Klose, Verena
                   and Mehler, Alexander},
      title     = {Visualizing Domain-specific and Generic Critical Online Reasoning
                   Related Structures of Online Texts: A Hybrid Approach},
      booktitle = {Students', Graduates' and Young Professionals' Critical Use of
                   Online Information: Digital Performance Assessment and Training
                   within and across Domains},
      year      = {2024},
      publisher = {Springer Nature Switzerland},
      address   = {Cham},
      pages     = {195--239},
      abstract  = {Besides ``traditional'' educational media, young professionals
                   in higher education use the Internet to obtain information. To
                   utilize their online research in professional contexts, they critically
                   evaluate the information they access and its sources. One dimension
                   of this evaluation is an assessment of the linguistic state of
                   the online sources, either implicitly or explicitly. This computational
                   educational linguistic study applies methods from computational
                   linguistics to online sources visited by young professionals from
                   three fields (law students, teacher trainees, and medicine student)
                   and develops partly novel visualizations that allow to quickly
                   discover similarities as well as differences between multi-heterogeneous
                   Internet sources, that is, sources that exhibit various topics,
                   genres, and textual structure, among others. The visualizations
                   also allow a comparison of search behaviour between different
                   professional fields. In this way, we found that (1) genre classification
                   has a significant impact on reliability scores, (2) young professionals'
                   search approaches vary by their professional field, and, (3) the
                   best predictor of reliability is indeed the linguistic profile
                   of an online source.},
      isbn      = {978-3-031-69510-0},
      doi       = {10.1007/978-3-031-69510-0_10},
      url       = {https://doi.org/10.1007/978-3-031-69510-0_10}
    }
    Olga Zlatkin-Troitschanskaia, Marie-Theres Nagel, Verena Klose and Alexander Mehler, eds. 2024. Students’, Graduates’ and Young Professionals’ Critical Use of Online Information: Digital Performance Assessment and Training within and across Domains. Springer Cham.
    BibTeX
    @book{Zlatkin-Troitschanskaia:et:al:2024,
      title     = {Students’, Graduates’ and Young Professionals’ Critical Use of
                   Online Information: Digital Performance Assessment and Training
                   within and across Domains},
      editor    = {Zlatkin-Troitschanskaia, Olga and Nagel, Marie-Theres and Klose, Verena
                   and Mehler, Alexander},
      isbn      = {9783031695100},
      url       = {http://dx.doi.org/10.1007/978-3-031-69510-0},
      doi       = {10.1007/978-3-031-69510-0},
      publisher = {Springer Cham},
      year      = {2024},
      abstract  = {This book addresses the topic of online information for everyday
                   personal and professional use by students, graduates, and young
                   professionals. It focuses on the development of the job-related
                   use of online information by young professionals in their practical
                   phases of education (traineeship/practical year) in the domains
                   of law, teaching, and medicine. The research conducted in this
                   context investigates the general and domain-specific use of online
                   resources in educational contexts and examines the effectiveness
                   of an innovative digital training approach in enhancing skills
                   required for the competent use of online information. For this
                   purpose, the presented research uses a yet unprecedented approach
                   of data triangulation, in which self-rated data, digitally and
                   in vivo assessed response process data and expert ratings are
                   integrated into a theoretically founded assessment framework and
                   are examined from various interdisciplinary perspectives with
                   different analysis methods. Overall, this work addresses key research
                   questions related to the use of online information in practical
                   tasks as well as to the impact of digital training. It provides
                   in-depth multidisciplinary analyses of multimodal processes and
                   performance data, allowing implications equally relevant for practitioners,
                   policymakers, and researchers in the field of education.}
    }
    Alexander Henlein, Andy Lücking and Alexander Mehler. 2024. Virtually Restricting Modalities in Interactions: Va.Si.Li-Lab for Experimental Multimodal Research. Proceedings of the 2nd International Symposium on Multimodal Communication (MMSYM 2024), Frankfurt, 25-27 September 2024, 96–97.
    BibTeX
    @inproceedings{Henlein:Luecking:Mehler:2024,
      title     = {Virtually Restricting Modalities in Interactions: Va.Si.Li-Lab
                   for Experimental Multimodal Research},
      author    = {Henlein, Alexander and L{\"u}cking, Andy and Mehler, Alexander},
      booktitle = {Proceedings of the 2nd International Symposium on Multimodal Communication
                   (MMSYM 2024), Frankfurt, 25-27 September 2024},
      pages     = {96--97},
      year      = {2024},
      pdf       = {http://mmsym.org/wp-content/uploads/2024/09/BookOfAbstractsMMSYM2024-3.pdf}
    }
    Andy Lücking, Alexander Mehler and Alexander Henlein. 2024. The Gesture–Prosody Link in Multimodal Grammar. Proceedings of the 2nd International Symposium on Multimodal Communication (MMSYM 2024), Frankfurt, 25-27 September 2024, 128–129.
    BibTeX
    @inproceedings{Luecking:Mehler:Henlein:2024,
      title     = {The Gesture–Prosody Link in Multimodal Grammar},
      author    = {L{\"u}cking, Andy and Mehler, Alexander and Henlein, Alexander},
      booktitle = {Proceedings of the 2nd International Symposium on Multimodal Communication
                   (MMSYM 2024), Frankfurt, 25-27 September 2024},
      pages     = {128--129},
      year      = {2024},
      pdf       = {http://mmsym.org/wp-content/uploads/2024/09/BookOfAbstractsMMSYM2024-3.pdf}
    }
    Andy Lücking, Alexander Mehler and Alexander Henlein. 2024. The Linguistic Interpretation of Non-emblematic Gestures Must be agreed in Dialogue: Combining Perceptual Classifiers and Grounding/Clarification Mechanisms. Proceedings of the 28th Workshop on The Semantics and Pragmatics of Dialogue.
    BibTeX
    @inproceedings{Luecking:Mehler:Henlein:2024-classifier,
      title     = {The Linguistic Interpretation of Non-emblematic Gestures Must
                   be agreed in Dialogue: Combining Perceptual Classifiers and Grounding/Clarification
                   Mechanisms},
      author    = {Lücking, Andy and Mehler, Alexander and Henlein, Alexander},
      year      = {2024},
      booktitle = {Proceedings of the 28th Workshop on The Semantics and Pragmatics of Dialogue},
      series    = {SemDial'24 -- TrentoLogue},
      location  = {Università di Trento, Palazzo Piomarta, Rovereto},
      url       = {https://www.semdial.org/anthology/papers/Z/Z24/Z24-4031/},
      pdf       = {http://semdial.org/anthology/Z24-Lucking_semdial_0031.pdf}
    }
    Dominik Mattern, Wahed Hemati, Andy Lücking and Alexander Mehler. Sep., 2024. On German verb sense disambiguation: A three-part approach based on linking a sense inventory (GermaNet) to a corpus through annotation (TGVCorp) and using the corpus to train a VSD classifier (TTvSense). Journal of Language Modelling, 12(1):155–212.
    BibTeX
    @article{Mattern:Hemati:Lücking:Mehler:2024,
      author    = {Mattern, Dominik and Hemati, Wahed and Lücking, Andy and Mehler, Alexander},
      title     = {On German verb sense disambiguation: A three-part approach based
                   on linking a sense inventory (GermaNet) to a corpus through annotation
                   (TGVCorp) and using the corpus to train a VSD classifier (TTvSense)},
      abstractnote = {We develop a three-part approach to Verb Sense Disambiguation (VSD) in German. After considering a set of lexical resources and corpora, we arrive at a statistically motivated selection of a subset of verbs and their senses from GermaNet. This sub-inventory is then used to disambiguate the occurrences of the corresponding verbs in a corpus resulting from the union of TüBa-D/Z, Salsa, and E-VALBU. The corpus annotated in this way is called TGVCorp. It is used in the third part of the paper for training a classifier for VSD and for its comparative evaluation with a state-of-the-art approach in this research area, namely EWISER. Our simple classifier outperforms the transformer-based approach on the same data in both accuracy and speed in German but not in English and we discuss possible reasons.},
      journal   = {Journal of Language Modelling},
      volume    = {12},
      number    = {1},
      year      = {2024},
      month     = {Sep.},
      pages     = {155–212},
      url       = {https://jlm.ipipan.waw.pl/index.php/JLM/article/view/356}
    }
    Kevin Bönisch and Alexander Mehler. 2024. Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval via Bagging and SVR Ensembles. Proceedings of the 2nd Legal Information Retrieval meets Artificial Intelligence Workshop LIRAI 2024. accepted.
    BibTeX
    @inproceedings{Boenisch:Mehler:2024,
      title     = {Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval
                   via Bagging and SVR Ensembles},
      author    = {B\"{o}nisch, Kevin and Mehler, Alexander},
      year      = {2024},
      booktitle = {Proceedings of the 2nd Legal Information Retrieval meets Artificial
                   Intelligence Workshop LIRAI 2024},
      location  = {Poznan, Poland},
      publisher = {CEUR-WS.org},
      address   = {Aachen, Germany},
      series    = {CEUR Workshop Proceedings},
      note      = {accepted},
      abstract  = {We introduce a retrieval approach leveraging Support Vector Regression
                   (SVR) ensembles, bootstrap aggregation (bagging), and embedding
                   spaces on the German Dataset for Legal Information Retrieval (GerDaLIR).
                   By conceptualizing the retrieval task in terms of multiple binary
                   needle-in-a-haystack subtasks, we show improved recall over the
                   baselines (0.849 > 0.803 | 0.829) using our voting ensemble, suggesting
                   promising initial results, without training or fine-tuning any
                   deep learning models. Our approach holds potential for further
                   enhancement, particularly through refining the encoding models
                   and optimizing hyperparameters.},
      keywords  = {legal information retrieval, support vector regression, word embeddings, bagging ensemble}
    }
    Patrick Schrottenbacher, Alexander Mehler, Theresa Berg, Jasper Hustedt, Julian Gagel, Timo Lüttig and Giuseppe Abrami. 2024. Geo-spatial hypertext in virtual reality: mapping and navigating global news event spaces. New Review of Hypermedia and Multimedia, 0(0):1–30.
    BibTeX
    @article{Schrottenbacher:et:al:2024,
      author    = {Schrottenbacher, Patrick and Mehler, Alexander and Berg, Theresa
                   and Hustedt, Jasper and Gagel, Julian and Lüttig, Timo and Abrami, Giuseppe},
      title     = {Geo-spatial hypertext in virtual reality: mapping and navigating
                   global news event spaces},
      journal   = {New Review of Hypermedia and Multimedia},
      volume    = {0},
      number    = {0},
      pages     = {1--30},
      year      = {2024},
      publisher = {Taylor \& Francis},
      doi       = {10.1080/13614568.2024.2383601},
      url       = {https://doi.org/10.1080/13614568.2024.2383601},
      eprint    = {https://doi.org/10.1080/13614568.2024.2383601},
      abstract  = {Every day, a myriad of events take place that are documented and
                   shared online through news articles from a variety of sources.
                   As a result, as users navigate the Web, the volume of data can
                   lead to information overload, making it difficult to find specific
                   details about an event. We present News in Time and Space (NiTS)
                   to address this issue: NiTS is a fully immersive system integrated
                   into Va.Si.Li-Lab that organises textual information in a geospatial
                   hypertext system in virtual reality. With NiTS, users can visualise,
                   filter and interact with information currently based on GDELT
                   on a virtual globe providing document networks to analyse global
                   events and trends. The article describes NiTS, its event semantics
                   and architecture. It evaluates NiTS in comparison to a classic
                   search engine website, extended by NiTSs information filtering
                   capabilities to make it comparable. Our comparison with this website
                   technology, which is directly linked to the user's usage habits,
                   shows that NiTS enables comparable information exploration even
                   if the users have little or no experience with VR. That is, we
                   observe an equivalent search result behaviour, but with the advantage
                   that VR allows users to get their results with a higher level
                   of usability without distracting them from their tasks. Through
                   its integration with Va.Si.Li-Lab, a simulation-based learning
                   environment, NiTS can be used in simulations of learning processes
                   aimed at studying critical online reasoning, where Va.Si.Li-Lab
                   guarantees that this can be done in relation to individual or
                   groups of learners.}
    }
    Kevin Bönisch, Alexander Mehler, Shaduan Babbili, Yannick Heinrich, Philipp Stephan and Giuseppe Abrami. 2024. Viki LibraRy: Collaborative Hypertext Browsing and Navigation in Virtual Reality. New Review of Hypermedia and Multimedia, 0(0):1–31.
    BibTeX
    @article{Boenisch:et:al:2024:b,
      author    = {B\"{o}nisch, Kevin and Mehler, Alexander and Babbili, Shaduan
                   and Heinrich, Yannick and Stephan, Philipp and Abrami, Giuseppe},
      abstract  = {We present Viki LibraRy, a dynamically built library in virtual
                   reality (VR) designed to visualize hypertext systems, with an
                   emphasis on collaborative interaction and spatial immersion. Viki
                   LibraRy goes beyond traditional methods of text distribution by
                   providing a platform where users can share, process, and engage
                   with textual information. It operates at the interface of VR,
                   collaborative learning and spatial data processing to make reading
                   tangible and memorable in a spatially mediated way. The article
                   describes the building blocks of Viki LibraRy, its underlying
                   architecture, and several use cases. It evaluates Viki LibraRy
                   in comparison to a conventional web interface for text retrieval
                   and reading. The article shows that Viki LibraRy provides users
                   with spatial references for structuring their recall, so that
                   they can better remember consulted texts and their meta-information
                   (e.g. in terms of subject areas and content categories)},
      title     = {{Viki LibraRy: Collaborative Hypertext Browsing and Navigation
                   in Virtual Reality}},
      journal   = {New Review of Hypermedia and Multimedia},
      volume    = {0},
      number    = {0},
      pages     = {1--31},
      year      = {2024},
      publisher = {Taylor \& Francis},
      doi       = {10.1080/13614568.2024.2383581},
      url       = {https://doi.org/10.1080/13614568.2024.2383581},
      eprint    = {https://doi.org/10.1080/13614568.2024.2383581}
    }
    Kevin Bönisch, Manuel Stoeckel and Alexander Mehler. 2024. HyperCausal: Visualizing Causal Inference in 3D Hypertext. Proceedings of the 35th ACM Conference on Hypertext and Social Media, 330––336.
    BibTeX
    @inproceedings{Boenisch:et:al:2024,
      author    = {B\"{o}nisch, Kevin and Stoeckel, Manuel and Mehler, Alexander},
      title     = {HyperCausal: Visualizing Causal Inference in 3D Hypertext},
      year      = {2024},
      isbn      = {9798400705953},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3648188.3677049},
      doi       = {10.1145/3648188.3677049},
      abstract  = {We present HyperCausal, a 3D hypertext visualization framework
                   for exploring causal inference in generative Large Language Models
                   (LLMs). HyperCausal maps the generative processes of LLMs into
                   spatial hypertexts, where tokens are represented as nodes connected
                   by probability-weighted edges. The edges are weighted by the prediction
                   scores of next tokens, depending on the underlying language model.
                   HyperCausal facilitates navigation through the causal space of
                   the underlying LLM, allowing users to explore predicted word sequences
                   and their branching. Through comparative analysis of LLM parameters
                   such as token probabilities and search algorithms, HyperCausal
                   provides insight into model behavior and performance. Implemented
                   using the Hugging Face transformers library and Three.js, HyperCausal
                   ensures cross-platform accessibility to advance research in natural
                   language processing using concepts from hypertext research. We
                   demonstrate several use cases of HyperCausal and highlight the
                   potential for detecting hallucinations generated by LLMs using
                   this framework. The connection with hypertext research arises
                   from the fact that HyperCausal relies on user interaction to unfold
                   graphs with hierarchically appearing branching alternatives in
                   3D space. This approach refers to spatial hypertexts and early
                   concepts of hierarchical hypertext structures. A third connection
                   concerns hypertext fiction, since the branching alternatives mediated
                   by HyperCausal manifest non-linearly organized reading threads
                   along artificially generated texts that the user decides to follow
                   optionally depending on the reading context.},
      booktitle = {Proceedings of the 35th ACM Conference on Hypertext and Social Media},
      pages     = {330–-336},
      numpages  = {7},
      keywords  = {3D hypertext, large language models, visualization},
      location  = {Poznan, Poland},
      series    = {HT '24},
      video     = {https://www.youtube.com/watch?v=ANHFTupnKhI}
    }
    Daniel Baumartz, Maxim Konca, Alexander Mehler, Patrick Schrottenbacher and Dominik Braunheim. 2024. Measuring Group Creativity of Dialogic Interaction Systems by Means of Remote Entailment Analysis. Proceedings of the 35th ACM Conference on Hypertext and Social Media, 153––166.
    BibTeX
    @inproceedings{Baumartz:et:al:2024,
      author    = {Baumartz, Daniel and Konca, Maxim and Mehler, Alexander and Schrottenbacher, Patrick
                   and Braunheim, Dominik},
      title     = {Measuring Group Creativity of Dialogic Interaction Systems by
                   Means of Remote Entailment Analysis},
      year      = {2024},
      isbn      = {9798400705953},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3648188.3675140},
      doi       = {10.1145/3648188.3675140},
      abstract  = {We present a procedure for assessing group creativity that allows
                   us to compare the contributions of human interlocutors and chatbots
                   based on generative AI such as ChatGPT. We focus on everyday creativity
                   in terms of dialogic communication and test four hypotheses about
                   the difference between human and artificial communication. Our
                   procedure is based on a test that requires interlocutors to cooperatively
                   interpret a sequence of sentences for which we control for coherence
                   gaps with reference to the notion of entailment. Using NLP methods,
                   we automatically evaluate the spoken or written contributions
                   of interlocutors (human or otherwise). The paper develops a routine
                   for automatic transcription based on Whisper, for sampling texts
                   based on their entailment relations, for analyzing dialogic contributions
                   along their semantic embeddings, and for classifying interlocutors
                   and interaction systems based on them. In this way, we highlight
                   differences between human and artificial conversations under conditions
                   that approximate free dialogic communication. We show that despite
                   their obvious classificatory differences, it is difficult to see
                   clear differences even in the domain of dialogic communication
                   given the current instruments of NLP.},
      booktitle = {Proceedings of the 35th ACM Conference on Hypertext and Social Media},
      pages     = {153–-166},
      numpages  = {14},
      keywords  = {Creative AI, Creativity, Generative AI, Hermeneutics, NLP},
      location  = {Poznan, Poland},
      series    = {HT '24}
    }
    Giuseppe Abrami, Dominik Alexander Wontke, Gurpreet Singh and Alexander Mehler. 2024. Va.Si.Li-ES: VR-based Dynamic Event Processing, Environment Change and User Feedback in Va.Si.Li-Lab. Proceedings of the 35th ACM Conference on Hypertext and Social Media, 357––368.
    BibTeX
    @inproceedings{Abrami:et:al:2024:b,
      author    = {Abrami, Giuseppe and Wontke, Dominik Alexander and Singh, Gurpreet
                   and Mehler, Alexander},
      title     = {Va.Si.Li-ES: VR-based Dynamic Event Processing, Environment Change
                   and User Feedback in Va.Si.Li-Lab},
      year      = {2024},
      isbn      = {9798400705953},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3648188.3675154},
      doi       = {10.1145/3648188.3675154},
      abstract  = {Flexibility, adaptability, modularity, and extensibility in the
                   context of a collaborative system are critical features for multi-user
                   hypertext systems. In addition to facilitating acceptance and
                   increasing reusability, these features simplify development cycles
                   and enable a larger range of application areas. However, especially
                   in virtual 3D hypertext systems, many of the features are only
                   partially available or not available at all. To fill this gap,
                   we present an approach to virtual hypertext systems for the realization
                   of dynamic event systems. Such an event system can be created
                   and serialized simultaneously at run time regarding the modification
                   of situational, environmental parameters. This includes informing
                   users and allowing them to participate in the environmental dynamics
                   of the system. We present Va.Si.Li-ES as a module of Va.Si.Li-Lab,
                   describe several environmental scenarios that can be adapted,
                   and provide use cases in the context of 3D hypertext systems.},
      booktitle = {Proceedings of the 35th ACM Conference on Hypertext and Social Media},
      pages     = {357–-368},
      numpages  = {12},
      keywords  = {Collaborative Simulation, Environmental Event System, Hypertext, Ubiq, Va.Si.Li-Lab, Virtual Reality},
      location  = {Poznan, Poland},
      series    = {HT '24}
    }
    Alexander Henlein, Anastasia Bauer, Reetu Bhattacharjee, Aleksandra Ćwiek, Alina Gregori, Frank Kügler, Jens Lemanski, Andy Lücking, Alexander Mehler, Pilar Prieto, Paula G. Sánchez-Ramón, Job Schepens, Martin Schulte-Rüther, Stefan R. Schweinberger and Celina I. von Eiff. 2024. An Outlook for AI Innovation in Multimodal Communication Research. Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management., 182–234.
    BibTeX
    @inproceedings{Henlein:et:al:2024-vicom,
      title     = {An Outlook for AI Innovation in Multimodal Communication Research},
      author    = {Henlein, Alexander and Bauer, Anastasia and Bhattacharjee, Reetu
                   and Ćwiek, Aleksandra and Gregori, Alina and Kügler, Frank and Lemanski, Jens
                   and Lücking, Andy and Mehler, Alexander and Prieto, Pilar and Sánchez-Ramón, Paula G.
                   and Schepens, Job and Schulte-Rüther, Martin and Schweinberger, Stefan R.
                   and von Eiff, Celina I.},
      editor    = {Duffy, Vincent G.},
      year      = {2024},
      booktitle = {Digital Human Modeling and Applications in Health, Safety, Ergonomics
                   and Risk Management.},
      series    = {HCII 2024. Lecture Notes in Computer Science},
      publisher = {Springer},
      address   = {Cham},
      pages     = {182--234},
      isbn      = {978-3-031-61066-0}
    }
    Giuseppe Abrami and Alexander Mehler. August, 2024. Efficient, uniform and scalable parallel NLP pre-processing with DUUI: Perspectives and Best Practice for the Digital Humanities. Digital Humanities Conference 2024 - Book of Abstracts (DH 2024), 15–18.
    BibTeX
    @inproceedings{Abrami:Mehler:2024,
      author    = {Abrami, Giuseppe and Mehler, Alexander},
      title     = {Efficient, uniform and scalable parallel NLP pre-processing with
                   DUUI: Perspectives and Best Practice for the Digital Humanities},
      year      = {2024},
      month     = {08},
      editor    = {Karajgikar, Jajwalya and Janco, Andrew and Otis, Jessica},
      booktitle = {Digital Humanities Conference 2024 - Book of Abstracts (DH 2024)},
      location  = {Washington, DC, USA},
      series    = {DH},
      keywords  = {duui},
      publisher = {Zenodo},
      doi       = {10.5281/zenodo.13761079},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2024/12/DH2024_Poster.pdf},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2024/12/DH2024_Abstract.pdf},
      url       = {https://doi.org/10.5281/zenodo.13761079},
      pages     = {15--18},
      numpages  = {4}
    }
    Andy Lücking, Giuseppe Abrami, Leon Hammerla, Marc Rahn, Daniel Baumartz, Steffen Eger and Alexander Mehler. May, 2024. Dependencies over Times and Tools (DoTT). Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 4641–4653.
    BibTeX
    @inproceedings{Luecking:et:al:2024,
      abstract  = {Purpose: Based on the examples of English and German, we investigate
                   to what extent parsers trained on modern variants of these languages
                   can be transferred to older language levels without loss. Methods:
                   We developed a treebank called DoTT (https://github.com/texttechnologylab/DoTT)
                   which covers, roughly, the time period from 1800 until today,
                   in conjunction with the further development of the annotation
                   tool DependencyAnnotator. DoTT consists of a collection of diachronic
                   corpora enriched with dependency annotations using 3 parsers,
                   6 pre-trained language models, 5 newly trained models for German,
                   and two tag sets (TIGER and Universal Dependencies). To assess
                   how the different parsers perform on texts from different time
                   periods, we created a gold standard sample as a benchmark. Results:
                   We found that the parsers/models perform quite well on modern
                   texts (document-level LAS ranging from 82.89 to 88.54) and slightly
                   worse on older texts, as expected (average document-level LAS
                   84.60 vs. 86.14), but not significantly. For German texts, the
                   (German) TIGER scheme achieved slightly better results than UD.
                   Conclusion: Overall, this result speaks for the transferability
                   of parsers to past language levels, at least dating back until
                   around 1800. This very transferability, it is however argued,
                   means that studies of language change in the field of dependency
                   syntax can draw on dependency distance but miss out on some grammatical
                   phenomena.},
      address   = {Torino, Italy},
      author    = {L{\"u}cking, Andy and Abrami, Giuseppe and Hammerla, Leon and Rahn, Marc
                   and Baumartz, Daniel and Eger, Steffen and Mehler, Alexander},
      booktitle = {Proceedings of the 2024 Joint International Conference on Computational
                   Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
      editor    = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro
                   and Sakti, Sakriani and Xue, Nianwen},
      month     = {may},
      pages     = {4641--4653},
      publisher = {ELRA and ICCL},
      title     = {Dependencies over Times and Tools ({D}o{TT})},
      url       = {https://aclanthology.org/2024.lrec-main.415},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2024/05/LREC_2024_Poster_DoTT.pdf},
      year      = {2024}
    }
    Maxim Konca, Andy Lücking and Alexander Mehler. May, 2024. German SRL: Corpus Construction and Model Training. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 7717–7727.
    BibTeX
    @inproceedings{Konca:et:al:2024,
      abstract  = {A useful semantic role-annotated resource for training semantic
                   role models for the German language is missing. We point out some
                   problems of previous resources and provide a new one due to a
                   combined translation and alignment process: The gold standard
                   CoNLL-2012 semantic role annotations are translated into German.
                   Semantic role labels are transferred due to alignment models.
                   The resulting dataset is used to train a German semantic role
                   model. With F1-scores around 0.7, the major roles achieve competitive
                   evaluation scores, but avoid limitations of previous approaches.
                   The described procedure can be applied to other languages as well.},
      address   = {Torino, Italy},
      author    = {Konca, Maxim and L{\"u}cking, Andy and Mehler, Alexander},
      booktitle = {Proceedings of the 2024 Joint International Conference on Computational
                   Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
      editor    = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro
                   and Sakti, Sakriani and Xue, Nianwen},
      month     = {may},
      pages     = {7717--7727},
      publisher = {ELRA and ICCL},
      title     = {{G}erman {SRL}: Corpus Construction and Model Training},
      url       = {https://aclanthology.org/2024.lrec-main.682},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2024/05/LREC_2024_Poster_GERMAN_SRL.pdf},
      year      = {2024}
    }
    Giuseppe Abrami, Mevlüt Bagci and Alexander Mehler. 2024. German Parliamentary Corpus (GerParCor) Reloaded. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 7707–7716.
    BibTeX
    @inproceedings{Abrami:et:al:2024:a,
      abstract  = {In 2022, the largest German-speaking corpus of parliamentary protocols
                   from three different centuries, on a national and federal level
                   from the countries of Germany, Austria, Switzerland and Liechtenstein,
                   was collected and published - GerParCor. Through GerParCor, it
                   became possible to provide for the first time various parliamentary
                   protocols which were not available digitally and, moreover, could
                   not be retrieved and processed in a uniform manner. Furthermore,
                   GerParCor was additionally preprocessed using NLP methods and
                   made available in XMI format. In this paper, GerParCor is significantly
                   updated by including all new parliamentary protocols in the corpus,
                   as well as adding and preprocessing further parliamentary protocols
                   previously not covered, so that a period up to 1797 is now covered.
                   Besides the integration of a new, state-of-the-art and appropriate
                   NLP preprocessing for the handling of large text corpora, this
                   update also provides an overview of the further reuse of GerParCor
                   by presenting various provisioning capabilities such as API's,
                   among others.},
      address   = {Torino, Italy},
      author    = {Abrami, Giuseppe and Bagci, Mevl{\"u}t and Mehler, Alexander},
      booktitle = {Proceedings of the 2024 Joint International Conference on Computational
                   Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
      editor    = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro
                   and Sakti, Sakriani and Xue, Nianwen},
      pages     = {7707--7716},
      publisher = {ELRA and ICCL},
      title     = {{G}erman Parliamentary Corpus ({G}er{P}ar{C}or) Reloaded},
      url       = {https://aclanthology.org/2024.lrec-main.681},
      pdf       = {https://aclanthology.org/2024.lrec-main.681.pdf},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2024/05/GerParCor_Reloaded_Poster.pdf},
      video     = {https://www.youtube.com/watch?v=5X-w_oXOAYo},
      keywords  = {gerparcor,corpus},
      year      = {2024}
    }

    2023

    Alina Gregori, Federica Amici, Ingmar Brilmayer, Aleksandra Ćwiek, Lennart Fritzsche, Susanne Fuchs, Alexander Henlein, Oliver Herbort, Frank Kügler, Jens Lemanski, Katja Liebal, Andy Lücking, Alexander Mehler, Kim Tien Nguyen, Wim Pouw, Pilar Prieto, Patrick Louis Rohrer, Paula G. Sánchez-Ramón, Martin Schulte-Rüther, Petra B. Schumacher, Stefan R. Schweinberger, Volker Struckmeier, Patrick C. Trettenbrein and Celina I. von Eiff. 2023. A Roadmap for Technological Innovation in Multimodal Communication Research. Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, 402–438.
    BibTeX
    @inproceedings{Gregori:et:al:2023-vicom,
      author    = {Gregori, Alina and Amici, Federica and Brilmayer, Ingmar and {\'{C}}wiek, Aleksandra
                   and Fritzsche, Lennart and Fuchs, Susanne and Henlein, Alexander and Herbort, Oliver
                   and K{\"u}gler, Frank and Lemanski, Jens and Liebal, Katja and L{\"u}cking, Andy
                   and Mehler, Alexander and Nguyen, Kim Tien and Pouw, Wim and Prieto, Pilar
                   and Rohrer, Patrick Louis and S{\'a}nchez-Ram{\'o}n, Paula G. and Schulte-R{\"u}ther, Martin
                   and Schumacher, Petra B. and Schweinberger, Stefan R. and Struckmeier, Volker
                   and Trettenbrein, Patrick C. and von Eiff, Celina I.},
      editor    = {Duffy, Vincent G.},
      title     = {A Roadmap for Technological Innovation in Multimodal Communication Research},
      booktitle = {Digital Human Modeling and Applications in Health, Safety, Ergonomics
                   and Risk Management},
      year      = {2023},
      publisher = {Springer Nature Switzerland},
      address   = {Cham},
      pages     = {402--438},
      abstract  = {Multimodal communication research focuses on how different means
                   of signalling coordinate to communicate effectively. This line
                   of research is traditionally influenced by fields such as cognitive
                   and neuroscience, human-computer interaction, and linguistics.
                   With new technologies becoming available in fields such as natural
                   language processing and computer vision, the field can increasingly
                   avail itself of new ways of analyzing and understanding multimodal
                   communication. As a result, there is a general hope that multimodal
                   research may be at the ``precipice of greatness'' due to technological
                   advances in computer science and resulting extended empirical
                   coverage. However, for this to come about there must be sufficient
                   guidance on key (theoretical) needs of innovation in the field
                   of multimodal communication. Absent such guidance, the research
                   focus of computer scientists might increasingly diverge from crucial
                   issues in multimodal communication. With this paper, we want to
                   further promote interaction between these fields, which may enormously
                   benefit both communities. The multimodal research community (represented
                   here by a consortium of researchers from the Visual Communication
                   [ViCom] Priority Programme) can engage in the innovation by clearly
                   stating which technological tools are needed to make progress
                   in the field of multimodal communication. In this article, we
                   try to facilitate the establishment of a much needed common ground
                   on feasible expectations (e.g., in terms of terminology and measures
                   to be able to train machine learning algorithms) and to critically
                   reflect possibly idle hopes for technical advances, informed by
                   recent successes and challenges in computer science, social signal
                   processing, and related domains.},
      isbn      = {978-3-031-35748-0},
      pdf       = {https://pure.mpg.de/rest/items/item_3511464_5/component/file_3520176/content}
    }
    Kevin Bönisch, Giuseppe Abrami, Sabine Wehnert and Alexander Mehler. 2023. Bundestags-Mine: Natural Language Processing for Extracting Key Information from Government Documents. Legal Knowledge and Information Systems.
    BibTeX
    @inproceedings{Boenisch:et:al:2023,
      title     = {{Bundestags-Mine}: Natural Language Processing for Extracting
                   Key Information from Government Documents},
      isbn      = {9781643684734},
      issn      = {1879-8314},
      url       = {http://dx.doi.org/10.3233/FAIA230996},
      doi       = {10.3233/faia230996},
      booktitle = {Legal Knowledge and Information Systems},
      publisher = {IOS Press},
      author    = {B\"{o}nisch, Kevin and Abrami, Giuseppe and Wehnert, Sabine and Mehler, Alexander},
      year      = {2023}
    }
    Alexander Leonhardt, Giuseppe Abrami, Daniel Baumartz and Alexander Mehler. 2023. Unlocking the Heterogeneous Landscape of Big Data NLP with DUUI. Findings of the Association for Computational Linguistics: EMNLP 2023, 385–399.
    BibTeX
    @inproceedings{Leonhardt:et:al:2023,
      title     = {Unlocking the Heterogeneous Landscape of Big Data {NLP} with {DUUI}},
      author    = {Leonhardt, Alexander and Abrami, Giuseppe and Baumartz, Daniel
                   and Mehler, Alexander},
      editor    = {Bouamor, Houda and Pino, Juan and Bali, Kalika},
      booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2023},
      year      = {2023},
      address   = {Singapore},
      publisher = {Association for Computational Linguistics},
      url       = {https://aclanthology.org/2023.findings-emnlp.29},
      pages     = {385--399},
      pdf       = {https://aclanthology.org/2023.findings-emnlp.29.pdf},
      abstract  = {Automatic analysis of large corpora is a complex task, especially
                   in terms of time efficiency. This complexity is increased by the
                   fact that flexible, extensible text analysis requires the continuous
                   integration of ever new tools. Since there are no adequate frameworks
                   for these purposes in the field of NLP, and especially in the
                   context of UIMA, that are not outdated or unusable for security
                   reasons, we present a new approach to address the latter task:
                   Docker Unified UIMA Interface (DUUI), a scalable, flexible, lightweight,
                   and feature-rich framework for automatic distributed analysis
                   of text corpora that leverages Big Data experience and virtualization
                   with Docker. We evaluate DUUI{'}s communication approach against
                   a state-of-the-art approach and demonstrate its outstanding behavior
                   in terms of time efficiency, enabling the analysis of big text
                   data.},
      keywords  = {duui}
    }
    Alexander Henlein, Andy Lücking, Mevlüt Bagci and Alexander Mehler. 2023. Towards grounding multimodal semantics in interaction data with Va.Si.Li-Lab. Proceedings of the 8th Conference on Gesture and Speech in Interaction (GESPIN).
    BibTeX
    @inproceedings{Henlein:et:al:2023c,
      title     = {Towards grounding multimodal semantics in interaction data with Va.Si.Li-Lab},
      author    = {Henlein, Alexander and Lücking, Andy and Bagci, Mevlüt and Mehler, Alexander},
      booktitle = {Proceedings of the 8th Conference on Gesture and Speech in Interaction (GESPIN)},
      location  = {Nijmegen, Netherlands},
      year      = {2023},
      keywords  = {vasililab},
      pdf       = {https://www.gespin2023.nl/documents/talks_and_posters/GeSpIn_2023_papers/GeSpIn_2023_paper_1692.pdf}
    }
    Shaduan Babbili, Kevin Bönisch, Yannick Heinrich, Philipp Stephan, Giuseppe Abrami and Alexander Mehler. 2023. Viki LibraRy: A Virtual Reality Library for Collaborative Browsing and Navigation through Hypertext. Proceedings of the 34th ACM Conference on Hypertext and Social Media.
    BibTeX
    @inproceedings{Babbili:et:al:2023,
      author    = {Babbili, Shaduan and B\"{o}nisch, Kevin and Heinrich, Yannick
                   and Stephan, Philipp and Abrami, Giuseppe and Mehler, Alexander},
      title     = {Viki LibraRy: A Virtual Reality Library for Collaborative Browsing
                   and Navigation through Hypertext},
      year      = {2023},
      isbn      = {9798400702327},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3603163.3609079},
      doi       = {10.1145/3603163.3609079},
      abstract  = {We present Viki LibraRy, a virtual-reality-based system for generating
                   and exploring online information as a spatial hypertext. It creates
                   a virtual library based on Wikipedia in which Rooms are used to
                   make data available via a RESTful backend. In these Rooms, users
                   can browse through all articles of the corresponding Wikipedia
                   category in the form of Books. In addition, users can access different
                   Rooms, through virtual portals. Beyond that, the explorations
                   can be done alone or collaboratively, using Ubiq.},
      booktitle = {Proceedings of the 34th ACM Conference on Hypertext and Social Media},
      articleno = {6},
      numpages  = {3},
      keywords  = {virtual reality simulation, virtual reality, virtual hypertext, virtual museum},
      location  = {Rome, Italy},
      series    = {HT '23},
      pdf       = {https://dl.acm.org/doi/pdf/10.1145/3603163.3609079}
    }
    Julian Gagel, Jasper Hustedt, Timo Lüttig, Theresa Berg, Giuseppe Abrami and Alexander Mehler. 2023. News in Time and Space: Global Event Exploration in Virtual Reality. Proceedings of the 34th ACM Conference on Hypertext and Social Media.
    BibTeX
    @inproceedings{Gagel:et:al:2023,
      author    = {Gagel, Julian and Hustedt, Jasper and L\"{u}ttig, Timo and Berg, Theresa
                   and Abrami, Giuseppe and Mehler, Alexander},
      title     = {News in Time and Space: Global Event Exploration in Virtual Reality},
      year      = {2023},
      isbn      = {9798400702327},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3603163.3609080},
      doi       = {10.1145/3603163.3609080},
      abstract  = {We present News in Time and Space (NiTS), a virtual reality application
                   for visualization, filtering and interaction with geo-referenced
                   events based on GDELT. It can be used both via VR glasses and
                   as a desktop solution for shared use by multiple users with Ubiq.
                   The aim of NiTS is to provide overviews of global events and trends
                   in order to create a resource for their monitoring and analysis.},
      booktitle = {Proceedings of the 34th ACM Conference on Hypertext and Social Media},
      articleno = {7},
      numpages  = {3},
      keywords  = {virtual hypertext, human data interaction, spatial computing, virtual reality simulation, geographic information systems, virtual reality},
      location  = {Rome, Italy},
      series    = {HT '23},
      pdf       = {https://dl.acm.org/doi/pdf/10.1145/3603163.3609080}
    }
    Giuseppe Abrami, Alexander Mehler, Mevlüt Bagci, Patrick Schrottenbacher, Alexander Henlein, Christian Spiekermann, Juliane Engel and Jakob Schreiber. 2023. Va.Si.Li-Lab as a Collaborative Multi-User Annotation Tool in Virtual Reality and Its Potential Fields of Application. Proceedings of the 34th ACM Conference on Hypertext and Social Media.
    BibTeX
    @inproceedings{Abrami:et:al:2023,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Bagci, Mevl\"{u}t and Schrottenbacher, Patrick
                   and Henlein, Alexander and Spiekermann, Christian and Engel, Juliane
                   and Schreiber, Jakob},
      title     = {Va.Si.Li-Lab as a Collaborative Multi-User Annotation Tool in
                   Virtual Reality and Its Potential Fields of Application},
      year      = {2023},
      isbn      = {9798400702327},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3603163.3609076},
      doi       = {10.1145/3603163.3609076},
      abstract  = {During the last thirty years a variety of hypertext approaches
                   and virtual environments -- some virtual hypertext environments
                   -- have been developed and discussed. Although the development
                   of virtual and augmented reality technologies is rapid and improving,
                   and many technologies can be used at affordable conditions, their
                   usability for hypertext systems has not yet been explored. At
                   the same time, even for virtual three-dimensional virtual and
                   augmented environments, there is no generally accepted concept
                   that is similar or nearly as elegant as hypertext. This gap will
                   have to be filled in the next years and a good concept should
                   be developed; in this article we aim to contribute in this direction
                   and also introduce a prototype for a possible implementation of
                   criteria for virtual hypertext simulations.},
      booktitle = {Proceedings of the 34th ACM Conference on Hypertext and Social Media},
      articleno = {22},
      numpages  = {9},
      keywords  = {VaSiLiLab, virtual hypertext, virtual reality, virtual reality simulation, authoring system},
      location  = {Rome, Italy},
      series    = {HT '23},
      pdf       = {https://dl.acm.org/doi/pdf/10.1145/3603163.3609076}
    }
    Alexander Henlein, Anju Gopinath, Nikhil Krishnaswamy, Alexander Mehler and James Pustejovsky. 2023. Grounding human-object interaction to affordance behavior in multimodal datasets. Frontiers in Artificial Intelligence, 6.
    BibTeX
    @article{Henlein:et:al:2023a,
      author    = {Henlein, Alexander and Gopinath, Anju and Krishnaswamy, Nikhil
                   and Mehler, Alexander and Pustejovsky, James},
      doi       = {10.3389/frai.2023.1084740},
      issn      = {2624-8212},
      journal   = {Frontiers in Artificial Intelligence},
      title     = {Grounding human-object interaction to affordance behavior in multimodal datasets},
      url       = {https://www.frontiersin.org/articles/10.3389/frai.2023.1084740},
      volume    = {6},
      year      = {2023}
    }
    Alexander Henlein, Attila Kett, Daniel Baumartz, Giuseppe Abrami, Alexander Mehler, Johannes Bastian, Yannic Blecher, David Budgenhagen, Roman Christof, Tim-Oliver Ewald, Tim Fauerbach, Patrick Masny, Julian Mende, Paul Schnüre and Marc Viel. 2023. Semantic Scene Builder: Towards a Context Sensitive Text-to-3D Scene Framework. Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, 461–479.
    BibTeX
    @inproceedings{Henlein:et:al:2023b,
      author    = {Henlein, Alexander and Kett, Attila and Baumartz, Daniel and Abrami, Giuseppe
                   and Mehler, Alexander and Bastian, Johannes and Blecher, Yannic and Budgenhagen, David
                   and Christof, Roman and Ewald, Tim-Oliver and Fauerbach, Tim and Masny, Patrick
                   and Mende, Julian and Schn{\"u}re, Paul and Viel, Marc},
      editor    = {Duffy, Vincent G.},
      title     = {Semantic Scene Builder: Towards a Context Sensitive Text-to-3D Scene Framework},
      booktitle = {Digital Human Modeling and Applications in Health, Safety, Ergonomics
                   and Risk Management},
      year      = {2023},
      publisher = {Springer Nature Switzerland},
      address   = {Cham},
      pages     = {461--479},
      abstract  = {We introduce Semantic Scene Builder (SeSB), a VR-based text-to-3D
                   scene framework using SemAF (Semantic Annotation Framework) as
                   a scheme for annotating discourse structures. SeSB integrates
                   a variety of tools and resources by using SemAF and UIMA as a
                   unified data structure to generate 3D scenes from textual descriptions.
                   Based on VR, SeSB allows its users to change annotations through
                   body movements instead of symbolic manipulations: from annotations
                   in texts to corrections in editing steps to adjustments in generated
                   scenes, all this is done by grabbing and moving objects. We evaluate
                   SeSB in comparison with a state-of-the-art open source text-to-scene
                   method (the only one which is publicly available) and find that
                   our approach not only performs better, but also allows for modeling
                   a greater variety of scenes.},
      isbn      = {978-3-031-35748-0},
      doi       = {10.1007/978-3-031-35748-0_32}
    }
    Alexander Mehler, Mevlüt Bagci, Alexander Henlein, Giuseppe Abrami, Christian Spiekermann, Patrick Schrottenbacher, Maxim Konca, Andy Lücking, Juliane Engel, Marc Quintino, Jakob Schreiber, Kevin Saukel and Olga Zlatkin-Troitschanskaia. 2023. A Multimodal Data Model for Simulation-Based Learning with Va.Si.Li-Lab. Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, 539–565.
    BibTeX
    @inproceedings{Mehler:et:al:2023:a,
      abstract  = {Simulation-based learning is a method in which learners learn
                   to master real-life scenarios and tasks from simulated application
                   contexts. It is particularly suitable for the use of VR technologies,
                   as these allow immersive experiences of the targeted scenarios.
                   VR methods are also relevant for studies on online learning, especially
                   in groups, as they provide access to a variety of multimodal learning
                   and interaction data. However, VR leads to a trade-off between
                   technological conditions of the observability of such data and
                   the openness of learner behavior. We present Va.Si.Li-Lab, a VR-L
                   ab for Simulation-based Learn ing developed to address this trade-off.
                   Va.Si.Li-Lab uses a graph-theoretical model based on hypergraphs
                   to represent the data diversity of multimodal learning and interaction.
                   We develop this data model in relation to mono- and multimodal,
                   intra- and interpersonal data and interleave it with ISO-Space
                   to describe distributed multiple documents from the perspective
                   of their interactive generation. The paper adds three use cases
                   to motivate the broad applicability of Va.Si.Li-Lab and its data
                   model.},
      address   = {Cham},
      author    = {Mehler, Alexander and Bagci, Mevl{\"u}t and Henlein, Alexander
                   and Abrami, Giuseppe and Spiekermann, Christian and Schrottenbacher, Patrick
                   and Konca, Maxim and L{\"u}cking, Andy and Engel, Juliane and Quintino, Marc
                   and Schreiber, Jakob and Saukel, Kevin and Zlatkin-Troitschanskaia, Olga},
      booktitle = {Digital Human Modeling and Applications in Health, Safety, Ergonomics
                   and Risk Management},
      editor    = {Duffy, Vincent G.},
      isbn      = {978-3-031-35741-1},
      pages     = {539--565},
      publisher = {Springer Nature Switzerland},
      title     = {A Multimodal Data Model for Simulation-Based Learning with Va.Si.Li-Lab},
      year      = {2023},
      doi       = {10.1007/978-3-031-35741-1_39}
    }

    2022

    Cornelia Ebert, Andy Lücking and Alexander Mehler. 2022. Introduction to the 2nd Edition of “Semantic, Artificial and Computational Interaction Studies”. HCI International 2022 - Late Breaking Papers. Multimodality in Advanced Interaction Environments, 36–47.
    BibTeX
    @inproceedings{Ebert:et:al:2022,
      abstract  = {``Behavioromics'' is a term that has been invented to cover the
                   study of multimodal interaction from various disciplines and points
                   of view. These disciplines and points of view, however, lack a
                   platform for exchange. The workshop session on ``Semantic, artificial
                   and computational interaction studies'' provides such a platform.
                   We motivate behavioromics, sketch its historical background, and
                   summarize this year's contributions.},
      address   = {Cham},
      author    = {Ebert, Cornelia and L{\"u}cking, Andy and Mehler, Alexander},
      booktitle = {HCI International 2022 - Late Breaking Papers. Multimodality in
                   Advanced Interaction Environments},
      editor    = {Kurosu, Masaaki and Yamamoto, Sakae and Mori, Hirohiko and Schmorrow, Dylan D.
                   and Fidopiastis, Cali M. and Streitz, Norbert A. and Konomi, Shin'ichi},
      isbn      = {978-3-031-17618-0},
      pages     = {36--47},
      publisher = {Springer Nature Switzerland},
      title     = {Introduction to the 2nd Edition of ``Semantic, Artificial and
                   Computational Interaction Studies''},
      doi       = {https://doi.org/10.1007/978-3-031-17618-0_3},
      year      = {2022}
    }
    Sajawel Ahmed, Rob van der Goot, Misbahur Rehman, Carl Kruse, Ömer Özsoy, Alexander Mehler and Gemma Roig. October, 2022. Tafsir Dataset: A Novel Multi-Task Benchmark for Named Entity Recognition and Topic Modeling in Classical Arabic Literature. Proceedings of the 29th International Conference on Computational Linguistics, 3753–3768.
    BibTeX
    @inproceedings{Ahmed:et:al:2022,
      title     = {Tafsir Dataset: A Novel Multi-Task Benchmark for Named Entity
                   Recognition and Topic Modeling in Classical {A}rabic Literature},
      author    = {Ahmed, Sajawel and van der Goot, Rob and Rehman, Misbahur and Kruse, Carl
                   and {\"O}zsoy, {\"O}mer and Mehler, Alexander and Roig, Gemma},
      booktitle = {Proceedings of the 29th International Conference on Computational Linguistics},
      month     = {oct},
      year      = {2022},
      address   = {Gyeongju, Republic of Korea},
      publisher = {International Committee on Computational Linguistics},
      url       = {https://aclanthology.org/2022.coling-1.330},
      pages     = {3753--3768},
      abstract  = {Various historical languages, which used to be lingua franca of
                   science and arts, deserve the attention of current NLP research.
                   In this work, we take the first data-driven steps towards this
                   research line for Classical Arabic (CA) by addressing named entity
                   recognition (NER) and topic modeling (TM) on the example of CA
                   literature. We manually annotate the encyclopedic work of Tafsir
                   Al-Tabari with span-based NEs, sentence-based topics, and span-based
                   subtopics, thus creating the Tafsir Dataset with over 51,000 sentences,
                   the first large-scale multi-task benchmark for CA. Next, we analyze
                   our newly generated dataset, which we make open-source available,
                   with current language models (lightweight BiLSTM, transformer-based
                   MaChAmP) along a novel script compression method, thereby achieving
                   state-of-the-art performance for our target task CA-NER. We also
                   show that CA-TM from the perspective of historical topic models,
                   which are central to Arabic studies, is very challenging. With
                   this interdisciplinary work, we lay the foundations for future
                   research on automatic analysis of CA literature.}
    }
    Maxim Konca, Andy Lücking, Alexander Mehler, Marie-Theres Nagel and Olga Zlatkin-Troitschanskaia. April, 2022. Computational educational linguistics for `Critical Online Reasoning' among young professionals in medicine, law and teaching.
    BibTeX
    @misc{Konca:et:al:2022,
      author    = {Konca, Maxim and L{\"u}cking, Andy and Mehler, Alexander and Nagel, Marie-Theres
                   and Zlatkin-Troitschanskaia, Olga},
      howpublished = {Presentation given at the AERA annual meeting, 21.-26.04. 2022, WERA symposium},
      month     = {04},
      title     = {Computational educational linguistics for `Critical Online Reasoning'
                   among young professionals in medicine, law and teaching},
      year      = {2022},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2022/10/BRIDGE_WERA_AERA-2022_reduce.pdf}
    }
    Alexander Mehler, Maxim Konca, Marie-Theres Nagel, Andy Lücking and Olga Zlatkin-Troitschanskaia. March, 2022. On latent domain-specific textual preferences in solving Internet-based generic tasks among graduates/young professionals from three domains.
    BibTeX
    @misc{Mehler:et:al:2022,
      author    = {Mehler, Alexander and Konca, Maxim and Nagel, Marie-Theres and L\"{u}cking, Andy
                   and Zlatkin-Troitschanskaia, Olga},
      year      = {2022},
      month     = {03},
      howpublished = {Presentation at BEBF 2022},
      title     = {On latent domain-specific textual preferences in solving Internet-based
                   generic tasks among graduates/young professionals from three domains},
      abstract  = {Although Critical Online Reasoning (COR) is often viewed as a
                   general competency (e.g. Alexander et al. 2016), studies have
                   found evidence supporting their domain-specificity (Toplak et
                   al. 2002). To investigate this assumption, we focus on commonalities
                   and differences in textual preferences in solving COR-related
                   tasks between graduates/young professionals from three domains.
                   For this reason, we collected data by requiring participants to
                   solve domain-specific (DOM-COR) and generic (GEN-COR) tasks in
                   an authentic Internet-based COR performance assessment (CORA),
                   allowing us to disentangle the assumed components of COR abilities.
                   Here, we focus on GEN-COR to distinguish between different groups
                   of graduates from the three disciplines in the context of generic
                   COR tasks. We present a computational model for educationally
                   relevant texts that combines features at multiple levels (lexical,
                   syntactic, semantic). We use machine learning to predict domain-specific
                   group membership based on documents consulted during task solving.
                   A major contribution of our analyses is a multi-part text classification
                   system that contrasts human annotation and rating of the documents
                   used with a semi-automatic classification to predict the document
                   type of web pages. That is, we work with competing classifications
                   to support our findings. In this way, we develop a computational
                   linguistic model that correlates GEN-COR abilities with properties
                   of documents consulted for solving the GEN-COR tasks. Results
                   show that participants from different domains indeed inquire different
                   sets of online sources for the same task. Machine learning-based
                   classifications show that the distributional differences can be
                   reproduced by computational linguistic models.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2022/04/On_latent_domain-specific_textual_preferences_in_solving_Internet-based_generic_tasks_among_graduates__young_professionals_from_three_domains.pdf}
    }
    Alexander Henlein and Alexander Mehler. 2022. What do Toothbrushes do in the Kitchen? How Transformers Think our World is Structured. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 5791–5807.
    BibTeX
    @inproceedings{Henlein:Mehler:2022,
      title     = {What do Toothbrushes do in the Kitchen? How Transformers Think
                   our World is Structured},
      author    = {Henlein, Alexander and Mehler, Alexander},
      booktitle = {Proceedings of the 2022 Conference of the North American Chapter
                   of the Association for Computational Linguistics: Human Language
                   Technologies},
      year      = {2022},
      address   = {Seattle, United States},
      publisher = {Association for Computational Linguistics},
      url       = {https://aclanthology.org/2022.naacl-main.425},
      doi       = {10.18653/v1/2022.naacl-main.425},
      pages     = {5791--5807},
      abstract  = {Transformer-based models are now predominant in NLP.They outperform
                   approaches based on static models in many respects. This success
                   has in turn prompted research that reveals a number of biases
                   in the language models generated by transformers. In this paper
                   we utilize this research on biases to investigate to what extent
                   transformer-based language models allow for extracting knowledge
                   about object relations (X occurs in Y; X consists of Z; action
                   A involves using X).To this end, we compare contextualized models
                   with their static counterparts. We make this comparison dependent
                   on the application of a number of similarity measures and classifiers.
                   Our results are threefold:Firstly, we show that the models combined
                   with the different similarity measures differ greatly in terms
                   of the amount of knowledge they allow for extracting. Secondly,
                   our results suggest that similarity measures perform much worse
                   than classifier-based approaches. Thirdly, we show that, surprisingly,
                   static models perform almost as well as contextualized models
                   {--} in some cases even better.}
    }
    Giuseppe Abrami, Mevlüt Bagci, Leon Hammerla and Alexander Mehler. 2022. German Parliamentary Corpus (GerParCor). Proceedings of the Language Resources and Evaluation Conference, 1900–1906.
    BibTeX
    @inproceedings{Abrami:Bagci:Hammerla:Mehler:2022,
      author    = {Abrami, Giuseppe and Bagci, Mevlüt and Hammerla, Leon and Mehler, Alexander},
      editor    = {Calzolari, Nicoletta and B\'echet, Fr\'ed\'eric and Blache, Philippe
                   and Choukri, Khalid and Cieri, Christopher and Declerck, Thierry and Goggi, Sara
                   and Isahara, Hitoshi and Maegaard, Bente and Mariani, Joseph and Mazo, H\'el\`ene
                   and Odijk, Jan and Piperidis, Stelios},
      title     = {German Parliamentary Corpus (GerParCor)},
      booktitle = {Proceedings of the Language Resources and Evaluation Conference},
      year      = {2022},
      address   = {Marseille, France},
      publisher = {European Language Resources Association},
      pages     = {1900--1906},
      abstract  = {Parliamentary debates represent a large and partly unexploited
                   treasure trove of publicly accessible texts. In the German-speaking
                   area, there is a certain deficit of uniformly accessible and annotated
                   corpora covering all German-speaking parliaments at the national
                   and federal level. To address this gap, we introduce the German
                   Parliamentary Corpus (GerParCor). GerParCor is a genre-specific
                   corpus of (predominantly historical) German-language parliamentary
                   protocols from three centuries and four countries, including state
                   and federal level data. In addition, GerParCor contains conversions
                   of scanned protocols and, in particular, of protocols in Fraktur
                   converted via an OCR process based on Tesseract. All protocols
                   were preprocessed by means of the NLP pipeline of spaCy3 and automatically
                   annotated with metadata regarding their session date. GerParCor
                   is made available in the XMI format of the UIMA project. In this
                   way, GerParCor can be used as a large corpus of historical texts
                   in the field of political communication for various tasks in NLP.},
      url       = {https://aclanthology.org/2022.lrec-1.202},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2022/06/GerParCor_LREC_2022.pdf},
      keywords  = {gerparcor},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.202.pdf}
    }
    Andy Lücking, Manuel Stoeckel, Giuseppe Abrami and Alexander Mehler. 2022. I still have Time(s): Extending HeidelTime for German Texts. Proceedings of the 13th Language Resources and Evaluation Conference.
    BibTeX
    @inproceedings{Luecking:Stoeckel:Abrami:Mehler:2022,
      author    = {L{\"u}cking, Andy and Stoeckel, Manuel and Abrami, Giuseppe and Mehler, Alexander},
      title     = {I still have Time(s): Extending {HeidelTime} for {German} Texts},
      booktitle = {Proceedings of the 13th Language Resources and Evaluation Conference},
      series    = {LREC 2022},
      location  = {Marseille, France},
      year      = {2022},
      url       = {https://aclanthology.org/2022.lrec-1.505},
      pdf       = {https://aclanthology.org/2022.lrec-1.505.pdf}
    }

    2021

    Maxim Konca, Alexander Mehler, Daniel Baumartz and Wahed Hemati. 2021. From distinguishability to informativity. A quantitative text model for detecting random texts.. Language and Text: Data, models, information and applications, 356:145–162.
    BibTeX
    @article{Konca:et:al:2021,
      title     = {From distinguishability to informativity. A quantitative text
                   model for detecting random texts.},
      author    = {Konca, Maxim and Mehler, Alexander and Baumartz, Daniel and Hemati, Wahed},
      journal   = {Language and Text: Data, models, information and applications},
      volume    = {356},
      pages     = {145--162},
      year      = {2021},
      editor    = {Adam Paw{\l}owski, Jan Ma{\v{c}}utek, Sheila Embleton and George Mikros},
      publisher = {John Benjamins Publishing Company},
      doi       = {10.1075/cilt.356.10kon}
    }
    Tatiana Lokot, Olga Abramov and Alexander Mehler. November, 2021. On the asymptotic behavior of the average geodesic distance L and the compactness CB of simple connected undirected graphs whose order approaches infinity. PLOS ONE, 16(11):1–13.
    BibTeX
    @article{Lokot:Abramov:Mehler:2021,
      doi       = {10.1371/journal.pone.0259776},
      author    = {Lokot, Tatiana and Abramov, Olga and Mehler, Alexander},
      journal   = {PLOS ONE},
      publisher = {Public Library of Science},
      title     = {On the asymptotic behavior of the average geodesic distance L
                   and the compactness CB of simple connected undirected graphs whose
                   order approaches infinity},
      year      = {2021},
      month     = {11},
      volume    = {16},
      url       = {https://doi.org/10.1371/journal.pone.0259776},
      pages     = {1-13},
      abstract  = {The average geodesic distance L Newman (2003) and the compactness
                   CB Botafogo (1992) are important graph indices in applications
                   of complex network theory to real-world problems. Here, for simple
                   connected undirected graphs G of order n, we study the behavior
                   of L(G) and CB(G), subject to the condition that their order |V(G)|
                   approaches infinity. We prove that the limit of L(G)/n and CB(G)
                   lies within the interval [0;1/3] and [2/3;1], respectively. Moreover,
                   for any not necessarily rational number β ∈ [0;1/3] (α ∈ [2/3;1])
                   we show how to construct the sequence of graphs {G}, |V(G)| =
                   n → ∞, for which the limit of L(G)/n (CB(G)) is exactly β (α)
                   (Theorems 1 and 2). Based on these results, our work points to
                   novel classification possibilities of graphs at the node level
                   as well as to the information-theoretic classification of the
                   structural complexity of graph indices.},
      number    = {11}
    }
    Alexander Mehler, Daniel Baumartz and Tolga Uslu. 2021. SemioGraphs: Visualizing Topic Networks as Mulit-Codal Graphs. International Quantitative Linguistics Conference (QUALICO 2021).
    BibTeX
    @inproceedings{Mehler:Uslu:Baumartz:2021,
      author    = {Mehler, Alexander and Baumartz, Daniel and Uslu, Tolga},
      title     = {{SemioGraphs:} Visualizing Topic Networks as Mulit-Codal Graphs},
      booktitle = {International Quantitative Linguistics Conference (QUALICO 2021)},
      series    = {QUALICO 2021},
      location  = {Tokyo, Japan},
      year      = {2021},
      poster    = {https://www.texttechnologylab.org/files/Qualico_2021_Semiograph_Poster.pdf}
    }
    Alexander Henlein, Giuseppe Abrami, Attila Kett, Christian Spiekermann and Alexander Mehler. 2021. Digital Learning, Teaching and Collaboration in an Era of ubiquitous Quarantine. Remote Learning in Times of Pandemic - Issues, Implications and Best Practice.
    BibTeX
    @incollection{Henlein:et:al:2021,
      author    = {Alexander Henlein and Giuseppe Abrami and Attila Kett and Christian Spiekermann
                   and Alexander Mehler},
      title     = {Digital Learning, Teaching and Collaboration in an Era of ubiquitous Quarantine},
      editor    = {Linda Daniela and Anna Visvizin},
      booktitle = {Remote Learning in Times of Pandemic - Issues, Implications and Best Practice},
      publisher = {Routledge},
      address   = {Thames, Oxfordshire, England, UK},
      year      = {2021},
      chapter   = {3}
    }
    Andy Lücking, Christine Driller, Manuel Stoeckel, Giuseppe Abrami, Adrian Pachzelt and Alexander Mehler. 2021. Multiple Annotation for Biodiversity: Developing an annotation framework among biology, linguistics and text technology. Language Resources and Evaluation.
    BibTeX
    @article{Luecking:et:al:2021,
      author    = {Andy Lücking and Christine Driller and Manuel Stoeckel and Giuseppe Abrami
                   and Adrian Pachzelt and Alexander Mehler},
      year      = {2021},
      journal   = {Language Resources and Evaluation},
      title     = {Multiple Annotation for Biodiversity: Developing an annotation
                   framework among biology, linguistics and text technology},
      editor    = {Nancy Ide and Nicoletta Calzolari},
      doi       = {10.1007/s10579-021-09553-5},
      pdf       = {https://link.springer.com/content/pdf/10.1007/s10579-021-09553-5.pdf},
      keywords  = {biofid}
    }
    Pascal Fischer, Alen Smajic, Giuseppe Abrami and Alexander Mehler. 2021. Multi-Type-TD-TSR - Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations. Proceedings of the 44th German Conference on Artificial Intelligence.
    BibTeX
    @inproceedings{Fischer:et:al:2021,
      author    = {Fischer, Pascal and Smajic, Alen and Abrami, Giuseppe and Mehler, Alexander},
      title     = {Multi-Type-TD-TSR - Extracting Tables from Document Images using
                   a Multi-stage Pipeline for Table Detection and Table Structure
                   Recognition: from OCR to Structured Table Representations},
      booktitle = {Proceedings of the 44th German Conference on Artificial Intelligence},
      series    = {KI2021},
      location  = {Berlin, Germany},
      year      = {2021},
      url       = {https://www.springerprofessional.de/multi-type-td-tsr-extracting-tables-from-document-images-using-a/19711570},
      pdf       = {https://arxiv.org/pdf/2105.11021.pdf}
    }
    Mark Klement, Alexander Henlein and Alexander Mehler. June, 2021. VoxML Annotation Tool Review and Suggestions for Improvement. Proceedings of the Seventeenth Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-17, Note for special track on visual information annotation).
    BibTeX
    @inproceedings{Klement:et:al:2021,
      author    = {Klement, Mark and Henlein, Alexander and Mehler, Alexander},
      title     = {VoxML Annotation Tool Review and Suggestions for Improvement},
      booktitle = {Proceedings of the Seventeenth Joint ACL - ISO Workshop on Interoperable
                   Semantic Annotation (ISA-17, Note for special track on visual
                   information annotation)},
      series    = {ISA-17},
      location  = {Groningen, Netherlands},
      month     = {June},
      year      = {2021},
      pdf       = {https://sigsem.uvt.nl/isa17/32_Klement-Paper.pdf}
    }
    Giuseppe Abrami, Alexander Henlein, Andy Lücking, Attila Kett, Pascal Adeberg and Alexander Mehler. June, 2021. Unleashing annotations with TextAnnotator: Multimedia, multi-perspective document views for ubiquitous annotation. Proceedings of the 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, 65–75.
    BibTeX
    @inproceedings{Abrami:et:al:2021,
      author    = {Abrami, Giuseppe and Henlein, Alexander and Lücking, Andy and Kett, Attila
                   and Adeberg, Pascal and Mehler, Alexander},
      title     = {Unleashing annotations with {TextAnnotator}: Multimedia, multi-perspective
                   document views for ubiquitous annotation},
      booktitle = {Proceedings of the 17th Joint ACL - ISO Workshop on Interoperable
                   Semantic Annotation},
      series    = {ISA-17},
      publisher = {Association for Computational Linguistics},
      address   = {Groningen, The Netherlands (online)},
      month     = {June},
      editor    = {Bunt, Harry},
      year      = {2021},
      url       = {https://aclanthology.org/2021.isa-1.7},
      pages     = {65--75},
      keywords  = {textannotator},
      pdf       = {https://iwcs2021.github.io/proceedings/isa/pdf/2021.isa-1.7.pdf},
      abstract  = {We argue that mainly due to technical innovation in the landscape
                   of annotation tools, a conceptual change in annotation models
                   and processes is also on the horizon. It is diagnosed that these
                   changes are bound up with multi-media and multi-perspective facilities
                   of annotation tools, in particular when considering virtual reality
                   (VR) and augmented reality (AR) applications, their potential
                   ubiquitous use, and the exploitation of externally trained natural
                   language pre-processing methods. Such developments potentially
                   lead to a dynamic and exploratory heuristic construction of the
                   annotation process. With TextAnnotator an annotation suite is
                   introduced which focuses on multi-mediality and multi-perspectivity
                   with an interoperable set of task-specific annotation modules
                   (e.g., for word classification, rhetorical structures, dependency
                   trees, semantic roles, and more) and their linkage to VR and mobile
                   implementations. The basic architecture and usage of TextAnnotator
                   is described and related to the above mentioned shifts in the
                   field.}
    }
    Andy Lücking, Sebastian Brückner, Giuseppe Abrami, Tolga Uslu and Alexander Mehler. 2021. Computational linguistic assessment of textbooks and online texts by means of threshold concepts in economics. Frontiers in Education.
    BibTeX
    @article{Luecking:Brueckner:Abrami:Uslu:Mehler:2021,
      journal   = {Frontiers in Education},
      doi       = {10.3389/feduc.2020.578475},
      title     = {Computational linguistic assessment of textbooks and online texts
                   by means of threshold concepts in economics},
      author    = {L{\"u}cking, Andy and Br{\"u}ckner, Sebastian and Abrami, Giuseppe
                   and Uslu, Tolga and Mehler, Alexander},
      eid       = {578475},
      url       = {https://www.frontiersin.org/articles/10.3389/feduc.2020.578475/},
      year      = {2021}
    }

    2020

    Alexander Mehler, Wahed Hemati, Pascal Welke, Maxim Konca and Tolga Uslu. 2020. Multiple Texts as a Limiting Factor in Online Learning: Quantifying (Dis-)similarities of Knowledge Networks. Frontiers in Education, 5:206.
    BibTeX
    @article{Mehler:Hemati:Welke:Konca:Uslu:2020,
      abstract  = {We test the hypothesis that the extent to which one obtains information
                   on a given topic through Wikipedia depends on the language in
                   which it is consulted. Controlling the size factor, we investigate
                   this hypothesis for a number of 25 subject areas. Since Wikipedia
                   is a central part of the web-based information landscape, this
                   indicates a language-related, linguistic bias. The article therefore
                   deals with the question of whether Wikipedia exhibits this kind
                   of linguistic relativity or not. From the perspective of educational
                   science, the article develops a computational model of the information
                   landscape from which multiple texts are drawn as typical input
                   of web-based reading. For this purpose, it develops a hybrid model
                   of intra- and intertextual similarity of different parts of the
                   information landscape and tests this model on the example of 35
                   languages and corresponding Wikipedias. In the way it measures
                   the similarities of hypertexts, the article goes beyond existing
                   approaches by examining their structural and semantic aspects
                   intra- and intertextually. In this way it builds a bridge between
                   reading research, educational science, Wikipedia research and
                   computational linguistics.},
      author    = {Mehler, Alexander and Hemati, Wahed and Welke, Pascal and Konca, Maxim
                   and Uslu, Tolga},
      doi       = {10.3389/feduc.2020.562670},
      issn      = {2504-284X},
      journal   = {Frontiers in Education},
      pages     = {206},
      title     = {Multiple Texts as a Limiting Factor in Online Learning: Quantifying
                   (Dis-)similarities of Knowledge Networks},
      url       = {https://www.frontiersin.org/article/10.3389/feduc.2020.562670},
      pdf       = {https://www.frontiersin.org/articles/10.3389/feduc.2020.562670/pdf},
      volume    = {5},
      year      = {2020}
    }
    Andy Lücking, Sebastian Brückner, Giuseppe Abrami, Tolga Uslu and Alexander Mehler. 2020. Computational linguistic assessment of textbook and online learning media by means of threshold concepts in business education. CoRR, abs/2008.02096.
    BibTeX
    @article{Luecking:et:al:2020,
      author    = {Andy L{\"{u}}cking and Sebastian Br{\"{u}}ckner and Giuseppe Abrami
                   and Tolga Uslu and Alexander Mehler},
      title     = {Computational linguistic assessment of textbook and online learning
                   media by means of threshold concepts in business education},
      journal   = {CoRR},
      volume    = {abs/2008.02096},
      year      = {2020},
      url       = {https://arxiv.org/abs/2008.02096},
      archiveprefix = {arXiv},
      eprint    = {2008.02096},
      timestamp = {Fri, 07 Aug 2020 15:07:21 +0200},
      biburl    = {https://dblp.org/rec/journals/corr/abs-2008-02096.bib},
      bibsource = {dblp computer science bibliography, https://dblp.org}
    }
    Christine Driller, Markus Koch, Giuseppe Abrami, Wahed Hemati, Andy Lücking, Alexander Mehler, Adrian Pachzelt and Gerwin Kasperek. 2020. Fast and Easy Access to Central European Biodiversity Data with BIOfid. Biodiversity Information Science and Standards, 4:e59157.
    BibTeX
    @article{Driller:et:al:2020,
      author    = {Christine Driller and Markus Koch and Giuseppe Abrami and Wahed Hemati
                   and Andy Lücking and Alexander Mehler and Adrian Pachzelt and Gerwin Kasperek},
      title     = {Fast and Easy Access to Central European Biodiversity Data with BIOfid},
      volume    = {4},
      number    = {},
      year      = {2020},
      doi       = {10.3897/biss.4.59157},
      publisher = {Pensoft Publishers},
      abstract  = {The storage of data in public repositories such as the Global
                   Biodiversity Information Facility (GBIF) or the National Center
                   for Biotechnology Information (NCBI) is nowadays stipulated in
                   the policies of many publishers in order to facilitate data replication
                   or proliferation. Species occurrence records contained in legacy
                   printed literature are no exception to this. The extent of their
                   digital and machine-readable availability, however, is still far
                   from matching the existing data volume (Thessen and Parr 2014).
                   But precisely these data are becoming more and more relevant to
                   the investigation of ongoing loss of biodiversity. In order to
                   extract species occurrence records at a larger scale from available
                   publications, one has to apply specialised text mining tools.
                   However, such tools are in short supply especially for scientific
                   literature in the German language.The Specialised Information
                   Service Biodiversity Research*1 BIOfid (Koch et al. 2017) aims
                   at reducing this desideratum, inter alia, by preparing a searchable
                   text corpus semantically enriched by a new kind of multi-label
                   annotation. For this purpose, we feed manual annotations into
                   automatic, machine-learning annotators. This mixture of automatic
                   and manual methods is needed, because BIOfid approaches a new
                   application area with respect to language (mainly German of the
                   19th century), text type (biological reports), and linguistic
                   focus (technical and everyday language).We will present current
                   results of the performance of BIOfid’s semantic search engine
                   and the application of independent natural language processing
                   (NLP) tools. Most of these are freely available online, such as
                   TextImager (Hemati et al. 2016). We will show how TextImager is
                   tied into the BIOfid pipeline and how it is made scalable (e.g.
                   extendible by further modules) and usable on different systems
                   (docker containers).Further, we will provide a short introduction
                   to generating machine-learning training data using TextAnnotator
                   (Abrami et al. 2019) for multi-label annotation. Annotation reproducibility
                   can be assessed by the implementation of inter-annotator agreement
                   methods (Abrami et al. 2020). Beyond taxon recognition and entity
                   linking, we place particular emphasis on location and time information.
                   For this purpose, our annotation tag-set combines general categories
                   and biology-specific categories (including taxonomic names) with
                   location and time ontologies. The application of the annotation
                   categories is regimented by annotation guidelines (Lücking et
                   al. 2020). Within the next years, our work deliverable will be
                   a semantically accessible and data-extractable text corpus of
                   around two million pages. In this way, BIOfid is creating a new
                   valuable resource that expands our knowledge of biodiversity and
                   its determinants.},
      issn      = {},
      pages     = {e59157},
      url       = {https://doi.org/10.3897/biss.4.59157},
      eprint    = {https://doi.org/10.3897/biss.4.59157},
      journal   = {Biodiversity Information Science and Standards},
      keywords  = {biofid}
    }
    Giuseppe Abrami, Alexander Mehler and Manuel Stoeckel. 2020. TextAnnotator: A web-based annotation suite for texts. Proceedings of the Digital Humanities 2020.
    BibTeX
    @inproceedings{Abrami:Mehler:Stoeckel:2020,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Stoeckel, Manuel},
      title     = {{TextAnnotator}: A web-based annotation suite for texts},
      booktitle = {Proceedings of the Digital Humanities 2020},
      series    = {DH 2020},
      location  = {Ottawa, Canada},
      year      = {2020},
      url       = {https://dh2020.adho.org/wp-content/uploads/2020/07/547_TextAnnotatorAwebbasedannotationsuitefortexts.html},
      doi       = {http://dx.doi.org/10.17613/tenm-4907},
      abstract  = {The TextAnnotator is a tool for simultaneous and collaborative
                   annotation of texts with visual annotation support, integration
                   of knowledge bases and, by pipelining the TextImager, a rich variety
                   of pre-processing and automatic annotation tools. It includes
                   a variety of modules for the annotation of texts, which contains
                   the annotation of argumentative, rhetorical, propositional and
                   temporal structures as well as a module for named entity linking
                   and rapid annotation of named entities. Especially the modules
                   for annotation of temporal, argumentative and propositional structures
                   are currently unique in web-based annotation tools. The TextAnnotator,
                   which allows the annotation of texts as a platform, is divided
                   into a front- and a backend component. The backend is a web service
                   based on WebSockets, which integrates the UIMA Database Interface
                   to manage and use texts. Texts are made accessible by using the
                   ResourceManager and the AuthorityManager, based on user and group
                   access permissions. Different views of a document can be created
                   and used depending on the scenario. Once a document has been opened,
                   access is gained to the annotations stored within annotation views
                   in which these are organized. Any annotation view can be assigned
                   with access permissions and by default, each user obtains his
                   or her own user view for every annotated document. In addition,
                   with sufficient access permissions, all annotation views can also
                   be used and curated. This allows the possibility to calculate
                   an Inter-Annotator-Agreement for a document, which shows an agreement
                   between the annotators. Annotators without sufficient rights cannot
                   display this value so that the annotators do not influence each
                   other. This contribution is intended to reflect the current state
                   of development of TextAnnotator, demonstrate the possibilities
                   of an instantaneous Inter-Annotator-Agreement and trigger a discussion
                   about further functions for the community.},
      keywords  = {textannotator},
      poster    = {https://hcommons.org/deposits/download/hc:31816/CONTENT/dh2020_textannotator_poster.pdf}
    }
    Giuseppe Abrami, Manuel Stoeckel and Alexander Mehler. 2020. TextAnnotator: A UIMA Based Tool for the Simultaneous and Collaborative Annotation of Texts. Proceedings of The 12th Language Resources and Evaluation Conference, 891–900.
    BibTeX
    @inproceedings{Abrami:Stoeckel:Mehler:2020,
      author    = {Abrami, Giuseppe and Stoeckel, Manuel and Mehler, Alexander},
      title     = {TextAnnotator: A UIMA Based Tool for the Simultaneous and Collaborative
                   Annotation of Texts},
      booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
      year      = {2020},
      address   = {Marseille, France},
      publisher = {European Language Resources Association},
      pages     = {891--900},
      isbn      = {979-10-95546-34-4},
      abstract  = {The annotation of texts and other material in the field of digital
                   humanities and Natural Language Processing (NLP) is a common task
                   of research projects. At the same time, the annotation of corpora
                   is certainly the most time- and cost-intensive component in research
                   projects and often requires a high level of expertise according
                   to the research interest. However, for the annotation of texts,
                   a wide range of tools is available, both for automatic and manual
                   annotation. Since the automatic pre-processing methods are not
                   error-free and there is an increasing demand for the generation
                   of training data, also with regard to machine learning, suitable
                   annotation tools are required. This paper defines criteria of
                   flexibility and efficiency of complex annotations for the assessment
                   of existing annotation tools. To extend this list of tools, the
                   paper describes TextAnnotator, a browser-based, multi-annotation
                   system, which has been developed to perform platform-independent
                   multimodal annotations and annotate complex textual structures.
                   The paper illustrates the current state of development of TextAnnotator
                   and demonstrates its ability to evaluate annotation quality (inter-annotator
                   agreement) at runtime. In addition, it will be shown how annotations
                   of different users can be performed simultaneously and collaboratively
                   on the same document from different platforms using UIMA as the
                   basis for annotation.},
      url       = {https://www.aclweb.org/anthology/2020.lrec-1.112},
      keywords  = {textannotator},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.112.pdf}
    }
    Giuseppe Abrami, Alexander Henlein, Attila Kett and Alexander Mehler. 2020. Text2SceneVR: Generating Hypertexts with VAnnotatoR as a Pre-processing Step for Text2Scene Systems. Proceedings of the 31st ACM Conference on Hypertext and Social Media, 177–186.
    BibTeX
    @inproceedings{Abrami:Henlein:Kett:Mehler:2020,
      author    = {Abrami, Giuseppe and Henlein, Alexander and Kett, Attila and Mehler, Alexander},
      title     = {{Text2SceneVR}: Generating Hypertexts with VAnnotatoR as a Pre-processing
                   Step for Text2Scene Systems},
      booktitle = {Proceedings of the 31st ACM Conference on Hypertext and Social Media},
      series    = {HT ’20},
      year      = {2020},
      location  = {Virtual Event, USA},
      isbn      = {9781450370981},
      publisher = {Association for Computing Machinery},
      address   = {New York, NY, USA},
      url       = {https://doi.org/10.1145/3372923.3404791},
      doi       = {10.1145/3372923.3404791},
      pages     = {177–186},
      numpages  = {10},
      pdf       = {https://dl.acm.org/doi/pdf/10.1145/3372923.3404791}
    }
    Manuel Stoeckel, Alexander Henlein, Wahed Hemati and Alexander Mehler. May, 2020. Voting for POS tagging of Latin texts: Using the flair of FLAIR to better Ensemble Classifiers by Example of Latin. Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, 130–135.
    BibTeX
    @inproceedings{Stoeckel:et:al:2020,
      author    = {Stoeckel, Manuel and Henlein, Alexander and Hemati, Wahed and Mehler, Alexander},
      title     = {{Voting for POS tagging of Latin texts: Using the flair of FLAIR
                   to better Ensemble Classifiers by Example of Latin}},
      booktitle = {Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies
                   for Historical and Ancient Languages},
      month     = {May},
      year      = {2020},
      address   = {Marseille, France},
      publisher = {European Language Resources Association (ELRA)},
      pages     = {130--135},
      abstract  = {Despite the great importance of the Latin language in the past,
                   there are relatively few resources available today to develop
                   modern NLP tools for this language. Therefore, the EvaLatin Shared
                   Task for Lemmatization and Part-of-Speech (POS) tagging was published
                   in the LT4HALA workshop. In our work, we dealt with the second
                   EvaLatin task, that is, POS tagging. Since most of the available
                   Latin word embeddings were trained on either few or inaccurate
                   data, we trained several embeddings on better data in the first
                   step. Based on these embeddings, we trained several state-of-the-art
                   taggers and used them as input for an ensemble classifier called
                   LSTMVoter. We were able to achieve the best results for both the
                   cross-genre and the cross-time task (90.64\% and 87.00\%) without
                   using additional annotated data (closed modality). In the meantime,
                   we further improved the system and achieved even better results
                   (96.91\% on classical, 90.87\% on cross-genre and 87.35\% on cross-time).},
      url       = {https://www.aclweb.org/anthology/2020.lt4hala-1.21},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2020/workshops/LT4HALA/pdf/2020.lt4hala-1.21.pdf}
    }
    Alexander Mehler, Bernhard Jussen, Tim Geelhaar, Alexander Henlein, Giuseppe Abrami, Daniel Baumartz, Tolga Uslu and Wahed Hemati. 2020. The Frankfurt Latin Lexicon. From Morphological Expansion and Word Embeddings to SemioGraphs. Studi e Saggi Linguistici, 58(1):121–155.
    BibTeX
    @article{Mehler:et:al:2020b,
      author    = {Mehler, Alexander and Jussen, Bernhard and Geelhaar, Tim and Henlein, Alexander
                   and Abrami, Giuseppe and Baumartz, Daniel and Uslu, Tolga and Hemati, Wahed},
      title     = {{The Frankfurt Latin Lexicon. From Morphological Expansion and
                   Word Embeddings to SemioGraphs}},
      journal   = {Studi e Saggi Linguistici},
      doi       = {10.4454/ssl.v58i1.276},
      year      = {2020},
      volume    = {58},
      number    = {1},
      pages     = {121--155},
      abstract  = {In this article we present the Frankfurt Latin Lexicon (FLL),
                   a lexical resource for Medieval Latin that is used both for the
                   lemmatization of Latin texts and for the post-editing of lemmatizations.
                   We describe recent advances in the development of lemmatizers
                   and test them against the Capitularies corpus (comprising Frankish
                   royal edicts, mid-6th to mid-9th century), a corpus created as
                   a reference for processing Medieval Latin. We also consider the
                   post-correction of lemmatizations using a limited crowdsourcing
                   process aimed at continuous review and updating of the FLL. Starting
                   from the texts resulting from this lemmatization process, we describe
                   the extension of the FLL by means of word embeddings, whose interactive
                   traversing by means of SemioGraphs completes the digital enhanced
                   hermeneutic circle. In this way, the article argues for a more
                   comprehensive understanding of lemmatization, encompassing classical
                   machine learning as well as intellectual post-corrections and,
                   in particular, human computation in the form of interpretation
                   processes based on graph representations of the underlying lexical
                   resources.},
      url       = {https://www.studiesaggilinguistici.it/index.php/ssl/article/view/276},
      pdf       = {https://www.studiesaggilinguistici.it/index.php/ssl/article/download/276/219}
    }
    Alexander Henlein, Giuseppe Abrami, Attila Kett and Alexander Mehler. May, 2020. Transfer of ISOSpace into a 3D Environment for Annotations and Applications. Proceedings of the 16th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, 32–35.
    BibTeX
    @inproceedings{Henlein:et:al:2020,
      author    = {Henlein, Alexander and Abrami, Giuseppe and Kett, Attila and Mehler, Alexander},
      title     = {Transfer of ISOSpace into a 3D Environment for Annotations and Applications},
      booktitle = {Proceedings of the 16th Joint ACL - ISO Workshop on Interoperable
                   Semantic Annotation},
      month     = {May},
      year      = {2020},
      address   = {Marseille},
      publisher = {European Language Resources Association},
      pages     = {32--35},
      abstract  = {People's visual perception is very pronounced and therefore it
                   is usually no problem for them to describe the space around them
                   in words. Conversely, people also have no problems imagining a
                   concept of a described space. In recent years many efforts have
                   been made to develop a linguistic concept for spatial and spatial-temporal
                   relations. However, the systems have not really caught on so far,
                   which in our opinion is due to the complex models on which they
                   are based and the lack of available training data and automated
                   taggers. In this paper we describe a project to support spatial
                   annotation, which could facilitate annotation by its many functions,
                   but also enrich it with many more information. This is to be achieved
                   by an extension by means of a VR environment, with which spatial
                   relations can be better visualized and connected with real objects.
                   And we want to use the available data to develop a new state-of-the-art
                   tagger and thus lay the foundation for future systems such as
                   improved text understanding for Text2Scene.},
      url       = {https://www.aclweb.org/anthology/2020.isa-1.4},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2020/workshops/ISA16/pdf/2020.isa-1.4.pdf}
    }
    Jonathan Hildebrand, Wahed Hemati and Alexander Mehler. May, 2020. Recognizing Sentence-level Logical Document Structures with the Help of Context-free Grammars. Proceedings of The 12th Language Resources and Evaluation Conference, 5282–5290.
    BibTeX
    @inproceedings{Hildebrand:Hemati:Mehler:2020,
      author    = {Hildebrand, Jonathan and Hemati, Wahed and Mehler, Alexander},
      title     = {Recognizing Sentence-level Logical Document Structures with the
                   Help of Context-free Grammars},
      booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
      month     = {May},
      year      = {2020},
      address   = {Marseille, France},
      publisher = {European Language Resources Association},
      pages     = {5282--5290},
      abstract  = {Current sentence boundary detectors split documents into sequentially
                   ordered sentences by detecting their beginnings and ends. Sentences,
                   however, are more deeply structured even on this side of constituent
                   and dependency structure: they can consist of a main sentence
                   and several subordinate clauses as well as further segments (e.g.
                   inserts in parentheses); they can even recursively embed whole
                   sentences and then contain multiple sentence beginnings and ends.
                   In this paper, we introduce a tool that segments sentences into
                   tree structures to detect this type of recursive structure. To
                   this end, we retrain different constituency parsers with the help
                   of modified training data to transform them into sentence segmenters.
                   With these segmenters, documents are mapped to sequences of sentence-related
                   “logical document structures”. The resulting segmenters aim to
                   improve downstream tasks by providing additional structural information.
                   In this context, we experiment with German dependency parsing.
                   We show that for certain sentence categories, which can be determined
                   automatically, improvements in German dependency parsing can be
                   achieved using our segmenter for preprocessing. The assumption
                   suggests that improvements in other languages and tasks can be
                   achieved.},
      url       = {https://www.aclweb.org/anthology/2020.lrec-1.650},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.650.pdf}
    }
    Alexander Henlein and Alexander Mehler. May, 2020. On the Influence of Coreference Resolution on Word Embeddings in Lexical-semantic Evaluation Tasks. Proceedings of The 12th Language Resources and Evaluation Conference, 27–33.
    BibTeX
    @inproceedings{Henlein:Mehler:2020,
      author    = {Henlein, Alexander and Mehler, Alexander},
      title     = {{On the Influence of Coreference Resolution on Word Embeddings
                   in Lexical-semantic Evaluation Tasks}},
      booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
      month     = {May},
      year      = {2020},
      address   = {Marseille, France},
      publisher = {European Language Resources Association},
      pages     = {27--33},
      abstract  = {Coreference resolution (CR) aims to find all spans of a text that
                   refer to the same entity. The F1-Scores on these task have been
                   greatly improved by new developed End2End-approaches and transformer
                   networks. The inclusion of CR as a pre-processing step is expected
                   to lead to improvements in downstream tasks. The paper examines
                   this effect with respect to word embeddings. That is, we analyze
                   the effects of CR on six different embedding methods and evaluate
                   them in the context of seven lexical-semantic evaluation tasks
                   and instantiation/hypernymy detection. Especially in the last
                   tasks we hoped for a significant increase in performance. We show
                   that all word embedding approaches do not benefit significantly
                   from pronoun substitution. The measurable improvements are only
                   marginal (around 0.5\% in most test cases). We explain this result
                   with the loss of contextual information, reduction of the relative
                   occurrence of rare words and the lack of pronouns to be replaced.},
      url       = {https://www.aclweb.org/anthology/2020.lrec-1.4},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.4.pdf}
    }
    Alexander Mehler, Rüdiger Gleim, Regina Gaitsch, Tolga Uslu and Wahed Hemati. 2020. From Topic Networks to Distributed Cognitive Maps: Zipfian Topic Universes in the Area of Volunteered Geographic Information. Complexity, 4:1–47.
    BibTeX
    @article{Mehler:Gleim:Gaitsch:Uslu:Hemati:2020,
      author    = {Alexander Mehler and R{\"{u}}diger Gleim and Regina Gaitsch and Tolga Uslu
                   and Wahed Hemati},
      title     = {From Topic Networks to Distributed Cognitive Maps: {Zipfian} Topic
                   Universes in the Area of Volunteered Geographic Information},
      journal   = {Complexity},
      volume    = {4},
      doi       = {10.1155/2020/4607025},
      pages     = {1-47},
      issuetitle = {Cognitive Network Science: A New Frontier},
      year      = {2020}
    }
    Vincent Kühn, Giuseppe Abrami and Alexander Mehler. 2020. WikNectVR: A Gesture-Based Approach for Interacting in Virtual Reality Based on WikNect and Gestural Writing. Virtual, Augmented and Mixed Reality. Design and Interaction - 12th International Conference, VAMR 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19-24, 2020, Proceedings, Part I, 12190:299–312.
    BibTeX
    @inproceedings{Kuehn:Abrami:Mehler:2020,
      author    = {Vincent K{\"{u}}hn and Giuseppe Abrami and Alexander Mehler},
      editor    = {Jessie Y. C. Chen and Gino Fragomeni},
      title     = {WikNectVR: {A} Gesture-Based Approach for Interacting in Virtual
                   Reality Based on WikNect and Gestural Writing},
      booktitle = {Virtual, Augmented and Mixed Reality. Design and Interaction -
                   12th International Conference, {VAMR} 2020, Held as Part of the
                   22nd {HCI} International Conference, {HCII} 2020, Copenhagen,
                   Denmark, July 19-24, 2020, Proceedings, Part {I}},
      series    = {Lecture Notes in Computer Science},
      volume    = {12190},
      pages     = {299--312},
      publisher = {Springer},
      year      = {2020},
      url       = {https://doi.org/10.1007/978-3-030-49695-1_20},
      doi       = {10.1007/978-3-030-49695-1_20},
      timestamp = {Tue, 14 Jul 2020 10:55:57 +0200},
      biburl    = {https://dblp.org/rec/conf/hci/KuhnAM20.bib},
      bibsource = {dblp computer science bibliography, https://dblp.org}
    }
    Giuseppe Abrami, Alexander Mehler, Christian Spiekermann, Attila Kett, Simon Lööck and Lukas Schwarz. 2020. Educational Technologies in the area of ubiquitous historical computing in virtual reality. In: New Perspectives on Virtual and Augmented Reality: Finding New Ways to Teach in a Transformed Learning Environment. Ed. by Linda Daniela. Taylor & Francis.
    BibTeX
    @inbook{Abrami:et:al:2020,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Spiekermann, Christian
                   and Kett, Attila and L{\"o}{\"o}ck, Simon and Schwarz, Lukas},
      editor    = {Daniela, Linda},
      title     = {Educational Technologies in the area of ubiquitous historical
                   computing in virtual reality},
      booktitle = {New Perspectives on Virtual and Augmented Reality: Finding New
                   Ways to Teach in a Transformed Learning Environment},
      year      = {2020},
      publisher = {Taylor \& Francis},
      abstract  = {At ever shorter intervals, new technologies are being developed
                   that are opening up more and more areas of application. This regards,
                   for example, Virtual Reality (VR) and Augmented Reality (AR) devices.
                   In addition to the private sector, the public and education sectors,
                   which already make intensive use of these devices, benefit from
                   these technologies. However, especially in the field of historical
                   education, there are not many frameworks for generating immersive
                   virtual environments that can be used flexibly enough. This chapter
                   addresses this gap by means of VAnnotatoR. VAnnotatoR is a versatile
                   framework for the creation and use of virtual environments that
                   serve to model historical processes in historical education. The
                   paper describes the building blocks of VAnnotatoR and describes
                   applications in historical education.},
      isbn      = {978-0-367-43211-9},
      url       = {https://www.routledge.com/New-Perspectives-on-Virtual-and-Augmented-Reality-Finding-New-Ways-to-Teach/Daniela/p/book/9780367432119}
    }
    Christian Stegbauer and Alexander Mehler. 2020. Ursachen der Entstehung von ubiquitären Zentrum-Peripheriestrukturen und ihre Folgen. Soziale Welt – Zeitschrift für sozialwissenschaftliche Forschung und Praxis (SozW), Sonderband 23:265–284.
    BibTeX
    @article{Stegbauer:Mehler:2020,
      author    = {Christian Stegbauer and Alexander Mehler},
      title     = {Ursachen der Entstehung von ubiquit{\"{a}}ren Zentrum-Peripheriestrukturen
                   und ihre Folgen},
      journal   = {Soziale Welt -- Zeitschrift f\"{u}r sozialwissenschaftliche Forschung und Praxis (SozW)},
      volume    = {Sonderband 23},
      year      = {2020},
      pages     = {265--284}
    }

    2019

    Olga Zlatkin-Troitschanskaia, Walter Bisang, Alexander Mehler, Mita Banerjee and Jochen Roeper. 2019. Positive Learning in the Internet Age: Developments and Perspectives in the PLATO Program. In: Frontiers and Advances in Positive Learning in the Age of InformaTiOn (PLATO), 1–5. Ed. by Olga Zlatkin-Troitschanskaia. Springer International Publishing.
    BibTeX
    @inbook{Zlatkin-Troitschanskaia:et:al:2019,
      author    = {Zlatkin-Troitschanskaia, Olga and Bisang, Walter and Mehler, Alexander
                   and Banerjee, Mita and Roeper, Jochen},
      editor    = {Zlatkin-Troitschanskaia, Olga},
      title     = {Positive Learning in the Internet Age: Developments and Perspectives
                   in the PLATO Program},
      booktitle = {Frontiers and Advances in Positive Learning in the Age of InformaTiOn (PLATO)},
      year      = {2019},
      publisher = {Springer International Publishing},
      address   = {Cham},
      pages     = {1--5},
      abstract  = {The Internet has become the main informational entity, i.e., a
                   public source of information. The Internet offers many new benefits
                   and opportunities for human learning, teaching, and research.
                   However, by providing a vast amount of information from innumerable
                   sources, it also enables the manipulation of information; there
                   are countless examples of disseminated misinformation and false
                   data in mass and social media. Much of the information presented
                   online is conflicting, preselected, or algorithmically obscure,
                   often colliding with fundamental humanistic values and posing
                   moral or ethical problems.},
      isbn      = {978-3-030-26578-6},
      doi       = {10.1007/978-3-030-26578-6_1},
      url       = {https://doi.org/10.1007/978-3-030-26578-6_1}
    }
    Alexander Mehler and Visvanathan Ramesh. 2019. TextInContext: On the Way to a Framework for Measuring the Context-Sensitive Complexity of Educationally Relevant Texts—A Combined Cognitive and Computational Linguistic Approach. In: Frontiers and Advances in Positive Learning in the Age of InformaTiOn (PLATO), 167–195. Ed. by Olga Zlatkin-Troitschanskaia. Springer International Publishing.
    BibTeX
    @inbook{Mehler:Ramesh:2019,
      author    = {Mehler, Alexander and Ramesh, Visvanathan},
      editor    = {Zlatkin-Troitschanskaia, Olga},
      title     = {{TextInContext}: On the Way to a Framework for Measuring the Context-Sensitive
                   Complexity of Educationally Relevant Texts---A Combined Cognitive
                   and Computational Linguistic Approach},
      booktitle = {Frontiers and Advances in Positive Learning in the Age of InformaTiOn (PLATO)},
      year      = {2019},
      publisher = {Springer International Publishing},
      address   = {Cham},
      pages     = {167--195},
      abstract  = {We develop a framework for modeling the context sensitivity of
                   text interpretation. As a point of reference, we focus on the
                   complexity of educational texts. To open up a broader basis for
                   representing phenomena of context sensitivity, we integrate a
                   learning theory (i.e., the Cognitive Load Theory) with a theory
                   of discourse comprehension (i.e., the Construction Integration
                   Model) and a theory of cognitive semantics (i.e., the theory of
                   Conceptual Spaces). The aim is to construct measures that view
                   text complexity as a relational attribute by analogy to the relational
                   concept of meaning in situation semantics. To this end, we reconstruct
                   the situation semantic notion of relational meaning from the perspective
                   of a computationally informed cognitive semantics. The aim is
                   to prepare the development of measurements for predicting learning
                   outcomes in the form of positive or negative learning. This prediction
                   ideally depends on the underlying learning material, the learner's
                   situational context, and knowledge retrieved from his or her long-term
                   memory, which he or she uses to arrive at coherent mental representations
                   of the underlying texts. Finally, our model refers to machine
                   learning as a tool for modeling such memory content. In this way,
                   the chapter integrates approaches from different disciplines (linguistic
                   semantics, computational linguistics, cognitive science, and data
                   science).},
      isbn      = {978-3-030-26578-6},
      doi       = {10.1007/978-3-030-26578-6_14},
      url       = {https://doi.org/10.1007/978-3-030-26578-6_14}
    }
    Manuel Stoeckel, Wahed Hemati and Alexander Mehler. November, 2019. When Specialization Helps: Using Pooled Contextualized Embeddings to Detect Chemical and Biomedical Entities in Spanish. Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, 11–15.
    BibTeX
    @inproceedings{Stoeckel:Hemati:Mehler:2019,
      title     = {When Specialization Helps: Using Pooled Contextualized Embeddings
                   to Detect Chemical and Biomedical Entities in {S}panish},
      author    = {Stoeckel, Manuel and Hemati, Wahed and Mehler, Alexander},
      booktitle = {Proceedings of The 5th Workshop on BioNLP Open Shared Tasks},
      month     = {nov},
      year      = {2019},
      address   = {Hong Kong, China},
      publisher = {Association for Computational Linguistics},
      url       = {https://www.aclweb.org/anthology/D19-5702},
      doi       = {10.18653/v1/D19-5702},
      pages     = {11--15},
      abstract  = {The recognition of pharmacological substances, compounds and proteins
                   is an essential preliminary work for the recognition of relations
                   between chemicals and other biomedically relevant units. In this
                   paper, we describe an approach to Task 1 of the PharmaCoNER Challenge,
                   which involves the recognition of mentions of chemicals and drugs
                   in Spanish medical texts. We train a state-of-the-art BiLSTM-CRF
                   sequence tagger with stacked Pooled Contextualized Embeddings,
                   word and sub-word embeddings using the open-source framework FLAIR.
                   We present a new corpus composed of articles and papers from Spanish
                   health science journals, termed the Spanish Health Corpus, and
                   use it to train domain-specific embeddings which we incorporate
                   in our model training. We achieve a result of 89.76{\%} F1-score
                   using pre-trained embeddings and are able to improve these results
                   to 90.52{\%} F1-score using specialized embeddings.}
    }
    Sajawel Ahmed, Manuel Stoeckel, Christine Driller, Adrian Pachzelt and Alexander Mehler. 2019. BIOfid Dataset: Publishing a German Gold Standard for Named Entity Recognition in Historical Biodiversity Literature. Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), 871–880.
    BibTeX
    @inproceedings{Ahmed:Stoeckel:Driller:Pachzelt:Mehler:2019,
      author    = {Sajawel Ahmed and Manuel Stoeckel and Christine Driller and Adrian Pachzelt
                   and Alexander Mehler},
      title     = {{BIOfid Dataset: Publishing a German Gold Standard for Named Entity
                   Recognition in Historical Biodiversity Literature}},
      publisher = {Association for Computational Linguistics},
      year      = {2019},
      booktitle = {Proceedings of the 23rd Conference on Computational Natural Language
                   Learning (CoNLL)},
      address   = {Hong Kong, China},
      url       = {https://www.aclweb.org/anthology/K19-1081},
      doi       = {10.18653/v1/K19-1081},
      pages     = {871--880},
      abstract  = {The Specialized Information Service Biodiversity Research (BIOfid)
                   has been launched to mobilize valuable biological data from printed
                   literature hidden in German libraries for over the past 250 years.
                   In this project, we annotate German texts converted by OCR from
                   historical scientific literature on the biodiversity of plants,
                   birds, moths and butterflies. Our work enables the automatic extraction
                   of biological information previously buried in the mass of papers
                   and volumes. For this purpose, we generated training data for
                   the tasks of Named Entity Recognition (NER) and Taxa Recognition
                   (TR) in biological documents. We use this data to train a number
                   of leading machine learning tools and create a gold standard for
                   TR in biodiversity literature. More specifically, we perform a
                   practical analysis of our newly generated BIOfid dataset through
                   various downstream-task evaluations and establish a new state
                   of the art for TR with 80.23{\%} F-score. In this sense, our paper
                   lays the foundations for future work in the field of information
                   extraction in biology texts.},
      keywords  = {biofid}
    }
    Alexander Mehler and Giuseppe Abrami. October 10–11. VAnnotatoR: A framework for the multimodal reconstruction of historical situations and spaces. Proceedings of the Time Machine Conference.
    BibTeX
    @inproceedings{Mehler:Abrami:2019,
      author    = {Mehler, Alexander and Abrami, Giuseppe},
      title     = {{VAnnotatoR}: A framework for the multimodal reconstruction of
                   historical situations and spaces},
      booktitle = {Proceedings of the Time Machine Conference},
      year      = {2019},
      date      = {October 10-11},
      address   = {Dresden, Germany},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2019/09/TimeMachineConference.pdf}
    }
    Alex Hunziker, Hasanagha Mammadov, Wahed Hemati and Alexander Mehler. 2019. Corpus2Wiki: A MediaWiki-based Tool for Automatically Generating Wikiditions in Digital Humanities. INF-DH-2019.
    BibTeX
    @inproceedings{Hunziker:et:al:2019,
      author    = {Hunziker, Alex and Mammadov, Hasanagha and Hemati, Wahed and Mehler, Alexander},
      title     = {{Corpus2Wiki}: A MediaWiki-based Tool for Automatically Generating
                   Wikiditions in Digital Humanities},
      booktitle = {INF-DH-2019},
      year      = {2019},
      editor    = {Burghardt, Manuel AND Müller-Birn, Claudia},
      publisher = {Gesellschaft für Informatik e.V.},
      address   = {Bonn}
    }
    Wahed Hemati and Alexander Mehler. March, 2019. CRFVoter: gene and protein related object recognition using a conglomerate of CRF-based tools. Journal of Cheminformatics, 11(1):11.
    BibTeX
    @article{Hemati:Mehler:2019b,
      author    = {Hemati, Wahed and Mehler, Alexander},
      title     = {{{CRFVoter}: gene and protein related object recognition using
                   a conglomerate of CRF-based tools}},
      journal   = {Journal of Cheminformatics},
      year      = {2019},
      month     = {Mar},
      day       = {14},
      volume    = {11},
      number    = {1},
      pages     = {11},
      abstract  = {Gene and protein related objects are an important class of entities
                   in biomedical research, whose identification and extraction from
                   scientific articles is attracting increasing interest. In this
                   work, we describe an approach to the BioCreative V.5 challenge
                   regarding the recognition and classification of gene and protein
                   related objects. For this purpose, we transform the task as posed
                   by BioCreative V.5 into a sequence labeling problem. We present
                   a series of sequence labeling systems that we used and adapted
                   in our experiments for solving this task. Our experiments show
                   how to optimize the hyperparameters of the classifiers involved.
                   To this end, we utilize various algorithms for hyperparameter
                   optimization. Finally, we present CRFVoter, a two-stage application
                   of Conditional Random Field (CRF) that integrates the optimized
                   sequence labelers from our study into one ensemble classifier.},
      issn      = {1758-2946},
      doi       = {10.1186/s13321-019-0343-x},
      url       = {https://doi.org/10.1186/s13321-019-0343-x}
    }
    Giuseppe Abrami, Alexander Mehler, Andy Lücking, Elias Rieb and Philipp Helfrich. May, 2019. TextAnnotator: A flexible framework for semantic annotations. Proceedings of the Fifteenth Joint ACL - ISO Workshop on Interoperable Semantic Annotation, (ISA-15).
    BibTeX
    @inproceedings{Abrami:et:al:2019,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Lücking, Andy and Rieb, Elias
                   and Helfrich, Philipp},
      title     = {{TextAnnotator}: A flexible framework for semantic annotations},
      booktitle = {Proceedings of the Fifteenth Joint ACL - ISO Workshop on Interoperable
                   Semantic Annotation, (ISA-15)},
      series    = {ISA-15},
      location  = {Gothenburg, Sweden},
      month     = {May},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/TextAnnotator_IWCS_Göteborg.pdf},
      year      = {2019},
      keywords  = {textannotator},
      abstract  = {Modern annotation tools should meet at least the following general
                   requirements: they can handle diverse data and annotation levels
                   within one tool, and they support the annotation process with
                   automatic (pre-)processing outcomes as much as possible. We developed
                   a framework that meets these general requirements and that enables
                   versatile and browser-based annotations of texts, the TextAnnotator.
                   It combines NLP methods of pre-processing with methods of flexible
                   post-processing. Infact, machine learning (ML) requires a lot
                   of training and test data, but is usually far from achieving perfect
                   results. Producing high-level annotations for ML and post-correcting
                   its results are therefore necessary. This is the purpose of TextAnnotator,
                   which is entirely implemented in ExtJS and provides a range of
                   interactive visualizations of annotations. In addition, it allows
                   for flexibly integrating knowledge resources, e.g. in the course
                   of post-processing named entity recognition. The paper describes
                   TextAnnotator’s architecture together with three use cases: annotating
                   temporal structures, argument structures and named entity linking.}
    }
    Tolga Uslu, Alexander Mehler and Daniel Baumartz. 2019. Computing Classifier-based Embeddings with the Help of text2ddc. Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing, (CICLing 2019).
    BibTeX
    @inproceedings{Uslu:Mehler:Baumartz:2019,
      author    = {Uslu, Tolga and Mehler, Alexander and Baumartz, Daniel},
      booktitle = {{Proceedings of the 20th International Conference on Computational
                   Linguistics and Intelligent Text Processing, (CICLing 2019)}},
      location  = {La Rochelle, France},
      series    = {{CICLing 2019}},
      title     = {{Computing Classifier-based Embeddings with the Help of text2ddc}},
      year      = {2019}
    }
    Tolga Uslu, Alexander Mehler, Clemens Schulz and Daniel Baumartz. 2019. BigSense: a Word Sense Disambiguator for Big Data. Proceedings of the Digital Humanities 2019, (DH2019).
    BibTeX
    @inproceedings{Uslu:Mehler:Schulz:Baumartz:2019,
      author    = {Uslu, Tolga and Mehler, Alexander and Schulz, Clemens and Baumartz, Daniel},
      booktitle = {{Proceedings of the Digital Humanities 2019, (DH2019)}},
      location  = {Utrecht, Netherlands},
      series    = {{DH2019}},
      title     = {{{BigSense}: a Word Sense Disambiguator for Big Data}},
      year      = {2019},
      url       = {https://dev.clariah.nl/files/dh2019/boa/0199.html}
    }
    Wahed Hemati and Alexander Mehler. January, 2019. LSTMVoter: chemical named entity recognition using a conglomerate of sequence labeling tools. Journal of Cheminformatics, 11(1):7.
    BibTeX
    @article{Hemati:Mehler:2019a,
      abstract  = {Chemical and biomedical named entity recognition (NER) is an essential
                   preprocessing task in natural language processing. The identification
                   and extraction of named entities from scientific articles is also
                   attracting increasing interest in many scientific disciplines.
                   Locating chemical named entities in the literature is an essential
                   step in chemical text mining pipelines for identifying chemical
                   mentions, their properties, and relations as discussed in the
                   literature. In this work, we describe an approach to the BioCreative
                   V.5 challenge regarding the recognition and classification of
                   chemical named entities. For this purpose, we transform the task
                   of NER into a sequence labeling problem. We present a series of
                   sequence labeling systems that we used, adapted and optimized
                   in our experiments for solving this task. To this end, we experiment
                   with hyperparameter optimization. Finally, we present LSTMVoter,
                   a two-stage application of recurrent neural networks that integrates
                   the optimized sequence labelers from our study into a single ensemble
                   classifier.},
      author    = {Hemati, Wahed and Mehler, Alexander},
      day       = {10},
      doi       = {10.1186/s13321-018-0327-2},
      issn      = {1758-2946},
      journal   = {Journal of Cheminformatics},
      month     = {Jan},
      number    = {1},
      pages     = {7},
      title     = {{{LSTMVoter}: chemical named entity recognition using a conglomerate
                   of sequence labeling tools}},
      url       = {https://doi.org/10.1186/s13321-018-0327-2},
      volume    = {11},
      year      = {2019}
    }
    Giuseppe Abrami, Alexander Mehler and Christian Spiekermann. July, 2019. Graph-based Format for Modeling Multimodal Annotations in Virtual Reality by Means of VAnnotatoR. Proceedings of the 21th International Conference on Human-Computer Interaction, HCII 2019, 351–358.
    BibTeX
    @inproceedings{Abrami:Mehler:Spiekermann:2019,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Spiekermann, Christian},
      title     = {{Graph-based Format for Modeling Multimodal Annotations in Virtual
                   Reality by Means of VAnnotatoR}},
      booktitle = {Proceedings of the 21th International Conference on Human-Computer
                   Interaction, HCII 2019},
      series    = {HCII 2019},
      location  = {Orlando, Florida, USA},
      editor    = {Stephanidis, Constantine and Antona, Margherita},
      month     = {July},
      publisher = {Springer International Publishing},
      address   = {Cham},
      pages     = {351--358},
      abstract  = {Projects in the field of Natural Language Processing (NLP), the
                   Digital Humanities (DH) and related disciplines dealing with machine
                   learning of complex relationships between data objects need annotations
                   to obtain sufficiently rich training and test sets. The visualization
                   of such data sets and their underlying Human Computer Interaction
                   (HCI) are perennial problems of computer science. However, despite
                   some success stories, the clarity of information presentation
                   and the flexibility of the annotation process may decrease with
                   the complexity of the underlying data objects and their relationships.
                   In order to face this problem, the so-called VAnnotatoR was developed,
                   as a flexible annotation tool using 3D glasses and augmented reality
                   devices, which enables annotation and visualization in three-dimensional
                   virtual environments. In addition, multimodal objects are annotated
                   and visualized within a graph-based approach.},
      isbn      = {978-3-030-30712-7},
      pdf       = {https://link.springer.com/content/pdf/10.1007\%2F978-3-030-30712-7_44.pdf},
      year      = {2019}
    }
    Alexander Mehler, Tolga Uslu, Rüdiger Gleim and Daniel Baumartz. 2019. text2ddc meets Literature - Ein Verfahren für die Analyse und Visualisierung thematischer Makrostrukturen. Proceedings of the 6th Digital Humanities Conference in the German-speaking Countries, DHd 2019.
    BibTeX
    @inproceedings{Mehler:Uslu:Gleim:Baumartz:2019,
      author    = {Mehler, Alexander and Uslu, Tolga and Gleim, Rüdiger and Baumartz, Daniel},
      title     = {{text2ddc meets Literature - Ein Verfahren für die Analyse und
                   Visualisierung thematischer Makrostrukturen}},
      booktitle = {Proceedings of the 6th Digital Humanities Conference in the German-speaking
                   Countries, DHd 2019},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/DHD_Poster___text2ddc_meets_Literature_Poster.pdf},
      series    = {DHd 2019},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/Preprint_DHd2019_text2ddc_meets_Literature.pdf},
      location  = {Frankfurt, Germany},
      year      = {2019}
    }
    Giuseppe Abrami, Christian Spiekermann and Alexander Mehler. 2019. VAnnotatoR: Ein Werkzeug zur Annotation multimodaler Netzwerke in dreidimensionalen virtuellen Umgebungen. Proceedings of the 6th Digital Humanities Conference in the German-speaking Countries, DHd 2019.
    BibTeX
    @inproceedings{Abrami:Spiekermann:Mehler:2019,
      author    = {Abrami, Giuseppe and Spiekermann, Christian and Mehler, Alexander},
      title     = {{VAnnotatoR: Ein Werkzeug zur Annotation multimodaler Netzwerke
                   in dreidimensionalen virtuellen Umgebungen}},
      booktitle = {Proceedings of the 6th Digital Humanities Conference in the German-speaking
                   Countries, DHd 2019},
      series    = {DHd 2019},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/Preprint_VAnnotatoR_DHd2019.pdf},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/DHDVAnnotatoRPoster.pdf},
      location  = {Frankfurt, Germany},
      year      = {2019}
    }
    Wahed Hemati, Alexander Mehler, Tolga Uslu and Giuseppe Abrami. 2019. Der TextImager als Front- und Backend für das verteilte NLP von Big Digital Humanities Data. Proceedings of the 6th Digital Humanities Conference in the German-speaking Countries, DHd 2019.
    BibTeX
    @inproceedings{Hemati:Mehler:Uslu:Abrami:2019,
      author    = {Hemati, Wahed and Mehler, Alexander and Uslu, Tolga and Abrami, Giuseppe},
      title     = {{Der TextImager als Front- und Backend für das verteilte NLP von
                   Big Digital Humanities Data}},
      booktitle = {Proceedings of the 6th Digital Humanities Conference in the German-speaking
                   Countries, DHd 2019},
      series    = {DHd 2019},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/Der-TextImager-als-Fron-und-Backend.pdf},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/DHD19_TextImager.pdf},
      location  = {Frankfurt, Germany},
      year      = {2019}
    }
    Rüdiger Gleim, Steffen Eger, Alexander Mehler, Tolga Uslu, Wahed Hemati, Andy Lücking, Alexander Henlein, Sven Kahlsdorf and Armin Hoenen. 2019. A practitioner's view: a survey and comparison of lemmatization and morphological tagging in German and Latin. Journal of Language Modeling.
    BibTeX
    @article{Gleim:Eger:Mehler:2019,
      author    = {Gleim, R\"{u}diger and Eger, Steffen and Mehler, Alexander and Uslu, Tolga
                   and Hemati, Wahed and L\"{u}cking, Andy and Henlein, Alexander and Kahlsdorf, Sven
                   and Hoenen, Armin},
      title     = {A practitioner's view: a survey and comparison of lemmatization
                   and morphological tagging in German and Latin},
      journal   = {Journal of Language Modeling},
      year      = {2019},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/07/jlm-tagging.pdf},
      doi       = {10.15398/jlm.v7i1.205},
      url       = {http://jlm.ipipan.waw.pl/index.php/JLM/article/view/205}
    }

    2018

    Tatiana Lokot, Alexander Mehler and Olga Abramov. November, 2018. On the limit value of compactness of some graph classes. PLOS ONE, 13(11):1–8.
    BibTeX
    @article{Lokot:Mehler:Abramov:2018,
      author    = {Lokot, Tatiana and Mehler, Alexander and Abramov, Olga},
      journal   = {PLOS ONE},
      publisher = {Public Library of Science},
      title     = {On the limit value of compactness of some graph classes},
      year      = {2018},
      month     = {11},
      volume    = {13},
      url       = {https://doi.org/10.1371/journal.pone.0207536},
      pages     = {1-8},
      abstract  = {In this paper, we study the limit of compactness which is a graph
                   index originally introduced for measuring structural characteristics
                   of hypermedia. Applying compactness to large scale small-world
                   graphs (Mehler, 2008) observed its limit behaviour to be equal
                   1. The striking question concerning this finding was whether this
                   limit behaviour resulted from the specifics of small-world graphs
                   or was simply an artefact. In this paper, we determine the necessary
                   and sufficient conditions for any sequence of connected graphs
                   resulting in a limit value of CB = 1 which can be generalized
                   with some consideration for the case of disconnected graph classes
                   (Theorem 3). This result can be applied to many well-known classes
                   of connected graphs. Here, we illustrate it by considering four
                   examples. In fact, our proof-theoretical approach allows for quickly
                   obtaining the limit value of compactness for many graph classes
                   sparing computational costs.},
      number    = {11},
      doi       = {10.1371/journal.pone.0207536}
    }
    Eleanor Rutherford, Wahed Hemati and Alexander Mehler. 2018. Corpus2Wiki: A MediaWiki based Annotation & Visualisation Tool for the Digital Humanities. INF-DH-2018.
    BibTeX
    @inproceedings{Rutherford:et:al:2018,
      author    = {Rutherford, Eleanor AND Hemati, Wahed AND Mehler, Alexander},
      title     = {{Corpus2Wiki}: A MediaWiki based Annotation \& Visualisation Tool
                   for the Digital Humanities},
      booktitle = {INF-DH-2018},
      year      = {2018},
      editor    = {Burghardt, Manuel AND Müller-Birn, Claudia},
      publisher = {Gesellschaft für Informatik e.V.},
      address   = {Bonn}
    }
    Giuseppe Abrami, Alexander Mehler, Philipp Helfrich and Elias Rieb. 2018. TextAnnotator: A Browser-based Framework for Annotating Textual Data in Digital Humanities. Proceedings of the Digital Humanities Austria 2018.
    BibTeX
    @inproceedings{Abrami:et:al:2018,
      author    = {Giuseppe Abrami and Alexander Mehler and Philipp Helfrich and Elias Rieb},
      title     = {{TextAnnotator}: A Browser-based Framework for Annotating Textual
                   Data in Digital Humanities},
      booktitle = {Proceedings of the Digital Humanities Austria 2018},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2019/04/TA__A_Browser_based_Framework_for_Annotating_Textual_Data_in_Digital_Humanities.pdf},
      location  = {Salzburg, Austria},
      year      = {2018}
    }
    Sajawel Ahmed and Alexander Mehler. 2018. Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora. Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (ICMLA).
    BibTeX
    @inproceedings{Ahmed:Mehler:2018,
      author    = {Sajawel Ahmed and Alexander Mehler},
      title     = {{Resource-Size matters: Improving Neural Named Entity Recognition
                   with Optimized Large Corpora}},
      abstract  = {This study improves the performance of neural named entity recognition
                   by a margin of up to 11\% in terms of F-score on the example of
                   a low-resource language like German, thereby outperforming existing
                   baselines and establishing a new state-of-the-art on each single
                   open-source dataset (CoNLL 2003, GermEval 2014 and Tübingen Treebank
                   2018). Rather than designing deeper and wider hybrid neural architectures,
                   we gather all available resources and perform a detailed optimization
                   and grammar-dependent morphological processing consisting of lemmatization
                   and part-of-speech tagging prior to exposing the raw data to any
                   training process. We test our approach in a threefold monolingual
                   experimental setup of a) single, b) joint, and c) optimized training
                   and shed light on the dependency of downstream-tasks on the size
                   of corpora used to compute word embeddings.},
      booktitle = {Proceedings of the 17th IEEE International Conference on Machine
                   Learning and Applications (ICMLA)},
      location  = {Orlando, Florida, USA},
      pdf       = {https://arxiv.org/pdf/1807.10675.pdf},
      year      = {2018}
    }
    Claus Weiland, Christine Driller, Markus Koch, Marco Schmidt, Giuseppe Abrami, Sajawel Ahmed, Alexander Mehler, Adrian Pachzelt, Gerwin Kasperek, Angela Hausinger and Thomas Hörnschemeyer. 2018. BioFID, a platform to enhance accessibility of biodiversity data. Proceedings of the 10th International Conference on Ecological Informatics.
    BibTeX
    @inproceedings{Weiland:et:al:2018,
      author    = {Claus Weiland and Christine Driller and Markus Koch and Marco Schmidt
                   and Giuseppe Abrami and Sajawel Ahmed and Alexander Mehler and Adrian Pachzelt
                   and Gerwin Kasperek and Angela Hausinger and Thomas Hörnschemeyer},
      title     = {{BioFID}, a platform to enhance accessibility of biodiversity data},
      booktitle = {Proceedings of the 10th International Conference on Ecological Informatics},
      year      = {2018},
      url       = {https://www.researchgate.net/profile/Marco_Schmidt3/publication/327940813_BIOfid_a_Platform_to_Enhance_Accessibility_of_Biodiversity_Data/links/5bae3e3e92851ca9ed2cd60f/BIOfid-a-Platform-to-Enhance-Accessibility-of-Biodiversity-Data.pdf?origin=publication_detail},
      location  = {Jena, Germany}
    }
    Attila Kett, Giuseppe Abrami, Alexander Mehler and Christian Spiekermann. 2018. Resources2City Explorer: A System for Generating Interactive Walkable Virtual Cities out of File Systems. Proceedings of the 31st ACM User Interface Software and Technology Symposium.
    BibTeX
    @inproceedings{Kett:et:al:2018,
      author    = {Attila Kett and Giuseppe Abrami and Alexander Mehler and Christian Spiekermann},
      title     = {{Resources2City Explorer}: A System for Generating Interactive
                   Walkable Virtual Cities out of File Systems},
      booktitle = {Proceedings of the 31st ACM User Interface Software and Technology Symposium},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2018/10/UIST2018Final.pdf},
      location  = {Berlin, Germany},
      abstract  = {We present Resources2City Explorer (R2CE), a tool for representing
                   file systems as interactive, walkable virtual cities. R2CE visualizes
                   file systems based on concepts of spatial, 3D information processing.
                   For this purpose, it extends the range of functions of conventional
                   file browsers considerably. Visual elements in a city generated
                   by R2CE represent (relations of) objects of the underlying file
                   system. The paper describes the functional spectrum of R2CE and
                   illustrates it by visualizing a sample of 940 files.},
      year      = {2018}
    }
    Tolga Uslu and Alexander Mehler. 2018. PolyViz: a Visualization System for a Special Kind of Multipartite Graphs. Proceedings of the IEEE VIS 2018.
    BibTeX
    @inproceedings{Uslu:Mehler:2018,
      author    = {Tolga Uslu and Alexander Mehler},
      title     = {{PolyViz}: a Visualization System for a Special Kind of Multipartite Graphs},
      booktitle = {Proceedings of the IEEE VIS 2018},
      series    = {IEEE VIS 2018},
      location  = {Berlin, Germany},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/07/polyviz-visualization-system.pdf},
      year      = {2018}
    }
    Daniel Baumartz, Tolga Uslu and Alexander Mehler. 2018. LTV: Labeled Topic Vector. Proceedings of COLING 2018, the 27th International Conference on Computational Linguistics: System Demonstrations, August 20-26.
    BibTeX
    @inproceedings{Baumartz:Uslu:Mehler:2018,
      author    = {Daniel Baumartz and Tolga Uslu and Alexander Mehler},
      title     = {{LTV}: Labeled Topic Vector},
      booktitle = {Proceedings of {COLING 2018}, the 27th International Conference
                   on Computational Linguistics: System Demonstrations, August 20-26},
      year      = {2018},
      address   = {Santa Fe, New Mexico, USA},
      publisher = {The COLING 2018 Organizing Committee},
      abstract  = {In this paper, we present LTV, a website and an API that generate
                   labeled topic classifications based on the Dewey Decimal Classification
                   (DDC), an international standard for topic classification in libraries.
                   We introduce nnDDC, a largely language-independent neural network-based
                   classifier for DDC-related topic classification, which we optimized
                   using a wide range of linguistic features to achieve an F-score
                   of 87.4\%. To show that our approach is language-independent,
                   we evaluate nnDDC using up to 40 different languages. We derive
                   a topic model based on nnDDC, which generates probability distributions
                   over semantic units for any input on sense-, word- and text-level.
                   Unlike related approaches, however, these probabilities are estimated
                   by means of nnDDC so that each dimension of the resulting vector
                   representation is uniquely labeled by a DDC class. In this way,
                   we introduce a neural network-based Classifier-Induced Semantic
                   Space (nnCISS).},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/06/coling2018.pdf}
    }
    Christine Driller, Markus Koch, Marco Schmidt, Claus Weiland, Thomas Hörnschemeyer, Thomas Hickler, Giuseppe Abrami, Sajawel Ahmed, Rüdiger Gleim, Wahed Hemati, Tolga Uslu, Alexander Mehler, Adrian Pachzelt, Jashar Rexhepi, Thomas Risse, Janina Schuster, Gerwin Kasperek and Angela Hausinger. 2018. Workflow and Current Achievements of BIOfid, an Information Service Mobilizing Biodiversity Data from Literature Sources. Biodiversity Information Science and Standards, 2:e25876.
    BibTeX
    @article{Driller:et:al:2018,
      author    = {Christine Driller and Markus Koch and Marco Schmidt and Claus Weiland
                   and Thomas Hörnschemeyer and Thomas Hickler and Giuseppe Abrami and Sajawel Ahmed
                   and Rüdiger Gleim and Wahed Hemati and Tolga Uslu and Alexander Mehler
                   and Adrian Pachzelt and Jashar Rexhepi and Thomas Risse and Janina Schuster
                   and Gerwin Kasperek and Angela Hausinger},
      title     = {Workflow and Current Achievements of BIOfid, an Information Service
                   Mobilizing Biodiversity Data from Literature Sources},
      volume    = {2},
      number    = {},
      year      = {2018},
      doi       = {10.3897/biss.2.25876},
      publisher = {Pensoft Publishers},
      abstract  = {BIOfid is a specialized information service currently being developed
                   to mobilize biodiversity data dormant in printed historical and
                   modern literature and to offer a platform for open access journals
                   on the science of biodiversity. Our team of librarians, computer
                   scientists and biologists produce high-quality text digitizations,
                   develop new text-mining tools and generate detailed ontologies
                   enabling semantic text analysis and semantic search by means of
                   user-specific queries. In a pilot project we focus on German publications
                   on the distribution and ecology of vascular plants, birds, moths
                   and butterflies extending back to the Linnaeus period about 250
                   years ago. The three organism groups have been selected according
                   to current demands of the relevant research community in Germany.
                   The text corpus defined for this purpose comprises over 400 volumes
                   with more than 100,000 pages to be digitized and will be complemented
                   by journals from other digitization projects, copyright-free and
                   project-related literature. With TextImager (Natural Language
                   Processing & Text Visualization) and TextAnnotator (Discourse
                   Semantic Annotation) we have already extended and launched tools
                   that focus on the text-analytical section of our project. Furthermore,
                   taxonomic and anatomical ontologies elaborated by us for the taxa
                   prioritized by the project’s target group - German institutions
                   and scientists active in biodiversity research - are constantly
                   improved and expanded to maximize scientific data output. Our
                   poster describes the general workflow of our project ranging from
                   literature acquisition via software development, to data availability
                   on the BIOfid web portal (http://biofid.de/), and the implementation
                   into existing platforms which serve to promote global accessibility
                   of biodiversity data.},
      issn      = {},
      pages     = {e25876},
      url       = {https://doi.org/10.3897/biss.2.25876},
      eprint    = {https://doi.org/10.3897/biss.2.25876},
      journal   = {Biodiversity Information Science and Standards},
      keywords  = {biofid}
    }
    Alexander Mehler, Giuseppe Abrami, Christian Spiekermann and Matthias Jostock. 2018. VAnnotatoR: A Framework for Generating Multimodal Hypertexts. Proceedings of the 29th ACM Conference on Hypertext and Social Media.
    BibTeX
    @inproceedings{Mehler:Abrami:Spiekermann:Jostock:2018,
      author    = {Mehler, Alexander and Abrami, Giuseppe and Spiekermann, Christian
                   and Jostock, Matthias},
      title     = {{VAnnotatoR}: {A} Framework for Generating Multimodal Hypertexts},
      booktitle = {Proceedings of the 29th ACM Conference on Hypertext and Social Media},
      series    = {Proceedings of the 29th ACM Conference on Hypertext and Social Media (HT '18)},
      year      = {2018},
      location  = {Baltimore, Maryland},
      publisher = {ACM},
      address   = {New York, NY, USA},
      pdf       = {http://delivery.acm.org/10.1145/3210000/3209572/p150-mehler.pdf}
    }
    Wahed Hemati, Alexander Mehler, Tolga Uslu, Daniel Baumartz and Giuseppe Abrami. 2018. Evaluating and Integrating Databases in the Area of NLP. International Quantitative Linguistics Conference (QUALICO 2018).
    BibTeX
    @inproceedings{Hemati:Mehler:Uslu:Baumartz:Abrami:2018,
      author    = {Wahed Hemati and Alexander Mehler and Tolga Uslu and Daniel Baumartz
                   and Giuseppe Abrami},
      title     = {Evaluating and Integrating Databases in the Area of {NLP}},
      booktitle = {International Quantitative Linguistics Conference (QUALICO 2018)},
      year      = {2018},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/04/Hemat-Mehler-Uslu-Baumartz-Abrami-Qualico-2018.pdf},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2018/10/qualico2018_databases_poster_hemati_mehler_uslu_baumartz_abrami.pdf},
      location  = {Wroclaw, Poland}
    }
    Alexander Mehler, Wahed Hemati, Rüdiger Gleim and Daniel Baumartz. 2018. VienNA: Auf dem Weg zu einer Infrastruktur für die verteilte interaktive evolutionäre Verarbeitung natürlicher Sprache. Forschungsinfrastrukturen und digitale Informationssysteme in der germanistischen Sprachwissenschaft, 6.
    BibTeX
    @incollection{Mehler:Hemati:Gleim:Baumartz:2018,
      author    = {Alexander Mehler and Wahed Hemati and Rüdiger Gleim and Daniel Baumartz},
      title     = {{VienNA: }{Auf dem Weg zu einer Infrastruktur für die verteilte
                   interaktive evolutionäre Verarbeitung natürlicher Sprache}},
      booktitle = {Forschungsinfrastrukturen und digitale Informationssysteme in
                   der germanistischen Sprachwissenschaft},
      publisher = {De Gruyter},
      editor    = {Henning Lobin and Roman Schneider and Andreas Witt},
      volume    = {6},
      address   = {Berlin},
      year      = {2018}
    }
    Alexander Mehler, Wahed Hemati, Tolga Uslu and Andy Lücking. 2018. A Multidimensional Model of Syntactic Dependency Trees for Authorship Attribution. Quantitative analysis of dependency structures.
    BibTeX
    @incollection{Mehler:Hemati:Uslu:Luecking:2018,
      author    = {Alexander Mehler and Wahed Hemati and Tolga Uslu and Andy Lücking},
      title     = {A Multidimensional Model of Syntactic Dependency Trees for Authorship
                   Attribution},
      booktitle = {Quantitative analysis of dependency structures},
      publisher = {De Gruyter},
      editor    = {Jingyang Jiang and Haitao Liu},
      address   = {Berlin/New York},
      abstract  = {Abstract: In this chapter we introduce a multidimensional model
                   of syntactic dependency trees. Our ultimate goal is to generate
                   fingerprints of such trees to predict the author of the underlying
                   sentences. The chapter makes a first attempt to create such fingerprints
                   for sentence categorization via the detour of text categorization.
                   We show that at text level, aggregated dependency structures actually
                   provide information about authorship. At the same time, we show
                   that this does not hold for topic detection. We evaluate our model
                   using a quarter of a million sentences collected in two corpora:
                   the first is sampled from literary texts, the second from Wikipedia
                   articles. As a second finding of our approach, we show that quantitative
                   models of dependency structure do not yet allow for detecting
                   syntactic alignment in written communication. We conclude that
                   this is mainly due to effects of lexical alignment on syntactic
                   alignment.},
      keywords  = {Dependency structure, Authorship attribution, Text
                       categorization, Syntactic Alignment},
      year      = {2018}
    }
    Tolga Uslu, Alexander Mehler and Dirk Meyer. 2018. LitViz: Visualizing Literary Data by Means of text2voronoi. Proceedings of the Digital Humanities 2018.
    BibTeX
    @inproceedings{Uslu:Mehler:Meyer:2018,
      author    = {Tolga Uslu and Alexander Mehler and Dirk Meyer},
      title     = {{{LitViz}: Visualizing Literary Data by Means of text2voronoi}},
      booktitle = {Proceedings of the Digital Humanities 2018},
      series    = {DH2018},
      location  = {Mexico City, Mexico},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/LitViz.pdf},
      year      = {2018}
    }
    Christian Spiekermann, Giuseppe Abrami and Alexander Mehler. 2018. VAnnotatoR: a Gesture-driven Annotation Framework for Linguistic and Multimodal Annotation. Proceedings of the Annotation, Recognition and Evaluation of Actions (AREA 2018) Workshop.
    BibTeX
    @inproceedings{Spiekerman:Abrami:Mehler:2018,
      author    = {Christian Spiekermann and Giuseppe Abrami and Alexander Mehler},
      title     = {{VAnnotatoR}: a Gesture-driven Annotation Framework for Linguistic
                   and Multimodal Annotation},
      booktitle = {Proceedings of the Annotation, Recognition and Evaluation of Actions
                   (AREA 2018) Workshop},
      series    = {AREA},
      location  = {Miyazaki, Japan},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/VAnnotatoR.pdf},
      year      = {2018}
    }
    Tolga Uslu, Lisa Miebach, Steffen Wolfsgruber, Michael Wagner, Klaus Fließbach, Rüdiger Gleim, Wahed Hemati, Alexander Henlein and Alexander Mehler. 2018. Automatic Classification in Memory Clinic Patients and in Depressive Patients. Proceedings of Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric impairments (RaPID-2).
    BibTeX
    @inproceedings{Uslu:et:al:2018:a,
      author    = {Tolga Uslu and Lisa Miebach and Steffen Wolfsgruber and Michael Wagner
                   and Klaus Fließbach and Rüdiger Gleim and Wahed Hemati and Alexander Henlein
                   and Alexander Mehler},
      title     = {{Automatic Classification in Memory Clinic Patients and in Depressive Patients}},
      booktitle = {Proceedings of Resources and ProcessIng of linguistic, para-linguistic
                   and extra-linguistic Data from people with various forms of cognitive/psychiatric
                   impairments (RaPID-2)},
      series    = {RaPID},
      location  = {Miyazaki, Japan},
      year      = {2018}
    }
    Alexander Mehler, Rüdiger Gleim, Andy Lücking, Tolga Uslu and Christian Stegbauer. 2018. On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach. Glottometrics, 40:1–44.
    BibTeX
    @article{Mehler:Gleim:Luecking:Uslu:Stegbauer:2018,
      author    = {Alexander Mehler and Rüdiger Gleim and Andy Lücking and Tolga Uslu
                   and Christian Stegbauer},
      title     = {On the Self-similarity of {Wikipedia} Talks: a Combined Discourse-analytical
                   and Quantitative Approach},
      journal   = {Glottometrics},
      volume    = {40},
      pages     = {1-44},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/Glottometrics-Mehler.pdf},
      year      = {2018}
    }
    Tolga Uslu, Alexander Mehler, Andreas Niekler and Daniel Baumartz. 2018. Towards a DDC-based Topic Network Model of Wikipedia. Proceedings of 2nd International Workshop on Modeling, Analysis, and Management of Social Networks and their Applications (SOCNET 2018), February 28, 2018.
    BibTeX
    @inproceedings{Uslu:Mehler:Niekler:Baumartz:2018,
      author    = {Tolga Uslu and Alexander Mehler and Andreas Niekler and Daniel Baumartz},
      title     = {Towards a {DDC}-based Topic Network Model of Wikipedia},
      booktitle = {Proceedings of 2nd International Workshop on Modeling, Analysis,
                   and Management of Social Networks and their Applications (SOCNET
                   2018), February 28, 2018},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/TowardsDDC.pdf},
      year      = {2018}
    }
    Tolga Uslu, Alexander Mehler, Daniel Baumartz, Alexander Henlein and Wahed Hemati. 2018. fastSense: An Efficient Word Sense Disambiguation Classifier. Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 7 - 12.
    BibTeX
    @inproceedings{Uslu:et:al:2018,
      author    = {Tolga Uslu and Alexander Mehler and Daniel Baumartz and Alexander Henlein
                   and Wahed Hemati},
      title     = {fastSense: An Efficient Word Sense Disambiguation Classifier},
      booktitle = {Proceedings of the 11th edition of the Language Resources and
                   Evaluation Conference, May 7 - 12},
      series    = {LREC 2018},
      address   = {Miyazaki, Japan},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/fastSense.pdf},
      year      = {2018}
    }
    Rüdiger Gleim, Alexander Mehler and Sung Y. Song. 2018. WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis. Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 7 - 12.
    BibTeX
    @inproceedings{Gleim:Mehler:Song:2018,
      author    = {R{\"u}diger Gleim and Alexander Mehler and Sung Y. Song},
      title     = {WikiDragon: A Java Framework For Diachronic Content And Network
                   Analysis Of MediaWikis},
      booktitle = {Proceedings of the 11th edition of the Language Resources and
                   Evaluation Conference, May 7 - 12},
      series    = {LREC 2018},
      address   = {Miyazaki, Japan},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/WikiDragon.pdf},
      year      = {2018}
    }
    Philipp Helfrich, Elias Rieb, Giuseppe Abrami, Andy Lücking and Alexander Mehler. 2018. TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations. Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 7 - 12.
    BibTeX
    @inproceedings{Helfrich:et:al:2018,
      author    = {Philipp Helfrich and Elias Rieb and Giuseppe Abrami and Andy L{\"u}cking
                   and Alexander Mehler},
      title     = {TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations},
      booktitle = {Proceedings of the 11th edition of the Language Resources and
                   Evaluation Conference, May 7 - 12},
      series    = {LREC 2018},
      address   = {Miyazaki, Japan},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/TreeAnnotator.pdf},
      year      = {2018}
    }
    Giuseppe Abrami and Alexander Mehler. May, 2018. A UIMA Database Interface for Managing NLP-related Text Annotations. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
    BibTeX
    @inproceedings{Abrami:Mehler:2018,
      address   = {Miyazaki, Japan},
      author    = {Abrami, Giuseppe and Mehler, Alexander},
      booktitle = {Proceedings of the Eleventh International Conference on Language
                   Resources and Evaluation ({LREC} 2018)},
      editor    = {Calzolari, Nicoletta and Choukri, Khalid and Cieri, Christopher
                   and Declerck, Thierry and Goggi, Sara and Hasida, Koiti and Isahara, Hitoshi
                   and Maegaard, Bente and Mariani, Joseph and Mazo, H{\'e}l{\`e}ne and Moreno, Asuncion
                   and Odijk, Jan and Piperidis, Stelios and Tokunaga, Takenobu},
      month     = {may},
      series    = {LREC 2018},
      keywords  = {UIMA},
      pdf       = {https://aclanthology.org/L18-1212.pdf},
      publisher = {European Language Resources Association (ELRA)},
      title     = {A {UIMA} Database Interface for Managing {NLP}-related Text Annotations},
      url       = {https://aclanthology.org/L18-1212},
      year      = {2018}
    }
    Alexander Mehler, Christian Stegbauer and Barbara Frank-Job. 2018. Ferdinand de Saussure. 1916. Cours de linguistique générale. Payot, Lausanne/Paris. In: Schlüsselwerke der Netzwerkforschung. Ed. by Christian Stegbauer and Boris Holzer. Springer VS.
    BibTeX
    @inbook{Mehler:Stegbauer:Frank-Job:2018,
      author    = {Alexander Mehler and Christian Stegbauer and Barbara Frank-Job},
      editor    = {Christian Stegbauer and Boris Holzer},
      title     = {{Ferdinand de Saussure. 1916. Cours de linguistique générale.
                   Payot, Lausanne/Paris}},
      publisher = {Springer VS},
      address   = {Wiesbaden},
      booktitle = {Schlüsselwerke der Netzwerkforschung},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2017/11/Saussure2.pdf},
      year      = {2018}
    }
    Alexander Mehler, Olga Zlatkin-Troitschanskaia, Wahed Hemati, Dimitri Molerov, Andy Lücking and Susanne Schmidt. 2018. Integrating Computational Linguistic Analysis of Multilingual Learning Data and Educational Measurement Approaches to Explore Learning in Higher Education. In: Positive Learning in the Age of Information: A Blessing or a Curse?, 145–193. Ed. by Olga Zlatkin-Troitschanskaia, Gabriel Wittum and Andreas Dengel. Springer Fachmedien Wiesbaden.
    BibTeX
    @inbook{Mehler:et:al:2018,
      abstract  = {This chapter develops a computational linguistic model for analyzing
                   and comparing multilingual data as well as its application to
                   a large body of standardized assessment data from higher education.
                   The approach employs both an automatic and a manual annotation
                   of the data on several linguistic layers (including parts of speech,
                   text structure and content). Quantitative features of the textual
                   data are explored that are related to both the students' (domain-specific
                   knowledge) test results and their level of academic experience.
                   The respective analysis involves statistics of distance correlation,
                   text categorization with respect to text types (questions and
                   response options) as well as languages (English and German), and
                   network analysis to assess dependencies between features. The
                   correlation between correct test results of students and linguistic
                   features of the verbal presentations of tests indicate to what
                   extent language influences higher education test performance.
                   It has also been found that this influence relates to specialized
                   language. Thus, this integrative modeling approach contributes
                   a test basis for a large-scale analysis of learning data and points
                   to a number of subsequent, more detailed research questions.},
      address   = {Wiesbaden},
      author    = {Mehler, Alexander and Zlatkin-Troitschanskaia, Olga and Hemati, Wahed
                   and Molerov, Dimitri and L{\"u}cking, Andy and Schmidt, Susanne},
      booktitle = {Positive Learning in the Age of Information: A Blessing or a Curse?},
      doi       = {10.1007/978-3-658-19567-0_10},
      editor    = {Zlatkin-Troitschanskaia, Olga and Wittum, Gabriel and Dengel, Andreas},
      isbn      = {978-3-658-19567-0},
      pages     = {145--193},
      publisher = {Springer Fachmedien Wiesbaden},
      title     = {Integrating Computational Linguistic Analysis of Multilingual
                   Learning Data and Educational Measurement Approaches to Explore
                   Learning in Higher Education},
      url       = {https://doi.org/10.1007/978-3-658-19567-0_10},
      year      = {2018}
    }
    Giuseppe Abrami, Sajawel Ahmed, Rüdiger Gleim, Wahed Hemati, Alexander Mehler and Uslu Tolga. March, 2018. Natural Language Processing and Text Mining for BIOfid.
    BibTeX
    @misc{Abrami:et:al:2018b,
      author    = {Abrami, Giuseppe and Ahmed, Sajawel and Gleim, R{\"u}diger and Hemati, Wahed
                   and Mehler, Alexander and Uslu Tolga},
      title     = {{Natural Language Processing and Text Mining for BIOfid}},
      howpublished = {Presentation at the 1st Meeting of the Scientific Advisory Board of the BIOfid Project},
      adress    = {Goethe-University, Frankfurt am Main, Germany},
      year      = {2018},
      month     = {March},
      day       = {08},
      pdf       = {}
    }

    2017

    Alexander Mehler and Andy Lücking. 2017. Modelle sozialer Netzwerke und Natural Language Processing: eine methodologische Randnotiz. Soziologie, 46(1):43–47.
    BibTeX
    @article{Mehler:Luecking:2017,
      author    = {Alexander Mehler and Andy Lücking},
      title     = {Modelle sozialer Netzwerke und Natural Language Processing: eine
                   methodologische Randnotiz},
      journal   = {Soziologie},
      volume    = {46},
      number    = {1},
      pages     = {43-47},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/Soziologe-NetzwerkeundNLP.pdf},
      year      = {2017}
    }
    Wahed Hemati, Alexander Mehler and Tolga Uslu. 2017. CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools. BioCreative V.5. Proceedings.
    BibTeX
    @inproceedings{Hemati:Mehler:Uslu:2017,
      author    = {Wahed Hemati and Alexander Mehler and Tolga Uslu},
      title     = {{CRFVoter}: Chemical Entity Mention, Gene and Protein Related
                   Object recognition using a conglomerate of CRF based tools},
      booktitle = {BioCreative V.5. Proceedings},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/CRFVoter.pdf},
      year      = {2017}
    }
    Wahed Hemati, Tolga Uslu and Alexander Mehler. 2017. TextImager as an interface to BeCalm. BioCreative V.5. Proceedings.
    BibTeX
    @inproceedings{Hemati:Uslu:Mehler:2017,
      author    = {Wahed Hemati and Tolga Uslu and Alexander Mehler},
      title     = {{TextImager} as an interface to {BeCalm}},
      booktitle = {BioCreative V.5. Proceedings},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/TextImager_BeCalm.pdf},
      year      = {2017}
    }
    Alexander Mehler, Giuseppe Abrami, Steffen Bruendel, Lisa Felder, Thomas Ostertag and Christian Spiekermann. 2017. Stolperwege: An App for a Digital Public History of the Holocaust. Proceedings of the 28th ACM Conference on Hypertext and Social Media, 319–320.
    BibTeX
    @inproceedings{Mehler:et:al:2017:a,
      author    = {Alexander Mehler and Giuseppe Abrami and Steffen Bruendel and Lisa Felder
                   and Thomas Ostertag and Christian Spiekermann},
      title     = {{Stolperwege:} An App for a Digital Public History of the {Holocaust}},
      booktitle = {Proceedings of the 28th ACM Conference on Hypertext and Social Media},
      series    = {HT '17},
      pages     = {319--320},
      address   = {New York, NY, USA},
      publisher = {ACM},
      abstract  = {We present the Stolperwege app, a web-based framework for ubiquitous
                   modeling of historical processes. Starting from the art project
                   Stolpersteine of Gunter Demnig, it allows for virtually connecting
                   these stumbling blocks with information about the biographies
                   of victims of Nazism. According to the practice of public history,
                   the aim of Stolperwege is to deepen public knowledge of the Holocaust
                   in the context of our everyday environment. Stolperwege uses an
                   information model that allows for modeling social networks of
                   agents starting from information about portions of their life.
                   The paper exemplifies how Stolperwege is informationally enriched
                   by means of historical maps and 3D animations of (historical)
                   buildings.},
      acmid     = {3078748},
      doi       = {10.1145/3078714.3078748},
      isbn      = {978-1-4503-4708-2},
      keywords  = {3d, geocaching, geotagging, historical maps,
                       historical processes, public history of the holocaust,
                       ubiquitous computing},
      location  = {Prague, Czech Republic},
      numpages  = {2},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2017/07/poster_ht2017.pdf},
      url       = {http://doi.acm.org/10.1145/3078714.3078748},
      year      = {2017}
    }
    Alexander Mehler, Rüdiger Gleim, Wahed Hemati and Tolga Uslu. 2017. Skalenfreie online soziale Lexika am Beispiel von Wiktionary. Proceedings of 53rd Annual Conference of the Institut für Deutsche Sprache (IDS), March 14-16, Mannheim, Germany. In German. Title translates into: Scale-free online-social Lexika by Example of Wiktionary.
    BibTeX
    @inproceedings{Mehler:Gleim:Hemati:Uslu:2017,
      author    = {Alexander Mehler and Rüdiger Gleim and Wahed Hemati and Tolga Uslu},
      title     = {{Skalenfreie online soziale Lexika am Beispiel von Wiktionary}},
      booktitle = {Proceedings of 53rd Annual Conference of the Institut für Deutsche
                   Sprache (IDS), March 14-16, Mannheim, Germany},
      editor    = {Stefan Engelberg and Henning Lobin and Kathrin Steyer and Sascha Wolfer},
      address   = {Berlin},
      publisher = {De Gruyter},
      note      = {In German. Title translates into: Scale-free
                       online-social Lexika by Example of Wiktionary},
      abstract  = {In English: The paper deals with characteristics of the structural,
                   thematic and participatory dynamics of collaboratively generated
                   lexical networks. This is done by example of Wiktionary. Starting
                   from a network-theoretical model in terms of so-called multi-layer
                   networks, we describe Wiktionary as a scale-free lexicon. Systems
                   of this sort are characterized by the fact that their content-related
                   dynamics is determined by the underlying dynamics of collaborating
                   authors. This happens in a way that social structure imprints
                   on content structure. According to this conception, the unequal
                   distribution of the activities of authors results in a correspondingly
                   unequal distribution of the information units documented within
                   the lexicon. The paper focuses on foundations for describing such
                   systems starting from a parameter space which requires to deal
                   with Wiktionary as an issue in big data analysis. In German: Der
                   Beitrag thematisiert Eigenschaften der strukturellen, thematischen
                   und partizipativen Dynamik kollaborativ erzeugter lexikalischer
                   Netzwerke am Beispiel von Wiktionary. Ausgehend von einem netzwerktheoretischen
                   Modell in Form so genannter Mehrebenennetzwerke wird Wiktionary
                   als ein skalenfreies Lexikon beschrieben. Systeme dieser Art zeichnen
                   sich dadurch aus, dass ihre inhaltliche Dynamik durch die zugrundeliegende
                   Kollaborationsdynamik bestimmt wird, und zwar so, dass sich die
                   soziale Struktur der entsprechenden inhaltlichen Struktur aufprägt.
                   Dieser Auffassung gemäß führt die Ungleichverteilung der Aktivitäten
                   von Lexikonproduzenten zu einer analogen Ungleichverteilung der
                   im Lexikon dokumentierten Informationseinheiten. Der Beitrag thematisiert
                   Grundlagen zur Beschreibung solcher Systeme ausgehend von einem
                   Parameterraum, welcher die netzwerkanalytische Betrachtung von
                   Wiktionary als Big-Data-Problem darstellt.},
      year      = {2017}
    }
    Tolga Uslu, Wahed Hemati, Alexander Mehler and Daniel Baumartz. 2017. TextImager as a Generic Interface to R. Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017).
    BibTeX
    @inproceedings{Uslu:Hemati:Mehler:Baumartz:2017,
      author    = {Tolga Uslu and Wahed Hemati and Alexander Mehler and Daniel Baumartz},
      title     = {{TextImager} as a Generic Interface to {R}},
      booktitle = {Software Demonstrations of the 15th Conference of the European
                   Chapter of the Association for Computational Linguistics (EACL
                   2017)},
      location  = {Valencia, Spain},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/TextImager.pdf},
      year      = {2017}
    }

    2016

    Steffen Eger, Armin Hoenen and Alexander Mehler. 2016. Language classification from bilingual word embedding graphs. Proceedings of COLING 2016.
    BibTeX
    @inproceedings{Eger:Hoenen:Mehler:2016,
      author    = {Steffen Eger and Armin Hoenen and Alexander Mehler},
      title     = {Language classification from bilingual word embedding graphs},
      booktitle = {Proceedings of COLING 2016},
      publisher = {ACL},
      location  = {Osaka},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2016/10/eger_hoenen_mehler_COLING2016.pdf},
      year      = {2016}
    }
    Wahed Hemati, Tolga Uslu and Alexander Mehler. 2016. TextImager: a Distributed UIMA-based System for NLP. Proceedings of the COLING 2016 System Demonstrations.
    BibTeX
    @inproceedings{Hemati:Uslu:Mehler:2016,
      author    = {Wahed Hemati and Tolga Uslu and Alexander Mehler},
      title     = {TextImager: a Distributed UIMA-based System for NLP},
      booktitle = {Proceedings of the COLING 2016 System Demonstrations},
      organization = {Federated Conference on Computer Science and
                       Information Systems},
      location  = {Osaka, Japan},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2018/03/TextImager2016.pdf},
      year      = {2016}
    }
    Alexander Mehler, Tolga Uslu and Wahed Hemati. 2016. Text2voronoi: An Image-driven Approach to Differential Diagnosis. Proceedings of the 5th Workshop on Vision and Language (VL'16) hosted by the 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin.
    BibTeX
    @inproceedings{Mehler:Uslu:Hemati:2016,
      author    = {Alexander Mehler and Tolga Uslu and Wahed Hemati},
      title     = {Text2voronoi: An Image-driven Approach to Differential Diagnosis},
      booktitle = {Proceedings of the 5th Workshop on Vision and Language (VL'16)
                   hosted by the 54th Annual Meeting of the Association for Computational
                   Linguistics (ACL), Berlin},
      pdf       = {https://aclweb.org/anthology/W/W16/W16-3212.pdf},
      year      = {2016}
    }
    Steffen Eger and Alexander Mehler. 2016. On the linearity of semantic change: Investigating meaning variation via dynamic graph models. Proceedings of ACL 2016.
    BibTeX
    @inproceedings{Eger:Mehler:2016,
      author    = {Steffen Eger and Alexander Mehler},
      title     = {On the linearity of semantic change: {I}nvestigating meaning variation
                   via dynamic graph models},
      booktitle = {Proceedings of ACL 2016},
      location  = {Berlin},
      pdf       = {https://www.aclweb.org/anthology/P/P16/P16-2009.pdf},
      year      = {2016}
    }
    Steffen Eger, Tim vor der Brück and Alexander Mehler. 2016. A Comparison of Four Character-Level String-to-String Translation Models for (OCR) Spelling Error Correction. The Prague Bulletin of Mathematical Linguistics, 105:77–99.
    BibTeX
    @article{Eger:vorDerBrueck:Mehler:2016,
      author    = {Eger, Steffen and vor der Brück, Tim and Mehler, Alexander},
      title     = {A Comparison of Four Character-Level String-to-String Translation
                   Models for (OCR) Spelling Error Correction},
      journal   = {The Prague Bulletin of Mathematical Linguistics},
      volume    = {105},
      pages     = {77-99},
      doi       = {10.1515/pralin-2016-0004},
      pdf       = {https://ufal.mff.cuni.cz/pbml/105/art-eger-vor-der-brueck.pdf},
      year      = {2016}
    }
    Alexander Mehler, Benno Wagner and Rüdiger Gleim. 2016. Wikidition: Towards A Multi-layer Network Model of Intertextuality. Proceedings of DH 2016, 12-16 July.
    BibTeX
    @inproceedings{Mehler:Wagner:Gleim:2016,
      author    = {Mehler, Alexander and Wagner, Benno and Gleim, R\"{u}diger},
      title     = {Wikidition: Towards A Multi-layer Network Model of Intertextuality},
      booktitle = {Proceedings of DH 2016, 12-16 July},
      series    = {DH 2016},
      abstract  = {The paper presents Wikidition, a novel text mining tool for generating
                   online editions of text corpora. It explores lexical, sentential
                   and textual relations to span multi-layer networks (linkification)
                   that allow for browsing syntagmatic and paradigmatic relations
                   among the constituents of its input texts. In this way, relations
                   of text reuse can be explored together with lexical relations
                   within the same literary memory information system. Beyond that,
                   Wikidition contains a module for automatic lexiconisation to extract
                   author specific vocabularies. Based on linkification and lexiconisation,
                   Wikidition does not only allow for traversing input corpora on
                   different (lexical, sentential and textual) levels. Rather, its
                   readers can also study the vocabulary of authors on several levels
                   of resolution including superlemmas, lemmas, syntactic words and
                   wordforms. We exemplify Wikidition by a range of literary texts
                   and evaluate it by means of the apparatus of quantitative network
                   analysis.},
      location  = {Kraków},
      url       = {http://dh2016.adho.org/abstracts/250},
      year      = {2016}
    }
    Tim vor der Brück and Alexander Mehler. 2016. TLT-CRF: A Lexicon-supported Morphological Tagger for Latin Based on Conditional Random Fields. Proceedings of the 10th International Conference on Language Resources and Evaluation.
    BibTeX
    @inproceedings{vorderBrueck:Mehler:2016,
      author    = {vor der Br\"{u}ck, Tim and Mehler, Alexander},
      title     = {{TLT-CRF}: A Lexicon-supported Morphological Tagger for {Latin}
                   Based on Conditional Random Fields},
      booktitle = {Proceedings of the 10th International Conference on Language Resources
                   and Evaluation},
      series    = {LREC 2016},
      location  = {{Portoro\v{z} (Slovenia)}},
      pdf       = {http://www.texttechnologylab.org/wp-content/uploads/2016/04/lrec2016_tagger.pdf},
      year      = {2016}
    }
    Steffen Eger, Rüdiger Gleim and Alexander Mehler. 2016. Lemmatization and Morphological Tagging in German and Latin: A comparison and a survey of the state-of-the-art. Proceedings of the 10th International Conference on Language Resources and Evaluation.
    BibTeX
    @inproceedings{Eger:Mehler:Gleim:2016,
      author    = {Eger, Steffen and Gleim, R\"{u}diger and Mehler, Alexander},
      title     = {Lemmatization and Morphological Tagging in {German} and {Latin}:
                   A comparison and a survey of the state-of-the-art},
      booktitle = {Proceedings of the 10th International Conference on Language Resources
                   and Evaluation},
      series    = {LREC 2016},
      location  = {Portoro\v{z} (Slovenia)},
      pdf       = {http://www.texttechnologylab.org/wp-content/uploads/2016/04/lrec_eger_gleim_mehler.pdf},
      year      = {2016}
    }
    Andy Lücking, Alexander Mehler, Désirée Walther, Marcel Mauri and Dennis Kurfürst. 2016. Finding Recurrent Features of Image Schema Gestures: the FIGURE corpus. Proceedings of the 10th International Conference on Language Resources and Evaluation.
    BibTeX
    @inproceedings{Luecking:Mehler:Walther:Mauri:Kurfuerst:2016,
      author    = {L\"{u}cking, Andy and Mehler, Alexander and Walther, D\'{e}sir\'{e}e
                   and Mauri, Marcel and Kurf\"{u}rst, Dennis},
      title     = {Finding Recurrent Features of Image Schema Gestures: the {FIGURE} corpus},
      booktitle = {Proceedings of the 10th International Conference on Language Resources
                   and Evaluation},
      series    = {LREC 2016},
      location  = {Portoro\v{z} (Slovenia)},
      pdf       = {http://www.texttechnologylab.org/wp-content/uploads/2016/04/lrec2016-gesture-study-final-version-short.pdf},
      year      = {2016}
    }
    Andy Lücking, Armin Hoenen and Alexander Mehler. 2016. TGermaCorp – A (Digital) Humanities Resource for (Computational) Linguistics. Proceedings of the 10th International Conference on Language Resources and Evaluation.
    BibTeX
    @inproceedings{Luecking:Hoenen:Mehler:2016,
      author    = {L\"{u}cking, Andy and Hoenen, Armin and Mehler, Alexander},
      title     = {{TGermaCorp} -- A (Digital) Humanities Resource for (Computational) Linguistics},
      booktitle = {Proceedings of the 10th International Conference on Language Resources
                   and Evaluation},
      series    = {LREC 2016},
      islrn     = {536-382-801-278-5},
      location  = {Portoro\v{z} (Slovenia)},
      pdf       = {http://www.texttechnologylab.org/wp-content/uploads/2016/04/lrec2016-ttgermacorp-final.pdf},
      year      = {2016}
    }
    Benno Wagner, Alexander Mehler and Hanno Biber. 2016. Transbiblionome Daten in der Literaturwissenschaft. Texttechnologische Erschließung und digitale Visualisierung intertextueller Beziehungen digitaler Korpora. DHd 2016.
    BibTeX
    @inproceedings{Wagner:Mehler:Biber:2016,
      author    = {Wagner, Benno and Mehler, Alexander and Biber, Hanno},
      title     = {{Transbiblionome Daten in der Literaturwissenschaft. Texttechnologische
                   Erschließung und digitale Visualisierung intertextueller Beziehungen
                   digitaler Korpora}},
      booktitle = {DHd 2016},
      url       = {http://www.dhd2016.de/abstracts/sektionen-005.html#index.xml-body.1_div.4},
      year      = {2016}
    }
    Alexander Mehler, Rüdiger Gleim, Tim vor der Brück, Wahed Hemati, Tolga Uslu and Steffen Eger. 2016. Wikidition: Automatic Lexiconization and Linkification of Text Corpora. Information Technology, 58:70–79.
    BibTeX
    @article{Mehler:et:al:2016,
      author    = {Alexander Mehler and Rüdiger Gleim and Tim vor der Brück and Wahed Hemati
                   and Tolga Uslu and Steffen Eger},
      title     = {Wikidition: Automatic Lexiconization and Linkification of Text Corpora},
      journal   = {Information Technology},
      volume    = {58},
      pages     = {70-79},
      abstract  = {We introduce a new text technology, called Wikidition, which automatically
                   generates large scale editions of corpora of natural language
                   texts. Wikidition combines a wide range of text mining tools for
                   automatically linking lexical, sentential and textual units. This
                   includes the extraction of corpus-specific lexica down to the
                   level of syntactic words and their grammatical categories. To
                   this end, we introduce a novel measure of text reuse and exemplify
                   Wikidition by means of the capitularies, that is, a corpus of
                   Medieval Latin texts.},
      doi       = {10.1515/itit-2015-0035},
      year      = {2016}
    }
    Armin Hoenen, Alexander Mehler and Jost Gippert. 2016. Corpora and Resources for (Historical) Low Resource Languages. 31(2). JLCL.
    BibTeX
    @collection{GSCL:JLCL:2016:2,
      bibsource = {GSCL, http://www.gscl.info/},
      editor    = {Armin Hoenen and Alexander Mehler and Jost Gippert},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2017/10/Titelblatt-Heft2-2016.png},
      issn      = {2190-6858},
      number    = {2},
      pdf       = {http://www.jlcl.org/2016_Heft2/Heft2-2016.pdf},
      publisher = {JLCL},
      title     = {{Corpora and Resources for (Historical) Low Resource Languages}},
      volume    = {31},
      year      = {2016}
    }
    Armin Hoenen, Alexander Mehler and Jost Gippert. 2016. Editorial. JLCL, 31(2):iii–iv.
    BibTeX
    @article{Hoenen:Mehler:Gippert:2016,
      author    = {Armin Hoenen and Alexander Mehler and Jost Gippert},
      title     = {{Editorial}},
      journal   = {JLCL},
      volume    = {31},
      number    = {2},
      pages     = {iii--iv},
      pdf       = {http://www.jlcl.org/2016_Heft2/Heft2-2016.pdf},
      year      = {2016}
    }

    2015

    Chris Biemann and Alexander Mehler, eds. 2015. Text Mining: From Ontology Learning to Automated Text Processing Applications. Festschrift in Honor of Gerhard Heyer. Theory and Applications of Natural Language Processing. Springer.
    BibTeX
    @book{Biemann:Mehler:2015,
      editor    = {Biemann, Chris and Mehler, Alexander},
      title     = {{Text Mining: From Ontology Learning to Automated Text Processing
                   Applications. Festschrift in Honor of Gerhard Heyer}},
      publisher = {Springer},
      series    = {Theory and Applications of Natural Language Processing},
      address   = {Heidelberg},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/TextMiningsmall.jpg},
      year      = {2015}
    }
    Tim vor der Brück, Steffen Eger and Alexander Mehler. 2015. Complex Decomposition of the Negative Distance Kernel. IEEE International Conference on Machine Learning and Applications.
    BibTeX
    @inproceedings{vor:der:Bruck:Eger:Mehler:2015,
      author    = {vor der Br{\"u}ck, Tim and Eger, Steffen and Mehler, Alexander},
      title     = {Complex Decomposition of the Negative Distance Kernel},
      booktitle = {IEEE International Conference on Machine Learning and Applications},
      location  = {Miami, Florida, USA},
      year      = {2015}
    }
    Steffen Eger, Tim vor der Brück and Alexander Mehler. 2015. Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods. Proceedings of the 9th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2015).
    BibTeX
    @inproceedings{Eger:vor:der:Brueck:Mehler:2015,
      author    = {Eger, Steffen and vor der Brück, Tim and Mehler, Alexander},
      title     = {Lexicon-assisted tagging and lemmatization in {Latin}: A comparison
                   of six taggers and two lemmatization methods},
      booktitle = {Proceedings of the 9th Workshop on Language Technology for Cultural
                   Heritage, Social Sciences, and Humanities ({LaTeCH 2015})},
      address   = {Beijing, China},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Lexicon-assisted_tagging.pdf},
      year      = {2015}
    }
    Alexander Mehler, Andy Lücking, Sven Banisch, Philippe Blanchard and Barbara Frank-Job, eds. 2015. Towards a Theoretical Framework for Analyzing Complex Linguistic Networks. Understanding Complex Systems. Springer.
    BibTeX
    @book{Mehler:Luecking:Banisch:Blanchard:Frank-Job:2015,
      editor    = {Mehler, Alexander and Lücking, Andy and Banisch, Sven and Blanchard, Philippe
                   and Frank-Job, Barbara},
      title     = {Towards a Theoretical Framework for Analyzing Complex Linguistic Networks},
      publisher = {Springer},
      series    = {Understanding Complex Systems},
      adress    = {Berlin and New York},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/UCS_17-2-tmp.png},
      isbn      = {978-36-662-47237-8},
      year      = {2015}
    }
    Alexander Mehler and Rüdiger Gleim. 2015. Linguistic Networks – An Online Platform for Deriving Collocation Networks from Natural Language Texts. Towards a Theoretical Framework for Analyzing Complex Linguistic Networks.
    BibTeX
    @incollection{Mehler:Gleim:2015:a,
      author    = {Mehler, Alexander and Gleim, Rüdiger},
      title     = {Linguistic Networks -- An Online Platform for Deriving Collocation
                   Networks from Natural Language Texts},
      booktitle = {Towards a Theoretical Framework for Analyzing Complex Linguistic Networks},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Lücking, Andy and Banisch, Sven and Blanchard, Philippe
                   and Frank-Job, Barbara},
      series    = {Understanding Complex Systems},
      year      = {2015}
    }
    Steffen Eger, Niko Schenk and Alexander Mehler. June, 2015. Towards Semantic Language Classification: Inducing and Clustering Semantic Association Networks from Europarl. Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 127–136.
    BibTeX
    @inproceedings{Eger:Schenk:Mehler:2015,
      author    = {Eger, Steffen and Schenk, Niko and Mehler, Alexander},
      title     = {Towards Semantic Language Classification: Inducing and Clustering
                   Semantic Association Networks from Europarl},
      booktitle = {Proceedings of the Fourth Joint Conference on Lexical and Computational
                   Semantics},
      pages     = {127--136},
      publisher = {Association for Computational Linguistics},
      month     = {June},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/starsem2015-corrected-version.pdf},
      url       = {http://www.aclweb.org/anthology/S15-1014},
      year      = {2015}
    }
    Alexander Mehler, Tim vor der Brück, Rüdiger Gleim and Tim Geelhaar. 2015. Towards a Network Model of the Coreness of Texts: An Experiment in Classifying Latin Texts using the TTLab Latin Tagger. Text Mining: From Ontology Learning to Automated text Processing Applications, 87–112.
    BibTeX
    @incollection{Mehler:Brueck:Gleim:Geelhaar:2015,
      author    = {Mehler, Alexander and vor der Brück, Tim and Gleim, Rüdiger and Geelhaar, Tim},
      title     = {Towards a Network Model of the Coreness of Texts: An Experiment
                   in Classifying Latin Texts using the TTLab Latin Tagger},
      booktitle = {Text Mining: From Ontology Learning to Automated text Processing Applications},
      publisher = {Springer},
      editor    = {Chris Biemann and Alexander Mehler},
      series    = {Theory and Applications of Natural Language Processing},
      pages     = {87-112},
      address   = {Berlin/New York},
      abstract  = {The analysis of longitudinal corpora of historical texts requires
                   the integrated development of tools for automatically preprocessing
                   these texts and for building representation models of their genre-
                   and register-related dynamics. In this chapter we present such
                   a joint endeavor that ranges from resource formation via preprocessing
                   to network-based text representation and classification. We start
                   with presenting the so-called TTLab Latin Tagger (TLT) that preprocesses
                   texts of classical and medieval Latin. Its lexical resource in
                   the form of the Frankfurt Latin Lexicon (FLL) is also briefly
                   introduced. As a first test case for showing the expressiveness
                   of these resources, we perform a tripartite classification task
                   of authorship attribution, genre detection and a combination thereof.
                   To this end, we introduce a novel text representation model that
                   explores the core structure (the so-called coreness) of lexical
                   network representations of texts. Our experiment shows the expressiveness
                   of this representation format and mediately of our Latin preprocessor.},
      website   = {http://link.springer.com/chapter/10.1007/978-3-319-12655-5_5},
      year      = {2015}
    }
    Rüdiger Gleim and Alexander Mehler. 2015. TTLab Preprocessor – Eine generische Web-Anwendung für die Vorverarbeitung von Texten und deren Evaluation. Accepted in the Proceedings of the Jahrestagung der Digital Humanities im deutschsprachigen Raum.
    BibTeX
    @inproceedings{Gleim:Mehler:2015,
      author    = {Gleim, Rüdiger and Mehler, Alexander},
      title     = {TTLab Preprocessor – Eine generische Web-Anwendung für die Vorverarbeitung
                   von Texten und deren Evaluation},
      booktitle = {Accepted in the Proceedings of the Jahrestagung der Digital Humanities
                   im deutschsprachigen Raum},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Gleim_Mehler_PrePro_DHGraz2015.pdf},
      year      = {2015}
    }
    Giuseppe Abrami, Alexander Mehler and Susanne Zeunert. 2015. Ontologiegestütze geisteswissenschaftliche Annotationen mit dem OWLnotator. Proceedings of the Jahrestagung der Digital Humanities im deutschsprachigen Raum.
    BibTeX
    @inproceedings{Abrami:Mehler:Zeunert:2015:a,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Zeunert, Susanne},
      title     = {Ontologiegestütze geisteswissenschaftliche Annotationen mit dem OWLnotator},
      booktitle = {Proceedings of the Jahrestagung der Digital Humanities im deutschsprachigen Raum},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Abrami_Mehler_Zeunert_DHd_2015_abstract.pdf},
      year      = {2015}
    }
    Giuseppe Abrami, Alexander Mehler and Dietmar Pravida. 2015. Fusing Text and Image Data with the Help of the OWLnotator. Human Interface and the Management of Information. Information and Knowledge Design, 9172:261–272.
    BibTeX
    @incollection{Abrami:Mehler:Pravida:2015:b,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Pravida, Dietmar},
      title     = {Fusing Text and Image Data with the Help of the OWLnotator},
      booktitle = {Human Interface and the Management of Information. Information
                   and Knowledge Design},
      publisher = {Springer International Publishing},
      editor    = {Yamamoto, Sakae},
      volume    = {9172},
      series    = {Lecture Notes in Computer Science},
      pages     = {261-272},
      doi       = {10.1007/978-3-319-20612-7_25},
      isbn      = {978-3-319-20611-0},
      language  = {English},
      website   = {http://dx.doi.org/10.1007/978-3-319-20612-7_25},
      year      = {2015}
    }

    2014

    Giuseppe Abrami, Alexander Mehler, Dietmar Pravida and Susanne Zeunert. December, 2014. Rubrik: Neues aus dem Netz. Kunstchronik, 12:623.
    BibTeX
    @article{Abrami:Mehler:Pravida:Zeunert:2014,
      author    = {Abrami, Giuseppe and Mehler, Alexander and Pravida, Dietmar and Zeunert, Susanne},
      title     = {Rubrik: Neues aus dem Netz},
      journal   = {Kunstchronik},
      volume    = {12},
      pages     = {623},
      address   = {München},
      month     = {12},
      publisher = {Zentralinstitut für Kunstgeschichte},
      website   = {http://www.zikg.eu/publikationen/laufende-publikationen/kunstchronik},
      year      = {2014}
    }
    Alexander Mehler. 2014. On the Expressiveness, Validity and Reproducibility of Models of Language Evolution. Comment on 'Modelling language evolution: Examples and predictions' by Tao Gong, Shuai Lan, and Menghan Zhang. Physics of Life Review.
    BibTeX
    @article{Mehler:2014,
      author    = {Mehler, Alexander},
      title     = {On the Expressiveness, Validity and Reproducibility of Models
                   of Language Evolution. Comment on 'Modelling language evolution:
                   Examples and predictions' by Tao Gong, Shuai Lan, and Menghan
                   Zhang},
      journal   = {Physics of Life Review},
      abstract  = {},
      pdf       = {http://www.sciencedirect.com/science/article/pii/S1571064514000529/pdfft?md5=6a2cbbfc083d7bc3adfd26d431cc55d8&pid=1-s2.0-S1571064514000529-main.pdf},
      website   = {https://www.researchgate.net/publication/261290946_On_the_expressiveness_validity_and_reproducibility_of_models_of_language_evolution_Comment_on_Modelling_language_evolution_Examples_and_predictions_by_Tao_Gong_Shuai_Lan_and_Menghan_Zhang},
      year      = {2014}
    }
    Chris Biemann, Gregory R. Crane, Christiane D. Fellbaum and Alexander Mehler. 2014. Computational Humanities - bridging the gap between Computer Science and Digital Humanities (Dagstuhl Seminar 14301). Dagstuhl Reports, 4(7):80–111.
    BibTeX
    @article{Biemann:Crane:Fellbaum:Mehler:2014,
      author    = {Chris Biemann and Gregory R. Crane and Christiane D. Fellbaum
                   and Alexander Mehler},
      title     = {Computational Humanities - bridging the gap between Computer Science
                   and Digital Humanities (Dagstuhl Seminar 14301)},
      journal   = {Dagstuhl Reports},
      volume    = {4},
      number    = {7},
      pages     = {80-111},
      abstract  = {Research in the field of Digital Humanities, also known as Humanities
                   Computing, has seen a steady increase over the past years. Situated
                   at the intersection of computing science and the humanities, present
                   efforts focus on making resources such as texts, images, musical
                   pieces and other semiotic artifacts digitally available, searchable
                   and analysable. To this end, computational tools enabling textual
                   search, visual analytics, data mining, statistics and natural
                   language processing are harnessed to support the humanities researcher.
                   The processing of large data sets with appropriate software opens
                   up novel and fruitful approaches to questions in the traditional
                   humanities. This report summarizes the Dagstuhl seminar 14301
                   on “Computational Humanities – bridging the gap between Computer
                   Science and Digital Humanities”},
      issn      = {2192-5283},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/dagrep_v004_i007_p080_s14301.pdf},
      publisher = {Schloss Dagstuhl--Leibniz-Zentrum für Informatik},
      year      = {2014}
    }
    Md. Zahurul Islam, Md. Rashedur Rahman and Alexander Mehler. 2014. Readability Classification of Bangla Texts. 15th International Conference on Intelligent Text Processing and Computational Linguistics (cicLing), Kathmandu, Nepal.
    BibTeX
    @inproceedings{Islam:Rahman:Mehler:2014,
      author    = {Islam, Md. Zahurul and Rahman, Md. Rashedur and Mehler, Alexander},
      title     = {Readability Classification of Bangla Texts},
      booktitle = {15th International Conference on Intelligent Text Processing and
                   Computational Linguistics (cicLing), Kathmandu, Nepal},
      abstract  = {Readability classification is an important application of Natural
                   Language Processing. It aims at judging the quality of documents
                   and to assist writers to identify possible problems. This paper
                   presents a readability classifier for Bangla textbooks using information-theoretic
                   and lexical features. All together 18 features are explored to
                   achieve an F-score of 86.46},
      year      = {2014}
    }
    Alexander Mehler, Tim vor der Brück and Andy Lücking. 2014. Comparing Hand Gesture Vocabularies for HCI. Proceedings of HCI International 2014, 22 - 27 June 2014, Heraklion, Greece.
    BibTeX
    @incollection{Mehler:vor:der:Brueck:Luecking:2014,
      author    = {Mehler, Alexander and vor der Brück, Tim and Lücking, Andy},
      title     = {Comparing Hand Gesture Vocabularies for HCI},
      booktitle = {Proceedings of HCI International 2014, 22 - 27 June 2014, Heraklion, Greece},
      publisher = {Springer},
      address   = {Berlin/New York},
      abstract  = {HCI systems are often equipped with gestural interfaces drawing
                   on a predefined set of admitted gestures. We provide an assessment
                   of the fitness of such gesture vocabularies in terms of their
                   learnability and naturalness. This is done by example of rivaling
                   gesture vocabularies of the museum information system WikiNect.
                   In this way, we do not only provide a procedure for evaluating
                   gesture vocabularies, but additionally contribute to design criteria
                   to be followed by the gestures.},
      keywords  = {wikinect},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Comparing-Gesture-Vocabularies-1_1.pdf},
      website   = {{http://link.springer.com/chapter/10.1007/978-3-319-07230-2_8#page-1}},
      year      = {2014}
    }
    Alexander Mehler, Andy Lücking and Giuseppe Abrami. 2014. WikiNect: Image Schemata as a Basis of Gestural Writing for Kinetic Museum Wikis. Universal Access in the Information Society, 1–17.
    BibTeX
    @article{Mehler:Luecking:Abrami:2014,
      author    = {Mehler, Alexander and Lücking, Andy and Abrami, Giuseppe},
      title     = {{WikiNect}: Image Schemata as a Basis of Gestural Writing for
                   Kinetic Museum Wikis},
      journal   = {Universal Access in the Information Society},
      pages     = {1-17},
      abstract  = {This paper provides a theoretical assessment of gestures in the
                   context of authoring image-related hypertexts by example of the
                   museum information system WikiNect. To this end, a first implementation
                   of gestural writing based on image schemata is provided (Lakoff
                   in Women, fire, and dangerous things: what categories reveal about
                   the mind. University of Chicago Press, Chicago, 1987). Gestural
                   writing is defined as a sort of coding in which propositions are
                   only expressed by means of gestures. In this respect, it is shown
                   that image schemata allow for bridging between natural language
                   predicates and gestural manifestations. Further, it is demonstrated
                   that gestural writing primarily focuses on the perceptual level
                   of image descriptions (Hollink et al. in Int J Hum Comput Stud
                   61(5):601–626, 2004). By exploring the metaphorical potential
                   of image schemata, it is finally illustrated how to extend the
                   expressiveness of gestural writing in order to reach the conceptual
                   level of image descriptions. In this context, the paper paves
                   the way for implementing museum information systems like WikiNect
                   as systems of kinetic hypertext authoring based on full-fledged
                   gestural writing.},
      doi       = {10.1007/s10209-014-0386-8},
      issn      = {1615-5289},
      keywords  = {wikinect},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/art_10.1007_s10209-014-0386-8.pdf},
      website   = {http://dx.doi.org/10.1007/s10209-014-0386-8},
      year      = {2014}
    }
    Tim vor der Brück, Alexander Mehler and Md. Zahurul Islam. 2014. ColLex.EN: Automatically Generating and Evaluating a Full-form Lexicon for English. Proceedings of LREC 2014.
    BibTeX
    @inproceedings{vor:der:Brueck:Mehler:Islam:2014,
      author    = {vor der Brück, Tim and Mehler, Alexander and Islam, Md. Zahurul},
      title     = {ColLex.EN: Automatically Generating and Evaluating a Full-form
                   Lexicon for English},
      booktitle = {Proceedings of LREC 2014},
      address   = {Reykjavik, Iceland},
      abstract  = {Currently, a large number of different lexica is available for
                   English. However, substantial and freely available fullform lexica
                   with a high number of named entities are rather rare even in the
                   case of this lingua franca. Existing lexica are often limited
                   in several respects as explained in Section 2. What is missing
                   so far is a freely available substantial machine-readable lexical
                   resource of English that contains a high number of word forms
                   and a large collection of named entities. In this paper, we describe
                   a procedure to generate such a resource by example of English.
                   This lexicon, henceforth called ColLex.EN (for Collecting Lexica
                   for English ), will be made freely available to the public 1.
                   In this paper, we describe how ColLex.EN was collected from existing
                   lexical resources and specify the statistical procedures that
                   we developed to extend and adjust it. No manual modifications
                   were done on the generated word forms and lemmas. Our fully automatic
                   procedure has the advantage that whenever new versions of the
                   source lexica are available, a new version of ColLex.EN can be
                   automatically generated with low effort.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/vdbrueck_mehler_islam_collex_lrec.pdf},
      website   = {
                       http://aclanthology.info/papers/collex-en-automatically-generating-and-evaluating-a-full-form-lexicon-for-english},
      year      = {2014}
    }

    2013

    Alexander Mehler, Roman Schneider and Angelika Storrer. 2013. Webkorpora in Computerlinguistik und Sprachforschung. Ed. by Roman Schneider, Angelika Storrer and Alexander Mehler.Journal for Language Technology and Computational Linguistics (JLCL), 28(2). JLCL.
    BibTeX
    @book{Schneider:Storrer:Mehler:2013,
      author    = {Mehler, Alexander and Schneider, Roman and Storrer, Angelika},
      editor    = {Roman Schneider and Angelika Storrer and Alexander Mehler},
      title     = {Webkorpora in Computerlinguistik und Sprachforschung},
      publisher = {JLCL},
      volume    = {28},
      number    = {2},
      series    = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/Webkorpora-300-20.png},
      issn      = {2190-6858},
      pagetotal = {107},
      pdf       = {http://www.jlcl.org/2013_Heft2/H2013-2.pdf},
      year      = {2013}
    }
    Alexander Mehler, Andy Lücking, Tim vor der Brück and Giuseppe Abrami. November, 2013. WikiNect - A Kinetic Artwork Wiki for Exhibition Visitors.
    BibTeX
    @misc{Mehler:Luecking:vor:der:Brueck:2013:a,
      author    = {Mehler, Alexander and Lücking, Andy and vor der Brück, Tim and Abrami, Giuseppe},
      title     = {WikiNect - A Kinetic Artwork Wiki for Exhibition Visitors},
      howpublished = {Poster Presentation at the Scientific Computing and
                       Cultural Heritage 2013 Conference, Heidelberg},
      keywords  = {wikinect},
      month     = {11},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/SCCHPoster2013.pdf},
      url       = {http://scch2013.wordpress.com/},
      year      = {2013}
    }
    Andy Lücking and Alexander Mehler. 2013. On Three Notions of Grounding of Artificial Dialog Companions. Science, Technology & Innovation Studies, 10(1):31–36.
    BibTeX
    @article{Luecking:Mehler:2013:a,
      author    = {Lücking, Andy and Mehler, Alexander},
      title     = {On Three Notions of Grounding of Artificial Dialog Companions},
      journal   = {Science, Technology \& Innovation Studies},
      volume    = {10},
      number    = {1},
      pages     = {31-36},
      abstract  = {We provide a new, theoretically motivated evaluation grid for
                   assessing the conversational achievements of Artificial Dialog
                   Companions (ADCs). The grid is spanned along three grounding problems.
                   Firstly, it is argued that symbol grounding in general has to
                   be instrinsic. Current approaches in this context, however, are
                   limited to a certain kind of expression that can be grounded in
                   this way. Secondly, we identify three requirements for conversational
                   grounding, the process leading to mutual understanding. Finally,
                   we sketch a test case for symbol grounding in the form of the
                   philosophical grounding problem that involves the use of modal
                   language. Together, the three grounding problems provide a grid
                   that allows us to assess ADCs’ dialogical performances and to
                   pinpoint future developments on these grounds.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/STI-final-badge.pdf},
      website   = {http://www.sti-studies.de/ojs/index.php/sti/article/view/143},
      year      = {2013}
    }
    Barbara Frank-Job, Alexander Mehler and Tilmann Sutter, eds. 2013. Die Dynamik sozialer und sprachlicher Netzwerke: Konzepte, Methoden und empirische Untersuchungen an Beispielen des WWW. Springer VS.
    BibTeX
    @book{FrankJob:Mehler:Sutter:2013,
      editor    = {Barbara Frank-Job and Alexander Mehler and Tilmann Sutter},
      title     = {Die Dynamik sozialer und sprachlicher Netzwerke: Konzepte, Methoden
                   und empirische Untersuchungen an Beispielen des WWW},
      publisher = {Springer VS},
      address   = {Wiesbaden},
      abstract  = {In diesem Band pr{\"a}sentieren Medien- und Informationswissenschaftler,
                   Netzwerkforscher aus Informatik, Texttechnologie und Physik, Soziologen
                   und Linguisten interdisziplin{\"a}r Aspekte der Erforschung komplexer
                   Mehrebenen-Netzwerke. Im Zentrum ihres Interesses stehen Untersuchungen
                   zum Zusammenhang zwischen sozialen und sprachlichen Netzwerken
                   und ihrer Dynamiken, aufgezeigt an empirischen Beispielen aus
                   dem Bereich des Web 2.0, aber auch an historischen Dokumentenkorpora
                   sowie an Rezeptions-Netzwerken aus Kunst- und Literaturwissenschaft.},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/DieDynamikSozialerUndSprachlicherNetzwerke.jpg},
      pagetotal = {240},
      year      = {2013}
    }
    Md. Zahurul Islam and Alexander Mehler. 2013. Automatic Readability Classification of Crowd-Sourced Data based on Linguistic and Information-Theoretic Features. 14th International Conference on Intelligent Text Processing and Computational Linguistics.
    BibTeX
    @inproceedings{Islam:Mehler:2013:a,
      author    = {Islam, Md. Zahurul and Mehler, Alexander},
      title     = {Automatic Readability Classification of Crowd-Sourced Data based
                   on Linguistic and Information-Theoretic Features},
      booktitle = {14th International Conference on Intelligent Text Processing and
                   Computational Linguistics},
      abstract  = {This paper presents a classifier of text readability based on
                   information-theoretic features. The classifier was developed based
                   on a linguistic approach to readability that explores lexical,
                   syntactic and semantic features. For this evaluation we extracted
                   a corpus of 645 articles from Wikipedia together with their quality
                   judgments. We show that information-theoretic features perform
                   as well as their linguistic counterparts even if we explore several
                   linguistic levels at once.},
      owner     = {zahurul},
      pdf       = {http://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/download/1516/1497},
      timestamp = {2013.01.22},
      website   = {http://www.redalyc.org/articulo.oa?id=61527437002},
      year      = {2013}
    }
    Alexander Mehler, Christian Stegbauer and Rüdiger Gleim. 2013. Zur Struktur und Dynamik der kollaborativen Plagiatsdokumentation am Beispiel des GuttenPlag Wiki: eine Vorstudie. Die Dynamik sozialer und sprachlicher Netzwerke. Konzepte, Methoden und empirische Untersuchungen am Beispiel des WWW.
    BibTeX
    @incollection{Mehler:Stegbauer:Gleim:2013,
      author    = {Mehler, Alexander and Stegbauer, Christian and Gleim, Rüdiger},
      title     = {Zur Struktur und Dynamik der kollaborativen Plagiatsdokumentation
                   am Beispiel des GuttenPlag Wiki: eine Vorstudie},
      booktitle = {Die Dynamik sozialer und sprachlicher Netzwerke. Konzepte, Methoden
                   und empirische Untersuchungen am Beispiel des WWW},
      publisher = {VS Verlag},
      editor    = {Frank-Job, Barbara and Mehler, Alexander and Sutter, Tilman},
      address   = {Wiesbaden},
      year      = {2013}
    }
    Nicole Beckage, Michael S. Vitevitch, Alexander Mehler and Eliana Colunga. 2013. Using Complex Network Analysis in the Cognitive Sciences. Proceedings of the 35th Annual Meeting of the Cognitive Science Society, CogSci 2013, Berlin, Germany, July 31 - August 3, 2013.
    BibTeX
    @inproceedings{Beckage:et:al:2013,
      author    = {Nicole Beckage and Michael S. Vitevitch and Alexander Mehler and Eliana Colunga},
      title     = {Using Complex Network Analysis in the Cognitive Sciences},
      booktitle = {Proceedings of the 35th Annual Meeting of the Cognitive Science
                   Society, CogSci 2013, Berlin, Germany, July 31 - August 3, 2013},
      editor    = {Markus Knauff and Michael Pauen and Natalie Sebanz and Ipke Wachsmuth},
      publisher = {cognitivesciencesociety.org},
      year      = {2013}
    }

    2012

    Alexander Mehler and Laurent Romary. 2012. Handbook of Technical Communication. De Gruyter Mouton.
    BibTeX
    @book{Mehler:Romary:2012,
      author    = {Mehler, Alexander and Romary, Laurent},
      title     = {Handbook of Technical Communication},
      publisher = {De Gruyter Mouton},
      address   = {Berlin},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/HandbookTechnicalCommunication.jpg},
      pagetotal = {839},
      year      = {2012}
    }
    Alexander Mehler, Christian Stegbauer and Rüdiger Gleim. July, 2012. Latent Barriers in Wiki-based Collaborative Writing. Proceedings of the Wikipedia Academy: Research and Free Knowledge. June 29 - July 1 2012.
    BibTeX
    @inproceedings{Mehler:Stegbauer:Gleim:2012:b,
      author    = {Mehler, Alexander and Stegbauer, Christian and Gleim, Rüdiger},
      title     = {Latent Barriers in Wiki-based Collaborative Writing},
      booktitle = {Proceedings of the Wikipedia Academy: Research and Free Knowledge.
                   June 29 - July 1 2012},
      address   = {Berlin},
      month     = {July},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/12_Paper_Alexander_Mehler_Christian_Stegbauer_Ruediger_Gleim.pdf},
      year      = {2012}
    }
    Md. Zahurul Islam, Alexander Mehler and Rashedur Rahman. 2012. Text Readability Classification of Textbooks of a Low-Resource Language. Accepted in the 26th Pacific Asia Conference on Language, Information, and Computation (PACLIC 26).
    BibTeX
    @inproceedings{Islam:Mehler:Rahman:2012,
      author    = {Islam, Md. Zahurul and Mehler, Alexander and Rahman, Rashedur},
      title     = {Text Readability Classification of Textbooks of a Low-Resource Language},
      booktitle = {Accepted in the 26th Pacific Asia Conference on Language, Information,
                   and Computation (PACLIC 26)},
      abstract  = {There are many languages considered to be low-density languages,
                   either because the population speaking the language is not very
                   large, or because insufficient digitized text material is available
                   in the language even though millions of people speak the language.
                   Bangla is one of the latter ones. Readability classification is
                   an important Natural Language Processing (NLP) application that
                   can be used to judge the quality of documents and assist writers
                   to locate possible problems. This paper presents a readability
                   classifier of Bangla textbook documents based on information-theoretic
                   and lexical features. The features proposed in this paper result
                   in an F-score that is 50\% higher than that for traditional readability
                   formulas.},
      owner     = {zahurul},
      pdf       = {http://www.aclweb.org/anthology/Y12-1059},
      timestamp = {2012.08.14},
      website   = {http://www.researchgate.net/publication/256648250_Text_Readability_Classification_of_Textbooks_of_a_Low-Resource_Language},
      year      = {2012}
    }
    Alexander Mehler, Laurent Romary and Dafydd Gibbon. 2012. Introduction: Framing Technical Communication. Handbook of Technical Communication, 8:1–26.
    BibTeX
    @incollection{Mehler:Romary:Gibbon:2012,
      author    = {Mehler, Alexander and Romary, Laurent and Gibbon, Dafydd},
      title     = {Introduction: Framing Technical Communication},
      booktitle = {Handbook of Technical Communication},
      publisher = {De Gruyter Mouton},
      editor    = {Alexander Mehler and Laurent Romary and Dafydd Gibbon},
      volume    = {8},
      series    = {Handbooks of Applied Linguistics},
      pages     = {1-26},
      address   = {Berlin and Boston},
      year      = {2012}
    }
    Alexander Mehler and Andy Lücking. 2012. Pathways of Alignment between Gesture and Speech: Assessing Information Transmission in Multimodal Ensembles. Proceedings of the International Workshop on Formal and Computational Approaches to Multimodal Communication under the auspices of ESSLLI 2012, Opole, Poland, 6-10 August.
    BibTeX
    @inproceedings{Mehler:Luecking:2012:d,
      author    = {Mehler, Alexander and Lücking, Andy},
      title     = {Pathways of Alignment between Gesture and Speech: Assessing Information
                   Transmission in Multimodal Ensembles},
      booktitle = {Proceedings of the International Workshop on Formal and Computational
                   Approaches to Multimodal Communication under the auspices of ESSLLI
                   2012, Opole, Poland, 6-10 August},
      editor    = {Gianluca Giorgolo and Katya Alahverdzhieva},
      abstract  = {We present an empirical account of multimodal ensembles based
                   on Hjelmslev’s notion of selection. This is done to get measurable
                   evidence for the existence of speech-and-gesture ensembles. Utilizing
                   information theory, we show that there is an information transmission
                   that makes a gestures’ representation technique predictable when
                   merely knowing its lexical affiliate – in line with the notion
                   of the primacy of language. Thus, there is evidence for a one-way
                   coupling – going from words to gestures – that leads to speech-and-gesture
                   alignment and underlies the constitution of multimodal ensembles.},
      keywords  = {wikinect},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2016/06/Mehler_Luecking_FoCoMC2012-2.pdf},
      website   = {http://www.researchgate.net/publication/268368670_Pathways_of_Alignment_between_Gesture_and_Speech_Assessing_Information_Transmission_in_Multimodal_Ensembles},
      year      = {2012}
    }
    Alexander Mehler and Andy Lücking. 2012. WikiNect: Towards a Gestural Writing System for Kinetic Museum Wikis. Proceedings of the International Workshop On User Experience in e-Learning and Augmented Technologies in Education (UXeLATE 2012) in Conjunction with ACM Multimedia 2012, 29 October- 2 November, Nara, Japan, 7–12.
    BibTeX
    @inproceedings{Mehler:Luecking:2012:c,
      author    = {Mehler, Alexander and Lücking, Andy},
      title     = {WikiNect: Towards a Gestural Writing System for Kinetic Museum Wikis},
      booktitle = {Proceedings of the International Workshop On User Experience in
                   e-Learning and Augmented Technologies in Education (UXeLATE 2012)
                   in Conjunction with ACM Multimedia 2012, 29 October- 2 November,
                   Nara, Japan},
      pages     = {7-12},
      abstract  = {We introduce WikiNect as a kinetic museum information system that
                   allows museum visitors to give on-site feedback about exhibitions.
                   To this end, WikiNect integrates three approaches to Human-Computer
                   Interaction (HCI): games with a purpose, wiki-based collaborative
                   writing and kinetic text-technologies. Our aim is to develop kinetic
                   technologies as a new paradigm of HCI. They dispense with classical
                   interfaces (e.g., keyboards) in that they build on non-contact
                   modes of communication like gestures or facial expressions as
                   input displays. In this paper, we introduce the notion of gestural
                   writing as a kinetic text-technology that underlies WikiNect to
                   enable museum visitors to communicate their feedback. The basic
                   idea is to explore sequences of gestures that share the semantic
                   expressivity of verbally manifested speech acts. Our task is to
                   identify such gestures that are learnable on-site in the usage
                   scenario of WikiNect. This is done by referring to so-called transient
                   gestures as part of multimodal ensembles, which are candidate
                   gestures of the desired functionality.},
      keywords  = {wikinect},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/UXeLATE2012-copyright.pdf},
      website   = {http://www.researchgate.net/publication/262319200_WikiNect_towards_a_gestural_writing_system_for_kinetic_museum_wikis},
      year      = {2012}
    }
    Rüdiger Gleim, Alexander Mehler and Alexandra Ernst. 2012. SOA implementation of the eHumanities Desktop. Proceedings of the Workshop on Service-oriented Architectures (SOAs) for the Humanities: Solutions and Impacts, Digital Humanities 2012, Hamburg, Germany.
    BibTeX
    @inproceedings{Gleim:Mehler:Ernst:2012,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Ernst, Alexandra},
      title     = {SOA implementation of the eHumanities Desktop},
      booktitle = {Proceedings of the Workshop on Service-oriented Architectures
                   (SOAs) for the Humanities: Solutions and Impacts, Digital Humanities
                   2012, Hamburg, Germany},
      abstract  = {The eHumanities Desktop is a system which allows users to upload,
                   organize and share resources using a web interface. Furthermore
                   resources can be processed, annotated and analyzed in various
                   ways. Registered users can organize themselves in groups and collaboratively
                   work on their data. The eHumanities Desktop is platform independent
                   and runs in a web browser. This paper presents the system focusing
                   on its service orientation and process management.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/dhc2012.pdf},
      year      = {2012}
    }
    Alexander Mehler and Christian Stegbauer. 2012. On the Self-similarity of Intertextual Structures in Wikipedia. Proceedings of the HotSocial '12: The First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research, 65–68.
    BibTeX
    @inproceedings{Mehler:Stegbauer:2012,
      author    = {Mehler, Alexander and Stegbauer, Christian},
      title     = {On the Self-similarity of Intertextual Structures in Wikipedia},
      booktitle = {Proceedings of the HotSocial '12: The First ACM International
                   Workshop on Hot Topics on Interdisciplinary Social Networks Research},
      editor    = {Xiaoming Fu and Peter Gloor and Jie Tang},
      pages     = {65-68},
      address   = {Beijing, China},
      pdf       = {http://wan.poly.edu/KDD2012/forms/workshop/HotSocial12/doc/p64_mehler.pdf},
      website   = {http://dl.acm.org/citation.cfm?id=2392633&bnc=1},
      year      = {2012}
    }
    Alexander Mehler, Silke Schwandt, Rüdiger Gleim and Alexandra Ernst. 2012. Inducing Linguistic Networks from Historical Corpora: Towards a New Method in Historical Semantics. Proceedings of the Conference on New Methods in Historical Corpora, 3:257–274.
    BibTeX
    @incollection{Mehler:Schwandt:Gleim:Ernst:2012,
      author    = {Mehler, Alexander and Schwandt, Silke and Gleim, Rüdiger and Ernst, Alexandra},
      title     = {Inducing Linguistic Networks from Historical Corpora: Towards
                   a New Method in Historical Semantics},
      booktitle = {Proceedings of the Conference on New Methods in Historical Corpora},
      publisher = {Narr},
      editor    = {Paul Bennett and Martin Durrell and Silke Scheible and Richard J. Whitt},
      volume    = {3},
      series    = {Corpus linguistics and Interdisciplinary perspectives
                       on language (CLIP)},
      pages     = {257--274},
      address   = {Tübingen},
      year      = {2012}
    }
    Alexander Mehler, Andy Lücking and Peter Menke. 2012. Assessing Cognitive Alignment in Different Types of Dialog by means of a Network Model. Neural Networks, 32:159–164.
    BibTeX
    @article{Mehler:Luecking:Menke:2012,
      author    = {Mehler, Alexander and Lücking, Andy and Menke, Peter},
      title     = {Assessing Cognitive Alignment in Different Types of Dialog by
                   means of a Network Model},
      journal   = {Neural Networks},
      volume    = {32},
      pages     = {159-164},
      abstract  = {We present a network model of dialog lexica, called TiTAN (Two-layer
                   Time-Aligned Network) series. TiTAN series capture the formation
                   and structure of dialog lexica in terms of serialized graph representations.
                   The dynamic update of TiTAN series is driven by the dialog-inherent
                   timing of turn-taking. The model provides a link between neural,
                   connectionist underpinnings of dialog lexica on the one hand and
                   observable symbolic behavior on the other. On the neural side,
                   priming and spreading activation are modeled in terms of TiTAN
                   networking. On the symbolic side, TiTAN series account for cognitive
                   alignment in terms of the structural coupling of the linguistic
                   representations of dialog partners. This structural stance allows
                   us to apply TiTAN in machine learning of data of dialogical alignment.
                   In previous studies, it has been shown that aligned dialogs can
                   be distinguished from non-aligned ones by means of TiTAN -based
                   modeling. Now, we simultaneously apply this model to two types
                   of dialog: task-oriented, experimentally controlled dialogs on
                   the one hand and more spontaneous, direction giving dialogs on
                   the other. We ask whether it is possible to separate aligned dialogs
                   from non-aligned ones in a type-crossing way. Starting from a
                   recent experiment (Mehler, Lücking, \& Menke, 2011a), we show
                   that such a type-crossing classification is indeed possible. This
                   hints at a structural fingerprint left by alignment in networks
                   of linguistic items that are routinely co-activated during conversation.},
      doi       = {10.1016/j.neunet.2012.02.013},
      website   = {http://www.sciencedirect.com/science/article/pii/S0893608012000421},
      year      = {2012}
    }
    Md. Zahurul Islam and Alexander Mehler. 2012. Customization of the Europarl Corpus for Translation Studies. Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC).
    BibTeX
    @inproceedings{Islam:Mehler:2012:a,
      author    = {Islam, Md. Zahurul and Mehler, Alexander},
      title     = {Customization of the Europarl Corpus for Translation Studies},
      booktitle = {Proceedings of the 8th International Conference on Language Resources
                   and Evaluation (LREC)},
      abstract  = {Currently, the area of translation studies lacks corpora by which
                   translation scholars can validate their theoretical claims, for
                   example, regarding the scope of the characteristics of the translation
                   relation. In this paper, we describe a customized resource in
                   the area of translation studies that mainly addresses research
                   on the properties of the translation relation. Our experimental
                   results show that the Type-Token-Ratio (TTR) is not a universally
                   valid indicator of the simplification of translation.},
      owner     = {zahurul},
      pdf       = {http://www.lrec-conf.org/proceedings/lrec2012/pdf/729_Paper.pdf},
      timestamp = {2012.02.02},
      year      = {2012}
    }
    Andy Lücking and Alexander Mehler. 2012. What's the Scope of the Naming Game? Constraints on Semantic Categorization. Proceedings of the 9th International Conference on the Evolution of Language, 196–203.
    BibTeX
    @inproceedings{Luecking:Mehler:2012,
      author    = {Lücking, Andy and Mehler, Alexander},
      title     = {What's the Scope of the Naming Game? Constraints on Semantic Categorization},
      booktitle = {Proceedings of the 9th International Conference on the Evolution of Language},
      pages     = {196-203},
      address   = {Kyoto, Japan},
      abstract  = {The Naming Game (NG) has become a vivid research paradigm for
                   simulation studies on language evolution and the establishment
                   of naming conventions. Recently, NGs were used for reconstructing
                   the creation of linguistic categories, most notably for color
                   terms. We recap the functional principle of NGs and the latter
                   Categorization Games (CGs) and evaluate them in the light of semantic
                   data of linguistic categorization outside the domain of colors.
                   This comparison reveals two specifics of the CG paradigm: Firstly,
                   the emerging categories draw basically on the predefined topology
                   of the learning domain. Secondly, the kind of categories that
                   can be learnt in CGs is bound to context-independent intersective
                   categories. This suggests that the NG and the CG focus on a special
                   aspect of natural language categorization, which disregards context-sensitive
                   categories used in a non-compositional manner.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Evolang2012-AL_AM.pdf},
      url       = {http://kyoto.evolang.org/},
      website   = {https://www.researchgate.net/publication/267858061_WHAT'S_THE_SCOPE_OF_THE_NAMING_GAME_CONSTRAINTS_ON_SEMANTIC_CATEGORIZATION},
      year      = {2012}
    }
    Maria Sukhareva, Md. Zahurul Islam, Armin Hoenen and Alexander Mehler. 2012. A Three-step Model of Language Detection in Multilingual Ancient Texts. Proceedings of Workshop on Annotation of Corpora for Research in the Humanities.
    BibTeX
    @inproceedings{Sukhareva:Islam:Hoenen:Mehler:2012,
      author    = {Sukhareva, Maria and Islam, Md. Zahurul and Hoenen, Armin and Mehler, Alexander},
      title     = {A Three-step Model of Language Detection in Multilingual Ancient Texts},
      booktitle = {Proceedings of Workshop on Annotation of Corpora for Research in the Humanities},
      address   = {Heidelberg, Germany},
      abstract  = {Ancient corpora contain various multilingual patterns. This imposes
                   numerous problems on their manual annotation and automatic processing.
                   We introduce a lexicon building system, called Lexicon Expander,
                   that has an integrated language detection module, Language Detection
                   (LD) Toolkit. The Lexicon Expander post-processes the output of
                   the LD Toolkit which leads to the improvement of f-score and accuracy
                   values. Furthermore, the functionality of the Lexicon Expander
                   also includes manual editing of lexical entries and automatic
                   morphological expansion by means of a morphological grammar.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/sukhareva_islam_hoenen_mehler_2011.pdf},
      website   = {https://www.academia.edu/2236625/A_Three-step_Model_of_Language_Detection_in_Multilingual_Ancient_Texts},
      year      = {2012}
    }

    2011

    Andy Lücking and Alexander Mehler. 2011. A Model of Complexity Levels of Meaning Constitution in Simulation Models of Language Evolution. International Journal of Signs and Semiotic Systems, 1(1):18–38.
    BibTeX
    @article{Luecking:Mehler:2011,
      author    = {Lücking, Andy and Mehler, Alexander},
      title     = {A Model of Complexity Levels of Meaning Constitution in Simulation
                   Models of Language Evolution},
      journal   = {International Journal of Signs and Semiotic Systems},
      volume    = {1},
      number    = {1},
      pages     = {18-38},
      abstract  = {Currently, some simulative accounts exist within dynamic or evolutionary
                   frameworks that are concerned with the development of linguistic
                   categories within a population of language users. Although these
                   studies mostly emphasize that their models are abstract, the paradigm
                   categorization domain is preferably that of colors. In this paper,
                   the authors argue that color adjectives are special predicates
                   in both linguistic and metaphysical terms: semantically, they
                   are intersective predicates, metaphysically, color properties
                   can be empirically reduced onto purely physical properties. The
                   restriction of categorization simulations to the color paradigm
                   systematically leads to ignoring two ubiquitous features of natural
                   language predicates, namely relativity and context-dependency.
                   Therefore, the models for simulation models of linguistic categories
                   are not able to capture the formation of categories like perspective-dependent
                   predicates ‘left’ and ‘right’, subsective predicates like ‘small’
                   and ‘big’, or predicates that make reference to abstract objects
                   like ‘I prefer this kind of situation’. The authors develop a
                   three-dimensional grid of ascending complexity that is partitioned
                   according to the semiotic triangle. They also develop a conceptual
                   model in the form of a decision grid by means of which the complexity
                   level of simulation models of linguistic categorization can be
                   assessed in linguistic terms.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/luecking_mehler_article_IJSSS.pdf},
      year      = {2011}
    }
    Alexander Mehler, Olga Abramov and Nils Diewald. 2011. Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia. Computer Speech and Language, 25(3):716–740.
    BibTeX
    @article{Mehler:Abramov:Diewald:2011:a,
      author    = {Mehler, Alexander and Abramov, Olga and Diewald, Nils},
      title     = {Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf
                   Hypothesis in the Context of Wikipedia},
      journal   = {Computer Speech and Language},
      volume    = {25},
      number    = {3},
      pages     = {716-740},
      abstract  = {In this article, we test a variant of the Sapir-Whorf Hypothesis
                   in the area of complex network theory. This is done by analyzing
                   social ontologies as a new resource for automatic language classification.
                   Our method is to solely explore structural features of social
                   ontologies in order to predict family resemblances of languages
                   used by the corresponding communities to build these ontologies.
                   This approach is based on a reformulation of the Sapir-Whorf Hypothesis
                   in terms of distributed cognition. Starting from a corpus of 160
                   Wikipedia-based social ontologies, we test our variant of the
                   Sapir-Whorf Hypothesis by several experiments, and find out that
                   we outperform the corresponding baselines. All in all, the article
                   develops an approach to classify linguistic networks of tens of
                   thousands of vertices by exploring a small range of mathematically
                   well-established topological indices.},
      doi       = {10.1016/j.csl.2010.05.006},
      website   = {http://www.sciencedirect.com/science/article/pii/S0885230810000434},
      year      = {2011}
    }
    Alexander Mehler. 2011. Social Ontologies as Generalized Nearly Acyclic Directed Graphs: A Quantitative Graph Model of Social Ontologies by Example of Wikipedia. Towards an Information Theory of Complex Networks: Statistical Methods and Applications, 259–319.
    BibTeX
    @incollection{Mehler:2011:c,
      author    = {Mehler, Alexander},
      title     = {Social Ontologies as Generalized Nearly Acyclic Directed Graphs:
                   A Quantitative Graph Model of Social Ontologies by Example of
                   Wikipedia},
      booktitle = {Towards an Information Theory of Complex Networks: Statistical
                   Methods and Applications},
      publisher = {Birkh{\"a}user},
      editor    = {Dehmer, Matthias and Emmert-Streib, Frank and Mehler, Alexander},
      pages     = {259-319},
      address   = {Boston/Basel},
      year      = {2011}
    }
    Alexander Mehler and Andy Lücking. September, 2011. A Graph Model of Alignment in Multilog. Proceedings of IEEE Africon 2011.
    BibTeX
    @inproceedings{Mehler:Luecking:2011,
      author    = {Mehler, Alexander and Lücking, Andy},
      title     = {A Graph Model of Alignment in Multilog},
      booktitle = {Proceedings of IEEE Africon 2011},
      series    = {IEEE Africon},
      address   = {Zambia},
      organization = {IEEE},
      month     = {9},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/africon2011-paper-Alexander_Mehler_Andy_Luecking.pdf},
      website   = {https://www.researchgate.net/publication/267941012_A_Graph_Model_of_Alignment_in_Multilog},
      year      = {2011}
    }
    Christian Stegbauer and Alexander Mehler. 2011. Positionssensitive Dekomposition von Potenzgesetzen am Beispiel von Wikipedia-basierten Kollaborationsnetzwerken. Proceedings of the 4th Workshop Digital Social Networks at INFORMATIK 2011: Informatik schafft Communities, Oct 4-7, 2011, Berlin.
    BibTeX
    @inproceedings{Stegbauer:Mehler:2011,
      author    = {Stegbauer, Christian and Mehler, Alexander},
      title     = {Positionssensitive Dekomposition von Potenzgesetzen am Beispiel
                   von Wikipedia-basierten Kollaborationsnetzwerken},
      booktitle = {Proceedings of the 4th Workshop Digital Social Networks at INFORMATIK
                   2011: Informatik schafft Communities, Oct 4-7, 2011, Berlin},
      pdf       = {http://www.user.tu-berlin.de/komm/CD/paper/090423.pdf},
      specialnote = {Best Paper Award},
      specialnotewebsite = {http://www.digitale-soziale-netze.de/gi-workshop/index.php?site=review2011},
      year      = {2011}
    }
    Mathias Lösch, Ulli Waltinger, Wolfram Horstmann and Alexander Mehler. 2011. Building a DDC-annotated Corpus from OAI Metadata. Journal of Digital Information, 12(2).
    BibTeX
    @article{Loesch:Waltinger:Horstmann:Mehler:2011,
      author    = {Lösch, Mathias and Waltinger, Ulli and Horstmann, Wolfram and Mehler, Alexander},
      title     = {Building a DDC-annotated Corpus from OAI Metadata},
      journal   = {Journal of Digital Information},
      volume    = {12},
      number    = {2},
      abstract  = {Checking for readability or simplicity of texts is important for
                   many institutional and individual users. Formulas for approximately
                   measuring text readability have a long tradition. Usually, they
                   exploit surface-oriented indicators like sentence length, word
                   length, word frequency, etc. However, in many cases, this information
                   is not adequate to realistically approximate the cognitive difficulties
                   a person can have to understand a text. Therefore we use deep
                   syntactic and semantic indicators in addition. The syntactic information
                   is represented by a dependency tree, the semantic information
                   by a semantic network. Both representations are automatically
                   generated by a deep syntactico-semantic analysis. A global readability
                   score is determined by applying a nearest neighbor algorithm on
                   3,000 ratings of 300 test persons. The evaluation showed that
                   the deep syntactic and semantic indicators lead to promising results
                   comparable to the best surface-based indicators. The combination
                   of deep and shallow indicators leads to an improvement over shallow
                   indicators alone. Finally, a graphical user interface was developed
                   which highlights difficult passages, depending on the individual
                   indicator values, and displays a global readability score.},
      bibsource = {DBLP, http://dblp.uni-trier.de},
      pdf       = {https://journals.tdl.org/jodi/index.php/jodi/article/download/1765/1767},
      website   = {http://journals.tdl.org/jodi/article/view/1765},
      year      = {2011}
    }
    Markus Lux, Jan Laußmann, Alexander Mehler and Christian Menßen. 2011. An Online Platform for Visualizing Time Series in Linguistic Networks. Proceedings of the Demonstrations Session of the 2011 IEEE / WIC / ACM International Conferences on Web Intelligence and Intelligent Agent Technology, 22 - 27 August 2011, Lyon, France.
    BibTeX
    @inproceedings{Lux:Laussmann:Mehler:Menssen:2011,
      author    = {Lux, Markus and Lau{\ss}mann, Jan and Mehler, Alexander and Men{\ss}en, Christian},
      title     = {An Online Platform for Visualizing Time Series in Linguistic Networks},
      booktitle = {Proceedings of the Demonstrations Session of the 2011 IEEE / WIC
                   / ACM International Conferences on Web Intelligence and Intelligent
                   Agent Technology, 22 - 27 August 2011, Lyon, France},
      poster    = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/wi-iat-poster-2011.pdf},
      website   = {http://dl.acm.org/citation.cfm?id=2052396},
      year      = {2011}
    }
    Alexander Mehler, Nils Diewald, Ulli Waltinger, Rüdiger Gleim, Dietmar Esch, Barbara Job, Thomas Küchelmann, Olga Abramov and Philippe Blanchard. 2011. Evolution of Romance Language in Written Communication: Network Analysis of Late Latin and Early Romance Corpora. Leonardo, 44(3).
    BibTeX
    @article{Mehler:Diewald:Waltinger:et:al:2010,
      author    = {Mehler, Alexander and Diewald, Nils and Waltinger, Ulli and Gleim, Rüdiger
                   and Esch, Dietmar and Job, Barbara and Küchelmann, Thomas and Abramov, Olga
                   and Blanchard, Philippe},
      title     = {Evolution of Romance Language in Written Communication: Network
                   Analysis of Late Latin and Early Romance Corpora},
      journal   = {Leonardo},
      volume    = {44},
      number    = {3},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_diewald_waltinger_gleim_esch_job_kuechelmann_pustylnikov_blanchard_2010.pdf},
      publisher = {MIT Press},
      year      = {2011}
    }
    Alexander Mehler, Andy Lücking and Peter Menke. 2011. From Neural Activation to Symbolic Alignment: A Network-Based Approach to the Formation of Dialogue Lexica. Proceedings of the International Joint Conference on Neural Networks (IJCNN 2011), San Jose, California, July 31 – August 5.
    BibTeX
    @inproceedings{Mehler:Luecking:Menke:2011,
      author    = {Mehler, Alexander and Lücking, Andy and Menke, Peter},
      title     = {From Neural Activation to Symbolic Alignment: A Network-Based
                   Approach to the Formation of Dialogue Lexica},
      booktitle = {Proceedings of the International Joint Conference on Neural Networks
                   (IJCNN 2011), San Jose, California, July 31 -- August 5},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/neural-align-final.pdf},
      website   = {{http://dx.doi.org/10.1109/IJCNN.2011.6033266}},
      year      = {2011}
    }
    Andy Lücking, Olga Abramov, Alexander Mehler and Peter Menke. 2011. The Bielefeld Jigsaw Map Game (JMG) Corpus. Abstracts of the Corpus Linguistics Conference 2011.
    BibTeX
    @inproceedings{Luecking:Abramov:Mehler:Menke:2011,
      author    = {Lücking, Andy and Abramov, Olga and Mehler, Alexander and Menke, Peter},
      title     = {The Bielefeld Jigsaw Map Game (JMG) Corpus},
      booktitle = {Abstracts of the Corpus Linguistics Conference 2011},
      series    = {CL2011},
      address   = {Birmingham},
      pdf       = {http://www.birmingham.ac.uk/documents/college-artslaw/corpus/conference-archives/2011/Paper-137.pdf},
      website   = {http://www.birmingham.ac.uk/research/activity/corpus/publications/conference-archives/2011-birmingham.aspx},
      year      = {2011}
    }
    Rüdiger Gleim, Armin Hoenen, Nils Diewald, Alexander Mehler and Alexandra Ernst. 2011. Modeling, Building and Maintaining Lexica for Corpus Linguistic Studies by Example of Late Latin. Corpus Linguistics 2011, 20-22 July, Birmingham.
    BibTeX
    @inproceedings{Gleim:Hoenen:Diewald:Mehler:Ernst:2011,
      author    = {Gleim, Rüdiger and Hoenen, Armin and Diewald, Nils and Mehler, Alexander
                   and Ernst, Alexandra},
      title     = {Modeling, Building and Maintaining Lexica for Corpus Linguistic
                   Studies by Example of Late Latin},
      booktitle = {Corpus Linguistics 2011, 20-22 July, Birmingham},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Paper-48.pdf},
      year      = {2011}
    }
    Peter Menke and Alexander Mehler. 2011. From experiments to corpora: The Ariadne Corpus Management System. Corpus Linguistics 2011, 20-22 July, Birmingham.
    BibTeX
    @inproceedings{Menke:Mehler:2011,
      author    = {Menke, Peter and Mehler, Alexander},
      title     = {From experiments to corpora: The Ariadne Corpus Management System},
      booktitle = {Corpus Linguistics 2011, 20-22 July, Birmingham},
      website   = {https://www.researchgate.net/publication/260186214_From_Experiments_to_Corpora_The_Ariadne_Corpus_Management_System},
      year      = {2011}
    }
    Matthias Dehmer, Frank Emmert-Streib and Alexander Mehler, eds. 2011. Towards an Information Theory of Complex Networks: Statistical Methods and Applications. Birkhäuser.
    BibTeX
    @book{Dehmer:EmmertStreib:Mehler:2009:a,
      editor    = {Dehmer, Matthias and Emmert-Streib, Frank and Mehler, Alexander},
      title     = {Towards an Information Theory of Complex Networks: Statistical
                   Methods and Applications},
      publisher = {Birkh{\"a}user},
      address   = {Boston/Basel},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/InformationTheoryComplexNetworks.jpg},
      pagetotal = {395},
      website   = {http://link.springer.com/book/10.1007/978-0-8176-4904-3/page/1},
      year      = {2011}
    }
    Alexander Mehler, Andy Lücking and Peter Menke. 2011. Assessing Lexical Alignment in Spontaneous Direction Dialogue Data by Means of a Lexicon Network Model. Proceedings of 12th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), February 20–26, Tokyo, 368–379.
    BibTeX
    @inproceedings{Mehler:Luecking:Menke:2011:a,
      author    = {Mehler, Alexander and Lücking, Andy and Menke, Peter},
      title     = {Assessing Lexical Alignment in Spontaneous Direction Dialogue
                   Data by Means of a Lexicon Network Model},
      booktitle = {Proceedings of 12th International Conference on Intelligent Text
                   Processing and Computational Linguistics (CICLing), February 20--26,
                   Tokyo},
      series    = {CICLing'11},
      pages     = {368-379},
      address   = {Berlin/New York},
      publisher = {Springer},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/titan-cicling-camera-ready.pdf},
      website   = {http://www.springerlink.com/content/g7p2250025u20010/},
      year      = {2011}
    }
    Peter Geibel, Alexander Mehler and Kai-Uwe Kühnberger. 2011. Learning Methods for Graph Models of Document Structure. Modeling, Learning and Processing of Text Technological Data Structures.
    BibTeX
    @incollection{Geibel:Mehler:Kuehnberger:2011:a,
      author    = {Geibel, Peter and Mehler, Alexander and Kühnberger, Kai-Uwe},
      title     = {Learning Methods for Graph Models of Document Structure},
      booktitle = {Modeling, Learning and Processing of Text Technological Data Structures},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Kühnberger, Kai-Uwe and Lobin, Henning and Lüngen, Harald
                   and Storrer, Angelika and Witt, Andreas},
      series    = {Studies in Computational Intelligence},
      address   = {Berlin/New York},
      website   = {http://www.springerlink.com/content/p095331472h76v56/},
      year      = {2011}
    }
    Alexander Mehler and Ulli Waltinger. 2011. Integrating Content and Structure Learning: A Model of Hypertext Zoning and Sounding. Modeling, Learning and Processing of Text Technological Data Structures.
    BibTeX
    @incollection{Mehler:Waltinger:2011:a,
      author    = {Mehler, Alexander and Waltinger, Ulli},
      title     = {Integrating Content and Structure Learning: A Model of Hypertext
                   Zoning and Sounding},
      booktitle = {Modeling, Learning and Processing of Text Technological Data Structures},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Kühnberger, Kai-Uwe and Lobin, Henning and Lüngen, Harald
                   and Storrer, Angelika and Witt, Andreas},
      series    = {Studies in Computational Intelligence},
      address   = {Berlin/New York},
      website   = {http://rd.springer.com/chapter/10.1007/978-3-642-22613-7_15},
      year      = {2011}
    }
    Olga Abramov and Alexander Mehler. 2011. Automatic Language Classification by Means of Syntactic Dependency Networks. Journal of Quantitative Linguistics, 18(4):291–336.
    BibTeX
    @article{Abramov:Mehler:2011:a,
      author    = {Abramov, Olga and Mehler, Alexander},
      title     = {Automatic Language Classification by Means of Syntactic Dependency Networks},
      journal   = {Journal of Quantitative Linguistics},
      volume    = {18},
      number    = {4},
      pages     = {291-336},
      abstract  = {This article presents an approach to automatic language classification
                   by means of linguistic networks. Networks of 11 languages were
                   constructed from dependency treebanks, and the topology of these
                   networks serves as input to the classification algorithm. The
                   results match the genealogical similarities of these languages.
                   In addition, we test two alternative approaches to automatic language
                   classification – one based on n-grams and the other on quantitative
                   typological indices. All three methods show good results in identifying
                   genealogical groups. Beyond genetic similarities, network features
                   (and feature combinations) offer a new source of typological information
                   about languages. This information can contribute to a better understanding
                   of the interplay of single linguistic phenomena observed in language.},
      website   = {http://www.researchgate.net/publication/220469321_Automatic_Language_Classification_by_means_of_Syntactic_Dependency_Networks},
      year      = {2011}
    }
    Alexander Mehler, Kai-Uwe Kühnberger, Henning Lobin, Harald Lüngen, Angelika Storrer and Andreas Witt. 2011. Modeling, Learning and Processing of Text Technological Data Structures. Ed. by Alexander Mehler, Kai-Uwe Kühnberger, Henning Lobin, Harald Lüngen, Angelika Storrer and Andreas Witt.Studies in Computational Intelligence. Springer.
    BibTeX
    @book{Mehler:Kuehnberger:Lobin:Luengen:Storrer:Witt:2011,
      author    = {Mehler, Alexander and Kühnberger, Kai-Uwe and Lobin, Henning and Lüngen, Harald
                   and Storrer, Angelika and Witt, Andreas},
      editor    = {Mehler, Alexander and Kühnberger, Kai-Uwe and Lobin, Henning and Lüngen, Harald
                   and Storrer, Angelika and Witt, Andreas},
      title     = {Modeling, Learning and Processing of Text Technological Data Structures},
      publisher = {Springer},
      series    = {Studies in Computational Intelligence},
      address   = {Berlin/New York},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/modelingLearningAndProcessing-medium.jpg},
      pagetotal = {400},
      website   = {/books/texttechnologybook/},
      year      = {2011}
    }
    Ulli Waltinger, Alexander Mehler, Mathias Lösch and Wolfram Horstmann. 2011. Hierarchical Classification of OAI Metadata Using the DDC Taxonomy. Advanced Language Technologies for Digital Libraries (ALT4DL), 29–40.
    BibTeX
    @incollection{Waltinger:Mehler:Loesch:Horstmann:2011,
      author    = {Waltinger, Ulli and Mehler, Alexander and Lösch, Mathias and Horstmann, Wolfram},
      title     = {Hierarchical Classification of OAI Metadata Using the DDC Taxonomy},
      booktitle = {Advanced Language Technologies for Digital Libraries (ALT4DL)},
      publisher = {Springer},
      editor    = {Raffaella Bernardi and Sally Chambers and Bjoern Gottfried and Frederique Segond
                   and Ilya Zaihrayeu},
      series    = {LNCS},
      pages     = {29-40},
      address   = {Berlin},
      abstract  = {In the area of digital library services, the access to subject-specific
                   metadata of scholarly publications is of utmost interest. One
                   of the most prevalent approaches for metadata exchange is the
                   XML-based Open Archive Initiative (OAI) Protocol for Metadata
                   Harvesting (OAI-PMH). However, due to its loose requirements regarding
                   metadata content there is no strict standard for consistent subject
                   indexing specified, which is furthermore needed in the digital
                   library domain. This contribution addresses the problem of automatic
                   enhancement of OAI metadata by means of the most widely used universal
                   classification schemes in libraries—the Dewey Decimal Classification
                   (DDC). To be more specific, we automatically classify scientific
                   documents according to the DDC taxonomy within three levels using
                   a machine learning-based classifier that relies solely on OAI
                   metadata records as the document representation. The results show
                   an asymmetric distribution of documents across the hierarchical
                   structure of the DDC taxonomy and issues of data sparseness. However,
                   the performance of the classifier shows promising results on all
                   three levels of the DDC.},
      website   = {http://www.springerlink.com/content/x20257512g818377/},
      year      = {2011}
    }
    Alexander Mehler, Silke Schwandt, Rüdiger Gleim and Bernhard Jussen. 2011. Der eHumanities Desktop als Werkzeug in der historischen Semantik: Funktionsspektrum und Einsatzszenarien. Journal for Language Technology and Computational Linguistics (JLCL), 26(1):97–117.
    BibTeX
    @article{Mehler:Schwandt:Gleim:Jussen:2011,
      author    = {Mehler, Alexander and Schwandt, Silke and Gleim, Rüdiger and Jussen, Bernhard},
      title     = {Der eHumanities Desktop als Werkzeug in der historischen Semantik:
                   Funktionsspektrum und Einsatzszenarien},
      journal   = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      volume    = {26},
      number    = {1},
      pages     = {97-117},
      abstract  = {Die Digital Humanities bzw. die Computational Humanities entwickeln
                   sich zu eigenst{\"a}ndigen Disziplinen an der Nahtstelle von Geisteswissenschaft
                   und Informatik. Diese Entwicklung betrifft zunehmend auch die
                   Lehre im Bereich der geisteswissenschaftlichen Fachinformatik.
                   In diesem Beitrag thematisieren wir den eHumanities Desktop als
                   ein Werkzeug für diesen Bereich der Lehre. Dabei geht es genauer
                   um einen Brückenschlag zwischen Geschichtswissenschaft und Informatik:
                   Am Beispiel der historischen Semantik stellen wir drei Lehrszenarien
                   vor, in denen der eHumanities Desktop in der geschichtswissenschaftlichen
                   Lehre zum Einsatz kommt. Der Beitrag schliesst mit einer Anforderungsanalyse
                   an zukünftige Entwicklungen in diesem Bereich.},
      pdf       = {http://media.dwds.de/jlcl/2011_Heft1/8.pdf },
      year      = {2011}
    }
    Md. Zahurul Islam, Roland Mittmann and Alexander Mehler. 2011. Multilingualism in Ancient Texts: Language Detection by Example of Old High German and Old Saxon. GSCL conference on Multilingual Resources and Multilingual Applications (GSCL 2011), 28-30 September, Hamburg, Germany.
    BibTeX
    @inproceedings{Zahurul:Mittmann:Mehler:2011,
      author    = {Islam, Md. Zahurul and Mittmann, Roland and Mehler, Alexander},
      title     = {Multilingualism in Ancient Texts: Language Detection by Example
                   of Old High German and Old Saxon},
      booktitle = {GSCL conference on Multilingual Resources and Multilingual Applications
                   (GSCL 2011), 28-30 September, Hamburg, Germany},
      abstract  = {In this paper, we present an approach to language d etection in
                   streams of multilingual ancient texts. We introduce a supervised
                   classifier that detects, amongst others, Old High G erman (OHG)
                   and Old Saxon (OS). We evaluate our mod el by means of three experiments
                   that show that language detection is po ssible even for dead languages.
                   Finally, we present an experiment in unsupervised language detection
                   as a tertium comparationis for o ur supervised classifier.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Multilingualism_in_Ancient_Texts_Language_Detection_by_Example_of_Old_High_German_and_Old_Saxon.pdf},
      timestamp = {2011.08.25},
      year      = {2011}
    }

    2010

    Alexander Mehler. 2010. Minimum Spanning Markovian Trees: Introducing Context-Sensitivity into the Generation of Spanning Trees. Structural Analysis of Complex Networks, 381–401.
    BibTeX
    @incollection{Mehler:2010:a,
      author    = {Mehler, Alexander},
      title     = {Minimum Spanning Markovian Trees: Introducing Context-Sensitivity
                   into the Generation of Spanning Trees},
      booktitle = {Structural Analysis of Complex Networks},
      publisher = {Birkh{\"a}user Publishing},
      editor    = {Dehmer, Matthias},
      pages     = {381-401},
      address   = {Basel},
      abstract  = {This chapter introduces a novel class of graphs: Minimum Spanning
                   Markovian Trees (MSMTs). The idea behind MSMTs is to provide spanning
                   trees that minimize the costs of edge traversals in a Markovian
                   manner, that is, in terms of the path starting with the root of
                   the tree and ending at the vertex under consideration. In a second
                   part, the chapter generalizes this class of spanning trees in
                   order to allow for damped Markovian effects in the course of spanning.
                   These two effects, (1) the sensitivity to the contexts generated
                   by consecutive edges and (2) the decreasing impact of more antecedent
                   (or 'weakly remembered') vertices, are well known in cognitive
                   modeling [6, 10, 21, 23]. In this sense, the chapter can also
                   be read as an effort to introduce a graph model to support the
                   simulation of cognitive systems. Note that MSMTs are not to be
                   confused with branching Markov chains or Markov trees [20] as
                   we focus on generating spanning trees from given weighted undirected
                   networks.},
      website   = {https://www.researchgate.net/publication/226700676_Minimum_Spanning_Markovian_Trees_Introducing_Context-Sensitivity_into_the_Generation_of_Spanning_Trees},
      year      = {2010}
    }
    Rüdiger Gleim and Alexander Mehler. 2010. Computational Linguistics for Mere Mortals – Powerful but Easy-to-use Linguistic Processing for Scientists in the Humanities. Proceedings of LREC 2010.
    BibTeX
    @inproceedings{Gleim:Mehler:2010:b,
      author    = {Gleim, Rüdiger and Mehler, Alexander},
      title     = {Computational Linguistics for Mere Mortals – Powerful but Easy-to-use
                   Linguistic Processing for Scientists in the Humanities},
      booktitle = {Proceedings of LREC 2010},
      address   = {Malta},
      publisher = {ELDA},
      abstract  = {Delivering linguistic resources and easy-to-use methods to a broad
                   public in the humanities is a challenging task. On the one hand
                   users rightly demand easy to use interfaces but on the other hand
                   want to have access to the full flexibility and power of the functions
                   being offered. Even though a growing number of excellent systems
                   exist which offer convenient means to use linguistic resources
                   and methods, they usually focus on a specific domain, as for example
                   corpus exploration or text categorization. Architectures which
                   address a broad scope of applications are still rare. This article
                   introduces the eHumanities Desktop, an online system for corpus
                   management, processing and analysis which aims at bridging the
                   gap between powerful command line tools and intuitive user interfaces.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/gleim_mehler_2010.pdf},
      year      = {2010}
    }
    Alexander Mehler, Andy Lücking and Petra Weiß. 2010. A Network Model of Interpersonal Alignment. Entropy, 12(6):1440–1483.
    BibTeX
    @article{Mehler:Weiss:Luecking:2010:a,
      author    = {Mehler, Alexander and Lücking, Andy and Wei{\ss}, Petra},
      title     = {A Network Model of Interpersonal Alignment},
      journal   = {Entropy},
      volume    = {12},
      number    = {6},
      pages     = {1440-1483},
      abstract  = {In dyadic communication, both interlocutors adapt to each other
                   linguistically, that is, they align interpersonally. In this article,
                   we develop a framework for modeling interpersonal alignment in
                   terms of the structural similarity of the interlocutors’ dialog
                   lexica. This is done by means of so-called two-layer time-aligned
                   network series, that is, a time-adjusted graph model. The graph
                   model is partitioned into two layers, so that the interlocutors’
                   lexica are captured as subgraphs of an encompassing dialog graph.
                   Each constituent network of the series is updated utterance-wise.
                   Thus, both the inherent bipartition of dyadic conversations and
                   their gradual development are modeled. The notion of alignment
                   is then operationalized within a quantitative model of structure
                   formation based on the mutual information of the subgraphs that
                   represent the interlocutor’s dialog lexica. By adapting and further
                   developing several models of complex network theory, we show that
                   dialog lexica evolve as a novel class of graphs that have not
                   been considered before in the area of complex (linguistic) networks.
                   Additionally, we show that our framework allows for classifying
                   dialogs according to their alignment status. To the best of our
                   knowledge, this is the first approach to measuring alignment in
                   communication that explores the similarities of graph-like cognitive
                   representations.},
      doi       = {10.3390/e12061440},
      pdf       = {http://www.mdpi.com/1099-4300/12/6/1440/pdf},
      website   = {http://www.mdpi.com/1099-4300/12/6/1440/},
      year      = {2010}
    }
    Alexander Mehler, Serge Sharoff and Marina Santini. 2010. Genres on the Web: Computational Models and Empirical Studies. Ed. by Alexander Mehler, Serge Sharoff and Marina Santini. Springer.
    BibTeX
    @book{Mehler:Sharoff:Santini:2010:a,
      author    = {Mehler, Alexander and Sharoff, Serge and Santini, Marina},
      editor    = {Mehler, Alexander and Sharoff, Serge and Santini, Marina},
      title     = {Genres on the Web: Computational Models and Empirical Studies},
      publisher = {Springer},
      address   = {Dordrecht},
      abstract  = {The volume 'Genres on the Web' has been designed for a wide audience,
                   from the expert to the novice. It is a required book for scholars,
                   researchers and students who want to become acquainted with the
                   latest theoretical, empirical and computational advances in the
                   expanding field of web genre research. The study of web genre
                   is an overarching and interdisciplinary novel area of research
                   that spans from corpus linguistics, computational linguistics,
                   NLP, and text-technology, to web mining, webometrics, social network
                   analysis and information studies. This book gives readers a thorough
                   grounding in the latest research on web genres and emerging document
                   types. The book covers a wide range of web-genre focussed subjects,
                   such as: -The identification of the sources of web genres -Automatic
                   web genre identification -The presentation of structure-oriented
                   models -Empirical case studies One of the driving forces behind
                   genre research is the idea of a genre-sensitive information system,
                   which incorporates genre cues complementing the current keyword-based
                   search and retrieval applications.},
      booktitle = {Genres on the Web: Computational Models and Empirical Studies},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/GenresOnTheWeb.jpg},
      pagetotal = {376},
      review    = {http://www.springerlink.com/content/ym07440380524721/},
      website   = {http://www.springer.com/computer/ai/book/978-90-481-9177-2},
      year      = {2010}
    }
    Tilmann Sutter and Alexander Mehler. 2010. Medienwandel als Wandel von Interaktionsformen – von frühen Medienkulturen zum Web 2.0. Ed. by Tilmann Sutter and Alexander Mehler. Verlag für Sozialwissenschaften.
    BibTeX
    @book{Sutter:Mehler:2010,
      author    = {Sutter, Tilmann and Mehler, Alexander},
      editor    = {Sutter, Tilmann and Mehler, Alexander},
      title     = {Medienwandel als Wandel von Interaktionsformen – von frühen Medienkulturen
                   zum Web 2.0},
      publisher = {Verlag für Sozialwissenschaften},
      address   = {Wiesbaden},
      abstract  = {Die Beitr{\"a}ge des Bandes untersuchen den Medienwandel von frühen
                   europ{\"a}ischen Medienkulturen bis zu aktuellen Formen der Internetkommunikation
                   unter soziologischer, kulturwissenschaftlicher und linguistischer
                   Perspektive. Zwar haben sich die Massenmedien von den Beschr{\"a}nkungen
                   sozialer Interaktionen gelöst, sie weisen dem Publikum aber eine
                   distanzierte, blo{\ss} rezipierende Rolle zu. Dagegen eröffnen
                   neue Formen 'interaktiver' Medien gesteigerte Möglichkeiten der
                   Rückmeldung und der Mitgestaltung für die Nutzer. Der vorliegende
                   Band fragt nach der Qualit{\"a}t dieses Medienwandels: Werden
                   Medien tats{\"a}chlich interaktiv? Was bedeutet die Interaktivit{\"a}t
                   neuer Medien? Werden die durch neue Medien eröffneten Beteiligungsmöglichkeiten
                   realisiert?},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/Medienwandel.jpg},
      pagetotal = {289},
      website   = {http://www.springer.com/de/book/9783531156422},
      year      = {2010}
    }
    Alexander Mehler, Petra Weiß, Peter Menke and Andy Lücking. 2010. Towards a Simulation Model of Dialogical Alignment. Proceedings of the 8th International Conference on the Evolution of Language (Evolang8), 14-17 April 2010, Utrecht, 238–245.
    BibTeX
    @inproceedings{Mehler:Weiss:Menke:Luecking:2010,
      author    = {Mehler, Alexander and Wei{\ss}, Petra and Menke, Peter and Lücking, Andy},
      title     = {Towards a Simulation Model of Dialogical Alignment},
      booktitle = {Proceedings of the 8th International Conference on the Evolution
                   of Language (Evolang8), 14-17 April 2010, Utrecht},
      pages     = {238-245},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Alexander_Mehler_Petra_Weiss_Peter_Menke_Andy_Luecking.pdf},
      website   = {http://www.let.uu.nl/evolang2010.nl/},
      year      = {2010}
    }
    Fiorella Foscarini, Yunhyong Kim, Christopher A. Lee, Alexander Mehler, Gillian Oliver and Seamus Ross. 2010. On the Notion of Genre in Digital Preservation. Automation in Digital Preservation.
    BibTeX
    @inproceedings{Foscarini:Kim:Lee:Mehler:Oliver:Ross:2010,
      author    = {Foscarini, Fiorella and Kim, Yunhyong and Lee, Christopher A.
                   and Mehler, Alexander and Oliver, Gillian and Ross, Seamus},
      title     = {On the Notion of Genre in Digital Preservation},
      booktitle = {Automation in Digital Preservation},
      editor    = {Chanod, Jean-Pierre and Dobreva, Milena and Rauber, Andreas and Ross, Seamus},
      number    = {10291},
      series    = {Dagstuhl Seminar Proceedings},
      address   = {Dagstuhl, Germany},
      publisher = {Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik,
                       Germany},
      annote    = {Keywords: Digital preservation, genre analysis,
                       context modeling, diplomatics, information retrieval},
      issn      = {1862-4405},
      pdf       = {http://drops.dagstuhl.de/opus/volltexte/2010/2763/pdf/10291.MehlerAlexander.Paper.2763.pdf},
      website   = {http://drops.dagstuhl.de/opus/volltexte/2010/2763},
      year      = {2010}
    }
    Alexander Mehler, Rüdiger Gleim, Ulli Waltinger and Nils Diewald. 2010. Time Series of Linguistic Networks by Example of the Patrologia Latina. Proceedings of INFORMATIK 2010: Service Science, September 27 - October 01, 2010, Leipzig, 2:609–616.
    BibTeX
    @inproceedings{Mehler:Gleim:Waltinger:Diewald:2010,
      author    = {Mehler, Alexander and Gleim, Rüdiger and Waltinger, Ulli and Diewald, Nils},
      title     = {Time Series of Linguistic Networks by Example of the Patrologia Latina},
      booktitle = {Proceedings of INFORMATIK 2010: Service Science, September 27
                   - October 01, 2010, Leipzig},
      editor    = {F{\"a}hnrich, Klaus-Peter and Franczyk, Bogdan},
      volume    = {2},
      series    = {Lecture Notes in Informatics},
      pages     = {609-616},
      publisher = {GI},
      pdf       = {http://subs.emis.de/LNI/Proceedings/Proceedings176/586.pdf},
      year      = {2010}
    }
    Rüdiger Gleim, Paul Warner and Alexander Mehler. 2010. eHumanities Desktop - An Architecture for Flexible Annotation in Iconographic Research. Proceedings of the 6th International Conference on Web Information Systems and Technologies (WEBIST '10), April 7-10, 2010, Valencia.
    BibTeX
    @inproceedings{Gleim:Warner:Mehler:2010,
      author    = {Gleim, Rüdiger and Warner, Paul and Mehler, Alexander},
      title     = {eHumanities Desktop - An Architecture for Flexible Annotation
                   in Iconographic Research},
      booktitle = {Proceedings of the 6th International Conference on Web Information
                   Systems and Technologies (WEBIST '10), April 7-10, 2010, Valencia},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/gleim_warner_mehler_2010.pdf},
      website   = {https://www.researchgate.net/publication/220724277_eHumanities_Desktop_-_An_Architecture_for_Flexible_Annotation_in_Iconographic_Research},
      year      = {2010}
    }
    Peter Menke and Alexander Mehler. 2010. The Ariadne System: A flexible and extensible framework for the modeling and storage of experimental data in the humanities. Proceedings of LREC 2010.
    BibTeX
    @inproceedings{Menke:Mehler:2010,
      author    = {Menke, Peter and Mehler, Alexander},
      title     = {The Ariadne System: A flexible and extensible framework for the
                   modeling and storage of experimental data in the humanities},
      booktitle = {Proceedings of LREC 2010},
      address   = {Malta},
      publisher = {ELDA},
      abstract  = {This paper introduces the Ariadne Corpus Management System. First,
                   the underlying data model is presented which enables users to
                   represent and process heterogeneous data sets within a single,
                   consistent framework. Secondly, a set of automatized procedures
                   is described that offers assistance to researchers in various
                   data-related use cases. Finally, an approach to easy yet powerful
                   data retrieval is introduced in form of a specialised querying
                   language for multimodal data.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/menke_mehler_2010.pdf},
      website   = {http://arnetminer.org/publication/the-ariadne-system-a-flexible-and-extensible-framework-for-the-modeling-and-storage-of-experimental-data-in-the-humanities-2839925.html},
      year      = {2010}
    }
    Tilmann Sutter and Alexander Mehler. 2010. Einleitung: Der aktuelle Medienwandel im Blick einer interdisziplinären Medienwissenschaft. In: Medienwandel als Wandel von Interaktionsformen, 7–16. Ed. by Tilmann Sutter and Alexander Mehler. VS Verlag für Sozialwissenschaften.
    BibTeX
    @inbook{Sutter2010,
      author    = {Sutter, Tilmann and Mehler, Alexander},
      editor    = {Sutter, Tilmann and Mehler, Alexander},
      title     = {Einleitung: Der aktuelle Medienwandel im Blick einer interdisziplin{\"a}ren
                   Medienwissenschaft},
      pages     = {7--16},
      publisher = {VS Verlag f{\"u}r Sozialwissenschaften},
      address   = {Wiesbaden},
      abstract  = {Die Herausforderung, die der Wandel von Kommunikationsmedien f{\"u}r
                   die Medienwissenschaft darstellt, resultiert nicht nur aus der
                   ungeheuren Beschleunigung des Medienwandels. Die Herausforderung
                   stellt sich auch mit der Frage, welches die neuen Formen und Strukturen
                   sind, die aus dem Wandel der Medien hervorgehen. R{\"u}ckt man
                   diese Frage in den Fokus der {\"U}berlegungen, kommen erstens
                   Entwicklungen im Wechsel von Massenmedien zu neuen, „interaktiven``
                   Medien in den Blick. Dies betrifft den Wandel von den alten Medien
                   in Form von Einwegkommunikation zu den neuen Medien in Form von
                   Netzkommunikation. Dieser Wandel wurde in zahlreichen Analysen
                   als eine Revolution beschrieben: Im Unterschied zur einseitigen,
                   r{\"u}ckkopplungsarmen Kommunikationsform der Massenmedien sollen
                   neue, computergest{\"u}tzte Formen der Medienkommunikation „interaktiv``
                   sein, d.h. gesteigerte R{\"u}ckkopplungs- und Eingriffsm{\"o}glichkeiten
                   f{\"u}r die Adressaten und Nutzer bieten. Sozialwissenschaftlich
                   bedeutsam ist dabei die Einsch{\"a}tzung der Qualit{\"a}t und
                   des Umfangs dieser neuen M{\"o}glichkeiten und Leistungen. Denn
                   bislang bedeutete Medienwandel im Kern eine zunehmende Ausdifferenzierung
                   alter und neuer Medien mit je spezifischen Leistungen, d.h. neue
                   Medien ersetzen die {\"a}lteren nicht, sondern sie erg{\"a}nzen
                   und erweitern sie. Allerdings wird im Zuge des aktuellen Medienwandels
                   immer deutlicher, dass die neuen Medien durchaus imstande sind,
                   die Leistungen massenmedialer Verbreitung von Kommunikation zu
                   {\"u}bernehmen. Stehen wir also, wie das schon seit l{\"a}ngerem
                   k{\"u}hn vorhergesagt wird, vor der Etablierung eines Universalmediums,
                   das in der Lage ist, die Formen und Funktionen anderer Medien
                   zu {\"u}bernehmen?},
      booktitle = {Medienwandel als Wandel von Interaktionsformen},
      doi       = {10.1007/978-3-531-92292-8_1},
      isbn      = {978-3-531-92292-8},
      url       = {https://doi.org/10.1007/978-3-531-92292-8_1},
      year      = {2010}
    }

    2009

    Marina Santini, Alexander Mehler and Serge Sharoff. 2009. Riding the Rough Waves of Genre on the Web: Concepts and Research Questions. Genres on the Web: Computational Models and Empirical Studies, 3–32.
    BibTeX
    @incollection{Santini:Mehler:Sharoff:2009,
      author    = {Santini, Marina and Mehler, Alexander and Sharoff, Serge},
      title     = {Riding the Rough Waves of Genre on the Web: Concepts and Research Questions},
      booktitle = {Genres on the Web: Computational Models and Empirical Studies},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Sharoff, Serge and Santini, Marina},
      pages     = {3-32},
      address   = {Berlin/New York},
      abstract  = {This chapter outlines the state of the art of empirical and computational
                   webgenre research. First, it highlights why the concept of genre
                   is profitable for a range of disciplines. At the same time, it
                   lists a number of recent interpretations that can inform and influence
                   present and future genre research. Last but not least, it breaks
                   down a series of open issues that relate to the modelling of the
                   concept of webgenre in empirical and computational studies.},
      year      = {2009}
    }
    Alexander Mehler, Rüdiger Gleim, Ulli Waltinger, Alexandra Ernst, Dietmar Esch and Tobias Feith. 2009. eHumanities Desktop – eine webbasierte Arbeitsumgebung für die geisteswissenschaftliche Fachinformatik. Proceedings of the Symposium "Sprachtechnologie und eHumanities", 26.–27. Februar, Duisburg-Essen University.
    BibTeX
    @inproceedings{Mehler:Gleim:Waltinger:Ernst:Esch:Feith:2009,
      author    = {Mehler, Alexander and Gleim, Rüdiger and Waltinger, Ulli and Ernst, Alexandra
                   and Esch, Dietmar and Feith, Tobias},
      title     = {eHumanities Desktop – eine webbasierte Arbeitsumgebung für die
                   geisteswissenschaftliche Fachinformatik},
      booktitle = {Proceedings of the Symposium "Sprachtechnologie und eHumanities",
                   26.–27. Februar, Duisburg-Essen University},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_gleim_waltinger_ernst_esch_feith_2009.pdf},
      website   = {http://duepublico.uni-duisburg-essen.de/servlets/DocumentServlet?id=37041},
      year      = {2009}
    }
    Benno Wagner, Alexander Mehler, Christian Wolff and Bernhard Dotzler. 2009. Bausteine eines Literary Memory Information System (LiMeS) am Beispiel der Kafka-Forschung. Proceedings of the Symposium "Sprachtechnologie und eHumanities", 26.–27. Februar, Duisburg-Essen University.
    BibTeX
    @inproceedings{Wagner:Mehler:Wolff:Dotzler:2009,
      author    = {Wagner, Benno and Mehler, Alexander and Wolff, Christian and Dotzler, Bernhard},
      title     = {Bausteine eines Literary Memory Information System (LiMeS) am
                   Beispiel der Kafka-Forschung},
      booktitle = {Proceedings of the Symposium "Sprachtechnologie und eHumanities",
                   26.–27. Februar, Duisburg-Essen University},
      abstract  = {In dem Paper beschreiben wir Bausteine eines Literary Memory Information
                   System (LiMeS), das die literaturwissenschaftliche Erforschung
                   von so genannten Matrixtexten – das sind Prim{\"a}rtexte eines
                   bestimmten literarischen Gesamtwerks – unter dem Blickwinkel gro{\ss}er
                   Mengen so genannter Echotexte (Topia 1984; Wagner/Reinhard 2007)
                   – das sind Subtexte im Sinne eines literaturwissenschaftlichen
                   Intertextualit{\"a}tsbegriffs – ermöglicht. Den Ausgangspunkt
                   dieses computerphilologischen Informationssystems bildet ein Text-Mining-Modell
                   basierend auf dem Intertextualit{\"a}tsbegriff in Verbindung mit
                   dem Begriff des Semantic Web (Mehler, 2004b, 2005a, b, Wolff 2005).
                   Wir zeigen, inwiefern dieses Modell über bestehende Informationssystemarchitekturen
                   hinausgeht und schlie{\ss}en einen Brückenschlag zur derzeitigen
                   Entwicklung von Arbeitsumgebungen in der geisteswissenschaftlichen
                   Fachinformatik in Form eines eHumanities Desktop.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/wagner_mehler_wolff_dotzler_2009.pdf},
      website   = {http://epub.uni-regensburg.de/6795/},
      year      = {2009}
    }
    Ulli Waltinger, Alexander Mehler and Armin Wegner. 2009. A Two-Level Approach to Web Genre Classification. Proceedings of the 5th International Conference on Web Information Systems and Technologies (WEBIST '09), March 23-26, 2009, Lisboa.
    BibTeX
    @inproceedings{Waltinger:Mehler:Wegner:2009,
      author    = {Waltinger, Ulli and Mehler, Alexander and Wegner, Armin},
      title     = {A Two-Level Approach to Web Genre Classification},
      booktitle = {Proceedings of the 5th International Conference on Web Information
                   Systems and Technologies (WEBIST '09), March 23-26, 2009, Lisboa},
      abstract  = {This paper presents an approach of two-level categorization of
                   web pages. In contrast to related approaches the model additionally
                   explores and categorizes functionally and thematically demarcated
                   segments of the hypertext types to be categorized. By classifying
                   these segments conclusions can be drawn about the type of the
                   corresponding compound web document.},
      pdf       = {http://www.ulliwaltinger.de/pdf/Webist_2009_TwoLevel_Genre_Classification_WaltingerMehlerWegner.pdf},
      year      = {2009}
    }
    Alexander Mehler. 2009. Structure Formation in the Web. A Graph-Theoretical Model of Hypertext Types. Linguistic Modeling of Information and Markup Languages. Contributions to Language Technology.
    BibTeX
    @incollection{Mehler:2009:b,
      author    = {Mehler, Alexander},
      title     = {Structure Formation in the Web. A Graph-Theoretical Model of Hypertext Types},
      booktitle = {Linguistic Modeling of Information and Markup Languages. Contributions
                   to Language Technology},
      publisher = {Springer},
      editor    = {Witt, Andreas and Metzing, Dieter},
      series    = {Text, Speech and Language Technology},
      address   = {Dordrecht},
      abstract  = {In this chapter we develop a representation model of web document
                   networks. Based on the notion of uncertain web document structures,
                   the model is defined as a template which grasps nested manifestation
                   levels of hypertext types. Further, we specify the model on the
                   conceptual, formal and physical level and exemplify it by reconstructing
                   competing web document models.},
      website   = {http://www.springerlink.com/content/t27782w8j2125112/},
      year      = {2009}
    }
    Rüdiger Gleim, Alexander Mehler, Ulli Waltinger and Peter Menke. 2009. eHumanities Desktop – An extensible Online System for Corpus Management and Analysis. 5th Corpus Linguistics Conference, University of Liverpool.
    BibTeX
    @inproceedings{Gleim:Mehler:Waltinger:Menke:2009,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Waltinger, Ulli and Menke, Peter},
      title     = {eHumanities Desktop – An extensible Online System for Corpus Management
                   and Analysis},
      booktitle = {5th Corpus Linguistics Conference, University of Liverpool},
      abstract  = {This paper presents the eHumanities Desktop - an online system
                   for corpus management and analysis in support of computing in
                   the humanities. Design issues and the overall architecture are
                   described, as well as an outline of the applications offered by
                   the system.},
      pdf       = {http://www.ulliwaltinger.de/pdf/eHumanitiesDesktop-AnExtensibleOnlineSystem-CL2009.pdf},
      website   = {http://www.ulliwaltinger.de/ehumanities-desktop-an-extensible-online-system-for-corpus-management-and-analysis/},
      year      = {2009}
    }
    Alexander Mehler and Andy Lücking. 2009. A Structural Model of Semiotic Alignment: The Classification of Multimodal Ensembles as a Novel Machine Learning Task. Proceedings of IEEE Africon 2009, September 23-25, Nairobi, Kenya.
    BibTeX
    @inproceedings{Mehler:Luecking:2009,
      author    = {Mehler, Alexander and Lücking, Andy},
      title     = {A Structural Model of Semiotic Alignment: The Classification of
                   Multimodal Ensembles as a Novel Machine Learning Task},
      booktitle = {Proceedings of IEEE Africon 2009, September 23-25, Nairobi, Kenya},
      publisher = {IEEE},
      abstract  = {In addition to the well-known linguistic alignment processes in
                   dyadic communication – e.g., phonetic, syntactic, semantic alignment
                   – we provide evidence for a genuine multimodal alignment process,
                   namely semiotic alignment. Communicative elements from different
                   modalities 'routinize into' cross-modal 'super-signs', which we
                   call multimodal ensembles. Computational models of human communication
                   are in need of expressive models of multimodal ensembles. In this
                   paper, we exemplify semiotic alignment by means of empirical examples
                   of the building of multimodal ensembles. We then propose a graph
                   model of multimodal dialogue that is expressive enough to capture
                   multimodal ensembles. In line with this model, we define a novel
                   task in machine learning with the aim of training classifiers
                   that can detect semiotic alignment in dialogue. This model is
                   in support of approaches which need to gain insights into realistic
                   human-machine communication.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_luecking_2009.pdf},
      website   = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?reload=true&arnumber=5308098},
      year      = {2009}
    }
    Alexander Mehler. 2009. Generalized Shortest Paths Trees: A Novel Graph Class Applied to Semiotic Networks. Analysis of Complex Networks: From Biology to Linguistics, 175–220.
    BibTeX
    @incollection{Mehler:2009:c,
      author    = {Mehler, Alexander},
      title     = {Generalized Shortest Paths Trees: A Novel Graph Class Applied
                   to Semiotic Networks},
      booktitle = {Analysis of Complex Networks: From Biology to Linguistics},
      publisher = {Wiley-VCH},
      editor    = {Dehmer, Matthias and Emmert-Streib, Frank},
      pages     = {175-220},
      address   = {Weinheim},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2009_b.pdf},
      website   = {https://www.researchgate.net/publication/255666602_1_Generalised_Shortest_Paths_Trees_A_Novel_Graph_Class_Applied_to_Semiotic_Networks},
      year      = {2009}
    }
    Alexander Mehler and Ulli Waltinger. 2009. Enhancing Document Modeling by Means of Open Topic Models: Crossing the Frontier of Classification Schemes in Digital Libraries by Example of the DDC. Library Hi Tech, 27(4):520–539.
    BibTeX
    @article{Mehler:Waltinger:2009:b,
      author    = {Mehler, Alexander and Waltinger, Ulli},
      title     = {Enhancing Document Modeling by Means of Open Topic Models: Crossing
                   the Frontier of Classification Schemes in Digital Libraries by
                   Example of the DDC},
      journal   = {Library Hi Tech},
      volume    = {27},
      number    = {4},
      pages     = {520-539},
      abstract  = {Purpose: We present a topic classification model using the Dewey
                   Decimal Classification (DDC) as the target scheme. This is done
                   by exploring metadata as provided by the Open Archives Initiative
                   (OAI) to derive document snippets as minimal document representations.
                   The reason is to reduce the effort of document processing in digital
                   libraries. Further, we perform feature selection and extension
                   by means of social ontologies and related web-based lexical resources.
                   This is done to provide reliable topic-related classifications
                   while circumventing the problem of data sparseness. Finally, we
                   evaluate our model by means of two language-specific corpora.
                   This paper bridges digital libraries on the one hand and computational
                   linguistics on the other. The aim is to make accessible computational
                   linguistic methods to provide thematic classifications in digital
                   libraries based on closed topic models as the DDC. Design/methodology/approach:
                   text classification, text-technology, computational linguistics,
                   computational semantics, social semantics. Findings: We show that
                   SVM-based classifiers perform best by exploring certain selections
                   of OAI document metadata. Research limitations/implications: The
                   findings show that it is necessary to further develop SVM-based
                   DDC-classifiers by using larger training sets possibly for more
                   than two languages in order to get better F-measure values. Practical
                   implications: We can show that DDC-classifications come into reach
                   which primarily explore OAI metadata. Originality/value: We provide
                   algorithmic and formal-mathematical information how to build DDC-classifiers
                   for digital libraries.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_waltinger_2009_b.pdf},
      website   = {http://biecoll.ub.uni-bielefeld.de/frontdoor.php?source_opus=5001&la=de},
      year      = {2009}
    }
    Rüdiger Gleim, Ulli Waltinger, Alexandra Ernst, Alexander Mehler, Dietmar Esch and Tobias Feith. 2009. The eHumanities Desktop – An Online System for Corpus Management and Analysis in Support of Computing in the Humanities. Proceedings of the Demonstrations Session of the 12th Conference of the European Chapter of the Association for Computational Linguistics EACL 2009, 30 March – 3 April, Athens.
    BibTeX
    @inproceedings{Gleim:Waltinger:Ernst:Mehler:Esch:Feith:2009,
      author    = {Gleim, Rüdiger and Waltinger, Ulli and Ernst, Alexandra and Mehler, Alexander
                   and Esch, Dietmar and Feith, Tobias},
      title     = {The eHumanities Desktop – An Online System for Corpus Management
                   and Analysis in Support of Computing in the Humanities},
      booktitle = {Proceedings of the Demonstrations Session of the 12th Conference
                   of the European Chapter of the Association for Computational Linguistics
                   EACL 2009, 30 March – 3 April, Athens},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/gleim_waltinger_ernst_mehler_esch_feith_2009.pdf},
      year      = {2009}
    }
    Alexander Mehler. 2009. Artifizielle Interaktivität. Eine semiotische Betrachtung. Medienwandel als Wandel von Interaktionsformen – von frühen Medienkulturen zum Web 2.0.
    BibTeX
    @incollection{Mehler:2009:d,
      author    = {Mehler, Alexander},
      title     = {Artifizielle Interaktivit{\"a}t. Eine semiotische Betrachtung},
      booktitle = {Medienwandel als Wandel von Interaktionsformen – von frühen Medienkulturen
                   zum Web 2.0},
      publisher = {VS},
      editor    = {Sutter, Tilmann and Mehler, Alexander},
      address   = {Wiesbaden},
      year      = {2009}
    }
    Ulli Waltinger and Alexander Mehler. 2009. The Feature Difference Coefficient: Classification by Means of Feature Distributions. Proceedings of the Conference on Text Mining Services (TMS 2009), 159–168.
    BibTeX
    @inproceedings{Waltinger:Mehler:2009:a,
      author    = {Waltinger, Ulli and Mehler, Alexander},
      title     = {The Feature Difference Coefficient: Classification by Means of
                   Feature Distributions},
      booktitle = {Proceedings of the Conference on Text Mining Services (TMS 2009)},
      series    = {Leipziger Beitr{\"a}ge zur Informatik: Band XIV},
      pages     = {159–168},
      address   = {Leipzig},
      publisher = {Leipzig University},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/waltinger_mehler_2009_a.pdf},
      year      = {2009}
    }
    Marina Santini, Georg Rehm, Serge Sharoff and Alexander Mehler. 2009. Automatic Genre Identification: Issues and Prospects. Ed. by Marina Santini, Georg Rehm, Serge Sharoff and Alexander Mehler.Journal for Language Technology and Computational Linguistics (JLCL), 24(1). GSCL.
    BibTeX
    @book{Santini:Rehm:Sharoff:Mehler:2009,
      author    = {Santini, Marina and Rehm, Georg and Sharoff, Serge and Mehler, Alexander},
      editor    = {Santini, Marina and Rehm, Georg and Sharoff, Serge and Mehler, Alexander},
      title     = {Automatic Genre Identification: Issues and Prospects},
      publisher = {GSCL},
      volume    = {24(1)},
      series    = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/AutomaticGenreIdentification.png},
      pagetotal = {148},
      pdf       = {http://www.jlcl.org/2009_Heft1/JLCL24(1).pdf},
      year      = {2009}
    }
    Ulli Waltinger, Alexander Mehler and Rüdiger Gleim. 2009. Social Semantics And Its Evaluation By Means of Closed Topic Models: An SVM-Classification Approach Using Semantic Feature Replacement By Topic Generalization. Proceedings of the Biennial GSCL Conference 2009, September 30 – October 2, Universität Potsdam.
    BibTeX
    @inproceedings{Waltinger:Mehler:Gleim:2009:a,
      author    = {Waltinger, Ulli and Mehler, Alexander and Gleim, Rüdiger},
      title     = {Social Semantics And Its Evaluation By Means of Closed Topic Models:
                   An SVM-Classification Approach Using Semantic Feature Replacement
                   By Topic Generalization},
      booktitle = {Proceedings of the Biennial GSCL Conference 2009, September 30
                   – October 2, Universit{\"a}t Potsdam},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/GSCL_2009_WaltingerMehlerGleim_camera_ready.pdf},
      year      = {2009}
    }
    Ulli Waltinger and Alexander Mehler. 2009. Social Semantics and Its Evaluation By Means Of Semantic Relatedness And Open Topic Models. IEEE/WIC/ACM International Conference on Web Intelligence, September 15–18, Milano.
    BibTeX
    @inproceedings{Waltinger:Mehler:2009:c,
      author    = {Waltinger, Ulli and Mehler, Alexander},
      title     = {Social Semantics and Its Evaluation By Means Of Semantic Relatedness
                   And Open Topic Models},
      booktitle = {IEEE/WIC/ACM International Conference on Web Intelligence, September
                   15–18, Milano},
      abstract  = {This paper presents an approach using social semantics for the
                   task of topic labelling by means of Open Topic Models. Our approach
                   utilizes a social ontology to create an alignment of documents
                   within a social network. Comprised category information is used
                   to compute a topic generalization. We propose a feature-frequency-based
                   method for measuring semantic relatedness which is needed in order
                   to reduce the number of document features for the task of topic
                   labelling. This method is evaluated against multiple human judgement
                   experiments comprising two languages and three different resources.
                   Overall the results show that social ontologies provide a rich
                   source of terminological knowledge. The performance of the semantic
                   relatedness measure with correlation values of up to .77 are quite
                   promising. Results on the topic labelling experiment show, with
                   an accuracy of up to .79, that our approach can be a valuable
                   method for various NLP applications.},
      website   = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5284920&abstractAccess=no&userType=inst},
      year      = {2009}
    }

    2008

    Maik Stührenberg, Michael Beißwenger, Kai-Uwe Kühnberger, Alexander Mehler, Harald Lüngen, Dieter Metzing and Uwe Mönnich. 2008. Sustainability of Text-Technological Resources. Proceedings of the Post LREC-2008 Workshop: Sustainability of Language Resources and Tools for Natural Language Processing Marrakech, Morocco.
    BibTeX
    @inproceedings{Stuehrenberg:Beisswenger:Kuehnberger:Mehler:Luengen:Metzing:Moennich:2008,
      author    = {Stührenberg, Maik and Bei{\ss}wenger, Michael and Kühnberger, Kai-Uwe
                   and Mehler, Alexander and Lüngen, Harald and Metzing, Dieter and Mönnich, Uwe},
      title     = {Sustainability of Text-Technological Resources},
      booktitle = {Proceedings of the Post LREC-2008 Workshop: Sustainability of
                   Language Resources and Tools for Natural Language Processing Marrakech,
                   Morocco},
      abstract  = {We consider that there are obvious relationships between research
                   on sustainability of language and linguistic resources on the
                   one hand and work undertaken in the Research Unit 'Text-Technological
                   Modelling of Information' on the other. Currently the main focus
                   in sustainability research is concerned with archiving methods
                   of textual resources, i.e. methods for sustainability of primary
                   and secondary data; these aspects are addressed in our work as
                   well. However, we believe that there are additional certain aspects
                   of sustainability on which new light is shed on by procedures,
                   algorithms and dynamic processes undertaken in our Research Unit},
      pdf       = {http://www.michael-beisswenger.de/pub/lrec-sustainability.pdf},
      year      = {2008}
    }
    Alexander Mehler, Barbara Job, Philippe Blanchard and Hans-Jürgen Eikmeyer. 2008. Sprachliche Netzwerke. Netzwerkanalyse und Netzwerktheorie, 413–427.
    BibTeX
    @incollection{Mehler:Job:Blanchard:Eikmeyer:2008,
      author    = {Mehler, Alexander and Job, Barbara and Blanchard, Philippe and Eikmeyer, Hans-Jürgen},
      title     = {Sprachliche Netzwerke},
      booktitle = {Netzwerkanalyse und Netzwerktheorie},
      publisher = {VS},
      editor    = {Stegbauer, Christian},
      pages     = {413-427},
      address   = {Wiesbaden},
      abstract  = {In diesem Kapitel beschreiben wir so genannte sprachliche Netzwerke.
                   Dabei handelt es sich um Netzwerke sprachlicher Einheiten, die
                   in Zusammenhang mit ihrer Einbettung in das Netzwerk jener Sprachgemeinschaft
                   analysiert werden, welche diese Einheiten und deren Vernetzung
                   hervorgebracht hat. Wir erörtern ein Dreistufenmodell zur Analyse
                   solcher Netzwerke und exemplifizieren dieses Modell anhand mehrerer
                   Spezialwikis. Ein Hauptaugenmerk des Kapitels liegt dabei auf
                   einem Mehrebenennetzwerkmodell, und zwar in Abkehr von den unipartiten
                   Graphmodellen der Theorie komplexer Netzwerke.},
      year      = {2008}
    }
    Olga Abramov, Alexander Mehler and Rüdiger Gleim. 2008. A Unified Database of Dependency Treebanks. Integrating, Quantifying and Evaluating Dependency Data. Proceedings of the 6th Language Resources and Evaluation Conference (LREC 2008), Marrakech (Morocco).
    BibTeX
    @inproceedings{Pustylnikov:Mehler:Gleim:2008,
      author    = {Abramov, Olga and Mehler, Alexander and Gleim, Rüdiger},
      title     = {A Unified Database of Dependency Treebanks. Integrating, Quantifying
                   and Evaluating Dependency Data},
      booktitle = {Proceedings of the 6th Language Resources and Evaluation Conference
                   (LREC 2008), Marrakech (Morocco)},
      abstract  = {This paper describes a database of 11 dependency treebanks which
                   were unified by means of a two-dimensional graph format. The format
                   was evaluated with respect to storage-complexity on the one hand,
                   and efficiency of data access on the other hand. An example of
                   how the treebanks can be integrated within a unique interface
                   is given by means of the DTDB interface.},
      pdf       = {http://wwwhomes.uni-bielefeld.de/opustylnikov/pustylnikov/pdfs/LREC08_full.pdf},
      year      = {2008}
    }
    Alexander Mehler. 2008. Structural Similarities of Complex Networks: A Computational Model by Example of Wiki Graphs. Applied Artificial Intelligence, 22(7&8):619–683.
    BibTeX
    @article{Mehler:2008:a,
      author    = {Mehler, Alexander},
      title     = {Structural Similarities of Complex Networks: A Computational Model
                   by Example of Wiki Graphs},
      journal   = {Applied Artificial Intelligence},
      volume    = {22},
      number    = {7\&8},
      pages     = {619–683},
      abstract  = {This article elaborates a framework for representing and classifying
                   large complex networks by example of wiki graphs. By means of
                   this framework we reliably measure the similarity of document,
                   agent, and word networks by solely regarding their topology. In
                   doing so, the article departs from classical approaches to complex
                   network theory which focuses on topological characteristics in
                   order to check their small world property. This does not only
                   include characteristics that have been studied in complex network
                   theory, but also some of those which were invented in social network
                   analysis and hypertext theory. We show that network classifications
                   come into reach which go beyond the hypertext structures traditionally
                   analyzed in web mining. The reason is that we focus on networks
                   as a whole as units to be classified—above the level of websites
                   and their constitutive pages. As a consequence, we bridge classical
                   approaches to text and web mining on the one hand and complex
                   network theory on the other hand. Last but not least, this approach
                   also provides a framework for quantifying the linguistic notion
                   of intertextuality.},
      doi       = {10.1080/08839510802164085},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2016/10/mehler_2008_Structural_Similarities_of_Complex_Networks.pdf},
      website   = {https://www.researchgate.net/publication/200772675_Structural_similarities_of_complex_networks_A_computational_model_by_example_of_wiki_graphs},
      year      = {2008}
    }
    Alexander Mehler. 2008. Lexical-Semantic Resources in Automated Discourse Analysis. Ed. by Harald Lüngen, Alexander Mehler and Angelika Storrer.Journal for Language Technology and Computational Linguistics (JLCL), 23(2). GSCL.
    BibTeX
    @book{Luengen:Mehler:Storrer:2008:a,
      author    = {Mehler, Alexander},
      editor    = {Lüngen, Harald and Mehler, Alexander and Storrer, Angelika},
      title     = {Lexical-Semantic Resources in Automated Discourse Analysis},
      publisher = {GSCL},
      volume    = {23(2)},
      series    = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/LexicalSemanticResources-300-20.png},
      pagetotal = {111},
      pdf       = {{http://www.jlcl.org/2008_Heft2/JLCL23(2).pdf}},
      website   = {https://www.researchgate.net/publication/228956889_Lexical-Semantic_Resources_in_Automated_Discourse_Analysis},
      year      = {2008}
    }
    Alexander Mehler. 2008. Large Text Networks as an Object of Corpus Linguistic Studies. Corpus Linguistics. An International Handbook of the Science of Language and Society, 328–382.
    BibTeX
    @incollection{Mehler:2008:b,
      author    = {Mehler, Alexander},
      title     = {Large Text Networks as an Object of Corpus Linguistic Studies},
      booktitle = {Corpus Linguistics. An International Handbook of the Science of
                   Language and Society},
      publisher = {De Gruyter},
      editor    = {Lüdeling, Anke and Kytö, Merja},
      pages     = {328–382},
      address   = {Berlin/New York},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2007_a.pdf},
      year      = {2008}
    }
    Olga Pustylnikov and Alexander Mehler. 2008. Text classification by means of structural features. What kind of information about texts is captured by their structure?. Proceedings of RUSSIR '08, September 1-5, Taganrog, Russia.
    BibTeX
    @inproceedings{Pustylnikov:Mehler:2008:c,
      author    = {Pustylnikov, Olga and Mehler, Alexander},
      title     = {Text classification by means of structural features. What kind
                   of information about texts is captured by their structure?},
      booktitle = {Proceedings of RUSSIR '08, September 1-5, Taganrog, Russia},
      pdf       = {http://www.www.texttechnologylab.org/data/pdf/mehler_geibel_pustylnikov_2007.pdf},
      year      = {2008}
    }
    Ulli Waltinger, Alexander Mehler and Maik Stührenberg. 2008. An Integrated Model of Lexical Chaining: Applications, Resources and their Format. Proceedings of KONVENS 2008 – Ergänzungsband Textressourcen und lexikalisches Wissen, 59–70.
    BibTeX
    @inproceedings{Waltinger:Mehler:Stuehrenberg:2008,
      author    = {Waltinger, Ulli and Mehler, Alexander and Stührenberg, Maik},
      title     = {An Integrated Model of Lexical Chaining: Applications, Resources
                   and their Format},
      booktitle = {Proceedings of KONVENS 2008 – Erg{\"a}nzungsband Textressourcen
                   und lexikalisches Wissen},
      editor    = {Storrer, Angelika and Geyken, Alexander and Siebert, Alexander
                   and Würzner, Kay-Michael},
      pages     = {59-70},
      pdf       = {http://www.ulliwaltinger.de/pdf/Konvens_2008_Integrated_Model_of_Lexical_Chaining_WaltingerMehlerStuehrenberg.pdf},
      year      = {2008}
    }
    Alexander Mehler. 2008. A Model of the Distribution of the Distances of Alike Elements in Dialogical Communication. Proceedings of the International Conference on Information Theory and Statistical Learning (ITSL '08), July 14-15, 2008, Las Vegas, 45–50.
    BibTeX
    @inproceedings{Mehler:2008:c,
      author    = {Mehler, Alexander},
      title     = {A Model of the Distribution of the Distances of Alike Elements
                   in Dialogical Communication},
      booktitle = {Proceedings of the International Conference on Information Theory
                   and Statistical Learning (ITSL '08), July 14-15, 2008, Las Vegas},
      pages     = {45-50},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2008_c.pdf},
      year      = {2008}
    }
    Ulli Waltinger, Alexander Mehler and Gerhard Heyer. 2008. Towards Automatic Content Tagging: Enhanced Web Services in Digital Libraries Using Lexical Chaining. 4th Int. Conf. on Web Information Systems and Technologies (WEBIST '08), 4-7 May, Funchal, Portugal, 231–236.
    BibTeX
    @inproceedings{Waltinger:Mehler:Heyer:2008,
      author    = {Waltinger, Ulli and Mehler, Alexander and Heyer, Gerhard},
      title     = {Towards Automatic Content Tagging: Enhanced Web Services in Digital
                   Libraries Using Lexical Chaining},
      booktitle = {4th Int. Conf. on Web Information Systems and Technologies (WEBIST
                   '08), 4-7 May, Funchal, Portugal},
      editor    = {Cordeiro, José and Filipe, Joaquim and Hammoudi, Slimane},
      pages     = {231-236},
      address   = {Barcelona},
      publisher = {INSTICC Press},
      pdf       = {http://www.ulliwaltinger.de/pdf/Webist_2008_Towards_Automatic_Content_Tagging_WaltingerMehlerHeyer.pdf},
      url       = {http://dblp.uni-trier.de/db/conf/webist/webist2008-2.html#WaltingerMH08},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.463.3097},
      year      = {2008}
    }
    Alexander Mehler. 2008. A Short Note on Social-Semiotic Networks from the Point of View of Quantitative Semantics. Proceedings of the Dagstuhl Seminar on Social Web Communities, September 21-26, Dagstuhl.
    BibTeX
    @inproceedings{Mehler:2008:f,
      author    = {Mehler, Alexander},
      title     = {A Short Note on Social-Semiotic Networks from the Point of View
                   of Quantitative Semantics},
      booktitle = {Proceedings of the Dagstuhl Seminar on Social Web Communities,
                   September 21-26, Dagstuhl},
      editor    = {Alani, Harith and Staab, Steffen and Stumme, Gerd},
      pdf       = {http://drops.dagstuhl.de/opus/volltexte/2008/1788/pdf/08391.MehlerAlexander.ExtAbstract.1788.pdf},
      year      = {2008}
    }
    Alexander Mehler, Rüdiger Gleim, Alexandra Ernst and Ulli Waltinger. 2008. WikiDB: Building Interoperable Wiki-Based Knowledge Resources for Semantic Databases. Sprache und Datenverarbeitung. International Journal for Language Data Processing, 32(1):47–70.
    BibTeX
    @article{Mehler:Gleim:Ernst:Waltinger:2008,
      author    = {Mehler, Alexander and Gleim, Rüdiger and Ernst, Alexandra and Waltinger, Ulli},
      title     = {WikiDB: Building Interoperable Wiki-Based Knowledge Resources
                   for Semantic Databases},
      journal   = {Sprache und Datenverarbeitung. International Journal
                       for Language Data Processing},
      volume    = {32},
      number    = {1},
      pages     = {47-70},
      abstract  = {This article describes an API for exploring the logical document
                   and the logical network structure of wikis. It introduces an algorithm
                   for the semantic preprocessing, filtering and typing of these
                   building blocks. Further, this article models the process of wiki
                   generation based on a unified format of syntactic, semantic and
                   pragmatic representations. This three-level approach to make accessible
                   syntactic, semantic and pragmatic aspects of wiki-based structure
                   formation is complemented by a corresponding database model –
                   called WikiDB – and an API operating thereon. Finally, the article
                   provides an empirical study of using the three-fold representation
                   format in conjunction with WikiDB.},
      pdf       = {http://www.ulliwaltinger.de/pdf/Konvens_2008_WikiDB_Building_Semantic_Databases_MehlerGleimErnstWaltinger.pdf},
      year      = {2008}
    }
    Ulli Waltinger and Alexander Mehler. 2008. Who is it? Context sensitive named entity and instance recognition by means of Wikipedia. Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence (WI-2008), 381–384.
    BibTeX
    @inproceedings{Waltinger:Mehler:2008:a,
      author    = {Waltinger, Ulli and Mehler, Alexander},
      title     = {Who is it? Context sensitive named entity and instance recognition
                   by means of Wikipedia},
      booktitle = {Proceedings of the 2008 IEEE/WIC/ACM International Conference
                   on Web Intelligence (WI-2008)},
      pages     = {381–384},
      publisher = {IEEE Computer Society},
      pdf       = {http://www.ulliwaltinger.de/pdf/WI_2008_Context_Sensitive_Instance_Recognition_WaltingerMehler.pdf},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.324.5881},
      year      = {2008}
    }
    Andy Lücking, Alexander Mehler and Peter Menke. June 2–4, 2008. Taking Fingerprints of Speech-and-Gesture Ensembles: Approaching Empirical Evidence of Intrapersonal Alignment in Multimodal Communication. LONDIAL 2008: Proceedings of the 12th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL), 157–164.
    BibTeX
    @inproceedings{Luecking:Mehler:Menke:2008,
      author    = {Lücking, Andy and Mehler, Alexander and Menke, Peter},
      title     = {Taking Fingerprints of Speech-and-Gesture Ensembles: Approaching
                   Empirical Evidence of Intrapersonal Alignment in Multimodal Communication},
      booktitle = {LONDIAL 2008: Proceedings of the 12th Workshop on the Semantics
                   and Pragmatics of Dialogue (SEMDIAL)},
      pages     = {157–164},
      address   = {King's College London},
      month     = {June 2–4},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/luecking_mehler_menke_2008.pdf},
      website   = {https://www.researchgate.net/publication/237305375_Taking_Fingerprints_of_Speech-and-Gesture_Ensembles_Approaching_Empirical_Evidence_of_Intrapersonal_Alignment_in_Multimodal_Communication},
      year      = {2008}
    }
    Alexander Mehler and Tilmann Sutter. 2008. Interaktive Textproduktion in Wiki-basierten Kommunikationssystemen. Kommunikation, Partizipation und Wirkungen im Social Web – Weblogs, Wikis, Podcasts und Communities aus interdisziplinärer Sicht, 267–300.
    BibTeX
    @incollection{Mehler:Sutter:2008,
      author    = {Mehler, Alexander and Sutter, Tilmann},
      title     = {Interaktive Textproduktion in Wiki-basierten Kommunikationssystemen},
      booktitle = {Kommunikation, Partizipation und Wirkungen im Social Web – Weblogs,
                   Wikis, Podcasts und Communities aus interdisziplin{\"a}rer Sicht},
      publisher = {Herbert von Halem},
      editor    = {Zerfa{\ss}, Ansgar and Welker, Martin and Schmidt, Jan},
      pages     = {267-300},
      address   = {Köln},
      abstract  = {This article addresses challenges in maintaining and annotating
                   image resources in the field of iconographic research. We focus
                   on the task of bringing together generic and extensible techniques
                   for resource and anno- tation management with the highly specific
                   demands in this area of research. Special emphasis is put on the
                   interrelation of images, image segements and textual contents.
                   In addition, we describe the architecture, data model and user
                   interface of the open annotation system used in the image database
                   application that is a part of the eHumanities Desktop.},
      year      = {2008}
    }
    Alexander Mehler. 2008. On the Impact of Community Structure on Self-Organizing Lexical Networks. Proceedings of the 7th Evolution of Language Conference (Evolang 2008), March 11-15, 2008, Barcelona, 227–234.
    BibTeX
    @inproceedings{Mehler:2008:e,
      author    = {Mehler, Alexander},
      title     = {On the Impact of Community Structure on Self-Organizing Lexical Networks},
      booktitle = {Proceedings of the 7th Evolution of Language Conference (Evolang
                   2008), March 11-15, 2008, Barcelona},
      editor    = {Smith, Andrew D. M. and Smith, Kenny and Cancho, Ramon Ferrer i},
      pages     = {227-234},
      publisher = {World Scientific},
      abstract  = {This paper presents a simulation model of self-organizing lexical
                   networks. Its starting point is the notion of an association game
                   in which the impact of varying community models is studied on
                   the emergence of lexical networks. The paper reports on experiments
                   whose results are in accordance with findings in the framework
                   of the naming game. This is done by means of a multilevel network
                   model in which the correlation of social and of linguistic networks
                   is studied},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2008_b.pdf},
      website   = {http://stel.ub.edu/evolang2008/evo10.htm},
      year      = {2008}
    }
    Olga Abramov and Alexander Mehler. 2008. Towards a Uniform Representation of Treebanks: Providing Interoperability for Dependency Tree Data. Proceedings of First International Conference on Global Interoperability for Language Resources (ICGL 2008), Hong Kong SAR, January 9-11.
    BibTeX
    @inproceedings{Pustylnikov:Mehler:2008:a,
      author    = {Abramov, Olga and Mehler, Alexander},
      title     = {Towards a Uniform Representation of Treebanks: Providing Interoperability
                   for Dependency Tree Data},
      booktitle = {Proceedings of First International Conference on Global Interoperability
                   for Language Resources (ICGL 2008), Hong Kong SAR, January 9-11},
      abstract  = {In this paper we present a corpus representation format which
                   unifies the representation of a wide range of dependency treebanks
                   within a single model. This approach provides interoperability
                   and reusability of annotated syntactic data which in turn extends
                   its applicability within various research contexts. We demonstrate
                   our approach by means of dependency treebanks of 11 languages.
                   Further, we perform a comparative quantitative analysis of these
                   treebanks in order to demonstrate the interoperability of our
                   approach.},
      pdf       = {http://wwwhomes.uni-bielefeld.de/opustylnikov/pustylnikov/pdfs/acl07.1.0.pdf},
      website   = {https://www.researchgate.net/publication/242681771_Towards_a_Uniform_Representation_of_Treebanks_Providing_Interoperability_for_Dependency_Tree_Data},
      year      = {2008}
    }
    Georg Rehm, Marina Santini, Alexander Mehler, Pavel Braslavski, Rüdiger Gleim, Andrea Stubbe, Svetlana Symonenko, Mirko Tavosanis and Vedrana Vidulin. 2008. Towards a Reference Corpus of Web Genres for the Evaluation of Genre Identification Systems. Proceedings of the 6th Language Resources and Evaluation Conference (LREC 2008), Marrakech (Morocco).
    BibTeX
    @inproceedings{Rehm:Santini:Mehler:Braslavski:Gleim:Stubbe:Symonenko:Tavosanis:Vidulin:2008,
      author    = {Rehm, Georg and Santini, Marina and Mehler, Alexander and Braslavski, Pavel
                   and Gleim, Rüdiger and Stubbe, Andrea and Symonenko, Svetlana and Tavosanis, Mirko
                   and Vidulin, Vedrana},
      title     = {Towards a Reference Corpus of Web Genres for the Evaluation of
                   Genre Identification Systems},
      booktitle = {Proceedings of the 6th Language Resources and Evaluation Conference
                   (LREC 2008), Marrakech (Morocco)},
      abstract  = {We present initial results from an international and multi-disciplinary
                   research collaboration that aims at the construction of a reference
                   corpus of web genres. The primary application scenario for which
                   we plan to build this resource is the automatic identification
                   of web genres. Web genres are rather difficult to capture and
                   to describe in their entirety, but we plan for the finished reference
                   corpus to contain multi-level tags of the respective genre or
                   genres a web document or a website instantiates. As the construction
                   of such a corpus is by no means a trivial task, we discuss several
                   alternatives that are, for the time being, mostly based on existing
                   collections. Furthermore, we discuss a shared set of genre categories
                   and a multi-purpose tool as two additional prerequisites for a
                   reference corpus of web genres.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/rehm_santini_mehler_braslavski_gleim_stubbe_symonenko_tavosanis_vidulin_2008.pdf},
      website   = {http://www.lrec-conf.org/proceedings/lrec2008/summaries/94.html},
      year      = {2008}
    }

    2007

    Rüdiger Gleim, Alexander Mehler, Matthias Dehmer and Olga Abramov. 2007. Aisles through the Category Forest – Utilising the Wikipedia Category System for Corpus Building in Machine Learning. 3rd International Conference on Web Information Systems and Technologies (WEBIST '07), March 3-6, 2007, Barcelona, 142–149.
    BibTeX
    @inproceedings{Gleim:Mehler:Dehmer:Abramov:2007,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Dehmer, Matthias and Abramov, Olga},
      title     = {Aisles through the Category Forest – Utilising the Wikipedia Category
                   System for Corpus Building in Machine Learning},
      booktitle = {3rd International Conference on Web Information Systems and Technologies
                   (WEBIST '07), March 3-6, 2007, Barcelona},
      editor    = {Filipe, Joaquim and Cordeiro, José and Encarnação, Bruno and Pedrosa, Vitor},
      pages     = {142-149},
      address   = {Barcelona},
      abstract  = {The Word Wide Web is a continuous challenge to machine learning.
                   Established approaches have to be enhanced and new methods be
                   developed in order to tackle the problem of finding and organising
                   relevant information. It has often been motivated that semantic
                   classifications of input documents help solving this task. But
                   while approaches of supervised text categorisation perform quite
                   well on genres found in written text, newly evolved genres on
                   the web are much more demanding. In order to successfully develop
                   approaches to web mining, respective corpora are needed. However,
                   the composition of genre- or domain-specific web corpora is still
                   an unsolved problem. It is time consuming to build large corpora
                   of good quality because web pages typically lack reliable meta
                   information. Wikipedia along with similar approaches of collaborative
                   text production offers a way out of this dilemma. We examine how
                   social tagging, as supported by the MediaWiki software, can be
                   utilised as a source of corpus building. Further, we describe
                   a representation format for social ontologies and present the
                   Wikipedia Category Explorer, a tool which supports categorical
                   views to browse through the Wikipedia and to construct domain
                   specific corpora for machine learning.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2016/10/webist_2007-gleim_mehler_dehmer_pustylnikov.pdf},
      year      = {2007}
    }
    Alexander Mehler, Rüdiger Gleim and Armin Wegner. 2007. Structural Uncertainty of Hypertext Types. An Empirical Study. Proceedings of the Workshop "Towards Genre-Enabled Search Engines: The Impact of NLP", September, 30, 2007, in conjunction with RANLP 2007, Borovets, Bulgaria, 13–19.
    BibTeX
    @inproceedings{Mehler:Gleim:Wegner:2007,
      author    = {Mehler, Alexander and Gleim, Rüdiger and Wegner, Armin},
      title     = {Structural Uncertainty of Hypertext Types. An Empirical Study},
      booktitle = {Proceedings of the Workshop "Towards Genre-Enabled Search Engines:
                   The Impact of NLP", September, 30, 2007, in conjunction with RANLP
                   2007, Borovets, Bulgaria},
      editor    = {Rehm, Georg and Santini, Marina},
      pages     = {13-19},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/RANLP.pdf},
      year      = {2007}
    }
    Alexander Mehler. 2007. Evolving Lexical Networks. A Simulation Model of Terminological Alignment. Proceedings of the Workshop on Language, Games, and Evolution at the 9th European Summer School in Logic, Language and Information (ESSLLI 2007), Trinity College, Dublin, 6-17 August, 57–67.
    BibTeX
    @inproceedings{Mehler:2007:d,
      author    = {Mehler, Alexander},
      title     = {Evolving Lexical Networks. A Simulation Model of Terminological Alignment},
      booktitle = {Proceedings of the Workshop on Language, Games, and Evolution
                   at the 9th European Summer School in Logic, Language and Information
                   (ESSLLI 2007), Trinity College, Dublin, 6-17 August},
      editor    = {Benz, Anton and Ebert, Christian and van Rooij, Robert},
      pages     = {57-67},
      abstract  = {In this paper we describe a simulation model of terminological
                   alignment in a multiagent community. It is based on the notion
                   of an association game which is used instead of the classical
                   notion of a naming game (Steels, 1996). The simulation model integrates
                   a small world-like agent community which restricts agent communication.
                   We hypothesize that this restriction is decisive when it comes
                   to simulate terminological alignment based on lexical priming.
                   The paper presents preliminary experimental results in support
                   of this hypothesis.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2007_d.pdf},
      year      = {2007}
    }
    Alexander Mehler, Peter Geibel, Rüdiger Gleim, Sebastian Herold, Brijnesh-Johannes Jain and Olga Abramov. 2007. Much Ado About Text Content. Learning Text Types Solely by Structural Differentiae. Proceedings of OTT '06 – Ontologies in Text Technology: Approaches to Extract Semantic Knowledge from Structured Information, 63–71.
    BibTeX
    @inproceedings{Mehler:Geibel:Gleim:Herold:Jain:Pustylnikov:2007,
      author    = {Mehler, Alexander and Geibel, Peter and Gleim, Rüdiger and Herold, Sebastian
                   and Jain, Brijnesh-Johannes and Abramov, Olga},
      title     = {Much Ado About Text Content. Learning Text Types Solely by Structural
                   Differentiae},
      booktitle = {Proceedings of OTT '06 – Ontologies in Text Technology: Approaches
                   to Extract Semantic Knowledge from Structured Information},
      editor    = {Mönnich, Uwe and Kühnberger, Kai-Uwe},
      series    = {Publications of the Institute of Cognitive Science
                       (PICS)},
      pages     = {63-71},
      address   = {Osnabrück},
      abstract  = {In this paper, we deal with classifying texts into classes which
                   denote text types whose textual instances serve more or less homogeneous
                   functions. Other than mainstream approaches to text classification,
                   which rely on the vector space model [30] or some of its descendants
                   [2] and, thus, on content-related lexical features, we solely
                   refer to structural differentiae, that is, to patterns of text
                   structure as determinants of class membership. Further, we suppose
                   that text types span a type hierarchy based on the type-subtype
                   relation [31]. Thus, although we admit that class membership is
                   fuzzy so that overlapping classes are inevitable, we suppose a
                   non-overlapping type system structured into a rooted tree – whether
                   solely based on functional or additional on, e.g., content- or
                   mediabased criteria [1]. What regards criteria of goodness of
                   classification, we perform a classical supervised categorization
                   experiment [30] based on cross-validation as a method of model
                   selection [11]. That is, we perform a categorization experiment
                   in which for all training and test cases class membership is known
                   ex ante. In summary, we perform a supervised experiment of text
                   classification in order to learn functionally grounded text types
                   where membership to these types is solely based on structural
                   criteria.},
      pdf       = {http://ikw.uni-osnabrueck.de/~ott06/ott06-abstracts/Mehler_Geibel_abstract.pdf},
      year      = {2007}
    }
    Matthias Dehmer, Alexander Mehler and Frank Emmert-Streib. 2007. Graph-theoretical Characterizations of Generalized Trees. Proceedings of the 2007 International Conference on Machine Learning: Models, Technologies & Applications (MLMTA '07), June 25-28, 2007, Las Vegas, 113–117.
    BibTeX
    @inproceedings{Dehmer:Mehler:Emmert-Streib:2007:a,
      author    = {Dehmer, Matthias and Mehler, Alexander and Emmert-Streib, Frank},
      title     = {Graph-theoretical Characterizations of Generalized Trees},
      booktitle = {Proceedings of the 2007 International Conference on Machine Learning:
                   Models, Technologies \& Applications (MLMTA '07), June 25-28,
                   2007, Las Vegas},
      pages     = {113-117},
      website   = {https://www.researchgate.net/publication/221188591_Graph-theoretical_Characterizations_of_Generalized_Trees},
      year      = {2007}
    }
    Rüdiger Gleim, Alexander Mehler and Hans-Jürgen Eikmeyer. 2007. Representing and Maintaining Large Corpora. Proceedings of the Corpus Linguistics 2007 Conference, Birmingham (UK).
    BibTeX
    @inproceedings{Gleim:Mehler:Eikmeyer:2007:a,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Eikmeyer, Hans-Jürgen},
      title     = {Representing and Maintaining Large Corpora},
      booktitle = {Proceedings of the Corpus Linguistics 2007 Conference, Birmingham (UK)},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/gleim_mehler_eikmeyer_2007_a.pdf},
      year      = {2007}
    }
    Peter Geibel, Olga Abramov, Alexander Mehler, Helmar Gust and Kai-Uwe Kühnberger. 2007. Classification of Documents Based on the Structure of Their DOM Trees. Proceedings of ICONIP 2007 (14th International Conference on Neural Information Processing), 779–788.
    BibTeX
    @inproceedings{Geibel:Pustylnikov:Mehler:Gust:Kuehnberger:2007,
      author    = {Geibel, Peter and Abramov, Olga and Mehler, Alexander and Gust, Helmar
                   and Kühnberger, Kai-Uwe},
      title     = {Classification of Documents Based on the Structure of Their DOM Trees},
      booktitle = {Proceedings of ICONIP 2007 (14th International Conference on Neural
                   Information Processing)},
      series    = {Lecture Notes in Computer Science 4985},
      pages     = {779–788},
      publisher = {Springer},
      abstract  = {In this paper, we discuss kernels that can be applied for the
                   classification of XML documents based on their DOM trees. DOM
                   trees are ordered trees in which every node might be labeled by
                   a vector of attributes including its XML tag and the textual content.
                   We describe five new kernels suitable for such structures: a kernel
                   based on predefined structural features, a tree kernel derived
                   from the well-known parse tree kernel, the set tree kernel that
                   allows permutations of children, the string tree kernel being
                   an extension of the so-called partial tree kernel, and the soft
                   tree kernel as a more efficient alternative. We evaluate the kernels
                   experimentally on a corpus containing the DOM trees of newspaper
                   articles and on the well-known SUSANNE corpus.},
      website   = {http://www.springerlink.com/content/x414002113425742/},
      year      = {2007}
    }
    Bernhard Jussen, Alexander Mehler and Alexandra Ernst. 2007. A Corpus Management System for Historical Semantics. Sprache und Datenverarbeitung. International Journal for Language Data Processing, 31(1-2):81–89.
    BibTeX
    @article{Jussen:Mehler:Ernst:2007,
      author    = {Jussen, Bernhard and Mehler, Alexander and Ernst, Alexandra},
      title     = {A Corpus Management System for Historical Semantics},
      journal   = {Sprache und Datenverarbeitung. International Journal
                       for Language Data Processing},
      volume    = {31},
      number    = {1-2},
      pages     = {81-89},
      abstract  = {Der Beitrag beschreibt ein Korpusmanagementsystem für die historische
                   Semantik. Die Grundlage hierfür bildet ein Bedeutungsbegriff,
                   der – methodologisch gesprochen – auf der Analyse diachroner Korpora
                   beruht. Das Ziel der Analyse dieser Korpora besteht darin, Bedeutungswandel
                   als eine Bezugsgrö{\ss}e für den Wandel sozialer Systeme zu untersuchen.
                   Das vorgestellte Korpusmanagementsystem unterstützt diese Art
                   der korpusbasierten historischen Semantik.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/jussen_mehler_ernst_2007.pdf},
      year      = {2007}
    }
    Alexander Mehler and Reinhard Köhler. 2007. Machine Learning in a Semiotic Perspective. Aspects of Automatic Text Analysis, 1–29.
    BibTeX
    @incollection{Mehler:Koehler:2007:b,
      author    = {Mehler, Alexander and Köhler, Reinhard},
      title     = {Machine Learning in a Semiotic Perspective},
      booktitle = {Aspects of Automatic Text Analysis},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Köhler, Reinhard},
      series    = {Studies in Fuzziness and Soft Computing},
      pages     = {1-29},
      address   = {Berlin/New York},
      abstract  = {Gegenstand des folgenden Aufsatzes ist der konnotative Aspekt
                   der Bedeutungen von Texten. Den Ausgangspunkt der {\"U}berlegungen
                   zur Konnotation des Textes bildet die Auffassung, wonach Wort-
                   und Textbedeutungskonstitution Ergebnis eines zirkul{\"a}ren Prozesses
                   sind, der für die Emergenz einer Hierarchie ineinander geschachtelter
                   Spracheinheiten verantwortlich zeichnet. Der Proze{\ss} der Zeichenartikulation
                   erfolgt entlang dieser Ebenen und erzeugt durch Verbindung von
                   (konnotativer) Inhalts- und Ausdrucksseite auf Textebene das Textzeichen.
                   Im Gegensatz zu einer strikten Interpretation des Fregeschen Kompositionalit{\"a}tsprinzips,
                   derzufolge die Bedeutungen sprachlicher Einheiten als fixierte,
                   kontextfreie Grö{\ss}en vorauszusetzen sind, behandelt der vorliegende
                   Ansatz bereits die lexikalische Bedeutung als Grö{\ss}e, die in
                   Abh{\"a}ngigkeit von ihrem Kontext variieren kann. Aus semiotischer
                   Perspektive ist es vor allem der Gestaltcharakter, welcher die
                   konnotative Textbedeutung einer Anwendung des FregePrinzips entzieht.
                   Anders ausgedrückt: Die konnotative Bedeutung eines Textes ist
                   keineswegs in eine Struktur 'atomarer' Repr{\"a}sentationen zerlegbar.
                   Die hierarchische Organisation von Texten erweist sich insofern
                   als komplex, als ihre Bedeutungen aus einem zirkul{\"a}ren Proze{\ss}
                   resultieren, der best{\"a}tigend und/oder ver{\"a}ndernd auf die
                   Bedeutungen der Textkonstituenten einwirkt. Diese Zirkularit{\"a}t
                   bedingt, da{\ss} Texte nicht nur als Orte der Manifestation von
                   Wortbedeutungsstrukturen anzusehen sind, sondern zugleich als
                   Ausgangspunkte für die Modifikation und Emergenz solcher Strukturen
                   dienen. Im folgenden wird unter Rekurs auf den Kopenhagener Strukturalismus
                   ein Modell der konnotativen Bedeutung von Texten entwickelt, das
                   sich unter anderem an dem glossematischen Begriff der Konstante
                   orientiert. Die Formalisierung des Modells erfolgt mit Hilfe des
                   Konzeptes der unscharfen Menge. Zu diesem Zweck werden die unscharfen
                   Verwendungsregularit{\"a}ten von Wörtern auf der Basis eines zweistufigen
                   Verfahrens analysiert, welches die syntagmatischen und paradigmatischen
                   Regularit{\"a}ten des Wortgebrauches berücksichtigt. Die Rolle
                   der Satzebene innerhalb des Prozesses der konnotativen Textbedeutungskonstitution
                   wird angedeutet. Abschlie{\ss}end erfolgt eine Exemplifizierung
                   des Algorithmus anhand der automatischen Analyse eines Textcorpus.},
      website   = {http://rd.springer.com/chapter/10.1007/978-3-540-37522-7_1},
      year      = {2007}
    }
    Alexander Mehler, Ulli Waltinger and Armin Wegner. 2007. A Formal Text Representation Model Based on Lexical Chaining. Proceedings of the KI 2007 Workshop on Learning from Non-Vectorial Data (LNVD 2007) September 10, Osnabrück, 17–26.
    BibTeX
    @inproceedings{Mehler:Waltinger:Wegner:2007:a,
      author    = {Mehler, Alexander and Waltinger, Ulli and Wegner, Armin},
      title     = {A Formal Text Representation Model Based on Lexical Chaining},
      booktitle = {Proceedings of the KI 2007 Workshop on Learning from Non-Vectorial
                   Data (LNVD 2007) September 10, Osnabrück},
      editor    = {Geibel, Peter and Jain, Brijnesh J.},
      pages     = {17-26},
      address   = {Osnabrück},
      publisher = {Universit{\"a}t Osnabrück},
      abstract  = {This paper presents a formal text representation model as an alternative
                   to the vector space model. It combines a tree-like model with
                   graph-inducing lexical relations. The paper aims at formalizing
                   two yet unrelated approaches, i.e. lexical chaining [3] and quantitative
                   structure analysis [9], in order to combine content and structure
                   modeling.},
      pdf       = {http://www.ulliwaltinger.de/pdf/LNVD07MehlerWaltingerWegner.pdf},
      year      = {2007}
    }
    Alexander Mehler, Peter Geibel and Olga Abramov. 2007. Structural Classifiers of Text Types: Towards a Novel Model of Text Representation. Journal for Language Technology and Computational Linguistics (JLCL), 22(2):51–66.
    BibTeX
    @article{Mehler:Geibel:Pustylnikov:2007,
      author    = {Mehler, Alexander and Geibel, Peter and Abramov, Olga},
      title     = {Structural Classifiers of Text Types: Towards a Novel Model of
                   Text Representation},
      journal   = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      volume    = {22},
      number    = {2},
      pages     = {51-66},
      abstract  = {Texts can be distinguished in terms of their content, function,
                   structure or layout (Brinker, 1992; Bateman et al., 2001; Joachims,
                   2002; Power et al., 2003). These reference points do not open
                   necessarily orthogonal perspectives on text classification. As
                   part of explorative data analysis, text classification aims at
                   automatically dividing sets of textual objects into classes of
                   maximum internal homogeneity and external heterogeneity. This
                   paper deals with classifying texts into text types whose instances
                   serve more or less homogeneous functions. Other than mainstream
                   approaches, which rely on the vector space model (Sebastiani,
                   2002) or some of its descendants (Baeza-Yates and Ribeiro-Neto,
                   1999) and, thus, on content-related lexical features, we solely
                   refer to structural differentiae. That is, we explore patterns
                   of text structure as determinants of class membership. Our starting
                   point are tree-like text representations which induce feature
                   vectors and tree kernels. These kernels are utilized in supervised
                   learning based on cross-validation as a method of model selection
                   (Hastie et al., 2001) by example of a corpus of press communication.
                   For a subset of categories we show that classification can be
                   performed very well by structural differentia only.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_geibel_pustylnikov_2007.pdf},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.604},
      year      = {2007}
    }
    Olga Abramov and Alexander Mehler. 2007. Structural Differentiae of Text Types. A Quantitative Model. Proceedings of the 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications (GfKl), 655–662.
    BibTeX
    @inproceedings{Abramov:Mehler:2007:b,
      author    = {Abramov, Olga and Mehler, Alexander},
      title     = {Structural Differentiae of Text Types. A Quantitative Model},
      booktitle = {Proceedings of the 31st Annual Conference of the German Classification
                   Society on Data Analysis, Machine Learning, and Applications (GfKl)},
      pages     = {655–662},
      pdf       = {http://wwwhomes.uni-bielefeld.de/opustylnikov/pustylnikov/pdfs/gfkl.pdf},
      website   = {http://www.springerprofessional.de/077---structural-differentiae-of-text-types--a-quantitative-model/1957362.html},
      year      = {2007}
    }
    Alexander Mehler and Reinhard Köhler. 2007. Aspects of Automatic Text Analysis: Festschrift in Honor of Burghard Rieger. Ed. by Alexander Mehler and Reinhard Köhler.Studies in Fuzziness and Soft Computing. Springer.
    BibTeX
    @book{Mehler:Koehler:2007:a,
      author    = {Mehler, Alexander and Köhler, Reinhard},
      editor    = {Mehler, Alexander and Köhler, Reinhard},
      title     = {Aspects of Automatic Text Analysis: Festschrift in Honor of Burghard Rieger},
      publisher = {Springer},
      series    = {Studies in Fuzziness and Soft Computing},
      address   = {Berlin/New York},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/AspectsOfAutomaticTextAnalysis.jpg},
      pagetotal = {464},
      review    = {http://www.degruyter.com/view/j/zrs.2011.3.issue-2/zrs.2011.050/zrs.2011.050.xml},
      review2   = {http://irsg.bcs.org/informer/Informer27.pdf},
      website   = {http://www.springer.com/de/book/9783540375203},
      year      = {2007}
    }
    Alexander Mehler and Angelika Storrer. 2007. What are Ontologies Good For? Evaluating Terminological Ontologies in the Framework of Text Graph Classification. Proceedings of OTT '06 – Ontologies in Text Technology: Approaches to Extract Semantic Knowledge from Structured Information, 11–18.
    BibTeX
    @inproceedings{Mehler:Storrer:2007,
      author    = {Mehler, Alexander and Storrer, Angelika},
      title     = {What are Ontologies Good For? Evaluating Terminological Ontologies
                   in the Framework of Text Graph Classification},
      booktitle = {Proceedings of OTT '06 – Ontologies in Text Technology: Approaches
                   to Extract Semantic Knowledge from Structured Information},
      editor    = {Mönnich, Uwe and Kühnberger, Kai-Uwe},
      series    = {Publications of the Institute of Cognitive Science
                       (PICS)},
      pages     = {11-18},
      address   = {Osnabrück},
      pdf       = {http://cogsci.uni-osnabrueck.de/~ott06/ott06-abstracts/Mehler_Storrer_abstract.pdf},
      website   = {http://citeseer.uark.edu:8080/citeseerx/viewdoc/summary?doi=10.1.1.91.2979},
      year      = {2007}
    }
    Maik Stührenberg, Daniela Goecke, Nils Diewald, Alexander Mehler and Irene Cramer. 2007. Web-based Annotation of Anaphoric Relations and Lexical Chains. Proceedings of the Linguistic Annotation Workshop, ACL 2007, 140–147.
    BibTeX
    @inproceedings{Stuehrenberg:Goecke:Diewald:Mehler:Cramer:2007:a,
      author    = {Stührenberg, Maik and Goecke, Daniela and Diewald, Nils and Mehler, Alexander
                   and Cramer, Irene},
      title     = {Web-based Annotation of Anaphoric Relations and Lexical Chains},
      booktitle = {Proceedings of the Linguistic Annotation Workshop, ACL 2007},
      pages     = {140–147},
      pdf       = {http://www.aclweb.org/anthology/W07-1523},
      website   = {https://www.researchgate.net/publication/234800610_Web-based_annotation_of_anaphoric_relations_and_lexical_chains},
      year      = {2007}
    }
    Ramon Ferrer i Cancho, Alexander Mehler, Olga Abramov and Albert Díaz-Guilera. 2007. Correlations in the organization of large-scale syntactic dependency networks. Proceedings of Graph-based Methods for Natural Language Processing (TextGraphs-2) at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, New York, 65–72.
    BibTeX
    @inproceedings{Ferrer:i:Cancho:Mehler:Pustylnikov:Diaz-Guilera:2007:a,
      author    = {Ferrer i Cancho, Ramon and Mehler, Alexander and Abramov, Olga
                   and Díaz-Guilera, Albert},
      title     = {Correlations in the organization of large-scale syntactic dependency networks},
      booktitle = {Proceedings of Graph-based Methods for Natural Language Processing
                   (TextGraphs-2) at the Annual Conference of the North American
                   Chapter of the Association for Computational Linguistics (NAACL-HLT
                   2007), Rochester, New York},
      pages     = {65-72},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/ferrer-i-cancho_mehler_pustylnikov_diaz-guilera_2007_a.pdf},
      year      = {2007}
    }
    Rüdiger Gleim, Alexander Mehler, Hans-Jürgen Eikmeyer and Hannes Rieser. 2007. Ein Ansatz zur Repräsentation und Verarbeitung großer Korpora multimodaler Daten. Data Structures for Linguistic Resources and Applications. Proceedings of the Biennial GLDV Conference 2007, 11.–13. April, Universität Tübingen, 275–284.
    BibTeX
    @inproceedings{Gleim:Mehler:Eikmeyer:Rieser:2007,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Eikmeyer, Hans-Jürgen
                   and Rieser, Hannes},
      title     = {Ein Ansatz zur Repr{\"a}sentation und Verarbeitung gro{\ss}er
                   Korpora multimodaler Daten},
      booktitle = {Data Structures for Linguistic Resources and Applications. Proceedings
                   of the Biennial GLDV Conference 2007, 11.–13. April, Universit{\"a}t
                   Tübingen},
      editor    = {Rehm, Georg and Witt, Andreas and Lemnitzer, Lothar},
      pages     = {275-284},
      address   = {Tübingen},
      publisher = {Narr},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/gleim_mehler_eikmeyer_rieser_2007.pdf},
      year      = {2007}
    }
    Alexander Mehler. 2007. Aspectos Metodológicos da Semiótica Computacional. Computação, Cognição e Semiose, 145–157.
    BibTeX
    @incollection{Mehler:2004:2007,
      author    = {Mehler, Alexander},
      title     = {Aspectos Metodológicos da Semiótica Computacional},
      booktitle = {Computação, Cognição e Semiose},
      publisher = {EDUFBA},
      editor    = {Queiroz, João and Gudwin, Ricardo and Loula, Angelo},
      pages     = {145-157},
      address   = {Federal University of Bahia},
      year      = {2007}
    }
    Alexander Mehler. 2007. Compositionality in Quantitative Semantics. A Theoretical Perspective on Text Mining. Aspects of Automatic Text Analysis, 139–167.
    BibTeX
    @incollection{Mehler:2007:b,
      author    = {Mehler, Alexander},
      title     = {Compositionality in Quantitative Semantics. A Theoretical Perspective
                   on Text Mining},
      booktitle = {Aspects of Automatic Text Analysis},
      publisher = {Springer},
      editor    = {Mehler, Alexander and Köhler, Reinhard},
      series    = {Studies in Fuzziness and Soft Computing},
      pages     = {139-167},
      address   = {Berlin/New York},
      abstract  = {This chapter introduces a variant of the principle of compositionality
                   in quantitative text semantics as an alternative to the bag-of-features
                   approach. The variant includes effects of context-sensitive interpretation
                   as well as processes of meaning constitution and change in the
                   sense of usage-based semantics. Its starting point is a combination
                   of semantic space modeling and text structure analysis. The principle
                   is implemented by means of a hierarchical constraint satisfaction
                   process which utilizes the notion of hierarchical text structure
                   superimposed by graph-inducing coherence relations. The major
                   contribution of the chapter is a conceptualization and formalization
                   of the principle of compositionality in terms of semantic spaces
                   which tackles some well known deficits of existing approaches.
                   In particular this relates to the missing linguistic interpretability
                   of statistical meaning representations.},
      website   = {http://www.springerlink.com/content/x214w527g42x0116/},
      year      = {2007}
    }
    Matthias Dehmer and Alexander Mehler. 2007. A New Method of Measuring the Similarity for a Special Class of Directed Graphs. Tatra Mountains Mathematical Publications, 36:39–59.
    BibTeX
    @article{Dehmer:Mehler:2007:a,
      author    = {Dehmer, Matthias and Mehler, Alexander},
      title     = {A New Method of Measuring the Similarity for a Special Class of Directed Graphs},
      journal   = {Tatra Mountains Mathematical Publications},
      volume    = {36},
      pages     = {39-59},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/dehmer_mehler_2004_a.pdf},
      website   = {https://www.researchgate.net/publication/228905939_A_new_method_of_measuring_similarity_for_a_special_class_of_directed_graphs},
      year      = {2007}
    }
    Peter Geibel, Ulf Krumnack, Olga Abramov, Alexander Mehler, Helmar Gust and Kai-Uwe Kühnberger. 2007. Structure-Sensitive Learning of Text Types. Proceedings of AI 2007: Advances in Artificial Intelligence, 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, December 2-6, 2007, 4830:642–646.
    BibTeX
    @inproceedings{Geibel:Krumnack:Pustylnikov:Mehler:Gust:Kuehnberger:2007,
      author    = {Geibel, Peter and Krumnack, Ulf and Abramov, Olga and Mehler, Alexander
                   and Gust, Helmar and Kühnberger, Kai-Uwe},
      title     = {Structure-Sensitive Learning of Text Types},
      booktitle = {Proceedings of AI 2007: Advances in Artificial Intelligence, 20th
                   Australian Joint Conference on Artificial Intelligence, Gold Coast,
                   Australia, December 2-6, 2007},
      editor    = {Orgun, Mehmet A. and Thornton, John},
      volume    = {4830},
      series    = {Lecture Notes in Computer Science},
      pages     = {642-646},
      publisher = {Springer},
      abstract  = {In this paper, we discuss the structure based classification of
                   documents based on their logical document structure, i.e., their
                   DOM trees. We describe a method using predefined structural features
                   and also four tree kernels suitable for such structures. We evaluate
                   the methods experimentally on a corpus containing the DOM trees
                   of newspaper articles, and on the well-known SUSANNE corpus. We
                   will demonstrate that, for the two corpora, many text types can
                   be learned based on structural features only.},
      website   = {http://www.springerlink.com/content/w574377ww1h6m212/},
      year      = {2007}
    }

    2006

    Alexander Mehler, Rüdiger Gleim and Matthias Dehmer. 2006. Towards Structure-Sensitive Hypertext Categorization. Proceedings of the 29th Annual Conference of the German Classification Society, March 9-11, 2005, Universität Magdeburg, 406–413.
    BibTeX
    @inproceedings{Mehler:Gleim:Dehmer:2006,
      author    = {Mehler, Alexander and Gleim, Rüdiger and Dehmer, Matthias},
      title     = {Towards Structure-Sensitive Hypertext Categorization},
      booktitle = {Proceedings of the 29th Annual Conference of the German Classification
                   Society, March 9-11, 2005, Universit{\"a}t Magdeburg},
      editor    = {Spiliopoulou, Myra and Kruse, Rudolf and Borgelt, Christian and Nürnberger, Andreas
                   and Gaul, Wolfgang},
      pages     = {406-413},
      address   = {Berlin/New York},
      publisher = {Springer},
      abstract  = {Hypertext categorization is the task of automatically assigning
                   category labels to hypertext units. Comparable to text categorization
                   it stays in the area of function learning based on the bag-of-features
                   approach. This scenario faces the problem of a many-to-many relation
                   between websites and their hidden logical document structure.
                   The paper argues that this relation is a prevalent characteristic
                   which interferes any effort of applying the classical apparatus
                   of categorization to web genres. This is confirmed by a threefold
                   experiment in hypertext categorization. In order to outline a
                   solution to this problem, the paper sketches an alternative method
                   of unsupervised learning which aims at bridging the gap between
                   statistical and structural pattern recognition (Bunke et al. 2001)
                   in the area of web mining.},
      website   = {http://www.springerlink.com/content/l7665tm3u241317l/},
      year      = {2006}
    }
    Alexander Mehler. 2006. A Network Perspective on Intertextuality. Exact Methods in the Study of Language and Text, 437–446.
    BibTeX
    @incollection{Mehler:2006:d,
      author    = {Mehler, Alexander},
      title     = {A Network Perspective on Intertextuality},
      booktitle = {Exact Methods in the Study of Language and Text},
      publisher = {De Gruyter},
      editor    = {Grzybek, Peter and Köhler, Reinhard},
      series    = {Quantitative Linguistics},
      pages     = {437-446},
      address   = {Berlin/New York},
      year      = {2006}
    }
    Matthias Dehmer, Frank Emmert-Streib, Alexander Mehler and Jürgen Kilian. 2006. Measuring the Structural Similarity of Web-based Documents: A Novel Approach. International Journal of Computational Intelligence, 3(1):1–7.
    BibTeX
    @article{Dehmer:Emmert:Streib:Mehler:Kilian:2006,
      author    = {Dehmer, Matthias and Emmert-Streib, Frank and Mehler, Alexander
                   and Kilian, Jürgen},
      title     = {Measuring the Structural Similarity of Web-based Documents: A Novel Approach},
      journal   = {International Journal of Computational Intelligence},
      volume    = {3},
      number    = {1},
      pages     = {1-7},
      abstract  = {Most known methods for measuring the structural similarity of
                   document structures are based on, e.g., tag measures, path metrics
                   and tree measures in terms of their DOM-Trees. Other methods measures
                   the similarity in the framework of the well known vector space
                   model. In contrast to these we present a new approach to measuring
                   the structural similarity of web-based documents represented by
                   so called generalized trees which are more general than DOM-Trees
                   which represent only directed rooted trees. We will design a new
                   similarity measure for graphs representing web-based hypertext
                   structures. Our similarity measure is mainly based on a novel
                   representation of a graph as strings of linear integers, whose
                   components represent structural properties of the graph. The similarity
                   of two graphs is then defined as the optimal alignment of the
                   underlying property strings. In this paper we apply the well known
                   technique of sequence alignments to solve a novel and challenging
                   problem: Measuring the structural similarity of generalized trees.
                   More precisely, we first transform our graphs considered as high
                   dimensional objects in linear structures. Then we derive similarity
                   values from the alignments of the property strings in order to
                   measure the structural similarity of generalized trees. Hence,
                   we transform a graph similarity problem to a string similarity
                   problem. We demonstrate that our similarity measure captures important
                   structural information by applying it to two different test sets
                   consisting of graphs representing web-based documents.},
      pdf       = {http://waset.org/publications/15928/measuring-the-structural-similarity-of-web-based-documents-a-novel-approach},
      website   = {http://connection.ebscohost.com/c/articles/24839145/measuring-structural-similarity-web-based-documents-novel-approach},
      year      = {2006}
    }
    Alexander Mehler and Rüdiger Gleim. 2006. The Net for the Graphs – Towards Webgenre Representation for Corpus Linguistic Studies. WaCky! Working Papers on the Web as Corpus, 191–224.
    BibTeX
    @incollection{Mehler:Gleim:2006:b,
      author    = {Mehler, Alexander and Gleim, Rüdiger},
      title     = {The Net for the Graphs – Towards Webgenre Representation for Corpus
                   Linguistic Studies},
      booktitle = {WaCky! Working Papers on the Web as Corpus},
      publisher = {Gedit},
      editor    = {Baroni, Marco and Bernardini, Silvia},
      pages     = {191-224},
      address   = {Bologna},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.510.4125},
      year      = {2006}
    }
    Rüdiger Gleim, Alexander Mehler and Matthias Dehmer. 2006. Web Corpus Mining by Instance of Wikipedia. Proceedings of the EACL 2006 Workshop on Web as Corpus, April 3-7, 2006, Trento, Italy, 67–74.
    BibTeX
    @inproceedings{Gleim:Mehler:Dehmer:2006:a,
      author    = {Gleim, Rüdiger and Mehler, Alexander and Dehmer, Matthias},
      title     = {Web Corpus Mining by Instance of Wikipedia},
      booktitle = {Proceedings of the EACL 2006 Workshop on Web as Corpus, April
                   3-7, 2006, Trento, Italy},
      editor    = {Kilgariff, Adam and Baroni, Marco},
      pages     = {67-74},
      abstract  = {Workshop organizer: Adam Kilgarriff},
      pdf       = {http://www.aclweb.org/anthology/W06-1710},
      website   = {http://pub.uni-bielefeld.de/publication/1773538},
      year      = {2006}
    }
    Alexander Mehler. 2006. In Search of a Bridge Between Network Analysis in Computational Linguistics and Computational Biology-A Conceptual Note.. BIOCOMP, 496–502.
    BibTeX
    @inproceedings{mehler:2006,
      author    = {Mehler, Alexander},
      title     = {In Search of a Bridge Between Network Analysis in Computational
                   Linguistics and Computational Biology-A Conceptual Note.},
      booktitle = {BIOCOMP},
      pages     = {496--502},
      pdf       = {https://pdfs.semanticscholar.org/81aa/0b840ed413089d69908cff60628a92609ccd.pdf},
      year      = {2006}
    }
    Alexander Mehler. 2006. Text Linkage in the Wiki Medium – A Comparative Study. Proceedings of the EACL Workshop on New Text – Wikis and blogs and other dynamic text sources, April 3-7, 2006, Trento, Italy, 1–8.
    BibTeX
    @inproceedings{Mehler:2006:c,
      author    = {Mehler, Alexander},
      title     = {Text Linkage in the Wiki Medium – A Comparative Study},
      booktitle = {Proceedings of the EACL Workshop on New Text – Wikis and blogs
                   and other dynamic text sources, April 3-7, 2006, Trento, Italy},
      editor    = {Karlgren, Jussi},
      pages     = {1-8},
      abstract  = {Workshop organizer: Jussi Karlgren},
      pdf       = {http://www.aclweb.org/anthology/W06-2801},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.6390},
      year      = {2006}
    }
    Alexander Mehler. 2006. Stratified Constraint Satisfaction Networks in Synergetic Multi-Agent Simulations of Language Evolution. Artificial Cognition Systems, 140–174.
    BibTeX
    @incollection{Mehler:2006:e,
      author    = {Mehler, Alexander},
      title     = {Stratified Constraint Satisfaction Networks in Synergetic Multi-Agent
                   Simulations of Language Evolution},
      booktitle = {Artificial Cognition Systems},
      publisher = {Idea Group Inc.},
      editor    = {Loula, Angelo and Gudwin, Ricardo and Queiroz, João},
      pages     = {140-174},
      address   = {Hershey},
      abstract  = {Ehedem = Mehler:2005:e},
      year      = {2006}
    }
    Alexander Mehler and Lorenz Sichelschmidt. 2006. Reconceptualizing Latent Semantic Analysis in Terms of Complex Network Theory. A Corpus-Linguistic Approach. 2nd International Conference of the German Cognitive Linguistics Association – Theme Session: Cognitive-Linguistic Approaches: What can we gain by computational treatment of data? 5.-7. Oktober 2006, Ludwig-Maximilians-Universität München, 23–26.
    BibTeX
    @inproceedings{Mehler:Sichelschmidt:2006,
      author    = {Mehler, Alexander and Sichelschmidt, Lorenz},
      title     = {Reconceptualizing Latent Semantic Analysis in Terms of Complex
                   Network Theory. A Corpus-Linguistic Approach},
      booktitle = {2nd International Conference of the German Cognitive Linguistics
                   Association – Theme Session: Cognitive-Linguistic Approaches:
                   What can we gain by computational treatment of data? 5.-7. Oktober
                   2006, Ludwig-Maximilians-Universit{\"a}t München},
      pages     = {23-26},
      editors   = {Alonge, Antonietta and Lönneker-Rodman, Birte},
      pdf       = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.5069&rep=rep1&type=pdf},
      year      = {2006}
    }
    Alexander Mehler, Matthias Dehmer and Rüdiger Gleim. 2006. Towards Logical Hypertext Structure - A Graph-Theoretic Perspective. Proceedings of the Fourth International Workshop on Innovative Internet Computing Systems (I2CS '04), 136–150.
    BibTeX
    @inproceedings{Mehler:Dehmer:Gleim:2006,
      author    = {Mehler, Alexander and Dehmer, Matthias and Gleim, Rüdiger},
      title     = {Towards Logical Hypertext Structure - A Graph-Theoretic Perspective},
      booktitle = {Proceedings of the Fourth International Workshop on Innovative
                   Internet Computing Systems (I2CS '04)},
      editor    = {Böhme, Thomas and Heyer, Gerhard},
      series    = {Lecture Notes in Computer Science 3473},
      pages     = {136-150},
      address   = {Berlin/New York},
      publisher = {Springer},
      abstract  = {Facing the retrieval problem according to the overwhelming set
                   of documents online the adaptation of text categorization to web
                   units has recently been pushed. The aim is to utilize categories
                   of web sites and pages as an additional retrieval criterion. In
                   this context, the bag-of-words model has been utilized just as
                   HTML tags and link structures. In spite of promising results this
                   adaptation stays in the framework of IR specific models since
                   it neglects the content-based structuring inherent to hypertext
                   units. This paper approaches hypertext modelling from the perspective
                   of graph-theory. It presents an XML-based format for representing
                   websites as hypergraphs. These hypergraphs are used to shed light
                   on the relation of hypertext structure types and their web-based
                   instances. We place emphasis on two characteristics of this relation:
                   In terms of realizational ambiguity we speak of functional equivalents
                   to the manifestation of the same structure type. In terms of polymorphism
                   we speak of a single web unit which manifests different structure
                   types. It is shown that polymorphism is a prevalent characteristic
                   of web-based units. This is done by means of a categorization
                   experiment which analyses a corpus of hypergraphs representing
                   the structure and content of pages of conference websites. On
                   this background we plead for a revision of text representation
                   models by means of hypergraphs which are sensitive to the manifold
                   structuring of web documents.},
      website   = {http://rd.springer.com/chapter/10.1007/11553762_14},
      year      = {2006}
    }
    Alexander Mehler. 2006. In Search of a Bridge between Network Analysis in Computational Linguistics and Computational Biology – A Conceptual Note. Proceedings of the 2006 International Conference on Bioinformatics & Computational Biology (BIOCOMP '06), June 26, 2006, Las Vegas, USA, 496–500.
    BibTeX
    @inproceedings{Mehler:2006:a,
      author    = {Mehler, Alexander},
      title     = {In Search of a Bridge between Network Analysis in Computational
                   Linguistics and Computational Biology – A Conceptual Note},
      booktitle = {Proceedings of the 2006 International Conference on Bioinformatics
                   \& Computational Biology (BIOCOMP '06), June 26, 2006, Las Vegas,
                   USA},
      editor    = {Arabnia, Hamid R. and Valafar, Homayoun},
      pages     = {496-500},
      pdf       = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.9842&rep=rep1&type=pdf},
      year      = {2006}
    }

    2005

    Matthias Dehmer, Frank Emmert-Streib, Alexander Mehler, Jürgen Kilian and Max Mühlhäuser. 2005. Application of a similarity measure for graphs to web-based document structures. Proceedings of VI. International Conference on Enformatika, Systems Sciences and Engineering, Budapest, Hungary, October 2005, International Academy of Sciences: Enformatika 8 (2005), 77–81.
    BibTeX
    @inproceedings{Dehmer:Emmert:Streib:Mehler:Kilian:Muehlhaeuser:2005,
      author    = {Dehmer, Matthias and Emmert-Streib, Frank and Mehler, Alexander
                   and Kilian, Jürgen and Mühlh{\"a}user, Max},
      title     = {Application of a similarity measure for graphs to web-based document structures},
      booktitle = {Proceedings of VI. International Conference on Enformatika, Systems
                   Sciences and Engineering, Budapest, Hungary, October 2005, International
                   Academy of Sciences: Enformatika 8 (2005)},
      pages     = {77-81},
      abstract  = {Due to the tremendous amount of information provided by the World
                   Wide Web (WWW) developing methods for mining the structure of
                   web-based documents is of considerable interest. In this paper
                   we present a similarity measure for graphs representing web-based
                   hypertext structures. Our similarity measure is mainly based on
                   a novel representation of a graph as linear integer strings, whose
                   components represent structural properties of the graph. The similarity
                   of two graphs is then defined as the optimal alignment of the
                   underlying property strings. In this paper we apply the well known
                   technique of sequence alignments for solving a novel and challenging
                   problem: Measuring the structural similarity of generalized trees.
                   In other words: We first transform our graphs considered as high
                   dimensional objects in linear structures. Then we derive similarity
                   values from the alignments of the property strings in order to
                   measure the structural similarity of generalized trees. Hence,
                   we transform a graph similarity problem to a string similarity
                   problem for developing a efficient graph similarity measure. We
                   demonstrate that our similarity measure captures important structural
                   information by applying it to two different test sets consisting
                   of graphs representing web-based document structures.},
      pdf       = {http://waset.org/publications/15299/application-of-a-similarity-measure-for-graphs-to-web-based-document-structures},
      website   = {https://www.researchgate.net/publication/238687277_Application_of_a_Similarity_Measure_for_Graphs_to_Web-based_Document_Structures},
      year      = {2005}
    }
    Alexander Mehler. 2005. Preliminaries to an Algebraic Treatment of Lexical Associations. Learning and Extending Lexical Ontologies. Proceedings of the Workshop at the 22nd International Conference on Machine Learning (ICML '05), August 7-11, 2005, Universität Bonn, Germany, 41–47.
    BibTeX
    @inproceedings{Mehler:2005:c,
      author    = {Mehler, Alexander},
      title     = {Preliminaries to an Algebraic Treatment of Lexical Associations},
      booktitle = {Learning and Extending Lexical Ontologies. Proceedings of the
                   Workshop at the 22nd International Conference on Machine Learning
                   (ICML '05), August 7-11, 2005, Universit{\"a}t Bonn, Germany},
      editor    = {Biemann, Chris and Paa{\ss}, Gerhard},
      pages     = {41-47},
      year      = {2005}
    }
    Alexander Mehler and Rüdiger Gleim. 2005. Polymorphism in Generic Web Units. A corpus linguistic study. Proceedings of Corpus Linguistics '05, July 14-17, 2005, University of Birmingham, Great Britian, Corpus Linguistics Conference Series 1(1).
    BibTeX
    @inproceedings{Mehler:Gleim:2005:a,
      author    = {Mehler, Alexander and Gleim, Rüdiger},
      title     = {Polymorphism in Generic Web Units. A corpus linguistic study},
      booktitle = {Proceedings of Corpus Linguistics '05, July 14-17, 2005, University
                   of Birmingham, Great Britian},
      volume    = {Corpus Linguistics Conference Series 1(1)},
      abstract  = {Corpus linguistics and related disciplines which focus on statistical
                   analyses of textual units have substantial need for large corpora.
                   More specifically, genre or register specific corpora are needed
                   which allow studying variations in language use. Along with the
                   incredible growth of the internet, the web became an important
                   source of linguistic data. Of course, web corpora face the same
                   problem of acquiring genre specific corpora. Amongst other things,
                   web mining is a framework of methods for automatically assigning
                   category labels to web units and thus may be seen as a solution
                   to this corpus acquisition problem as far as genre categories
                   are applied. The paper argues that this approach is faced with
                   the problem of a many-to-many relation between expression units
                   on the one hand and content or function units on the other hand.
                   A quantitative study is performed which supports the argumentation
                   that functions of web-based communication are very often concentrated
                   on single web pages and thus interfere any effort of directly
                   applying the classical apparatus of categorization on web page
                   level. The paper outlines a two-level algorithm as an alternative
                   approach to category assignment which is sensitive to genre specific
                   structures and thus may be used to tackle the problem of acquiring
                   genre specific corpora.},
      issn      = {1747-9398},
      pdf       = {http://www.birmingham.ac.uk/Documents/college-artslaw/corpus/conference-archives/2005-journal/Thewebasacorpus/AlexanderMehlerandRuedigerGleimCorpusLinguistics2005.pdf},
      year      = {2005}
    }
    Alexander Mehler and Christian Wolff. 2005. Einleitung: Perspektiven und Positionen des Text Mining. Journal for Language Technology and Computational Linguistics (JLCL), 20(1):1–18.
    BibTeX
    @article{Mehler:Wolff:2005:b,
      author    = {Mehler, Alexander and Wolff, Christian},
      title     = {Einleitung: Perspektiven und Positionen des Text Mining},
      journal   = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      volume    = {20},
      number    = {1},
      pages     = {1-18},
      abstract  = {Beitr{\"a}ge zum Thema Text Mining beginnen vielfach mit dem Hinweis
                   auf die enorme Zunahme online verfügbarer Dokumente, ob nun im
                   Internet oder in Intranets (Losiewicz et al. 2000; Merkl 2000;
                   Feldman 2001; Mehler 2001; Joachims \& Leopold 2002). Der hiermit
                   einhergehenden „Informationsflut“ wird das Ungenügen des Information
                   Retrieval (IR) bzw. seiner g{\"a}ngigen Verfahren der Informationsaufbereitung
                   und Informationserschlie{\ss}ung gegenübergestellt. Es wird bem{\"a}ngelt,
                   dass sich das IR weitgehend darin erschöpft, Teilmengen von Textkollektionen
                   auf Suchanfragen hin aufzufinden und in der Regel blo{\ss} listenförmig
                   anzuordnen. Das auf diese Weise dargestellte Spannungsverh{\"a}ltnis
                   von Informationsexplosion und Defiziten bestehender IR-Verfahren
                   bildet den Hintergrund für die Entwicklung von Verfahren zur automatischen
                   Verarbeitung textueller Einheiten, die sich st{\"a}rker an den
                   Anforderungen von Informationssuchenden orientieren. Anders ausgedrückt:
                   Mit der Einführung der Neuen Medien w{\"a}chst die Bedeutung digitalisierter
                   Dokumente als Prim{\"a}rmedium für die Verarbeitung, Verbreitung
                   und Verwaltung von Information in öffentlichen und betrieblichen
                   Organisationen. Dabei steht wegen der Menge zu verarbeitender
                   Einheiten die Alternative einer intellektuellen Dokumenterschlie{\ss}ung
                   nicht zur Verfügung. Andererseits wachsen die Anforderung an eine
                   automatische Textanalyse, der das klassische IR nicht gerecht
                   wird. Der Mehrzahl der hiervon betroffenen textuellen Einheiten
                   fehlt die explizite Strukturiertheit formaler Datenstrukturen.
                   Vielmehr weisen sie je nach Text- bzw. Dokumenttyp ganz unterschiedliche
                   Strukturierungsgrade auf. Dabei korreliert die Flexibilit{\"a}t
                   der Organisationsziele negativ mit dem Grad an explizierter Strukturiertheit
                   und positiv mit der Anzahl jener Texte und Texttypen (E-Mails,
                   Memos, Expertisen, technische Dokumentationen etc.), die im Zuge
                   ihrer Realisierung produziert bzw. rezipiert werden. Vor diesem
                   Hintergrund entsteht ein Bedarf an Texttechnologien, die ihren
                   Benutzern nicht nur „intelligente“ Schnittstellen zur Textrezeption
                   anbieten, sondern zugleich auf inhaltsorientierte Textanalysen
                   zielen, um auf diese Weise aufgabenrelevante Daten explorieren
                   und kontextsensitiv aufbereiten zu helfen. Das Text Mining ist
                   mit dem Versprechen verbunden, eine solche Technologie darzustellen
                   bzw. sich als solche zu entwickeln. Dieser einheitlichen Problembeschreibung
                   stehen konkurrierende Textmining-Spezifikationen gegenüber, was
                   bereits die Vielfalt der Namensgebungen verdeutlicht. So finden
                   sich neben der Bezeichnung Text Mining (Joachims \& Leopold 2002;
                   Tan 1999) die Alternativen • Text Data Mining (Hearst 1999b; Merkl
                   2000), • Textual Data Mining (Losiewicz et al. 2000), • Text Knowledge
                   Engineering (Hahn \& Schnattinger 1998), Knowledge Discovery in
                   Texts (Kodratoff 1999) oder Knowledge Discovery in Textual Databases
                   (Feldman \& Dagan 1995). Dabei l{\"a}sst bereits die Namensgebung
                   erkennen, dass es sich um Analogiebildungen zu dem (nur unwesentlich
                   {\"a}lteren) Forschungsgebiet des Data Mining (DM; als Bestandteil
                   des Knowledge Discovery in Databases – KDD) handelt. Diese Namensvielfalt
                   findet ihre Entsprechung in widerstreitenden Aufgabenzuweisungen.
                   So setzt beispielsweise Sebastiani (2002) Informationsextraktion
                   und Text Mining weitgehend gleich, wobei er eine Schnittmenge
                   zwischen Text Mining und Textkategorisierung ausmacht (siehe auch
                   Dörre et al. 1999). Demgegenüber betrachten Kosala \& Blockeel
                   (2000) Informationsextraktion und Textkategorisierung lediglich
                   als Teilbereiche des ihrer Ansicht nach umfassenderen Text Mining,
                   w{\"a}hrend Hearst (1999a) im Gegensatz hierzu Informationsextraktion
                   und Textkategorisierung explizit aus dem Bereich des explorativen
                   Text Mining ausschlie{\ss}t.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_wolff_2005_b.pdf},
      website   = {http://epub.uni-regensburg.de/6844/},
      year      = {2005}
    }
    Alexander Mehler. 2005. Korpuslinguistik. Ed. by Alexander Mehler.Journal for Language Technology and Computational Linguistics (JLCL), 20(2).
    BibTeX
    @book{Mehler:2005:e,
      author    = {Mehler, Alexander},
      editor    = {Mehler, Alexander},
      title     = {Korpuslinguistik},
      volume    = {20(2)},
      series    = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/Korpuslinguistik.png},
      pagetotal = {97},
      website   = {http://www.jlcl.org/2005_Heft2/LDV_Forum_Band_20_Heft_2.pdf},
      year      = {2005}
    }
    Alexander Mehler, Matthias Dehmer and Rüdiger Gleim. 2005. Zur Automatischen Klassifikation von Webgenres. Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen. Beiträge zur GLDV-Frühjahrstagung '05, 10. März – 01. April 2005, Universität Bonn, 158–174.
    BibTeX
    @inproceedings{Mehler:Dehmer:Gleim:2005,
      author    = {Mehler, Alexander and Dehmer, Matthias and Gleim, Rüdiger},
      title     = {Zur Automatischen Klassifikation von Webgenres},
      booktitle = {Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen.
                   Beitr{\"a}ge zur GLDV-Frühjahrstagung '05, 10. M{\"a}rz – 01.
                   April 2005, Universit{\"a}t Bonn},
      editor    = {Fisseni, Bernhard and Schmitz, Hans-Christina and Schröder, Bernhard
                   and Wagner, Petra},
      pages     = {158-174},
      address   = {Frankfurt a. M.},
      publisher = {Lang},
      year      = {2005}
    }
    Alexander Mehler and Christian Wolff. 2005. Text Mining. Ed. by Alexander Mehler and Christian Wolff.Journal for Language Technology and Computational Linguistics (JLCL), 20(1). GSCL.
    BibTeX
    @book{Mehler:Wolff:2005:a,
      author    = {Mehler, Alexander and Wolff, Christian},
      editor    = {Mehler, Alexander and Wolff, Christian},
      title     = {Text Mining},
      publisher = {GSCL},
      volume    = {20(1)},
      series    = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/TextMining.png},
      pagetotal = {143},
      website   = {http://www.jlcl.org/2005_Heft1/LDV-Forum1.2005.pdf},
      year      = {2005}
    }
    Alexander Mehler. 2005. Eigenschaften der textuellen Einheiten und Systeme / Properties of Textual Units and Systems. Quantitative Linguistik. Ein internationales Handbuch / Quantitative Linguistics. An International Handbook, 325–348.
    BibTeX
    @incollection{Mehler:2005:b,
      author    = {Mehler, Alexander},
      title     = {Eigenschaften der textuellen Einheiten und Systeme / Properties
                   of Textual Units and Systems},
      booktitle = {Quantitative Linguistik. Ein internationales Handbuch / Quantitative
                   Linguistics. An International Handbook},
      publisher = {De Gruyter},
      editor    = {Köhler, Reinhard and Altmann, Gabriel and Piotrowski, Raijmund G.},
      pages     = {325-348},
      address   = {Berlin/New York},
      year      = {2005}
    }
    Alexander Mehler. 2005. Lexical Chaining as a Source of Text Chaining. Proceedings of the 1st Computational Systemic Functional Grammar Conference, University of Sydney, Australia, 12–21.
    BibTeX
    @inproceedings{Mehler:2005:d,
      author    = {Mehler, Alexander},
      title     = {Lexical Chaining as a Source of Text Chaining},
      booktitle = {Proceedings of the 1st Computational Systemic Functional Grammar
                   Conference, University of Sydney, Australia},
      editor    = {Patrick, Jon and Matthiessen, Christian},
      pages     = {12-21},
      abstract  = {July 16, 2005,},
      pdf       = {http://www.www.texttechnologylab.org/media/pdf/CohesionTrees1.pdf},
      year      = {2005}
    }
    Alexander Mehler. 2005. Zur textlinguistischen Fundierung der Text- und Korpuskonversion. Sprache und Datenverarbeitung. International Journal for Language Data Processing, 1:29–53.
    BibTeX
    @article{Mehler:2005:a,
      author    = {Mehler, Alexander},
      title     = {Zur textlinguistischen Fundierung der Text- und Korpuskonversion},
      journal   = {Sprache und Datenverarbeitung. International Journal
                       for Language Data Processing},
      volume    = {1},
      pages     = {29-53},
      abstract  = {Die automatische Konversion von Texten in Hypertexte ist mit der
                   Erwartung verbunden, computerbasierte Rezeptionshilfen zu gewinnen.
                   Dies betrifft insbesondere die Bew{\"a}ltigung der ungeheuren
                   Menge an Fachliteratur im Rahmen der Wissenschaftskommunikation.
                   Von einem thematisch relevanten Text zu einem thematisch verwandten
                   Text per Hyperlink direkt gelangen zu können, stellt einen Anspruch
                   dar, dessen Erfüllung mittels digitaler Bibliotheken n{\"a}her
                   gerückt zu sein scheint. Doch wie lassen sich die Kriterien, nach
                   denen Texte automatisch verlinkt werden, genauer begründen? Dieser
                   Beitrag geht dieser Frage aus der Sicht textlinguistischer Modellbildungen
                   nach. Er zeigt, dass parallel zur Entwicklung der Textlinguistik,
                   wenn auch mit einer gewissen Verzögerung, Konversionsans{\"a}tze
                   entwickelt wurden, die sich jeweils an einer bestimmten Stufe
                   des Textbegriffs orientieren. Der Beitrag weist nicht nur das
                   diesen Ans{\"a}tzen gemeinsame Fundament in Form der so genannten
                   Explikationshypothese nach, sondern verweist zugleich auf grundlegende
                   Automatisierungsdefizite, die mit ihnen verbunden sind. Mit systemisch-funktionalen
                   Hypertexten wird schlie{\ss}lich ein Ansatz skizziert, der darauf
                   zielt, den Anspruch nach textlinguistischer Fundierung und Automatisierbarkeit
                   zu vereinen.},
      publisher = {GSCL},
      year      = {2005}
    }

    2004

    Alexander Mehler. 2004. Textmining. Texttechnologie. Perspektiven und Anwendungen, 329–352.
    BibTeX
    @incollection{Mehler:2004:h,
      author    = {Mehler, Alexander},
      title     = {Textmining},
      booktitle = {Texttechnologie. Perspektiven und Anwendungen},
      publisher = {Stauffenburg},
      editor    = {Lobin, Henning and Lemnitzer, Lothar},
      pages     = {329-352},
      address   = {Tübingen},
      year      = {2004}
    }
    Alexander Mehler and Henning Lobin. 2004. Aspekte der texttechnologischen Modellierung. Automatische Textanalyse: Systeme und Methoden zur Annotation und Analyse natürlichsprachlicher Texte, 1–21.
    BibTeX
    @incollection{Mehler:Lobin:2004:b,
      author    = {Mehler, Alexander and Lobin, Henning},
      title     = {Aspekte der texttechnologischen Modellierung},
      booktitle = {Automatische Textanalyse: Systeme und Methoden zur Annotation
                   und Analyse natürlichsprachlicher Texte},
      publisher = {Verlag für Sozialwissenschaften},
      editor    = {Mehler, Alexander and Lobin, Henning},
      pages     = {1-21},
      address   = {Wiesbaden},
      year      = {2004}
    }
    Alexander Mehler and Henning Lobin. 2004. Automatische Textanalyse. Systeme und Methoden zur Annotation und Analyse natürlichsprachlicher Texte. Ed. by Alexander Mehler and Henning Lobin. Verlag für Sozialwissenschaften.
    BibTeX
    @book{Mehler:Lobin:2004:a,
      author    = {Mehler, Alexander and Lobin, Henning},
      editor    = {Mehler, Alexander and Lobin, Henning},
      title     = {Automatische Textanalyse. Systeme und Methoden zur Annotation
                   und Analyse natürlichsprachlicher Texte},
      publisher = {Verlag für Sozialwissenschaften},
      address   = {Wiesbaden},
      pagetotal = {290},
      website   = {http://www.v-r.de/de/Mehler-Lobin-Automatische-Textanalyse/t/352526527/},
      year      = {2004}
    }
    Alexander Mehler. 2004. A Data-Oriented Model of Context in Hypertext Authoring. Proceedings of the 7th International Workshop on Organisational Semiotics (OS '04), July 19-20, 2004, Setúbal, Portugal, 24–45.
    BibTeX
    @inproceedings{Mehler:2004:c,
      author    = {Mehler, Alexander},
      title     = {A Data-Oriented Model of Context in Hypertext Authoring},
      booktitle = {Proceedings of the 7th International Workshop on Organisational
                   Semiotics (OS '04), July 19-20, 2004, Setúbal, Portugal},
      editor    = {Filipe, Joaquim and Liu, Kecheng},
      pages     = {24-45},
      address   = {Setúbal},
      publisher = {INSTICC},
      pdf       = {http://www.orgsem.org/papers/02.pdf},
      website   = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.7944},
      year      = {2004}
    }
    Alexander Mehler. 2004. Automatische Synthese Internet-basierter Links für digitale Bibliotheken. Osnabrücker Beiträge zur Sprachtheorie. Themenheft Internetbasierte Kommunikation, 68:31–53.
    BibTeX
    @article{Mehler:2004:b,
      author    = {Mehler, Alexander},
      title     = {Automatische Synthese Internet-basierter Links für digitale Bibliotheken},
      journal   = {Osnabrücker Beitr{\"a}ge zur Sprachtheorie.
                       Themenheft Internetbasierte Kommunikation},
      volume    = {68},
      pages     = {31-53},
      abstract  = {Dieser Beitrag behandelt Verfahren zur automatischen Erzeugung
                   von Hyperlinks, wie sie im WWW für die Informationssuche bereitstehen.
                   Dabei steht die Frage im Vordergrund, auf welche Weise bestehende
                   Verfahren suchrelevante Dokumente bestimmen und von diesen aus
                   inhaltsverwandte Dokumente verlinken. Dieser Gegenstand verbindet
                   den Bereich des klassischen Information Retrievals (IR) mit einem
                   Anwendungsgebiet, das in der Wissenschaftskommunikation unter
                   dem Stichwort der digitalen Bibliothek unter Nutzbarmachung des
                   Hyperlink-basierten Browsings firmiert. Ein Beispiel hierfür bildet
                   die digitale Bibliothek CiteSeer (Lawrence et al. 1999), welche
                   das Boolesche Retrieval dadurch erweitert, dass ausgehend von
                   Treffern einer Suche jene Dokumente per Link angesteuert werden
                   können, welche die aufgefundenen Dokumente zitieren oder von diesen
                   zitiert werden. CiteSeer ist also ein System, welches das Schlagwort-basierte
                   Querying im Rahmen des klassischen IRs mit dem Hypertext-basierten
                   Browsing von Zitaten verknüpft, und zwar zu dem Zweck, die Suche
                   wissenschaftlicher Dokumente zu erleichtern. Darüber hinaus verwendet
                   es die unter dem Stichwort des Vektorraummodells bekannt gewordene
                   Technologie für den wortbasierten Vergleich von Texten. Der Beitrag
                   setzt an dieser Stelle an. Er argumentiert, dass Verfahren bereitstehen,
                   welche die Anforderung nach inhaltsorientiertem Retrieval mit
                   dem inhaltsorientierten Browsing verbinden, mit der Forderung
                   also, dass Hyperlinks, die E-Texte als digitalisierte Versionen
                   von (wissenschaftlichen) Dokumenten verknüpfen (Storrer 2002),
                   Inhalts- und nicht nur Zitat-basiert sind.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2004_b.pdf},
      year      = {2004}
    }
    Matthias Dehmer, Alexander Mehler and Rüdiger Gleim. 2004. Aspekte der Kategorisierung von Webseiten. INFORMATIK 2004 – Informatik verbindet, Band 2, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI). Workshop Multimedia-Informationssysteme, 51:39–43.
    BibTeX
    @inproceedings{Dehmer:Mehler:Gleim:2004,
      author    = {Dehmer, Matthias and Mehler, Alexander and Gleim, Rüdiger},
      title     = {Aspekte der Kategorisierung von Webseiten},
      booktitle = {INFORMATIK 2004 – Informatik verbindet, Band 2, Beitr{\"a}ge der
                   34. Jahrestagung der Gesellschaft für Informatik e.V. (GI). Workshop
                   Multimedia-Informationssysteme},
      editor    = {Dadam, Peter and Reichert, Manfred},
      volume    = {51},
      series    = {Lecture Notes in Informatics},
      pages     = {39-43},
      publisher = {GI},
      abstract  = {Im Zuge der Web-basierten Kommunikation tritt die Frage auf, inwiefern
                   Webpages zum Zwecke ihrer inhaltsorientierten Filterung kategorisiert
                   werden können. Diese Studie untersucht zwei Ph{\"a}nomene, welche
                   die Bedingung der Möglichkeit einer solchen Kategorisierung betreffen
                   (siehe [6]): Mit dem Begriff der funktionalen Aquivalenz beziehen
                   wir uns auf das Ph{\"a}nomen, dass dieselbe Funktions- oder Inhaltskategorie
                   durch völlig verschiedene Bausteine Web-basierter Dokumente manifestiert
                   werden kann. Mit dem Begriff des Polymorphie beziehen wir uns
                   auf das Ph{\"a}nomen, dass dasselbe Dokument zugleich mehrere
                   Funktions- oder Inhaltskategorien manifestieren kann. Die zentrale
                   Hypothese lautet, dass beide Ph{\"a}nomene für Web-basierte Hypertextstrukturen
                   charakteristisch sind. Ist dies der Fall, so kann die automatische
                   Kategorisierung von Hypertexten [2, 10] nicht mehr als eindeutige
                   Zuordnung verstanden werden, bei der einem Dokument genau eine
                   Kategorie zugeordnet wird. In diesem Sinne thematisiert das Papier
                   die Frage nach der ad{\"a}quaten Modellierung multimedialer Dokumente.},
      pdf       = {http://subs.emis.de/LNI/Proceedings/Proceedings51/GI-Proceedings.51-11.pdf},
      website   = {https://www.researchgate.net/publication/221385316_Aspekte_der_Kategorisierung_von_Webseiten},
      year      = {2004}
    }
    Alexander Mehler. 2004. Textmodellierung: Mehrstufige Modellierung generischer Bausteine der Textähnlichkeitsmessung. Automatische Textanalyse: Systeme und Methoden zur Annotation und Analyse natürlichsprachlicher Texte, 101–120.
    BibTeX
    @incollection{Mehler:2003:d,
      author    = {Mehler, Alexander},
      title     = {Textmodellierung: Mehrstufige Modellierung generischer Bausteine
                   der Text{\"a}hnlichkeitsmessung},
      booktitle = {Automatische Textanalyse: Systeme und Methoden zur Annotation
                   und Analyse natürlichsprachlicher Texte},
      publisher = {Verlag für Sozialwissenschaften},
      editor    = {Mehler, Alexander and Lobin, Henning},
      pages     = {101-120},
      address   = {Wiesbaden},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/AutomatischeTextanalyse2.jpg},
      year      = {2004}
    }

    2003

    Alexander Mehler. 2003. Methodological Aspects of Computational Semiotics. SEED Journal, 3(3):71–80.
    BibTeX
    @article{Mehler:2003:b,
      author    = {Mehler, Alexander},
      title     = {Methodological Aspects of Computational Semiotics},
      journal   = {SEED Journal},
      volume    = {3},
      number    = {3},
      pages     = {71-80},
      abstract  = {In the following, elementary constituents of models in computational
                   semiotics are outlined. This is done by referring to computer
                   simulations as a framework which neither aims to describe artificial
                   sign systems (as done in computer semiotics), nor to realize semiotic
                   functions in “artificial worlds” (as proposed in “artificial semiosis”).
                   Rather, the framework referred to focuses on preconditions of
                   computer-based simulations of semiotic processes. Following this
                   approach, the paper focuses on methodological aspects of computational
                   semiotics.},
      year      = {2003}
    }
    Alexander Mehler. 2003. Konnotative Textbedeutungen: zur Modellierung struktureller Aspekte der Bedeutungen von Texten. Korpuslinguistische Untersuchungen zur quantitativen und systemtheoretischen Linguistik, 320–347.
    BibTeX
    @incollection{Mehler:2003,
      author    = {Mehler, Alexander},
      title     = {Konnotative Textbedeutungen: zur Modellierung struktureller Aspekte
                   der Bedeutungen von Texten},
      booktitle = {Korpuslinguistische Untersuchungen zur quantitativen und systemtheoretischen
                   Linguistik},
      publisher = {Gardez! Verlag},
      editor    = {Köhler, Reinhard},
      pages     = {320-347},
      address   = {Sankt Augustin},
      pdf       = {http://ubt.opus.hbz-nrw.de/volltexte/2004/279/pdf/10_mehler.pdf},
      year      = {2003}
    }
    Alexander Mehler and Siegfried Reich. 2003. Guided Tours + Trails := Guided Trails. Poster at the 14th ACM Conference on Hypertext and Hypermedia (Hypertext '03), Nottingham, August 26-30, 1–2.
    BibTeX
    @inproceedings{Mehler:Reich:2003,
      author    = {Mehler, Alexander and Reich, Siegfried},
      title     = {Guided Tours + Trails := Guided Trails},
      booktitle = {Poster at the 14th ACM Conference on Hypertext and Hypermedia
                   (Hypertext '03), Nottingham, August 26-30},
      pages     = {1-2},
      website   = {http://www.sigweb.org/Ht03posters},
      year      = {2003}
    }
    Alexander Mehler. 2003. Ein Kompositionalitätsprinzip für numerische Textsemantiken. Journal for Language Technology and Computational Linguistics (JLCL), 18(1-2):321–337.
    BibTeX
    @article{Mehler:2003:c,
      author    = {Mehler, Alexander},
      title     = {Ein Kompositionalit{\"a}tsprinzip für numerische Textsemantiken},
      journal   = {Journal for Language Technology and Computational
                       Linguistics (JLCL)},
      volume    = {18},
      number    = {1-2},
      pages     = {321-337},
      abstract  = {Der Beitrag beschreibt eine Variante des Kompositionalit{\"a}tsprinzips
                   der Bedeutung als Grundprinzip für die numerische Analyse unsystematischer
                   Sinnrelationen komplexer Zeichen, das über das Ph{\"a}nomen der
                   perspektivischen Interpretation hinaus gebrauchssemantische Bedeutungsaspekte
                   berücksichtigt. Ziel ist es, ein theoretisches Fundament für korpusanalytische
                   Ans{\"a}tze in der Semantik, die oftmals die linguistische Interpretierbarkeit
                   ihrer Analyseergebnisse vermissen lassen, zu umrei{\ss}en. Die
                   Spezifikation des Kompositionalit{\"a}tsprinzips erfolgt unter
                   Rekurs auf das Modell eines hierarchisch geordneten Constraint-Satisfaction-Prozesses.
                   Hiermit ist das l{\"a}ngerfristige Ziel verbunden, das Problem
                   einer defizit{\"a}ren numerischen Textrepr{\"a}sentation sowie
                   die mangelnde Integration von propositionaler und strukturaler
                   bzw. korpusanalytischer Semantik anzugehen. Die Erörterungen dieses
                   Beitrags sind prim{\"a}r konzeptioneller Natur; sie betreffen
                   die Konzeption einer numerischen Textsemantik zur Vermeidung von
                   Defiziten bestehender Ans{\"a}tze.},
      pdf       = {http://media.dwds.de/jlcl/2003_Doppelheft/321-337_Mehler.pdf},
      year      = {2003}
    }

    2002

    Alexander Mehler. 2002. Components of a Model of Context-Sensitive Hypertexts. Journal of Universal Computer Science (J.UCS), 8(10):924–943.
    BibTeX
    @article{Mehler:2002:l,
      author    = {Mehler, Alexander},
      title     = {Components of a Model of Context-Sensitive Hypertexts},
      journal   = {Journal of Universal Computer Science (J.UCS)},
      volume    = {8},
      number    = {10},
      pages     = {924-943},
      abstract  = {On the background of rising Intranet applications the automatic
                   generation of adaptable, context-sensitive hypertexts becomes
                   more and more important [El-Beltagy et al., 2001]. This observation
                   contradicts the literature on hypertext authoring, where Information
                   Retrieval techniques prevail, which disregard any linguistic and
                   context-theoretical underpinning. As a consequence, resulting
                   hypertexts do not manifest those schematic structures, which are
                   constitutive for the emergence of text types and the context-mediated
                   understanding of their instances, i.e. natural language texts.
                   This paper utilizes Systemic Functional Linguistics (SFL) and
                   its context model as a theoretical basis of hypertext authoring.
                   So called Systemic Functional Hypertexts (SFHT) are proposed,
                   which refer to a stratified context layer as the proper source
                   of text linkage. The purpose of this paper is twofold: First,
                   hypertexts are reconstructed from a linguistic point of view as
                   a kind of supersign, whose constituents are natural language texts
                   and whose structuring is due to intra- and intertextual coherence
                   relations and their context-sensitive interpretation. Second,
                   the paper prepares a formal notion of SFHTs as a first step towards
                   operationalization of fundamental text linguistic concepts. On
                   this background, SFHTs serve to overcome the theoretical poverty
                   of many approaches to link generation.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_components_2002.pdf},
      website   = {http://www.jucs.org/jucs_8_10/components_of_a_model},
      year      = {2002}
    }
    Alexander Mehler and Rodney Clarke. 2002. Systemic Functional Hypertexts. An Architecture for Socialsemiotic Hypertext Systems. New Directions in Humanities Computing. The 14th Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities (ALLC/ACH '02), July 24-28, University of Tübingen, 68–69.
    BibTeX
    @inproceedings{Mehler:Clarke:2002,
      author    = {Mehler, Alexander and Clarke, Rodney},
      title     = {Systemic Functional Hypertexts. An Architecture for Socialsemiotic
                   Hypertext Systems},
      booktitle = {New Directions in Humanities Computing. The 14th Joint International
                   Conference of the Association for Literary and Linguistic Computing
                   and the Association for Computers and the Humanities (ALLC/ACH
                   '02), July 24-28, University of Tübingen},
      pages     = {68-69},
      year      = {2002}
    }
    Alexander Mehler. 2002. Text Mining with the Help of Cohesion Trees. Classification, Automation, and New Media. Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation, March 15-17, 2000, Universität Passau, 199–206.
    BibTeX
    @inproceedings{Mehler:2002:e,
      author    = {Mehler, Alexander},
      title     = {Text Mining with the Help of Cohesion Trees},
      booktitle = {Classification, Automation, and New Media. Proceedings of the
                   24th Annual Conference of the Gesellschaft für Klassifikation,
                   March 15-17, 2000, Universit{\"a}t Passau},
      editor    = {Gaul, Wolfgang and Ritter, Gunter},
      pages     = {199-206},
      address   = {Berlin/New York},
      publisher = {Springer},
      abstract  = {In the framework of automatic text processing, semantic spaces
                   are used as a format for modeling similarities of natural language
                   texts represented as vectors. They prove to be efficient in divergent
                   areas, as information retrieval (Dumais 1995), computational psychology
                   (Landauer, Dumais 1997), and computational linguistics (Rieger
                   1995; Mehler 1998). In order to group semantically similar texts,
                   cluster analysis is used. A central problem of this method relates
                   to the difficulty to name clusters, whereas lists neglect the
                   polyhierarchical structure of semantic spaces. This paper introduces
                   the concept of cohesion tree as an alternative tool for exploring
                   similarity relations of texts represented in high dimensional
                   spaces. Cohesion trees allow the perspective evaluation of numerically
                   represented text similarities. They depart from minimal spanning
                   trees (MST) by context-sensitively optimizing path costs. This
                   central property underlies the linguistic interpretation of cohesion
                   trees: instead of manifesting context-free associations, they
                   model context priming effects.},
      website   = {http://www.springerlink.com/content/x484814744877078/},
      year      = {2002}
    }
    Alexander Mehler. 2002. Cohesive Paths: Applying the Concept of Cohesion to Hypertext. Sprachwissenschaft auf dem Weg in das dritte Jahrtausend. Proceedings of the 34th Linguistics Colloquium, September 7-10, 1999, Universität Mainz, 725–733.
    BibTeX
    @inproceedings{Mehler:2002:f,
      author    = {Mehler, Alexander},
      title     = {Cohesive Paths: Applying the Concept of Cohesion to Hypertext},
      booktitle = {Sprachwissenschaft auf dem Weg in das dritte Jahrtausend. Proceedings
                   of the 34th Linguistics Colloquium, September 7-10, 1999, Universit{\"a}t
                   Mainz},
      editor    = {Rapp, Reinhard},
      pages     = {725-733},
      address   = {Frankfurt a. M.},
      publisher = {Peter Lang},
      year      = {2002}
    }
    Alexander Mehler. 2002. Hierarchical Orderings of Textual Units. Proceedings of the 19th International Conference on Computational Linguistics (COLING '02), August 24 – September 1, 2002, Taipei, Taiwan, 646–652.
    BibTeX
    @inproceedings{Mehler:2002:k,
      author    = {Mehler, Alexander},
      title     = {Hierarchical Orderings of Textual Units},
      booktitle = {Proceedings of the 19th International Conference on Computational
                   Linguistics (COLING '02), August 24 – September 1, 2002, Taipei,
                   Taiwan},
      pages     = {646-652},
      address   = {San Francisco},
      publisher = {Morgan Kaufmann},
      abstract  = {Text representation is a central task for any approach to automatic
                   learning from texts. It requires a format which allows to interrelate
                   texts even if they do not share content words, but deal with similar
                   topics. Furthermore, measuring text similarities raises the question
                   of how to organize the resulting clusters. This paper presents
                   cohesion trees (CT) as a data structure for the perspective, hierarchical
                   organization of text corpora. CTs operate on alternative text
                   representation models taking lexical organization, quantitative
                   text characteristics, and text structure into account. It is shown
                   that CTs realize text linkages which are lexically more homogeneous
                   than those produced by minimal spanning trees.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2002_k.pdf},
      year      = {2002}
    }
    Alexander Mehler. 2002. Hierarchical Analysis of Text Similarity Data. Künstliche Intelligenz (KI), 2:12–16.
    BibTeX
    @article{Mehler:2002:a,
      author    = {Mehler, Alexander},
      title     = {Hierarchical Analysis of Text Similarity Data},
      journal   = {Künstliche Intelligenz (KI)},
      volume    = {2},
      pages     = {12-16},
      abstract  = {Semantic spaces are used as a representational format for modeling
                   similarities of signs. As a multidimensional data structure they
                   are bound to the question of how to explore similarity relations
                   of signs mapped onto them. This paper introduces an abstract data
                   structure called dependency scheme as a formal format which encapsulates
                   two types of order relations, whose variable instatiation allows
                   to derive different classes of trees for the hierarchial analysis
                   of text similarity data derived from semantic spaces.},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/mehler_2002_a.pdf},
      year      = {2002}
    }
    Alexander Mehler. 2002. Textbedeutungsrekonstruktion. Grundzüge einer Architektur zur Modellierung der Bedeutungen von Texten. Prozesse der Bedeutungskonstruktion, 445–486.
    BibTeX
    @incollection{Mehler:2002:b,
      author    = {Mehler, Alexander},
      title     = {Textbedeutungsrekonstruktion. Grundzüge einer Architektur zur
                   Modellierung der Bedeutungen von Texten},
      booktitle = {Prozesse der Bedeutungskonstruktion},
      publisher = {Peter Lang},
      editor    = {Pohl, Inge},
      pages     = {445-486},
      address   = {Frankfurt a. M.},
      year      = {2002}
    }

    2001

    Alexander Mehler. 2001. Aspects of Text Mining. From Computational Semiotics to Systemic Functional Hypertexts. Australasian Journal of Information Systems (AJIS), 8(2):129–141.
    BibTeX
    @article{Mehler:2001:b,
      author    = {Mehler, Alexander},
      title     = {Aspects of Text Mining. From Computational Semiotics to Systemic
                   Functional Hypertexts},
      journal   = {Australasian Journal of Information Systems (AJIS)},
      volume    = {8},
      number    = {2},
      pages     = {129-141},
      abstract  = {The significance of natural language texts as the prime information
                   structure for the management and dissemination of knowledge in
                   organisations is still increasing. Making relevant documents available
                   depending on varying tasks in different contexts is of primary
                   importance for any efficient task completion. Implementing this
                   demand requires the content based processing of texts, which enables
                   to reconstruct or, if necessary, to explore the relationship of
                   task, context and document. Text mining is a technology that is
                   suitable for solving problems of this kind. In the following,
                   semiotic aspects of text mining are investigated. Based on the
                   primary object of text mining - natural language lexis - the specific
                   complexity of this class of signs is outlined and requirements
                   for the implementation of text mining procedures are derived.
                   This is done with reference to text linkage introduced as a special
                   task in text mining. Text linkage refers to the exploration of
                   implicit, content based relations of texts (and their annotation
                   as typed links in corpora possibly organised as hypertexts). In
                   this context, the term systemic functional hypertext is introduced,
                   which distinguishes genre and register layers for the management
                   of links in a poly-level hypertext system},
      pdf       = {https://www.texttechnologylab.org/wp-content/uploads/2015/08/Mehler_AJIS-2001.pdf},
      website   = {http://journal.acs.org.au/index.php/ajis/article/view/249/220},
      year      = {2001}
    }
    Alexander Mehler. 2001. Textbedeutung. Zur prozeduralen Analyse und Repräsentation struktureller Ähnlichkeiten von Texten / Text Meaning – Procedural Analysis and Representation of Structural Similarities of Texts. Computer Studies in Language and Speech, 5. Peter Lang. Zugl. Diss. Univ. Trier.
    BibTeX
    @book{Mehler:2001:a,
      author    = {Mehler, Alexander},
      title     = {Textbedeutung. Zur prozeduralen Analyse und Repr{\"a}sentation
                   struktureller {\"A}hnlichkeiten von Texten / Text Meaning – Procedural
                   Analysis and Representation of Structural Similarities of Texts},
      publisher = {Peter Lang},
      volume    = {5},
      series    = {Computer Studies in Language and Speech},
      address   = {Frankfurt a. M.},
      note      = {Zugl. Diss. Univ. Trier},
      image     = {https://www.texttechnologylab.org/wp-content/uploads/2015/09/38648_cover_front.jpg},
      pagetotal = {401},
      website   = {https://www.peterlang.com/view/product/39259?tab=toc&format=PBK},
      year      = {2001}
    }
    Alexander Mehler and Rodney Clarke. 2001. Systemic Functional Hypertexts (SFHT): Modeling Contexts in Hypertexts. Organizational Semiotics. Evolving a Science of Information Systems, 153–170.
    BibTeX
    @incollection{Mehler:Clarke:2001,
      author    = {Mehler, Alexander and Clarke, Rodney},
      title     = {Systemic Functional Hypertexts (SFHT): Modeling Contexts in Hypertexts},
      booktitle = {Organizational Semiotics. Evolving a Science of Information Systems},
      publisher = {Kluwer},
      editor    = {Liu, Kecheng and Clarke, Rodney J. and Andersen, Peter B. and Stamper, Ronald K.},
      pages     = {153-170},
      address   = {Boston},
      abstract  = {IFIP TC8 / WG8.1 Working Conference on Organizational Semiotics.
                   July 23-25, 2001, Montreal, Canada},
      website   = {http://link.springer.com/chapter/10.1007/978-0-387-35611-2_10},
      year      = {2001}
    }

    1999

    Rodney Clarke and Alexander Mehler. 1999. Theorising Print Media in Contexts: A Systemic Semiotic Contribution to Computational Semiotics. Proceedings of the 7th International Congress of the IASS-AIS: International Association for Semiotic Studies – Sign Processes in Complex Systems, Dresden, University of Technology, October 6-11.
    BibTeX
    @inproceedings{Clarke:Mehler:1999,
      author    = {Clarke, Rodney and Mehler, Alexander},
      title     = {Theorising Print Media in Contexts: A Systemic Semiotic Contribution
                   to Computational Semiotics},
      booktitle = {Proceedings of the 7th International Congress of the IASS-AIS:
                   International Association for Semiotic Studies – Sign Processes
                   in Complex Systems, Dresden, University of Technology, October
                   6-11},
      year      = {1999}
    }
    Alexander Mehler. 1999. Aspects of Text Semantics in Hypertext. Returning to our Diverse Roots. Proceedings of the 10th ACM Conference on Hypertext and Hypermedia (Hypertext '99), February 21-25, 1999, Technische Universität Darmstadt, 25–26.
    BibTeX
    @inproceedings{Mehler:1999,
      author    = {Mehler, Alexander},
      title     = {Aspects of Text Semantics in Hypertext},
      booktitle = {Returning to our Diverse Roots. Proceedings of the 10th ACM Conference
                   on Hypertext and Hypermedia (Hypertext '99), February 21-25, 1999,
                   Technische Universit{\"a}t Darmstadt},
      editor    = {Tochtermann, Klaus and Westbomke, Jörg and Wiil, Uffe K. and Leggett, John J.},
      pages     = {25-26},
      address   = {New York},
      publisher = {ACM Press},
      pdf       = {{http://dl.acm.org/ft_gateway.cfm?id=294477&ftid=30049&dwn=1&CFID=722943569&CFTOKEN=97409508}},
      website   = {http://dl.acm.org/citation.cfm?id=294477},
      year      = {1999}
    }

    1998

    Alexander Mehler. 1998. Toward Computational Aspects of Text Semiotics. Proceedings of the 1998 Joint Conference of IEEE ISIC, IEEE CIRA, and ISAS on the Science and Technology of Intelligent Systems, September 14-17, 1998, NIST, Gaithersburg, USA, 807–813.
    BibTeX
    @inproceedings{Mehler:1998,
      author    = {Mehler, Alexander},
      title     = {Toward Computational Aspects of Text Semiotics},
      booktitle = {Proceedings of the 1998 Joint Conference of IEEE ISIC, IEEE CIRA,
                   and ISAS on the Science and Technology of Intelligent Systems,
                   September 14-17, 1998, NIST, Gaithersburg, USA},
      editor    = {Albus, James and Meystel, Alex},
      pages     = {807-813},
      address   = {Gaithersburg},
      publisher = {IEEE},
      website   = {http://www.researchgate.net/publication/3766784_Toward_computational_aspects_of_text_semiotics},
      year      = {1998}
    }

    1996

    Alexander Mehler. 1996. A Multiresolutional Approach to Fuzzy Text Meaning. Journal of Quantitative Linguistics, 3(2):113–127.
    BibTeX
    @article{Mehler:1996:b,
      author    = {Mehler, Alexander},
      title     = {A Multiresolutional Approach to Fuzzy Text Meaning},
      journal   = {Journal of Quantitative Linguistics},
      volume    = {3},
      number    = {2},
      pages     = {113-127},
      abstract  = {In diesem Beitrag beschreiben wir den eHumanities Desktop3. Es
                   handelt sich dabei um eine rein webbasierte Umgebung für die texttechnologische
                   Arbeit mit Korpora, welche von der standardisierten Repr{\"a}sentation
                   textueller Einheiten über deren computerlinguistische Vorverarbeitung
                   bis hin zu Text Mining–Funktionalit{\"a}ten eine gro{\ss}e Zahl
                   von Werkzeugen integriert. Diese Integrationsleistung betrifft
                   neben den Textkorpora und den hierauf operierenden texttechnologischen
                   Werkzeugen auch die je zum Einsatz kommenden lexikalischen Ressourcen.
                   Aus dem Blickwinkel der geisteswissenschaftlichen Fachinformatik
                   gesprochen fokussiert der Desktop somit darauf, eine Vielzahl
                   heterogener sprachlicher Ressourcen mit grundlegenden texttechnologischen
                   Methoden zu integrieren, und zwar so, dass das Integrationsresultat
                   auch in den H{\"a}nden von Nicht–Texttechnologen handhabbar bleibt.
                   Wir exemplifizieren diese Handhabung an einem Beispiel aus der
                   historischen Semantik, und damit an einem Bereich, der erst in
                   jüngerer Zeit durch die Texttechnologie erschlossen wird.},
      year      = {1996}
    }
    Alexander Mehler. 1996. A Multiresolutional Approach to Fuzzy Text Meaning – a First Attempt. Proceedings of the 1996 International Multidisciplinary Conference on Intelligent Systems: A Semiotic Perspective, Gaithersburg, Maryland, October 20-23, I:261–273.
    BibTeX
    @inproceedings{Mehler:1996:a,
      author    = {Mehler, Alexander},
      title     = {A Multiresolutional Approach to Fuzzy Text Meaning -- a First Attempt},
      booktitle = {Proceedings of the 1996 International Multidisciplinary Conference
                   on Intelligent Systems: A Semiotic Perspective, Gaithersburg,
                   Maryland, October 20-23},
      editor    = {Albus, James and Meystel, Alex and Quintero, Richard},
      volume    = {I},
      pages     = {261-273},
      address   = {Gaithersburg},
      publisher = {National Institute of Standards and Technology (NIST)},
      year      = {1996}
    }