Wikidition

Wikiditions are online editions of text corpora and associated lexica based on a wiki. Each token of a text is lemmatized, tagged and linked to a syntactic word of a lexicon which is also part of a Wikidition. Furthermore several similarity measures are implemented which provide links to similar texts, sentences lemmas and syntactic words.

Name Language Texts Publication
Capitularies Wiki Latin 307 [1]
Kafka Wiki German 2 [2]
[1] [doi] A. Mehler, R. Gleim, T. vor der Brück, W. Hemati, T. Uslu, and S. Eger, “Wikidition: Automatic Lexiconization and Linkification of Text Corpora,” Information Technology, pp. 70-79, 2016.
[BibTeX]
@Article{Mehler:et:al:2016,
  Author         = {Alexander Mehler and Rüdiger Gleim and Tim vor der
                   Brück and Wahed Hemati and Tolga Uslu and Steffen Eger},
  Title          = {Wikidition: Automatic Lexiconization and
                   Linkification of Text Corpora},
  Journal        = {Information Technology},
  Pages          = {70-79},
  abstract       = {We introduce a new text technology, called Wikidition,
which automatically generates large scale editions of
corpora of natural language texts. Wikidition combines
a wide range of text mining tools for automatically
linking lexical, sentential and textual units. This
includes the extraction of corpus-specific lexica down
to the level of syntactic words and their grammatical
categories. To this end, we introduce a novel measure
of text reuse and exemplify Wikidition by means of the
capitularies, that is, a corpus of Medieval Latin
texts.},
  doi            = {10.1515/itit-2015-0035},
  year           = 2016
}
[2] A. Mehler, B. Wagner, and R. Gleim, “Wikidition: Towards A Multi-layer Network Model of Intertextuality,” in Proceedings of DH 2016, 12-16 July, 2016.
[BibTeX]
@InProceedings{Mehler:Wagner:Gleim:2016,
  Author         = {Mehler, Alexander and Wagner, Benno and Gleim,
                   R\"{u}diger},
  Title          = {Wikidition: Towards A Multi-layer Network Model of
                   Intertextuality},
  BookTitle      = {Proceedings of DH 2016, 12-16 July},
  Series         = {DH 2016},
  abstract       = {The paper presents Wikidition, a novel text mining
tool for generating online editions of text corpora. It
explores lexical, sentential and textual relations to
span multi-layer networks (linkification) that allow
for browsing syntagmatic and paradigmatic relations
among the constituents of its input texts. In this way,
relations of text reuse can be explored together with
lexical relations within the same literary memory
information system. Beyond that, Wikidition contains a
module for automatic lexiconisation to extract author
specific vocabularies. Based on linkification and
lexiconisation, Wikidition does not only allow for
traversing input corpora on different (lexical,
sentential and textual) levels. Rather, its readers can
also study the vocabulary of authors on several levels
of resolution including superlemmas, lemmas, syntactic
words and wordforms. We exemplify Wikidition by a range
of literary texts and evaluate it by means of the
apparatus of quantitative network analysis.},
  location       = {Kraków},
  url            = {http://dh2016.adho.org/abstracts/250},
  year           = 2016
}