background preloader

Text analysis

Facebook Twitter

Perl Beginners' Site. PHI Latin Texts. 2013. Articles Archival Liveness: Designing with Collections Before and During Cataloguing and DigitizationTom Schofield, Culture Lab, Newcastle University; David Kirk, Digital Interactions Group, Newcastle University; Telmo Amaral, Digital Interactions Group, Newcastle University; Marian Dörk, Potsdam University of Applied Sciences, Institute for Urban Futures; Mitchell Whitelaw, Faculty of Arts and Design, University of Canberra; Guy Schofield, Digital Interactions Group, Newcastle University; Thomas Ploetz, Digital Interactions Group, Newcastle University We present archival liveness as a concept in design and the Digital Humanities and describe its development within a Research Through Design process.


Working with a newly acquired archive of contemporary poetry we produced designs that both manifested and "geared in to" the temporal rhythms of the work and infrastructure of archiving. Jane, John … Leslie? Correcteur Orthographique de Latin. Correcteur Orthographique de Latin Détails Publié le dimanche 24 mars 2013 20:46 Correcteur Orthographique de LatinCorrettore Ortografico di LatinoCorrector Ortográfico de Latín COL (Correcteur Orthographique pour le Latin) est un outil gratuit offrant une aide à la vérification de l'orthographe d'un texte latin.

Correcteur Orthographique de Latin

Disponible pour Microsoft Word, LibreOffice, et AbiWord, il intègre un dictionnaire d'environ 400.000 formes latines (latin classique et médiéval). Pour correspondre au mieux aux différentes pratiques, COL est paramétrable : l'utilisateur peut notamment choisir une graphie particulière (comment représenter les diphtongues, ou les voyelles u et i lorsqu'elles ont une valeur de consonne). Text Mechanic™ - Text Manipulation Tools. Dtm-Vic / Lebart. Last modified on 08/19/2013 11:42:16 Software DtmVic: Exploratory statistical processing of complex data sets comprising both numerical and textual data.

Dtm-Vic / Lebart

Applications concern primarily the processing of responses to open ended questions in socio-economic sample surveys. - Special emphasis on: Complementary use of visualization techniques (Principal Component Analysis, Two-way and Multiple Correspondence Analysis) and clustering techniques (hybrid method using both hierarchical clustering and k-means technique; Self Organizing Maps (SOM). Alpheios Texts. Digital Humanities 2012. Computational stylistics. Z:perseus-annis [Klafil] Nouns in nominative Here is how I searched for nouns in nominative. case="nominative" & POS="noun" & #1 _=_ #2 There turn out to be 212 annotated Cicero's nominatives.

z:perseus-annis [Klafil]

It seems that the clause & #1 _=_ #2 is obligatory (tried first without it, to no avail), and it seems to mean that the first and the second condition both apply to the same word. There are 105 participles in accusative: case="accusative" & POS="participle" & #1 _=_ #2 How would you search for verbs? Adverbs modifying verbs Find all verbs modified by adverbs. Annis² Corpus Search. Overview - ANNIS2 - a Linguistic Database for Exploring Information Structure. Pede certo. Index Thomisticus Treebank. Croatian Dependency Treebank: homepage. Croatian Dependency Treebank is one of tasks of the project 0130418 "Development of Croatian Language Resources" supported by the Ministry of Science, Education and Sports of the Republic of Croatia. goal To build a syntactically annotated Croatian corpus of at least 100,000 tokens. method Annotation will be based on dependency analysis of sentence from the corpus.

Croatian Dependency Treebank: homepage

Model of syntactic description and annotation is being taken from the Prague Dependency Treebank. Plan. GOLDVARB 2001 Users' Manual. Goldvarb X. The Cultural Heritage Language Technologies Consortium. 1.

The Cultural Heritage Language Technologies Consortium

Introduction For the past three years, the Cultural Heritage Language Technologies consortium [1] – situated at eight institutions in four countries [2] – has received funding from the National Science Foundation and the European Commission International Digital Libraries program to engage in research about the most effective ways to apply technologies and techniques from the fields of computational linguistics, natural language processing, and information retrieval technologies to challenges faced by students and scholars who are working with texts written in Greek, Latin, and Old Norse [3].

A Gentle Introduction to XML. As originally published in previous editions of the Guidelines, this chapter provided a gentle introduction to `just enough' SGML for anyone to understand how the TEI used that standard.

A Gentle Introduction to XML

Since then, the Gentle Guide seems to have taken on a life of its own independent of the Guidelines, having been widely distributed (and flatteringly imitated) on the web. In revising it for the present draft, the editors have therefore felt free to reduce considerably its discussion of SGML-specific matters, in favour of a simple presentation of how the TEI uses XML. The encoding scheme defined by these Guidelines may be formulated either as an application of the ISO Standard Generalized Markup Language (SGML)5 or of the more recently developed W3C Extensible Markup Language (XML)6. Concordance software: MonoConc Pro MP2.2. Tapor Tools Prototype. Association for Computational Linguistics.

V. A Gentle Introduction to XML - TEI P5: — Guidelines for Electronic Text Encoding and Interchange. Strictly speaking, XML is a metalanguage, that is, a language used to describe other languages, in this case, markup languages.

v. A Gentle Introduction to XML - TEI P5: — Guidelines for Electronic Text Encoding and Interchange

Historically, the word markup has been used to describe annotation or other marks within a text intended to instruct a compositor or typist how a particular passage should be printed or laid out. Examples include wavy underlining to indicate boldface, special symbols for passages to be omitted or printed in a particular font, and so forth. As the formatting and printing of texts was automated, the term was extended to cover all sorts of special codes inserted into electronic texts to govern formatting, printing, or other processing. Digital Classicist: index. Collatinus. The Latin and Ancient Greek Dependency Treebanks. The Ancient Greek and Latin Dependency Treebanks are an attempt to create a linguistic genome: a large database of Classical texts where the morphological, syntactic, and lexical information for each sentence has been explicitly encoded.

The Latin and Ancient Greek Dependency Treebanks

The point? To put linguistic research in Greek and Latin on a new quantitative foundation. To help drive a new generation of computational analysis. And above all, to get students and faculty both involved in the production of data that can be useful to the wider scholarly community. XQuery Introduction. XPath Introduction. TextSTAT.