background preloader

Stylometry

Stylometry
Stylometry is often used to attribute authorship to anonymous or disputed documents. It has legal as well as academic and literary applications, ranging from the question of the authorship of Shakespeare's works to forensic linguistics. History[edit] Stylometry grew out of earlier techniques of analyzing texts for evidence of authenticity, authorial identity, and other questions. The basics of stylometry were set out by Polish philosopher Wincenty Lutosławski in Principes de stylométrie (1890). Methods[edit] Modern stylometry draws heavily on the aid of computers for statistical analysis, artificial intelligence and access to the growing corpus of texts available via the Internet. Whereas in the past, stylometry emphasized the rarest or most striking elements of a text, contemporary techniques can isolate identifying patterns even in common parts of speech. Writer invariant[edit] The primary stylometric method is the writer invariant: a property of a text which is invariant of its author. Related:  Text Analytics

Graphing the history of philosophy « Drunks&Lampposts A close up of ancient and medieval philosophy ending at Descartes and Leibniz If you are interested in this data set you might like my latest post where I use it to make book recommendations. This one came about because I was searching for a data set on horror films (don’t ask) and ended up with one describing the links between philosophers. To cut a long story very short I’ve extracted the information in the influenced by section for every philosopher on Wikipedia and used it to construct a network which I’ve then visualised using gephi It’s an easy process to repeat. First I’ll show why I think it’s worked as a visualisation. Each philosopher is a node in the network and the lines between them (or edges in the terminology of graph theory) represents lines of influence. It gets more interesting when we use Gephi to identify communities (or modules) within the network. It has been fairly successful. The Continental Tradition The graph is probably most insightful when you zoom in close.

il nudo femminile sdraiato dall’antichità ai nostri giorni (Achille della Ragione) Il prototipo del nudo femminile sdraiato viene generalmente fatto risalire al Giorgione, anche se già nel I secolo d.C. viene realizzata, da un ignoto artista romano, una Venere marina circondata da due amorini su di una parete del peristilio in una casa patrizia di Pompei. Purtroppo una rovinosa eruzione cancellerà dalla memoria degli uomini per circa duemila anni la splendida dea dell’amore ed il suo giovane corpo nudo e ne vieterà la visione. Quando sarà diseppelita gli artisti avranno di nuovo creato quell’immagine poderosa in grado di scuotere il torpore e di accendere la fantasia e da allora non si sono più fermati. fig.00 Ignoto artista romano Venere marina I secolo d.C. Il Cinquecento inaugura la spettacolare serie delle Veneri nude con la più sensuale e misteriosa delle creazioni del Giorgione, la Venere dormiente (fig 1) il quale, nel 1509, ci fa dono dell’ immagine immortale di una placida fanciulla che sogna e ci fa sognare. fig.03 Michelangelo La notte part. Ottocento

Analysis Jean Lievens: Wikinomics Model for Value of Open Data Categories: Analysis,Architecture,Balance,Citizen-Centered,Data,Design,Graphics,ICT-IT,Knowledge,Policies-Harmonization,Processing,Strategy-Holistic Coherence Jean Lievens A visual model showing the value of open data Prof. Visualize Business Models I bought the book Business Model Generation: A Handbook for Visionaries, Game Changers, and Challengers [72 slides free online at SlideShare] by Alexander Osterwalder. Second, the book itself has a new business model. Value Model of Open Data Read the rest of this entry » Apr 8 Graphic: Four Forces After Next with IO Updated Categories: Analysis,Balance,Budgets & Funding,Capabilities-Force Structure,ICT-IT,Multinational Plus,Policies-Harmonization,Strategy-Holistic Coherence,Threats,Tribes,True Cost Click on Image to Enlarge Citation: Robert David Steele, “Graphic: Four Forces After Next with IO Updated,” Phi Beta Iota Public Intelligence Blog (3 April 2013). Apr 3 See Also: Mar 26 Original Source

Linguistics and the Book of Mormon According to most adherents of the Latter Day Saint movement, the Book of Mormon is a 19th-century translation of a record of ancient inhabitants of the American continent, which was written in a script which the book refers to as "reformed Egyptian."[1][2][3][4][5] This claim, as well as virtually all claims to historical authenticity of the Book of Mormon, are generally rejected by non–Latter Day Saint historians and scientists.[6][7][8][9][10] Linguistically based assertions are frequently cited and discussed in the context of the subject of the Book of Mormon, both in favor of and against the book's claimed origins. Both critics and promoters of the Book of Mormon have used linguistic methods to analyze the text. The problem with linguistic reviews of the Book of Mormon is that the claimed original text is either unavailable for study or never existed. Native American language-development[edit] In 1922, LDS Church general authority B. Linguistic anachronisms[edit] Critics[who?]

The Signature Stylometric System The aim of this website is to highlight the many strong links between Philosophy and Computing, for the benefit of students of both disciplines: For students of Philosophy who are seeking ways into formal Computing, learning by discovery about programming, how computers work, language processing, artificial intelligence, and even conducting computerised thought experiments on philosophically interesting problems such as the evolution of co-operative behaviour. For students of Computing who are keen to see how their technical abilities can be applied to intellectually exciting and philosophically challenging problems. The links along the top of these web pages lead to the main sections of the website (click here for the next page in the "Home" section). This website is under development by Peter Millican, Fellow and Professor of Philosophy at Hertford College, Oxford University, who previously taught both Philosophy and Computing for 20 years at the University of Leeds.

JGAAP DiscoverText - A Text Analytic Toolkit for eDiscovery and Research Concordance: software for concordancing and text analysis Lexico Web Page (downloadable app for the PC) Cédric Lamalle, William Martinez, Serge Fleury, André Salem Lexico3 est réalisé par l’équipe universitaire SYLED-CLA2T. Ce logiciel fait l’objet d’une diffusion commerciale. English Version Tutorial LEXICO Animation Quicktime : Démo Lexico3 Explorations textométriques Nous rassemblons actuellement plusieurs compte-rendus d'expériences réalisées avec les logiciels de la famille Lexico au cours de nombreuses recherches et dans le cadre de collaborations diverses. Ouvrage de référence Lebart, L. & Salem, A. (1994). Glossaire de Statistiques Textuelles Sur le WIKI du Groupe d'Analyse des Données Textuelles - Format des données :

untitled (Ludovic Lebart et André Salem) Préface de Christian Baudelot Chapitre 0 : Préface, Sommaire, Avant Propos, Introduction (format pdf) Chapitre 1 : Domaines et problèmes (format pdf) Le premier chapitre, Domaines et problèmes, évoque à la fois : les domaines disciplinaires concernés (linguistique, statistique, informatique), les problèmes et les approches. Chapitre 2 : Les unités de la statistique textuelle (format pdf) Le second chapitre, Les unités de la statistique textuelle, est consacré à l'étude des unités statistiques que les programmes lexicométriques devront découper ou reconnaître (formes, segments répétés). Chapitre 3 : L'analyse des correspondances (format pdf) Chapitre 4 : La classification automatique des formes et des textes (format pdf) Chapitre 5 : Typologies, visualisations (format pdf) Le cinquième chapitre : Typologies, visualisations, applique les outils présentés aux chapitres trois et quatre à la description des associations entre formes et entre catégories. (format pdf)

Semantic Search Engine and Text Analysis BigSee < Main < WikiTADA This page is for the SHARCNET and TAPoR text visualization project. Note that it is a work in progress as this is an ongoing project. At the University of Alberta we picked up the project and gave a paper at the Chicago Colloquium on Digital Humanities and Computer Science with the title | The Big See: Large Scale Visualization. The Big See is an experiment in high performance text visualization. We are looking at how a text or corpus of texts could be represented if processing and the resolution of the display were not an issue. Most text visualizations, like word clouds and distribution graphs, are designed for the personal computer screen. Project Goals This project imagines possible paradigms for the visual representation of a text that could scale up to very high resolution displays (data walls), 3D displays, and animated displays. Participants Geoffrey Rockwell is a Professor of Philosophy and Humanities Computing at the University of Alberta. Collocation Graphs in 3D Space Research

Zipf's law Zipf's law /ˈzɪf/, an empirical law formulated using mathematical statistics, refers to the fact that many types of data studied in the physical and social sciences can be approximated with a Zipfian distribution, one of a family of related discrete power law probability distributions. The law is named after the American linguist George Kingsley Zipf (1902–1950), who first proposed it (Zipf 1935, 1949), though the French stenographer Jean-Baptiste Estoup (1868–1950) appears to have noticed the regularity before Zipf.[1] It was also noted in 1913 by German physicist Felix Auerbach[2] (1856–1933). Motivation[edit] Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc. Theoretical review[edit] Formally, let: Related laws[edit]

Related: