background preloader

Text Analysis

Facebook Twitter

Laurence Anthony's Software. WebParaNews A web-based interface to the Japanese-English News Articles database of the National Institute of Information and Communications Technology (NICT).

Laurence Anthony's Software

WebParaNews is developed in collaboration with Kiyomi CHUJO (Nihon University, Japan). The WebParaNews Concordancer A web-based interface to the ENEJE (English Native Edited Japanese Essays) Parallel Corpus. The ENEJE Parallel Corpus is developed by Laurence ANTHONY (Waseda University, Japan) in collaboration with Nozomi MIKI (Komazawa University, Japan). The ENEJE Concordancer A web-based interface to the EXEMPRAES (Exemplary Empirical Research Articles in English and Spanish) Corpus.

To be released shortly. A (commercial) learning environment for teaching grammar and vocabulary to EFL learners in Japan. A Statistical Analysis of the Work of Bob Ross. Bob Ross was a consummate teacher.

A Statistical Analysis of the Work of Bob Ross

He guided fans along as he painted “happy trees,” “almighty mountains” and “fluffy clouds” over the course of his 11-year television career on his PBS show, “The Joy of Painting.” In total, Ross painted 381 works on the show, relying on a distinct set of elements, scenes and themes, and thereby providing thousands of data points. I decided to use that data to teach something myself: the important statistical concepts of conditional probability and clustering, as well as a lesson on the limitations of data. So let’s perm out our hair and get ready to create some happy spreadsheets! More Culture What I found — through data analysis and an interview with one of Ross’s closest collaborators — was a body of work that was defined by consistency and a fundamentally personal ideal. Stanford Literary Lab. Humanities Data in R.

Discourse analysis. Discourse analysis (DA), or discourse studies, is a general term for a number of approaches to analyze written, vocal, or sign language use, or any significant semiotic event.

Discourse analysis

Text analysis, wordcount, keyword density analyzer, prominence analysis. Statistical Methods for Studying Literature Using R. R is a powerful programing language for statistical analysis and visualization that can be broadly used for many applications in the digital humanities.

Statistical Methods for Studying Literature Using R

As with any programming language, getting started with R involves a steep initial learning curve in order to produce useful results. In its current form, this blog contains the notes from a hands-on workshop that I initially ran at the University of Kansas's Digital Humanities Forum/THATCamp Representing Knowledge in the Digital Humanities in September of 2011 and expanded with a more literary focus at the (University of Kansas 2012 Digital Humanities Forum). It was further revised for an additional workshop at the University of Iowa Oberman Center for Advanced Study in the fall of 2014. The examples are based on three different data sets. Intro To Text Analysis With R. Guest post by Christopher Johnson from One of the most powerful aspects of using R is that you can download free packages for so many tools and types of analysis.

Intro To Text Analysis With R

Text analysis is still somewhat in its infancy, but is very promising. It is estimated that as much as 80% of the world’s data is unstructured, while most types of analysis only work with structured data. In this paper, we will explore the potential of R packages to analyze unstructured text. R provides two packages for working with unstructured text – TM and Sentiment. Install.packages("devtools") require(devtools) install_url(" install_url(" install_url(" The remaining required packaged can be installed as follows. install.packages("plyr") » Text Analysis with R for Students of Literature Matthew L. Jockers. Text Analysis with R for Students of Literature provides a practical introduction to computational text analysis using the open source programming language R.

» Text Analysis with R for Students of Literature Matthew L. Jockers

Readers begin working with text right away and each chapter works through a new technique or process such that readers gain a broad exposure to core R procedures and a basic understanding of the possibilities of computational text analysis at both the micro and macro scale. View the Book Flyer [pdf 1.4MB] Introduction to the RStudio Programming Environment [Video]. Text Analyzer - Text analysis Tool - Counts Frequencies of Words, Characters, Sentences and Syllables. TAPoR: Text Analysis Tools. Romancing the Novel: Large Scale Text Analysis in the Humanities (by Mark Algee-Hewitt) Large-Scale Text Analysis with R - HILT 2015. Text mining, the practice of using computational and statistical analysis on large collections of digitized text, is becoming an increasingly important way of extracting meaning from writing.

Large-Scale Text Analysis with R - HILT 2015

Whether working on survey data, medical records, political speeches or even digitized collections of historical writing, we are now able to use the power of computational algorithms to extract patterns from vast quantities of textual data. This technique gives us information we could never access by simply reading the texts. But determining which patterns have meaning and which answer key questions about our data is a difficult task, both conceptually and methodologically; particularly for those who work in the humanities who are able to benefit the most from these methods. Searching; Visualized: “The Book History Bibliograph” Tagged with: collaboration, database, libraries, literary topology, text analysis, virtual environments Posted in Cultural Archives & Curation, Knowledge Environments, Major Projects, Projects, Research, Tools Research, Visualization With the increased interest in the material aspects of the book, the field of book history has seen rapid expansion in the past twenty years.

Searching; Visualized: “The Book History Bibliograph”

An extremely broad area of research, the problem of finding sources in multiple languages and disciplines has been of continuing concern. “The Book History Bibliograph”, a bibliographic tool currently under development between Stanford and the University of Edinburgh, with input from McGill, proposes creative solutions to cross-disciplinary and multi-lingual searching. Under the direction of Dr. Tom Mole (University of Edinburgh), the SSHRC-funded project is one of many initiatives supported by the “Interacting with Print” group at McGill. Currently in “beta”, the Bibliograph database contains about 500 sources; Dr. Automated Data Collection with R: A Practical Guide to Web Scraping and Text ... - Simon Munzert, Christian Rubba, Peter Meißner, Dominic Nyhuis - Google Books.

Text Analysis with R for Students of Literature. From the book reviews: “This is a well written book on the topic of Text Analysis.

Text Analysis with R for Students of Literature

There is enough information to give you a good start using R. Followed by easy to understand details about text analysis. … This is a good book to have if you are doing text analysis.” (Mary Anne, Cats and Dogs with Data,, August, 2014) “A remarkably well-crafted book that will allow students to get a quick start and progress toward quite sophisticated text mining tasks. … exercises provided at the end of each chapter, with solutions at the end of the book, should serve well to help students solidify their knowledge and gain more confidence in their text mining skills. … a great addition to the libraries of digital humanists and natural language enthusiasts who wish to expand their programming literacy … .”