Tools for Exploring Text: Natural Language Processing Natural language processing (NLP), also known as computational linguistics, is a set of models and techniques for analyzing text computationally. In the context of the digital humanities, it can help take a question that a literary scholar or historian might ask of a body of text, and help turn it into a quantitative hypothesis. In a previous post, I talked about how visualization can be used to get a sense of text; this is the next in the series. Throughout this post, we’ll try to answer a hypothetical question a scholar in the humanities, perhaps a literary scholar or historian, might be interested in: “How is the character Mary talked about in this novel or historical text?
statistics by Terry M. Therneau Ph.D.Faculty, Mayo Clinic About a year ago there was a query about how to do "type 3" tests for a Cox model on the R help list, which someone wanted because SAS does it. HackNTU 臺大黑客松年度小聚 - 活動通 Accupass 每一個人心中都有一個英雄，都想要用自己的力量讓這個世界變得更好！ Hacker 運用技術做出豐富人們生活的產品； Entrepreneur 建立商業模式讓好產品被大家所用。而當 Hacker 遇上 Entrepreneur 這個世界即將翻轉！ Data Visualizations: 11 Ways To Bring Analytics To Life Data visualizations, used well, can help people make sense of large, complex data. Learn how data visualizations are changing and best practices for making the most of them. 1 of 12 User demand for interactive data visualization is driving an evolution in the way that complex information is presented.
A gentle introduction to historical data analysis It's surprisingly easy to use tools to explore texts and greatly improve research efficiency and open new research doors. The following techniques are incredibly useful for a small to intermediate amount of text. These techniques do not scale up to handle huge amounts of data, but then again most historians don't work with huge amounts of data. One example is using Voyant to explore a single text or set of texts. Our top 10 Data Science articles in 2014 2014 has been a year of growth for us. We now get 10x traffic compared to what we used to get 12 months back. It gives us immense satisfaction to be able to create something which is helping more and more people every day.
San Francisco Bay Area Professional Chapter Toyota InfoTechnology Center, U.S.A., Inc. (Toyota-ITC) together with ACM, announces sponsorship of Toyota ACM Quantified-Car Hackathon. Come join this competition to create apps that utilize the API for vehicle data on either iOS or Android platforms. At these events, Toyota-ITC will provide qualified developers with access to Toyota's vehicle data streaming API for the first time in the industry, in the US.The event will be preceded by API roll-out and idea brainstorming sessions (Ideathons). Here you can learn about and discuss the API, generate app ideas and form teams. Step 1: Learn about the API
Open data summer showcase ODI commissions projects demonstrating the impact of open data Do you have a great idea for an open data project but need help to get it off the ground? Are you working on a project that you would like to promote more widely? As part of its open data summer showcase, the ODI is looking for great examples of open data impact, to support their development and promote them to a global audience. Successful projects will showcase tangible economic, social or environmental impacts. Where to start with text mining. This post is an outline of discussion topics I’m proposing for a workshop at NASSR2012 (a conference of Romanticists). I’m putting it on the blog since some of the links might be useful for a broader audience. In the morning I’ll give a few examples of concrete literary results produced by text mining.
Must read books for Analysts (or people interested in Analytics) One of the ways I continue my learning is reading. I read for 30 minutes before hitting the bed every day. This not only makes sure that I learn some thing daily, but also ends my day in a fulfilling manner. Over the years, I have read a variety of books on various subjects. In this article, I will share a list of 7 must read books, which I think should be present in every Analyst’s bookshelf. I believe that each and every book listed below has helped me learn about Analytics and expect that they will immensely help people who want to learn about this field.
Data.gov.uk To Go Data.gov.uk proudly announces: 'Data.gov.uk To Go' - a package containing the software for the well-known open data website Data.gov.uk. This allows other governments and open data communities to quickly install and customize a full open data website and develop it further in partnership with the worldwide community. Although the central components CKAN, Drupal and data.gov.uk's custom components have each been open source for several years, few have used them in combination, due to the complexities of set-up. Data.gov.uk To Go provides an organized way to configure these components to quickly launch a fully-featured open data portal along the lines of data.gov.uk. Data.gov.uk To Go builds on the CKAN's data catalogue to provide extra features such as:
Designing Data-Driven Interfaces — Truth Labs “Dashboard”, “Big Data”, “Data visualization”, “Analytics” — there’s been an explosion of people and companies looking to do interesting things with their data. I've been lucky to work on dozens of data-heavy interfaces throughout my career and I wanted to share some thoughts on how to arrive at a distinct and meaningful product. Many people have already tackled this topic, so I'm going try and stick to the parts of our process that have the most impact. 1. Different users, different data Whenever you're designing complex systems there will inevitably be multiple users or personas to design for. Data mining Data mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Etymology In the 1960s, statisticians used terms like "Data Fishing" or "Data Dredging" to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. (4) Modeling