background preloader

Twitter

Facebook Twitter

A list of datasets for opinion mining in Twitter. Are there any twitter corpora available on the web for text/opinion mining research? Thanks, Finn. Regards, Richard I maintain a list here: Here is another one which might be of use: Hi Richard, There are quite few datasets manually annotated for sentiment analysis such as:- Stanford Twitter Corpus: HCR and OMD datasets: Sentiment Strength Corpora: Sanders: SemEval: Cheers Thanks a lot, Mathias.

Hello Richard, Twitter Sentiment Analysis Training Corpus (Dataset) | Thinknook. An essential part of creating a Sentiment Analysis algorithm (or any Data Mining algorithm for that matter) is to have a comprehensive dataset or corpus to learn from, as well as a test dataset to ensure that the accuracy of your algorithm meets the standards you expect. This will also allow you to tweak your algorithm and deduce better (or more precise) features of natural language that you could extract from the text that contribute towards stronger sentiment classification, rather than using a generic “word bag” approach. This post will contain a corpus of already classified tweets in terms of sentiment, this Twitter sentiment dataset is by no means diverse and should not be used in a final product for sentiment analysis, at least not without diluting the dataset with a much more diverse one.

The dataset is based on data from the following two sources: I really hate Apple and like Samsung. Data < Sentiment Analysis in Twitter. Twitter-as-a-Corpus-for-Sentiment-Analysis-and-Opinion-Mining.pdf. Public domain twitter sentiment corpus.