AWS Public Data Sets A data set containing Google Books n-gram corpora. This data set is freely available on Amazon S3 in a Hadoop friendly file format and is licensed under a Creative Commons Attribution 3.0 Unported License. The original dataset is available from Last Modified: Jan 12, 2015 21:46 PM GMT High resolution climate data to help assess the impacts of climate change primarily on agriculture. These open access datasets of climate projections will help researchers make climate change impact assessments.
MAMA - Dev.Opera By Brian Wilson MAMA: What is the Web made of? The Web has search engines—many of them. New York Times APIs The Article Search API Search Times articles from 1851 to today, retrieving headlines, abstracts and links to associated multimedia. The Books API Retrieve New York Times book reviews and get data from all best-seller lists. The Campaign Finance API Tf-Idf and Cosine similarity In the year 1998 Google handled 9800 average search queries every day. In 2012 this number shot up to 5.13 billion average searches per day. The graph given below shows this astronomical growth.
Data Repository List From Open Access Directory This list is part of the Open Access Directory. This is a list of repositories and databases for open data. Dot products Next: Queries as vectors Up: The vector space model Previous: The vector space model Contents Index We denote by the vector derived from document , with one component in the vector for each dictionary term. Purdue D2C2 - Projects Scott Brandt S. Brandt (w/ M. Witt and J. Carlson, and others) "Investigating Data Curation Profiles Across Multiple Research Disciplines." Publications scientifiques All publications from Proxem are available on HAL by Inria, the French open-access repository for scientific publishing. Une approche paresseuse de l’analyse sémantique ou comment construire une interface syntaxe-sémantique à partir d’exemples TALN 2010, MontréalThis article shows how to extract a syntax-semantics interface starting from an interchangeable dependency parser, many lexical resources and from samples associated with the semantic representations which one wishes to compute.
10 Awesome Twitter Analytics and Visualization Tools Recently Twitter rolled out their native analytics platform for all users and now you can get some quality data about your tweets directly from Twitter. After researching over a thousand Twitter Tools for the Twitter Tools Book I came across many Twitter analytics and visualization tools. These Twitter tools were designed to add value by presenting a different way to visualize or analyze your tweets, the people in your network, and the tweets from the people in your network. Many tools tried to add value and failed. At least they tried. Funding Support (Grant) Information in MEDLINE/PubMed An important aspect of scientific publication is the indication of funding support. The Medical Subject Headings (MeSH®), which is used by the National Library of Medicine® (NLM®) to describe the content of journal articles for MEDLINE®, includes Publication Types to identify financial support of the research that resulted in the published papers when that support is mentioned in the articles: Research Support, Non-U.S. Gov't Research Support, American Recovery & Reinvestment Act Research Support, U.S.
24 Data Science Resources to Keep Your Finger on the Pulse There are lots of resources out there to learn about, or to build upon what you already know about, data science. But where do you start? What are some of the best or most authoritative sources? Here are some websites, books, and other resources that we think are outstanding.