The Five Best Libraries For Building Data Visualizations An explosion in the number of available data sources and data-processing tools means that more people than ever are jumping into the world of data visualization. But with so much to learn, it can be intimidating to know just where to start. So which library is best, and what advice do the pros have? Read on and find out. Like telling the history of personal computers without mentioning Steve Jobs, it’s impossible to talk about data visualization without talking about D3. GET search/tweets Returns a collection of relevant Tweets matching a specified query. Please note that Twitter’s search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets. Not all Tweets will be indexed or made available via the search interface. In API v1.1, the response format of the Search API has been improved to return Tweet objects more similar to the objects you’ll find across the REST API and platform. However, perspectival attributes (fields that pertain to the perspective of the authenticating user) are not currently supported on this endpoint. To learn how to use Twitter Search effectively, consult our guide to Using the Twitter Search API.
Hadoop Gets a Boost - Cloudera Receives $25 Million Funding - ReadWriteCloud Cloudera is announcing $25 million round in funding to further invest in product development and services to support its push into the enterprise. It's a clear sign that the Hadoop ecosystem is thriving as analytics become increasingly important and data more complex to manage. The Series C funding is being led by Meritech Capital Partners. haloop - An modified version of Hadoop to support efficient iterative data processing on large commodity clusters Why do we develop the HaLoop project? The growing demand for large-scale data mining and data analysis applications has led both industry and academia to design new types of highly scalable data-intensive computing platforms. MapReduce and Dryad are two popular platforms in which the dataflow takes the form of a directed acyclic graph of operators. However, these new platforms do not have built-in support for iterative programs, which arise naturally in many applications including data mining, web ranking, graph processing, model fitting, and so on. What is HaLoop?
The DataSift Platform Social data is noisy. Whether you’re trying to social analyze trends within an industry, or mentions of your products or brands, you need a platform that can filter out the noise and allow you to focus on the data that’s most relevant to you. This is especially important when you are paying for the social data you receive. At the heart of the DataSift platform is a high-performance filtering engine with which you can find the exact content and conversations that are relevant to your business. Go beyond keywords and filter on more than over 300 unique fields including author, location, language, and demographics. Filter with greater precision by using advanced search operations including regular expressions, text pattern matching, and substrings to get the most relevant data.
Datasets on Datavisualization.ch Wikileaks US Embassy Cables 29 Nov 2010 Datasets Infographic, Politics Wikileaks began on Sunday November 28th publishing 251,287 leaked United States embassy cables, the largest set of confidential documents ever to be released into the public domain. Data Science Cheat Sheet I promised to write it long ago: here we go! Click on this link to see the most current version. I will update this article regularly.
From Big Data to Big Bicycles: 5 Must-See GigaOM TV Videos: Tech News « Summary: GigaOM TV rock-tobered this month, delivering a bevy of videos that will engage, entertain and quite possibly inspire. From business insights and optimal work habits, to cutting-edge inventions, see what you may have missed. Rocktober was a big month for GigaOM TV. User Guide · OpenRefine/OpenRefine Wiki How to use OpenRefine If you haven't done so already, we strongly suggest you to watch the screencasts first as they will give you an idea of how to use OpenRefine. The Basics First, although OpenRefine might start out looking like a spreadsheet program (Microsoft Excel, Google Spreadsheets, etc.), don't expect it to work like a spreadsheet program.
About Kaggle and Crowdsourcing Data Modeling Kaggle is the world's largest community of data scientists. They compete with each other to solve complex data science problems, and the top competitors are invited to work on the most interesting and sensitive business problems from some of the world’s biggest companies through Masters competitions. Kaggle provides cutting-edge data science results to companies of all sizes.