
dataset
As useful as the Twitter API is, developers, designers, and researchers have long clamored for more than the trickle of data that service currently allows. We agree — some of the sexiest uses of data require processing not just all that is now, but the vast historical record. Twitter doesn’t provide the only use case for this , but until now its historical bulk data has been hard to find. Today we are publishing a few items collected from our large scrape of Twitter’s API. The data was collected, cleaned, and packaged over twelve months and contains almost the entire history of Twitter: 35 million users, one billion relationships, and half a billion Tweets, reaching back to March 2006. The initial datasets are a part of our Twitter Census collection.
Twitter Census: Publishing the First of Many Datasets | blog.inf
Bixo Labs has merged with Scale Unlimited and is now providing complete consulting and training services for a wide range of big data problems, including web crawling, data mining and search. Why Did Bixo Labs Merge With Scale Unlimited? During client engagements, we repeatedly saw the need for mentoring and training to ensure a smooth hand-off of projects to internal team members.

