Architecting the future of big data

Data Platform Not only open-source, but built in the open. HDP demonstrates our commitment to growing Hadoop and it’s sub-projects with the community and completely in the open. HDP is assembled entirely of projects built through the Apache Software Foundation. How is this different from open-source, and why is it so important? About Kaggle and Crowdsourcing Data Modeling Kaggle is the world's largest community of data scientists. They compete with each other to solve complex data science problems, and the top competitors are invited to work on the most interesting and sensitive business problems from some of the world’s biggest companies through Masters competitions. Kaggle provides cutting-edge data science results to companies of all sizes. We have a proven track-record of solving real-world problems across a diverse array of industries including life sciences, financial services, energy, information technology, and retail. Read more about our solutions »

Latest As I mentioned in my previous post, our collaboration with the Sabeti Lab is aimed at creating new visual exploration tools to help researchers, doctors, and clinicians discover patterns and associations in large health and epidemiological datasets. These tools will be the first step in a hypothesis-generation process, combining intuition from expert users with visualization techniques and automated algorithms, allowing users to quickly test hypothesis that are “suggested” by the data itself. Researchers and doctors have a deep familiarity with their data and often can tell immediately when a new pattern is potentially interesting or simply the result of noise. Visualization techniques will help articulate their knowledge to a wider audience.

The DataSift Platform Social data is noisy. Whether you’re trying to social analyze trends within an industry, or mentions of your products or brands, you need a platform that can filter out the noise and allow you to focus on the data that’s most relevant to you. This is especially important when you are paying for the social data you receive. At the heart of the DataSift platform is a high-performance filtering engine with which you can find the exact content and conversations that are relevant to your business.

Infographic Of The Day: Bloomberg And Frog Turn Raw Data Into Branding Bloomberg is a sprawling, multi billion-dollar enterprise, which creates a distinct problem if you’re trying to explain what the company actually does. They do lots of things, ranging from law research to sports research for team managers to, of course, stock-market data crunching. "Many people have a single association with Bloomberg, as a wire service or a market-data provider," says Jen Walsh, Bloomberg’s head of digital marketing. "We wanted our website to shine a light on other aspects of the business."

Big Data Jobs at Jive Software Jive is on a singular mission to transform the way we work. We’re the first company to bring the social innovation of the consumer web to the enterprise. And in doing so, we’re making work great. We’re breaking down the barriers separating employees, customers, and partners making it possible for the first time to engage socially and genuinely around what matters most to them. That’s a Big Data problem. An Annual Report on One Man's Life Nick Bilton/The New York TimesNicholas Felton and his 2008 annual report. At the end of 2005, Nicholas Felton decided to publish a report that would chronicle his life over the previous year. He looked through his music archives to see how many songs he had listened to.

The Visible Universe, Then and Now Before the telescope was invented in 1608, our picture of the universe consisted of six planets, our moon, the sun and any stars we could see in the Milky Way galaxy. But as our light-gathering capabilities have grown, so too have the boundaries of the visible universe. Our interactive map shows how the known universe has grown from 1950 to 2011. Map « Open Government Data This map is a starting point of mapping information around Open Government Data, initiatives, catalogues, competitions and events. This map is work in progress and not complete or comprehensive. The original source is from our friends at the semantic web company: Here is the original KML file Open Government Data on a Map Mapping Open Government Data BIG DATA: Top Australian Banks Launch Social Media Spending Comparison Sites CommBank’s Signals hub site collates on the debit and credit cards’ spending data of the customers comparing their everyday spending, personal loan payments, savings, and mortgage to others in their demographic. The bank also uses the data to analyse financial trends such as utility costs and state of the real estate market. Every month, it publishes a new Signal that hones a particular issue.

How many pupils from your school go to Oxbridge? Four schools and one sixth-form college sent more pupils to Oxford and Cambridge between them over three years than 2,000 schools and colleges across the UK, according to a new study that analyses university admissions by individual schools. Westminster, Eton, St Pauls, St Pauls Girls School and Hills Road sixth form college, a state school, produced 946 Oxbridge entrants from 2007-09. In the same period, 2,000 schools and colleges sent 927 pupils to Oxbridge. The difference in these schools' success rates is driven mainly by gaps in achievement at A-level, but some schools do better at gaining entry to university than others with similar exam results.

he only vendor which uses 100% open source Apache Hadoop without own (non-open) modifications. Hortonworks is the first vendor to use Apache HCatalog functionality for metadata services. Besides, their Stinger initiative optimizes the Hive project massively. Hortonworks offers a very good, easy-to-use sandbox for getting started. Hortonworks developed and committed enhancements into the core trunk that make Apache Hadoop run natively on the Microsoft Windows platforms including Windows Server and Windows Azure. by sergeykucherov Jul 15