background preloader

Big Data

Facebook Twitter

Log In. The election prediction business is one small aspect of a far-reaching change across industries that have increasingly become obsessed with data, the value of it and the potential to mine it for cost-saving and profit-making insights.

Log In

It is a behind-the-scenes technology that quietly drives everything from the ads that people see online to billion-dollar acquisition deals. Examples stretch from Silicon Valley to the industrial heartland. Microsoft, for example, is paying $26 billion for LinkedIn largely for its database of personal profiles and business connections on more than 400 million people. General Electric, the nation’s largest manufacturer, is betting big that data-generating sensors and software can increase the efficiency and profitability of its jet engines and other machinery. But data science is a technology advance with trade-offs. How mathematics can fight the abuse of big data algorithms. “Is maths creating an unfair society?”

How mathematics can fight the abuse of big data algorithms

That seems to be the question on many people’s lips. The rise of big data and the use of algorithms by organisations has left many blaming mathematics for modern society’s ills – refusing people cheap insurance, giving false credit ratings, or even deciding who to interview for a job. We have been here before. Following the banking crisis of 2008, some argued that it was a mathematical formula that felled Wall Street. The theory goes that the same model that was used to price sub-prime mortgages was used for years to price life assurance policies. Small Data vs. Big Data: Back to the Basics. Small data is data in a volume and format that makes it accessible, informative and actionable.

Small Data vs. Big Data: Back to the Basics

The Small Data Group offers the following explanation: The Mathematical Shape Of Big Science Data. Forbes Welcome. The Problem with Our Data Obsession. A contentious question on the California ballot in 2008 inspired a simple online innovation: a website called Eightmaps.com.

The Problem with Our Data Obsession

The number in the name referred to Proposition 8, which called for the state’s constitution to be amended to prohibit gay marriage. Under California’s campaign finance laws, all donations greater than $100 to groups advocating for or against Proposition 8 were recorded in a publicly accessible database. The Internet, peer-reviewed. It could be one of the most important innovations on the Internet since the browser.

The Internet, peer-reviewed

Imagine an open-source, crowd-sourced, community-moderated, distributed platform for sentence-level annotation of the Web. In other words, a way to cut through the babble and restore some sanity and trust. That’s the idea behind Hypothes.is. False beliefs persist, even after instant online corrections. It seems like a great idea: Provide instant corrections to web-surfers when they run across obviously false information on the Internet.

False beliefs persist, even after instant online corrections

But a new study suggests that this type of tool may not be a panacea for dispelling inaccurate beliefs, particularly among people who already want to believe the falsehood. “Real-time corrections do have some positive effect, but it is mostly with people who were predisposed to reject the false claim anyway,” said R. Kelly Garrett, lead author of the study and assistant professor of communication at Ohio State University. “The problem with trying to correct false information is that some people want to believe it, and simply telling them it is false won’t convince them.” For example, the rumor that President Obama was not born in the United States was widely believed during the past election season, even though it was thoroughly debunked. Factual’s Gil Elbaz Wants to Gather the Data Universe.

FACTUAL sells data to corporations and independent software developers on a sliding scale, based on how much the information is used.

Factual’s Gil Elbaz Wants to Gather the Data Universe

Small data feeds for things like prototypes are free; contracts with its biggest customers run into the millions. Sometimes, Factual trades data with other companies, building its resources. Some current uses are for adding information like restaurant locations to cellphone maps, or for planning sales campaigns. Snopes.com: Urban Legends Reference Pages. 5D optical memory in glass could record the last evidence of civilization. Using nanostructured glass, scientists at the University of Southampton have, for the first time, experimentally demonstrated the recording and retrieval processes of five dimensional digital data by femtosecond laser writing.

5D optical memory in glass could record the last evidence of civilization

The storage allows unprecedented parameters including 360 TB/disc data capacity, thermal stability up to 1000°C and practically unlimited lifetime. Coined as the 'Superman' memory crystal, as the glass memory has been compared to the "memory crystals" used in the Superman films, the data is recorded via self-assembled nanostructures created in fused quartz, which is able to store vast quantities of data for over a million years. The information encoding is realised in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.

A 300 kb digital copy of a text file was successfully recorded in 5D using ultrafast laser, producing extremely short and intense pulses of light. Scientists ‘freeze’ light for an entire minute. Million-Year Data Storage Disk Unveiled. Back in 1956, IBM introduced the world’s first commercial computer capable of storing data on a magnetic disk drive.

Million-Year Data Storage Disk Unveiled

The IBM 305 RAMAC used fifty 24-inch discs to store up to 5 MB, an impressive feat in those days. Today, however, it’s not difficult to find hard drives that can store 1 TB of data on a single 3.5-inch disk. But despite this huge increase in storage density and a similarly impressive improvement in power efficiency, one thing hasn’t changed. The lifetime over which data can be stored on magnetic discs is still about a decade. That raises an interesting problem. How Quantum Computers and Machine Learning Will Revolutionize Big Data - Wired Science. When subatomic particles smash together at the Large Hadron Collider in Switzerland, they create showers of new particles whose signatures are recorded by four detectors. Curation.

Data Visualization / Infographics

The Mathematical Shape of Big Science Data. Simon DeDeo, a research fellow in applied mathematics and complex systems at the Santa Fe Institute, had a problem.

The Mathematical Shape of Big Science Data

Data Scientist: The Sexiest Job of the 21st Century. Artwork: Tamar Cohen, Andrew J Buboltz, 2011, silk screen on a page from a high school yearbook, 8.5" x 12" Download a free chapter from Thomas H. Davenport's book Keeping Up with the Quants. When Jonathan Goldman arrived for work in June 2006 at LinkedIn, the business networking site, the place still felt like a start-up. The company had just under 8 million accounts, and the number was growing quickly as existing members invited their friends and colleagues to join. The Question to Ask Before Hiring a Data Scientist - Michael Li. By Michael Li | 10:00 AM August 6, 2014 When hiring data scientists, there’s nothing more frustrating than making the wrong hire. Data scientists are in notoriously high demand, hard to attract, and command large salaries — compounding the cost of a mistake.

At The Data Incubator, we’ve talked to dozens of employers looking to hire data scientists from our training program, from large corporates like Pfizer and JPMorgan Chase to smaller tech startups like Foursquare and Upstart. Employers that didn’t have good hiring experiences in the past often failed to ask a key question: For Start-Ups, Sorting the Data Cloud Is the Next Big Thing. “My smartphone produces a huge amount of data, my car produces ridiculous amounts of really valuable data, my house is throwing off data, everything is making data,” said Erik Swan, 47, co-founder of Splunk, a San Francisco-based start-up whose software indexes vast quantities of machine-generated data into searchable links.

Companies search those links, as one searches Google, to analyze customer behavior in real time. Splunk is among a crop of enterprise software start-up companies that analyze big data and are establishing themselves in territory long controlled by giant business-technology vendors like Oracle and I.B.M. Founded in 2004, before the term “big data” had worked its way into the vocabulary of Silicon Valley, Splunk now has some 3,200 customers in more than 75 countries, including more than half the Fortune 100 companies. Macy’s uses Splunk’s software to observe its Web traffic in order to avoid costly down times, particularly during peak holiday shopping. How Big Data Gets Real. The business of Big Data, which involves collecting large amounts of data and then searching it for patterns and new revelations, is the result of cheap storage, abundant sensors and new software.

It has become a multibillion-dollar industry in less than a decade. Growing at speed like that, it is easy to miss how much remains to do before the industry has proven standards. Until then, lots of customers are probably wasting much of their money. There is essential work to be done training a core of people in very hard problems, like advanced statistics and software that ensures data quality and operational efficiency. Why the world’s governments are interested in creating hubs for open data. The Limits of Big Data: A Review of Social Physics by Alex Pentland. Big Data, Trying to Build Better Workers. Tears in rain: how Snapchat showed me the glory of data death.

"I've seen things you people wouldn't believe. IBM's Watson wants to fix America's doctor shortage. Words by the Millions, Sorted by Software. Kris Snibbe/Harvard University. Down in the Data Dumps: Researchers Inventory a World of Information. Data are the common currency that unites all fields of science. As science progresses data proliferate, providing points of reference, revealing trends, and offering evidence to substantiate hypotheses. How Companies Learn Your Secrets.

What are you revealing online? Much more than you think. What can be guessed about you from your online behavior? Two computer privacy experts — economist Alessandro Acquisti and computer scientist Jennifer Golbeck — on how little we know about how much others know. What data is being collected on you? Some shocking info. On August 31, 2009, politician Malte Spitz traveled from Berlin to Erlangen, sending 29 text messages as he traveled. Everything We Know About What Data Brokers Know About You. June 13, 2014: This story has been updated. It was originally published on March 7, 2013. We've spent a lot of time this past year trying to understand how the National Security Agency gathers and stores information about ordinary people. How Facebook Uses Your Data to Target Ads, Even Offline.