background preloader

Big Data and data mining

Facebook Twitter

Will Democracy Survive Big Data and Artificial Intelligence? Editor’s Note: This article first appeared in Spektrum der Wissenschaft, Scientific American’s sister publication, as “Digitale Demokratie statt Datendiktatur.”

Will Democracy Survive Big Data and Artificial Intelligence?

“Enlightenment is man’s emergence from his self-imposed immaturity. Immaturity is the inability to use one’s understanding without guidance from another.” —Immanuel Kant, “What is Enlightenment?” (1784) The digital revolution is in full swing. Everything will become intelligent; soon we will not only have smart phones, but also smart homes, smart factories and smart cities. Artificial Intelligence Will Redesign Healthcare - The Medical Futurist. There are various thought leaders who believe that we are experiencing the Fourth Industrial Revolution, which is characterized by a range of new technologies that are fusing the physical, digital and biological worlds, impacting all disciplines, economies and industries, and even challenging ideas about what it means to be human.

Artificial Intelligence Will Redesign Healthcare - The Medical Futurist

Executive Summary - Future HR Trends. The key findings from The Economist Intelligence Unit’s research into workforce analytics are as follows: Investment in workforce analytics is on the increase Research and supporting evidence detailed in this report strongly suggest that the overwhelming majority of organizations will either begin or increase their use of Big Data1 in HR over the coming years.

Executive Summary - Future HR Trends

There are several reasons for this reported increase. The use of data has be- come more evident in all functions, and the focus on workforce analytics simply reflects that overall trend. Book Review — The Traps of Big Data Revealed in “Weapons of Math Destruction” by Cathy O’Neil. “Weapons of Math Destruction,” by Cathy O’Neil Cathy O’Neil is a mathematician and data scientist who writes a blog at

Book Review — The Traps of Big Data Revealed in “Weapons of Math Destruction” by Cathy O’Neil

Her new book, “Weapons of Math Destruction,” is just out, and it’s a blistering journey through big data’s already dangerous effects on society, and a sobering preview of what could be coming if we’re not careful. Calling algorithms “an opinion formalized in code,” O’Neil starts by outlining her love of math, which led to a Ph.D. in mathematics from Harvard, a teaching stint at Barnard, then to a role as an analyst at the same hedge fund (D.E. Shaw) that spawned Jeff Bezos in the 1990s.

Arriving at D.E. Is an algorithm any less racist than a human? We would all like to fancy ourselves as eminently capable of impartiality, able to make decisions without prejudices – especially at work.

Is an algorithm any less racist than a human?

Unfortunately, the reality is that human bias, both conscious and unconscious, can’t help but come into play when it comes to who gets jobs and how much money candidates get offered. Managers often gravitate to people most like themselves, make gender-based assumptions about skills or salaries, or reject candidates who have non-white names – to name just a few examples – even if they don’t mean to. There’s an increasingly popular solution to this problem: why not let an intelligent algorithm make hiring decisions for you? Surely, the thinking goes, a computer is more able to be impartial than a person, and can simply look at the relevant data vectors to select the most qualified people from a heap of applications, removing human bias and making the process more efficient to boot. Forbes Welcome. This Guy Trains Computers to Find Future Criminals.

When historians look back at the turmoil over prejudice and policing in the U.S. over the past few years, they’re unlikely to dwell on the case of Eric Loomis.

This Guy Trains Computers to Find Future Criminals

Police in La Crosse, Wis., arrested Loomis in February 2013 for driving a car that was used in a drive-by shooting. He had been arrested a dozen times before. Loomis took a plea, and was sentenced to six years in prison plus five years of probation.


Why we need a methodology for data science. Those who work in the domain of data science solve problems and answer questions through data analysis every day.

Why we need a methodology for data science

They build models to predict outcomes or discover underlying patterns, all to gain insights leading to actions that will improve future outcomes. And the tools and technologies used in data analysis are evolving rapidly, enhancing data scientists’ abilities to reach their goal. But a chronic inhibitor to success somehow remains even in such rapid growth. Although many business analysts are moving into data scientist roles, they and the businesspeople whose problems need solving lack sufficient understanding of how to go about solving problems using data science techniques. As a result, they sometimes arrive at solutions that fail to adequately address the problem at hand. Like traditional scientists, data scientists need a foundational methodology that will serve as a guiding strategy for solving problems. Technology and the future of work.

Mining public, private data loaded with large questions. CAMBRIDGE, Mass. — With the success of its free, open online-course system, called MITx, the Massachusetts Institute of Technology finds itself sitting on a wealth of student data that researchers might use to compare the efficacy of virtual teaching methods, and perhaps advance the field of Web-based instruction.

Mining public, private data loaded with large questions

Since its inception several years ago, MITx has attracted more than 760,000 registered users from about 190 countries, university officials said. Feds Grapple With Big Data Vs. Privacy. Government study focuses on how privacy-enhancing technologies and large-scale analytics will shape the future of big data.

Feds Grapple With Big Data Vs. Privacy

Internet Of Things: 8 Cost-Cutting Ideas For Government (Click image for larger view and slideshow.) White House counselor John Podesta is leading a 90-day government study that explores the intersection of big data and privacy. Big Data Ethics: How Does It Affect Your Privacy? Big data enables to check, control and know everything.

Big Data Ethics: How Does It Affect Your Privacy?

But to know everything entails an obligation to act on behalf of and to protect the customer. Such an obligation is that organization should do everything possible to protect (sensitive) data sets and to be open and clear what is done with that data. Big data ethics is also related to that although anything is possible to know, it should not always be known or those people within an organization entitled to know sensitive information should only know it. In The Netherlands, it became clear that sensitive Electronic Health Records could be accessed by anyone in the hospital, even the administrative clerks who could check what his or her neighbor was doing in the hospital or an intern who could see why a student was treated in a psychiatric institution. These important privacy breaches should be prevented with the right ethics in place within an organization. How Big Data Is Changing Medicine. Here’s how science usually works: Come up with a question or a hypothesis.

Develop an experiment to test it and create data. As any middle school student could tell you, it’s called the scientific method. Now, some researchers and entrepreneurs in the Bay Area say that method is being upended, especially when it comes to medicine. Big data, big business, Big Brother? With big data comes big responsibility. Above, a computer hacker types on his laptop representing the ominous future of data protection and security. With big data comes big responsibility, says Gerd Leonhard CEO of The Futures AgencyMore global policies needed to manage and prevent future 'big data oil spills'Big data has enormous potential, but not if information is shared with big business, Leonhard warns Editor's note: Gerd Leonhard is futurist, keynote speaker, strategist, author and CEO of The Futures Agency. Follow him on Twitter. Mobile World Congress is the world's largest mobile tech trade show looking at the current state of mobile and where it might go next.

What Makes Big Data Projects Succeed - Tom Davenport. By Tom Davenport | 12:00 PM March 26, 2014 In conversations with executives, many of the same misconceptions about big data projects — and what makes them successful — keep coming up. To help clear the air and foster a better understanding of what makes big data initiatives succeed, here are some of the key things I’ve learned from companies that are realizing substantial business value with their big data initiatives. Google Flu Trends' Failure Shows Good Data > Big Data - Kaiser Fung. By Kaiser Fung | 8:00 AM March 25, 2014 In their best-selling 2013 book Big Data: A Revolution That Will Transform How We Live, Work and Think, authors Viktor Mayer-Schönberger and Kenneth Cukier selected Google Flu Trends (GFT) as the lede of chapter one. They explained how Google’s algorithm mined five years of web logs, containing hundreds of billions of searches, and created a predictive model utilizing 45 search terms that “proved to be a more useful and timely indicator [of flu] than government statistics with their natural reporting lags.”

Unfortunately, no. The first sign of trouble emerged in 2009, shortly after GFT launched, when it completely missed the swine flu pandemic. Last year, Nature reported that Flu Trends overestimated by 50% the peak Christmas season flu of 2012. Quantiphobia and the turning of morals into facts. When stats-wiz and political prognosticator Nate Silver’s new venture, FiveThirtyEight, launched last week, it punctuated the rise of “data journalism,” journalism that incorporates actual numerical data into reporting and storytelling! Silver’s star rose through his New York Times blog, which largely focused on political analysis and his ability to predict 50 out of 50 states correctly in the 2012 presidential election. As a standalone venture, FiveThirtyEight, focuses on sports, science, economics, and lifestyle issues in addition to politics, and brings in data and statistical analysis to bear on these topics. That Nate Silver can be heralded as a star, and that a site like FiveThirtyEight even exists is indicative of a culture that has grown increasingly (and thankfully) enamored with data.

Yet at the same time, the launch of FiveThirtyEight was mostly met with negativity. An individual expressing fear, perhaps fear of numbers Photo courtesy of Bantosh via Wikimedia Commons. Innovation: How your search queries can predict the future - tech - 30 April 2009. Innovation is our new column that highlights the latest emerging technological ideas and where they may lead.

Real-time web search – which scours only the latest updates to services like Twitter – is currently generating quite a buzz because it can provide a glimpse of what people around the world are thinking or doing at any given moment. Interest in this kind of search is so great that, according to recent leaks, Google is considering buying Twitter. The latest research from the internet search giant, though, suggests that real-time results could be even more powerful – they may reveal the future as well as the present. NSA Uses Google Ad-Tracking Cookies for Targeted Hacks: Report. Think all of those advertising cookies installed on your browser are being used by the government to track people and hack into their computers? You’re not paranoid – you’re right. From the latest revelation from the Edward Snowden leaked document cache, the Washington Post reports that the National Security Agency and its sister spy organization in the UK, GCHQ, uses Google Web cookies to identify specific users who are then hit with hacking software.

IBM's Next Big Thing: Psychic Twitter Bots. Startup Trifacta Embracing the Data Scientist in All of Us. Organizations drowning in big and small data will soon have a new way to wrangle, munge or transform it – however you want to describe the process – thanks to software from startup Trifacta that's now in beta tests. The San Francisco company, whose staff has grown from its three computer scientist founders a year ago to a robust 22 employees, today announced a second round of venture funds totaling $12 million led by Greylock Partners and Accel Partners. That brings overall funding to $16.3 million.

Beyond Data Mining. This article first appeared in IEEE Software magazine and is brought to you by InfoQ & IEEE Computer Society. Big Data Security, Privacy Concerns Remain Unanswered. Is Data Complexity Blinding Your IT Decision-Making? Why Did Google Pay $400 Million for DeepMind? How much are a dozen deep-learning researchers worth? Apparently, more than $400 million. Big data blues: The dangers of data mining. Open data: Unlocking innovation and performance with liquid information.