background preloader

Big data

Facebook Twitter

MarkLogic | The Operational Database for Big Data. The Open Data Handbook — Open Data Handbook. Three New Tools Bring Machine Learning Insights to the Masses. Over the past few years, machine learning has quickly become the "secret sauce" of large-scale web sites. Machine learning systems have historically been hand-crafted by the small armies of computer science and mathematics Ph.D.s in employ at places like Google. With the growing popularity of machine learning and other statistical techniques, the demand for so-called "data scientists" (software developers and analysts with the skill to apply statistical techniques to large data sets) has exploded since 2010.

As a result, these rarefied skills have become extremely difficult to find and expensive to retain, driving up the cost of machine learning systems and making it difficult for enterprises and smaller web firms to apply the technology. In the data scientist talent shortage is opportunity, however, and a new breed of software platform is rising to meet this need. Skytree This is not to diminish the Skytree value proposition in the least. BigML BigML Precog. Big Data’s Impact in the World. Integration as a foundation technology for BI and Analytics.

The value of business intelligence (BI) is well-proven. In any case, I'll leave it up to the slew of BI vendors to make that point. The topic of data analytics, a distinctly different science, is increasingly a front-and-center topic. Both of these require data, and this discussion is how integration acts as the foundation for enabling all the goodness that BI and data analytics bring to the enterprise. In order for any organization to leverage the operational benefits of business intelligence/analytics, to comply with governmental or industry regulations, or simply to make well-informed decisions, there must be a solid data foundation. The point of all this is that your business intelligence/analytics platform is only as good as the data it operates on. And the data it operates on is typically brought to the BI/analytics platform through some sort of data integration layer.

Sem apps

What a Tweet Can Tell You. Imagine a tiny little sun, just bursting with heat and light, but trapped inside a hard metal cover with a few holes to let beams of energy stream out from inside. Now imagine there were millions of those little suns, maybe the size of basketballs or tennis balls, all rolling down an assembly line one after another, each with a unique pattern of holes and beams of light streaming out into the world.

That's what Twitter is. Inside every unborn tweet you can find infinite potential - someone will be in a place, with social context and they will say something, anything, and give that potential a form. They will say something and it will be instantly available to anyone in the world who's subscribed. "One could spend months mining Twitter using @DataSift," said Paul M. Years in the making, Datasift launched today as the second licensed reseller of tweets. Datasift just opened to the public today, so despite its best efforts there are still some kinks that need to be worked out. Artificial intelligence: Difference Engine: Luddite legacy.

AN APOCRYPHAL tale is told about Henry Ford II showing Walter Reuther, the veteran leader of the United Automobile Workers, around a newly automated car plant. “Walter, how are you going to get those robots to pay your union dues,” gibed the boss of Ford Motor Company. Without skipping a beat, Reuther replied, “Henry, how are you going to get them to buy your cars?” Whether the exchange was true or not is irrelevant. The point was that any increase in productivity required a corresponding increase in the number of consumers capable of buying the product.

The original Henry Ford, committed to raising productivity and lowering prices remorselessly, appreciated this profoundly—and insisted on paying his workers twice the going rate, so they could afford to buy his cars. For the company, there was an added bonus. Economists see this as a classic example of how advancing technology, in the form of automation and innovation, increases productivity. Some did lose their jobs, of course. Public Data Explorer. New rules for big data. Data, data everywhere. WHEN the Sloan Digital Sky Survey started work in 2000, its telescope in New Mexico collected more data in its first few weeks than had been amassed in the entire history of astronomy. Now, a decade later, its archive contains a whopping 140 terabytes of information. A successor, the Large Synoptic Survey Telescope, due to come on stream in Chile in 2016, will acquire that quantity of data every five days.

Such astronomical amounts of information can be found closer to Earth too. Wal-Mart, a retail giant, handles more than 1m customer transactions every hour, feeding databases estimated at more than 2.5 petabytes—the equivalent of 167 times the books in America's Library of Congress (see article for an explanation of how data are quantified). Facebook, a social-networking website, is home to 40 billion photos.

All these examples tell the same story: that the world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. Dross into gold. Dario Gil: The Next Era of Computing: Learning Systems. When IBM's Watson defeated two past champions on TV's Jeopardy! Game show last February, it awoke many people to the awesome power of computing. Watson demonstrates that computers are at last becoming learning systems-capable of consuming vast amounts of information about the world, learning from it and drawing conclusions that can help humans make better decisions.

At IBM Research, we believe that learning systems will shape the future of information science and the IT industry, and that Watson represents a very significant step on that journey. But every innovator needs a target to aim for, so, after the Jeopardy! Challenge, we're searching for the next "grand challenge" to will drive the next advances in Information Technology. We want to throw a wider net, as well. The decision to focus on learning systems for this particular lab event emerged out of a year-long project that was connected to the IBM centennial. Meaning mining from wkipedia. Background Briefing. The NeuroCommons is a proving ground for the ideas behind Science Commons’ Data Project.

It is built on the legal opportunities created by Open Access to the scientific literature and the technical capabilities of the Semantic Web. Executive Summary The Neurocommons project, a collaboration between Science Commons and the Teranode Corporation, is building on Open Access scientific knowledge to build a Semantic Web for neuroscience research. The project has three distinct goals. Background Today’s life scientist faces a dizzying array of knowledge sources. The result is a “scalability problem” in life sciences: while methods for generating information have gone digital, methods for using that information remain stolidly analog. Unfortunately it is neither cheap nor easy to seize the moment and apply the technological advances to the human problem. The technological opportunity: Semantic Web The life sciences represent an ideal test case for the Semantic Web.

The legal and economic problem. TIA:Genisys. Program Objective: Program Genisys is a FY02 new-start program. The Genisys program's goal is to produce technology enabling ultra-large, all-source information repositories. To predict, track, and preempt terrorist attacks, the U.S. requires a full-coverage database containing all information relevant to identifying: potential foreign terrorists and their possible supporters; their activities; prospective targets; and, their operational plans. Current database technology is clearly insufficient to address this need. Program Strategy: Much of database technology as it exists today is based on a paradigm defined in the 1980's. Today, computer processors, storage media, and networks are a thousand times more capable. Planned Accomplishments: FY02: Genisys will produce several prototype designs consistent with program goals.

Related Links: