background preloader

Big Data challenge

Facebook Twitter

​DNA data storage landmark: Now it's 215 petabytes per gram or over 100 million movies. Researchers from Columbia University and the New York Genome Center have devised a new coding system, dubbed DNA Fountain, which is capable of stuffing 215 petabytes of data onto one gram of DNA. That's about 100 times more than previous researchers have stored on DNA, and was achieved by customizing an algorithm for streaming video on a smartphone, Science Daily reports. DNA holds promise for data storage because of its superior density to tape, disk, and optical media. It can also store information for thousands of years if it's kept in the right conditions. While information in computers is written as ones and zeros, researchers have devised different algorithms for encoding data to conform with DNA's four base nucleotides: adenine, A, guanine, G, cytosine, C, and thymine, T.

Using this method, Microsoft last year claimed a record by storing 200MB of data including a music video, on synthetic DNA strands. "DNA storage is basically a communication channel," write Erlich and Zielinski. Book: Python Data Science Handbook. Digital Transformation: The Age of Innocence, Inertia or Innovation? Data enrichment records for 200 million people up for sale on the Darknet. Full data enrichment profiles for more than 200 million people have been placed up for sale on the Darknet. The person offering the files claims the data is from Experian, and is looking to get $600 for everything. Details of this incident came to Salted Hash via the secure drop at Peerlyst, where someone uploaded details surrounding the sale and the data.

The data was first vetted by the technical review board at Peerlyst, who confirmed its legitimacy. Once it was cleared by the technical team, a sample of the data was passed over to Salted Hash for additional verification and disclosure. Calls to individuals in the sample data went to voicemail and were not returned. Should any of them confirm their information, we’ll update this story. Salted Hash also reached out to Experian and one other firm, Acxiom, as sources have speculated the information that’s up for sale aligns with enrichment data made available by these companies.

Acxiom did not respond to questions. Impact: The State of Artificial Intelligence Technology, Per Nvidia's CEO. As industry embraces AI, computers aren’t the only ones that have to learn new tricks. Fortune’s Andrew Nusca talks with Nvidia nvda CEO Jen-Hsun Huang. Fortune: What’s the current status of artificial intelligence? Jen-Hsun Huang: 2015 was a big year. Artificial intelligence is moving into the commercial world. There have been several recent advancements.

Yes, in an area of AI called “deep learning.” What are we seeing at the commercial level? Google, Microsoft, and Facebook fb are using AI, whether it’s the voice recognition on your phone or the items displayed in your news feed. Nvidia sees a lot of opportunities in an area of AI called “deep learning’ where a system learns by itself including recognizing images.Photo: Courtesy of Nvidia Is this just for image recognition?

We’re going to use it for everything. Where are we on the adoption curve? Gosh, I wish people knew. There was a moment when we realized, maybe we can actually pull this off. What does that mean for Nvidia? So many. Yeah. Data Lakes, The Internet of Things and Other Assorted Mysteries. “Those who do not heed history are bound to repeat it.” Not so long ago the world of IT was surrounded by a phenomenon known as “”. We woke up one day and was everywhere. It was on TV. It was in Silicon Valley. It was in the boardrooms of almost every corporation. It was on Wall Street. Everybody was declaiming old “brick and mortars” as being passe.

People were proclaiming themselves to be instant millionaires. Then one day the bubble burst. And the Stanford and Harvard MBA’s that once had millions in startup stock options were driving taxis if they wanted to have an income and put food on the table. So what went wrong? The dot com bubble burst because of the failure to ask this most basic question. Something eerily similar is happening today.

Today we have Big Data and data lakes and the internet of things. Everywhere around us we find we are in a world where data is being created every minute of every day at a terrific rate. Let’s take a simple case. About the Author. Machine Learning. What is the potential of machine learning over the next 5-10 years? And how can we develop this technology in a way that benefits everyone? The Royal Society’s machine learning project has been investigating these questions, and has launched a report setting out the action needed to maintain the UK’s role in advancing this technology while ensuring careful stewardship of its development.

Machine learning is a form of artificial intelligence that allows computer systems to learn from examples, data, and experience. Through enabling computers to perform specific tasks intelligently, machine learning systems can carry out complex processes by learning from data, rather than following pre-programmed rules. Recent years have seen exciting advances in machine learning, which have raised its capabilities across a suite of applications. Why Every Manager Needs These Data Science Skills. The fact is, data is becoming an increasingly important business assets and more and more decisions are based on analysis of data. It’s infiltrating every department of every business, and that means that the employees and managers who have the skills to deal with data will be in a better position to help their company and move up.

But don’t fret: you don’t need to go out and learn computer programming, database management, or advanced maths if you’re a manager in human resources or sales. What you do need is some basic data literacy so that when these subjects come up in meetings — and they will come up, sooner or later — you can be the one at the table who is not only keeping up, but adding value to the conversation. You must be ready and willing to conduct experiments.

A large portion of data analysis is devising and conducting experiments — as in, if we do this, what will be the outcome? The question is not if it will happen, the question is: will you be prepared when it does? Analytics & Automation Blog. Cisco Blog > Analytics & Automation Where We’ve Been When Judith Davis and I co-authored Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility in 2011, we identified the five most popular Data Virtualization usage patterns at that time. “Data virtualization is a versatile data integration solution that can be deployed to solve a wide range of data integration challenges. Based on nearly ten years of successful implementations, several common usage patterns have emerged to help guide your enterprise’s data virtualization adoption strategy.

BI data federationData warehouse extensionEnterprise data virtualization layerBig data integrationCloud data integration” What’s Happening Today Looking at how our customers are adopting data virtualization today, I am seeing similar use cases. Through a business lens, this makes sense. Responding to the Trend New CRM and ERP Adapters Marketing Automation Adapters Social Media Adapters Collaboration Adapters Read More » NetAppVoice: How The Semantic Web Changes Everything. Again!

The “semantic Web” is hugely important to tomorrow’s business. Do not underestimate its significance: It truly changes everything. Embrace it, or risk extinction. But what is it? And what does it mean for your business? “Semantic” is the latest buzzword to hit the online world. It’s come to mean everything and nothing. From semantic search to the semantic Web; and from semantic marketing to semantic technologies, it seems like everyone wants to ride the semantic train.

But let’s take things from the beginning. So What? It marks the transition into a new phase of the Web, where we stop searching and start finding. In other words, we discover not just the information that matches the keywords we search for, but the information that we really wanted to find. This is exactly what is happening with Google GOOG +0.27%’s semantic search, which finds content in direct response to the intent of our search query. New Products; New Services The Age Of Checkbox Marketing Is Over So how’s it achieved? Data to Value Data Management Consultancy. Introduction to open data. Today the world generates vast quantities of data each day that can be used to enhance the quality of living of virtually anyone in the world. Information is power but also a tool for supporting development, knowledge sharing and social initiatives. Tracking natural disasters, crowdsourcing rainfall data and mapping out the night’s sky are amongst a diverse range of open data initiatives.

Three key terms of data are used to describe how available it is to people who wish to access it. There is closed, shared and open data. Open data has potential to create tremendous value and has started to be used on a wider scale. However with this growth in access to data sources also comes the challenge of managing the growth in volume and variety. For more information about efficient and new ways of managing data please visit our website. Data for Policy | Reports. Enterprise-Grade Machine Learning for Big Data | Skytree. 72 Infographics about big data. Big Data, IoT, Wearables: A Connected World with Intelligence. At the CES 2015, I was fascinated by all sorts of possible applications of IoT – socks with sensors, mattresses with sensors, smart watches, smart everything – it seems like a scene in sci-fi movies has just come true. People are eager to learn more about what’s happening around them and now they can. While I was at there I attended a talk given by David Pogue – he is awesome.

He pointed out that the prevalence of smartphone is the key to the realization of the phenomenon called “Quantified Self.” I agreed with him. Smart phones play a vital role as a hub where all our personal data converge and present, seamlessly. It’s all relevant; Big Data, IoT, Wearable, Cloud Computing… While most data is uploaded to the cloud, the client devices are generally powerful enough that the computing can be decentralized.

Marc Andreessen once said, “I think we are all underestimating the impact of aggregated big data across many domains of human behavior, surfaced by smartphone apps.” Big Data, IoT, Wearables: A Connected World with Intelligence. A Visual Introduction to Machine Learning. Finding better boundaries Let's revisit the 73-m elevation boundary proposed previously to see how we can improve upon our intuition. Clearly, this requires a different perspective. By transforming our visualization into a histogram, we can better see how frequently homes appear at each elevation. While the highest home in New York is 73m, the majority of them seem to have far lower elevations. Your first fork A decision tree uses if-then statements to define patterns in data.

For example, if a home's elevation is above some number, then the home is probably in San Francisco. In machine learning, these statements are called forks, and they split the data into two branches based on some value. That value between the branches is called a split point. Tradeoffs Picking a split point has tradeoffs. Look at that large slice of green in the left pie chart, those are all the San Francisco homes that are misclassified. The best split Recursion. Data Scientist: The Sexiest Job of the 21st Century. When Jonathan Goldman arrived for work in June 2006 at LinkedIn, the business networking site, the place still felt like a start-up. The company had just under 8 million accounts, and the number was growing quickly as existing members invited their friends and colleagues to join. But users weren’t seeking out connections with the people who were already on the site at the rate executives had expected.

Something was apparently missing in the social experience. As one LinkedIn manager put it, “It was like arriving at a conference reception and realizing you don’t know anyone. Luckily, Reid Hoffman, LinkedIn’s cofounder and CEO at the time (now its executive chairman), had faith in the power of analytics because of his experiences at PayPal, and he had granted Goldman a high degree of autonomy. The shortage of data scientists is becoming a serious constraint in some sectors. It didn’t take long for LinkedIn’s top managers to recognize a good idea and make it a standard feature.

A New Breed. Small Pieces Loosely Joined: How smarter use of technology and data can deliver real reform of local government. Local authorities could save up to £10billion by 2020 through smarter and more collaborative use of technology and data. Small Pieces Loosely Joined highlights how every year councils lose more than £1 billion by failing to identify where fraud has taken place. The paper also sheds light on how a lack of data sharing and collaboration between many local authorities, as well as the use of bespoke IT systems, keeps the cost of providing public services unsustainably high.

The report sets out three ways in which local authorities could not only save billions of pounds, but also provide better, more coordinated public services: Using data to predict and prevent fraud. Each year councils lose in excess of £1.3 billion through Council Tax fraud, benefit fraud and housing tenancy fraud (such as illegal subletting). Testimonials Local Government Minister Kris Hopkins: Richard Copley, Corporate ICT Manager Rotherham Metropolitan Borough Council: "Local Government has spent 5 years cutting back. The Truth About Real-Time Analytics in the Era of Big Data - Mainframe Insights.

Every day we see the impact of real-time analytics and personalization on modern business. Mobile has been the ultimate catalyst, driving new customer experiences that are tailored to the individual, as data fuels a race to provide more value than competitors do. The emergence of the open data economy is enabling access to entirely new business opportunities, but there are misconceptions around what it means to truly implement a real-time solution. I wanted to use this post to take a brief look at what it really takes to compete in this landscape where milliseconds matter, and expose some of the myths around the use of the term “real-time.” The Strategy Let’s say you are looking at making your operational business processes smarter.

After researching the required capabilities, you decide to integrate predictive analytics into your core operational systems, with a vision of using these models to score transactions as they occur. The Illusion of Real-Time So Why Does This Happen? Big data challenges? Look at your people, not your technology. As anyone pursuing a big data initiative knows, every big data strategy really has two components: the technology and the people.

The technology part is actually very simple to solve, relative to the people. As long as you're not trying to crack big data problems with relational database technology from 2004, this piece of the equation shouldn't be a big scary beast. The first thing you should do is capture all the structured and unstructured information you can, even if you don't know what's going to be useful. Why? Because too many companies get so wrapped up in putting together a big data plan that six months can go by — while all the data they could have collected during this time is lost. So even if capturing all this data is an ugly path, I'd advise you to capture everything you can. Once you have the data, you have to figure out what are you going to do with it and how you're going to report on it.

The heart (and art) of data science Data science absolutely is an art.