background preloader

Big Data

Facebook Twitter

Data Journalism (big data)

Top Business Intelligence predictions for 2012 (Part Three) And we’re back. Parts one and two of this three-part blog series discussed six of the significant, and largely continuing, trends that will shape Business Intelligence (BI) product development and purchase decisions throughout 2012. And, in the words of iconic English rock band Queen, The Show Must Go On. Although, the final chapter of this BI predications piece doesn’t contain nearly as much Innuendo… 7.

The continued consumerization of BI – the inclusion and development of user-friendly features and functionality making reporting and analytics accessible to a wider array of users – will generate more widespread user adoption and better BI Return on Investment (ROI). Studies from the TDWI and BeyeNetwork have also demonstrated the link between pervasive BI and superior ROI for BI initiatives. 8. We know that business users are having a greater say when purchasing a BI solution, but how are usership figures changing in accordance? 9. 10. 11. 12. 13. 14. Useful tools to review, refine, clean, analyze, visualize and publish data | Health Data Innovation. How to Use Pivot Tables to Mine Your Data. To succeed at Six Sigma or any process improvement effort, you'll often have to analyze and summarize text data. Most companies have lots of transaction data from "flat files" like the one shown below, but because the data consists of text and raw numbers, they sometimes have a hard time figuring out what to do with it.

These examples use Excel along with QI Macros for Excel: To summarize and analyze this data, you will want to learn how to use Excel's PivotTable tool. In past incarnations it was known as Crosstab (for cross tabulation). With Pivot Tables and the file above you could: Count the number of deliveries all doctors performed.Count the number of times each doctor had a "Complication" during delivery.Sum or average the charges per delivery by doctor.Count the number of deliveries for each diagnosis. And do it easily. Pivot Tables are a Great Tool, but the User Interface is Awkward Step 1: Your Data Must Have Column Headings! Avoid Mistakes: No Blanks In Column Headings!

Warning! 1. Home | MINE: Maximal Information-based Nonparametric Exploration. Data on GitHub: The easy way to make your data available. GitHub is designed for collaborating on coding projects. Nonetheless, it is also a potentially great resource for researchers to make their data publicly available. Specifically you can use it to:store data in the cloud for future use (for free),track changes,make data publicly available for replication,create a website to nicely present key information about the data,and uniquely:benefit from error checking by the research community.This is an example of a data set that I’ve put up on GitHub.

How? Taking advantage of these things through GitHub is pretty easy. In this post I’m going to give a brief overview of how to set up a GitHub data repository. Note: I’ll assume that you have already set up your GitHub account. Store Data in the Cloud Data basically consists of two parts, the data and description files that explain what the data means and how we obtained it. Track Changes GitHub will now track every change you make to all files in the data repository each time you commit the changes. Tertiary data: Big data's hidden layer. Big data isn’t just about multi-terabyte datasets hidden inside eventually-concurrent distributed databases in the cloud, or enterprise-scale data warehousing, or even the emerging market in data. It’s also about the hidden data you carry with you all the time; the slowly growing datasets on your movements, contacts and social interactions. Until recently, most people’s understanding of what can actually be done with the data collected about us by our own cell phones was theoretical.

There were few real-world examples. But over the last couple of years, this has changed dramatically. Courting hubris perhaps, but I must admit it’s possible some of that was my fault, though I haven’t been alone. The data you carry with you You probably think you know how much data you carry around with you on your cell phone. We know about what I generally call primary data: calendars, address books, photographs, SMS messages and browser bookmarks. But there is also what I refer to as tertiary data. How Much Data Do You Really Need? One of the many things Malofiej 20 made me wonder about is how we present data and what we expect from such a presentation.

Very often, we essentially narrate the process of discovery, but is that really the best way? And how much data do we need to show when making a point? Just because we start out with lots of data does not mean we really need to show it all. So here is a simple experiment. Let’s look at minimum wage data in the U.S. over time. This was inspired by a very nice interactive infographic on Inequality in America that EJ Fox of has put together.

One of his items shows the difference between the nominal minimum wage (the dollar amount) and the inflation-adjusted value (or buying power). Here is a simple visualization of that data. The minimum wage was established in 1938 and was raised at different times in different increments. This reduces the amount of data considerably, to around 50 data points on the blue line, or by a factor of 17. Data Mining Map. Make Data Work Throughout Your Organization - Thomas C. Redman and David Walker. By Thomas C. Redman and David Walker | 3:28 PM January 25, 2012 Data-driven managers, departments, and organizations have always enjoyed distinct advantages. The data-driven have crafted the best strategies, uncovered wholly new markets, and kept operational costs low. Today, advances in predictive analytics and the potential for big data portend even greater opportunity.

Indeed, we think every organization must develop and execute an aggressive plan to put data to work. So where to begin? Improve the data. Build “data to discovery to dollars” processes. Invest in people. Strive to empower all with data. This last point is driven home over and over. She looked him and said, “You know, before we had these measurements I never had any say in my work.

The excitement in her voice rose as she continued: “Now I have the facts. Later that night Tom ran into her boss and asked the same question. He continued, “People still come to me with problems. Top 5 Myths About Big Data. Brian Gentile is the CEO of Jaspersoft, a commercial open source business intelligence software company.

Folllow him @BrianG_Jasper With the amount of hype around Big Data it’s easy to forget that we’re just in the first inning. More than three exabytes of new data are created each day, and market research firm IDC estimates that 1,200 exabytes of data will be generated this year alone. The expansion of digital data has been underway for more than a decade and for those who’ve done a little homework, they understand that Big Data references more than just Google, eBay, or Amazon-sized data sets. The opportunity for a company of any size to gain advantages from Big Data stem from data aggregation, data exhaust, and metadata — the fundamental building blocks to tomorrow’s business analytics.

Combined, these data forces present an unparalleled opportunity. Yet, despite how broadly Big Data is being discussed, it appears that it is still a very big mystery to many. 1. 2. 3. 4. 5. Big Data Sets. One of the things about teaching Computer Studies is having data sets for students to run with their program to get the types of results that you should. During program development, I always found that it was desirable to talk to the students about how to create their own data sets to meet the specifications of the program. It shows that they can read and understand what is needed and generating data that allow them to work through a problem manually and compare their results to what is generated by their program is a great technique to master.

Of course, when it comes to testing the accurateness of their program, you want to have your own data sets. I would make it available as a file that they would read and present the results. Since I know what the results should look like, it’s a quick and easy way to test their programming skills. Sometimes, generating these test data sets can be a real chore in themselves. Note the options for output of your file. Your data is good to go. Related. Give a Little; Get Back a Lot. Recently, I had blogged about how to create Big Data Sets. At the core of the post was reference to the website For Computer Science teachers, this can be a real timesavers. Rather than create significant test data files, use the utility here to generate data for you … lots of data. It comes back with big value for me!

It ended up being included in a Pearltree by drbazuk. By following the link, it opened up a huge collection of resources about big data! The point of this post is to pay it forward to my readers. If you’re looking for articles, resources, or discussion about big data, check out this Pearltree. Powered by Qumana Like this: Like Loading... Related Great Day of Sharing and Learning I started today as I normally would.

In "Computers" Backing up Diigo This is not intended to cast anything disparaging in the direction of Diigo. Day 1 with the new Delicious I have my latest bookmarks saved as a message to this blog. Real-Time Data: You're Doing It Wrong. When it comes to predicting the future, Chartbeat's CEO Tony Haile thinks you're awful. At the Mashable Media Summit, Haile spoke about the importance of real-time data and what your business should be doing with that information. "The more we think we know, the more expert we believe ourselves to be," says Haile, "and the more likely we are to trust our judgment when we shouldn't and get things wrong.

" SEE ALSO: The Return of Real-Time Social Environments Haile talks about replacing complex future predictions with simpler ones for right now, and looking at data as an environment instead of a generated report. The Mashable Media Summit in Pictures Presenting Sponsor: AT&T. Big data has big implications for knowledge management. A goal of knowledge management over the years has been the ability to integrate information from multiple perspectives to provide the insights required for valid decision-making.

Organizations do not make decisions just based on one factor, such as revenue, employee salaries or interest rates for commercial loans. The total picture is what should drive decisions, such as where to invest marketing dollars, how much to invest in R&D or whether to expand into a new geographic market. In the past, the cost of collecting and storing limited the ability of enterprises to obtain the comprehensive information needed to create this holistic picture. However, automated collection of digital information and cheap storage have removed the barriers to making data accessible. Volume, variety, velocity New solutions have now emerged to deal with so-called "big data. " Volume, however, is not the only dimension that defines big data.

Velocity is a third factor associated with big data. Big data in travel. Big data: The next frontier for innovation, competition, and productivity | McKinsey Global Institute | Technology & Innovation. The amount of data in our world has been exploding, and analyzing large data sets—so-called big data—will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, according to research by MGI and McKinsey's Business Technology Office.

Leaders in every sector will have to grapple with the implications of big data, not just a few data-oriented managers. The increasing volume and detail of information captured by enterprises, the rise of multimedia, social media, and the Internet of Things will fuel exponential growth in data for the foreseeable future.

MGI studied big data in five domains—healthcare in the United States, the public sector in Europe, retail in the United States, and manufacturing and personal-location data globally. Big data can generate value in each. 1. 2. Podcast Distilling value and driving productivity from mountains of data 3. 4. 5. 6. 7. O'Brien: Big data and the coming revolution in health care. Posted: 05/02/2012 10:35:42 PM PDT0 Comments|Updated: about a year ago Congratulations! You found a link we goofed up on, and as a result you're here, on the article-not-found page.

That said, if you happened to be looking for our daily celebrity photo gallery, you're in luck: Also, if you happened to be looking for our photo gallery of our best reader-submitted images, you're in luck: So, yeah, sorry, we could not find the Mercury News article you're looking for. There are a couple possible reasons for this: The article has expired from our system.

What next? You may also want to try our search to locate news and information on If you're looking for an article that was published in the last two weeks, here are more options: You can also click on one of our sections: 4 tips for leveraging big data. Cost savings are always key drivers of new initiatives.

And in today's healthcare industry, as priorities continue to shift and pressure is added to increase revenues and improve outcomes, one element could be a key player in making it all happen: big data. "We think it's going to separate winners from losers in many markets over the next five years," said Russ Richmond, MD, CEO of healthcare solutions and consulting company Objective Health. "The institutions that are capable of first understanding where the market is going … are going to have tremendous advantages over the ones who can't or won't do this. We believe that over time, it's going to become a core competency for hospitals, and it won't be something seen as extra or nice to have – it's going to become a core part of how they operate going forward.

" Richmond outlines four tips for leveraging big data at hospitals. 1. [See also: Data breaches top of mind for IT decision makers.] 2. What data can and cannot do | News. In the early days of photography there was a great deal of optimism around its potential to present the public with an accurate, objective picture of the world. In the 19th century pioneering photographers (later to be called photojournalists) were heralded for their unprecedented documentary depictions of war scenes in Mexico, Crimea and across the US. Over a century and a half later – after decades of advertising, propaganda, and PR, compositing, enhancement and outright manipulation – we are more cautious about seeing photographs as impartial representations of reality.

Photography has lost its privileged position in relation to truth. Photographs are just a part of the universe of evidence that must be weighed up, analysed, and critically evaluated by the journalist, the analyst, the scholar, the critic, and the reader. Data can be an immensely powerful asset, if used in the right way. Data is not a force unto itself. Please let us know what you think in the comments below. R Links for the Beginner on World Statistics Day. Microsoft Fuzzy Lookup Add-in for Excel 2010 Walkthrough « Dan English's BI Blog.

Fuzzy Lookup Add-In for Excel. Unstructured data is worth the effort when you've got the right tools. Meet the Urban Datasexual | Endless Innovation. What Is the Future of Knowledge in the Internet Age? Connecting the Dots: Finding Patterns in Large Piles of Numbers - Rebecca J. Rosen - Technology. Data mining by hospitals may be profitable, but not risk-free : Health IT Law Blog. Dynamite plots: unmitigated evil? - Ecological Models and Data.