Background | Data Gravity. The purpose of this site is to explore Data Gravity and Data Physics. By explore, we mean embrace with the community in open discussion with a goal of everyone in Software, Networking, Data, and Compute benefiting in the long term. Data Gravity was a concept first described in this blog post. Since that post there has been a great deal of discussion about the concept:What Data Gravity means to your DataHow the Law Dictates Data Gravity in the CloudData Gravity the reason for a cloud’s successData Gravity in a converged infrastructureDoes Data Gravity apply to a converged infrastructure? PaaS meets Data Gravity (Podcast/Cloudcast)Data GravitySQL Server Central Forum Discussion on Data Gravity Now the Data Gravity concept/analogy is being extended, in the hopes of a formulaic approach to calculating not only Data Gravity, but all sorts of Data Physics models, formulas, Application Mass and other new areas.
-Dave McCrory Like this: Like Loading... Manager and machine: The new leadership equation. In a 1967 McKinsey Quarterly article, “The manager and the moron,” Peter Drucker noted that “the computer makes no decisions; it only carries out orders. It’s a total moron, and therein lies its strength. It forces us to think, to set the criteria. The stupider the tool, the brighter the master has to be—and this is the dumbest tool we have ever had.” How things have changed. What would it take for algorithms to take over the C-suite?
Our argument is simple: the advances of brilliant machines will astound us, but they will transform the lives of senior executives only if managerial advances enable them to. If these two things happen—and they’re likely to, for the simple reason that leading-edge organizations will seize competitive advantage and be imitated—the role of the senior leader will evolve. Missing links Consider also the challenge posed by today’s real-time sales data, which can be sliced by location, product, team, and channel. The human edge Asking questions Attacking exceptions. Systems intelligence research group. Selecting the Right Machine Learning Approach - DZone Big Data. Fantasy Analytics. Sometimes it just amazes me what people think is computable given their actual observation space.
At times you have to look them in the eye and tell them they are living in fantasyland. Here is how an example conversation: Me: “Tell me about your company.” Customer: “We are in the business of moving things through supply chains.” Me: “What do you want to achieve with analytics?” Customer: “We want to find bombs in the supply chain.” Me: “COOL!” Me: “Tell me about your available observation space.” Customer: “We have information on the shipper and receiver.” “We also know the owner of the plane, train, truck, car, etc.” “And the people who operate these vehicles too.” Me: “Nice. Customer: “We have the manifest – a statement about the contents.” Me: “Excellent. Customer: “That’s it.” Me: “WHAT?! " The problem being; often the business objectives (e.g., finding a bomb) are simply not possible given the proposed observation space (data sources).
Qualifying Observation Spaces Data Beats Math. Introduction to Market Basket Analysis » Loren on the Art of MATLAB. You probably heard about the "beer and diapers" story as the often quoted example of what data mining can achieve. It goes like this: some supermarket placed beer next to diapers and got more business because they mined their sales data and found that men often bought those two items together. Today's guest blogger, Toshi Takeuchi, who works in the web marketing team here at MathWorks, gives you a quick introduction to how such analysis is actually done, and will follow up with how you can scale it for larger dataset with MapReduce (new feature in R2014b) in a future post.
Contents Motivation: "Introduction to Data Mining" PDF I have been interested in Market Basket Analysis not because I work at a supermarket but because it can be used for web usage pattern mining among many applications. Fortunately, I came across a good introduction in Chapter 6 (sample chapter available for free download) of Introduction to Data Mining. Let's start by loading the example dataset used in the textbook. Data Intelligence and Analytics Resources. Financieel-management. A Clinician-Educator's Roadmap to Choosing and Interpreting Statistical Tests. How data science and A.I. can enhance the creative process | VentureBeat | Big Data | by VentureBeat Staff.
VentureBeat’s DataBeat conference is coming up on May 19-20, and we’re excited to announce a very cool session with Sean Gourley of Quid and Dan Buczaczer of VivaKi. Note: The room is filling up quickly, and ticket prices shoot up soon. Register now and save $200. Seats are limited. Sean Gourley, CTO, Quid Dan Buczaczer, EVP, Creative Partnerships, VivaKi Here’s a bit more about the session: Data and artistry have often been pitted against each other as bitter rivals in creative-driven industries like entertainment and advertising. When big data and AI are combined with the right forms of interactive visualization, they can drive unique forms of thinking — creating intuitive connections, bringing forward hunches, and changing the way humans see the world.
Stay tuned for more session announcements this week, and you can find event details — including our full lineup of data visionaries — here. The disruptive power of collaboration: An interview with Clay Shirky. From the invention of the printing press to the telephone, the radio, and the Internet, the ways people collaborate change frequently, and the effects of those changes often reverberate through generations. In this video interview, Clay Shirky, author, New York University professor, and leading thinker on the impact of social media, explains the disruptive impact of technology on how people live and work—and on the economics of what we make and consume.
This interview was conducted by McKinsey Global Institute partner Michael Chui, and an edited transcript of Shirky’s remarks follows. Interview transcript Sharing changes everything The thing I’ve always looked at, because it is long-term disruptive, is changes in the way people collaborate. The printing press was a sustaining technology for the scientific revolution, the spread of newspapers, the spread of democracy, just on down the list.
Upending supply and demand Creating success from failure. 10 Perspectives on “All Things Data” - DZone Big Data. Switching focus back to a series of technical blog posts, over the next 5/6 blog posts (there may be some Web Summit updates intertwined!) I aim to demystify “all things data”, to include reporting – analytics – data science – business intelligence, key difference and dependencies between these terms, explore an introduction to where machine learning fits into your data model in your company. Governance, security and data management will also be covered. To begin, a short post with 10 perspectives that will get you thinking. (hopefully!) Big Data is just a tool.Analytics is utilized by Data Science and Business IntelligenceData is never clean.
Topics: data,big data,data science. Benford's law of first digits: a universal phenomenon. 1 Research School of Earth Sciences, Australian National University, Canberra, ACT 0200, Australia2 Institut fur Geophysik, ETH Zurich, CH-8092 Zurich, Switzerland More than 100 years ago it was predicted that the distribution of first digits of real world observations would not be uniform, but instead follow a trend where measurements with lower first digit (1,2,...) occur more frequently than those with higher first digits (...,8,9). This idea was first described by an astronomer, Simon Newcomb in 1881.
Newcomb noticed that the pages of logarithm tables were more thumbed for low digits than higher ones. He argued that this was because scientists had more need to look up logs of real numbers with smaller first digit than larger. He produced a mathematical formula predicting the distribution of first digits. Our new study shows that Benford's first digit rule is a natural phenomenon which is likely to hold universally. Troubleshooting a Chart in Excel. Eight (No, Nine!) Problems With Big Data. Photo BIG data is suddenly everywhere. Everyone seems to be collecting it, analyzing it, making money from it and celebrating (or fearing) its powers. Whether we’re talking about analyzing zillions of Google search queries to predict flu outbreaks, or zillions of phone records to detect signs of terrorist activity, or zillions of airline stats to find the best time to buy plane tickets, big data is on the case.
By combining the power of modern computing with the plentiful data of the digital era, it promises to solve virtually any problem — crime, public health, the evolution of grammar, the perils of dating — just by crunching the numbers. Or so its champions allege. Is big data really all it’s cracked up to be? The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful. A sixth worry is the risk of too many correlations. Behavior-driven development. In software engineering, behavior-driven development (BDD) is a software development process that emerged from test-driven development (TDD).[vague] Behavior-driven development combines the general techniques and principles of TDD with ideas from domain-driven design and object-oriented analysis and design to provide software development and management teams with shared tools and a shared process to collaborate on software development. Although BDD is principally an idea about how software development should be managed by both business interests and technical insight, the practice of BDD does assume the use of specialized software tools to support the development process. Although these tools are often developed specifically for use in BDD projects, they can be seen as specialized forms of the tooling that supports test-driven development.
The tools serve to add automation to the ubiquitous language that is a central theme of BDD. History BDD focuses on: What is the difference between a proof of concept and a prototype? Sanity check. A sanity test or sanity check is a basic test to quickly evaluate whether a claim or the result of a calculation can possibly be true. It is a simple check to see if the produced material is rational (that the material's creator was thinking rationally, applying sanity). The point of a sanity test is to rule out certain classes of obviously false results, not to catch every possible error. A rule-of-thumb may be checked to perform the test. The advantage of a sanity test, over performing a complete or rigorous test, is speed. In arithmetic, for example, when multiplying by 9, using the divisibility rule for 9 to verify that the sum of digits of the result is divisible by 9 is a sanity test—it will not catch every multiplication error, however it's a quick and simple method to discover many possible errors.
Mathematical A sanity test can refer to various orders of magnitude and other simple rule-of-thumb devices applied to cross-check mathematical calculations. A "Hello, World! " A/B testing. Example of A/B testing on a website. By randomly serving visitors two versions of a website that differ only in the design of a single button element, the relative efficacy of the two designs can be measured.
A/B testing (bucket tests or split-run testing) is a randomized experiment with two variants, A and B. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistics. A/B testing is a way to compare two versions of a single variable, typically by testing a subject's response to variant A against variant B, and determining which of the two variants is more effective. As the name implies, two versions (A and B) are compared, which are identical except for one variation that might affect a user's behavior.
Version A might be the currently used version (control), while version B is modified in some respect (treatment). Common test statistics History An emailing campaign example to Acceptance Calculategrowth. It's common to want to calculate period growth rates for historical figures. Surprisingly, there's no simple formula for doing it. The Growth formula in Excel is an array formula meaning that it takes several arrays of data as input and outputs an array of data which can be difficult to understand if your knowledge of statistics ain't what it used to be. We are going to look at several other methods for calculating growth including a manually-written formula, a charting method and one method using Goal Seek.
Finally, we'll look at series that begin with negative numbers. Mathematically, exponential growth calculated from such a series has no meaning, so what do we do? Formula Method What most of us want from the Growth formula is a simple number representing the period over period growth rate of a series of numbers. The formula for CAGR is not difficult.
It also provides the opportunity for an organization to solicit internal feedback about a promising product or service, while reducing unnecessary risk and exposure and providing the opportunity for stakeholders to assess design choices early on in the development cycle. In some corporate cultures, proof of concept may be referred to as proof of principle. Analytics - Storytelling. Data observation space. Machine learning. Business value algorithm.