background preloader

Analysis

Facebook Twitter

IBM to set Watson loose on cancer genome data. NEW YORK—Earlier today, IBM announced that it would be using Watson, the system that famously wiped the floor with human Jeopardy champions, to tackle a somewhat more significant problem: choosing treatments for cancer.

IBM to set Watson loose on cancer genome data

In the process, the company hopes to help usher in the promised era of personalized medicine. The announcement was made at the headquarters of IBM's partner in this effort, the New York Genome Center; its CEO, Robert Darnell called the program "not purely clinical and not purely research. " VLOG - The Hype Train Derailed. JasperSnoek/spearmint. Hyperopt by hyperopt. Hyperopt: A Python library for optimizing machine learning algorithms; SciPy 2013. How Well Do Blindfolded Monkeys Play the Stock Market? On Wall Street, the term "random walk" is an obscenity.

How Well Do Blindfolded Monkeys Play the Stock Market?

The Best (and Worst) Airlines. CareerCast is out with their annual ranking of the 10 best and 10 worst jobs for 2014, and let's just say that math and science guys everywhere are about to high-five.

The Best (and Worst) Airlines

Nine out of 10 of the best jobs fell into the STEM career category (science, technology, engineering and math), with the "numbers guys," in particular, locking in 3 of the top 4 spots. "This absolutely verifies the importance of STEM careers," said Tony Lee, publisher of CareerCast.com and JobsRated.com. CareerCast looks at 200 of the most populated jobs and then ranks them on a variety of criteria that fall into four key categories: environment, income, outlook and stress. How the cops watch your tweets in real-time. Recent leaks about the NSA's Internet spy programs have sparked renewed interest in government surveillance, though the leaks touch largely on a single form of such surveillance—the covert one.

How the cops watch your tweets in real-time

But so-called "open source intelligence" (OSINT) is also big business— and not just at the national/international level. New tools now mine everything from "the deep Web" to Facebook posts to tweets so that cops and corporations can see what locals are saying. Due to the sheer scale of social media posts, many tools don't even aim at providing a complete picture. Others do. For instance, consider BlueJay, the "Law Enforcement Twitter Crime Scanner," which provides real-time, geo-fenced access to every single public tweet so that local police can keep tabs on #gunfire, #meth, and #protest (yes, those are real examples) in their communities.

The congruence bias is why we all jump to conclusions and stay there. The whole gateway drug thing is so flawed in so many ways.

The congruence bias is why we all jump to conclusions and stay there

One being that though many people who are smack addicts started with pot, there are also many people that only ever smoke pot and never moved to heroin. What makes a lot more sense is that being a heroin addict would tend to suggest that you are open to trying drugs of all sorts so it stands to reason that you've probably done all kinds of drugs legal and otherwise before you reached the end. But you know what, even if weed was a gateway drug, so what? Minority Rules: Why 10 Percent is All You Need.

Microsoft explains Xbox One’s new griefer-separating reputation system. On the Xbox 360, your Xbox Live reputation is a simple five-star rating that is often ignored by the community at large.

Microsoft explains Xbox One’s new griefer-separating reputation system

On the Xbox One, though, the reputation system will get a complete overhaul that will use more detailed monitoring and reporting tools to separate antisocial players from the rest of the community. Four Common Statistical Misconceptions You Should Avoid. Popularity versus similarity: A balance that predicts network growth.

(Phys.org)—Do you know who Michael Jackson or George Washington was?

Popularity versus similarity: A balance that predicts network growth

You most likely do: they are what we call "household names" because these individuals were so ubiquitous. But what about Giuseppe Tartini or John Bachar? That's much less likely, unless you are a fan of Italian baroque music or free solo climbing. In that case, you would have heard of Bachar just as likely as Washington. Parsing the difference between the Internet and the Web according to Alan Kay. This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites.

Parsing the difference between the Internet and the Web according to Alan Kay

What did digital pioneer Alan Kay mean by, “The Internet was done so well, but the Web, in comparison, is a joke. It was done by amateurs”? When Kay speaks, programmers listen. But like anyone who puts forward an opinion, he opens himself up to being misinterpreted. How to Burst the "Filter Bubble" that Protects Us from Opposing Views. The term “filter bubble” entered the public domain back in 2011when the internet activist Eli Pariser coined it to refer to the way recommendation engines shield people from certain aspects of the real world.

How to Burst the "Filter Bubble" that Protects Us from Opposing Views

Pariser used the example of two people who googled the term “BP”. One received links to investment news about BP while the other received links to the Deepwater Horizon oil spill, presumably as a result of some recommendation algorithm. This is an insidious problem. Much social research shows that people prefer to receive information that they agree with instead of information that challenges their beliefs. Comment éclater la «bulle de filtres» et avoir accès à ceux qui ne pensent pas comme nous sur Internet? Utiliser des «portraits de données» pour éclater la «bulle de filtres»: c’est la méthode que préconise une étude du 19 novembre 2013, réalisée en collaboration par l’université Pompeu Fabra de Barcelone et Yahoo Labs, et résumée dans un article du MIT Technology Review paru le 29 novembre.

Comment éclater la «bulle de filtres» et avoir accès à ceux qui ne pensent pas comme nous sur Internet?

Concrètement, il s’agit de faire entrer en contact des personnes ayant de fortes divergences de point de vue, à l’heure où Internet tendrait à les éloigner de plus en plus. L’idée que le développement des réseaux sociaux aurait entraîné un rapprochement des gens autour des opinions qu’ils partagent, et surtout un éloignement entre ceux qui n’en partagent pas, a été évoquée par l’activiste Eli Pariser, qui en 2011 a consacré l’expression de «bulle de filtres» (ou filter bubble) pour la désigner –Titiou Lecoq vous en a parlé le 29 novembre sur Slate.fr. Comme le rappelle le Technology Review, la bulle de filtres amplifierait un problème qui existe déjà dans le monde réel: Analyse combinatoire (rappels) Probabilités et Statistique. Un nouveau modèle pour prédire la propagation des épidémies.

Lors de l'épidémie de grippe H1N1, partie du Mexique, les autorités sanitaires n'avaient pu que surveiller les cas et émettre des recommandations aux voyageurs pour juguler la propagation de la maladie. Les schémas de prédictions alors utilisés avaient été dépassés par le monde moderne, globalisé et mobile. Outbreak! Watch How Quickly An Epidemic Would Spread Across The World. In March of 2009, the Mexican government confirmed it: A four-year-old boy in eastern Mexico’s La Gloria village had swine flu, or H1N1. Sixty percent of the village had reported an unknown respiratory illness back in February, and since, it looked like the virus had jumped. A flu case later confirmed to be H1N1 popped up in California. Then, it spanned an ocean. This Computer Program Can Spot Hipsters. Fashion is a powerful tool.

Within a split second, a person’s glasses, jeans, and haircut can communicate quite a bit about them--everything from age to socioeconomic status to connectedness to trends. Now, courtesy of scientists at the University of California, computers are getting a similar ability. Researchers there are developing a new algorithm that can distinguish someone’s urban tribe--whether he or she is a "biker, country, goth, heavy metal, hip hop, hipster, raver and surfer"--just by looking at a photos.

And it’s accurate about 50% of the time. Identifying the distribution of data is key to analysis. Knowing the distribution of your data is essential to choosing the right statistical method. Suppose you need to assess the capability of your process. What your favorite drink says about your politics, in one chart. Graphic courtesy Jennifer Dube, National Media Research Planning and Placement LLC Former Mississippi governor and uber-Republican Haley Barbour loves bourbon.

Franklin Roosevelt mixed martinis. And, as it turns out, those two partisans have something in common with their base voters: Consumer data suggests Democrats prefer clear spirits, while Republicans like their brown liquor. Democratic drinkers are more likely to sip Absolut and Grey Goose vodkas, while Republican tipplers are more likely to savor Jim Beam, Canadian Club and Crown Royal. That research comes from consumer data supplied by GFK MRI, and analyzed by Jennifer Dube of National Media Research Planning and Placement, an Alexandria-based Republican consulting firm. All those likes and upvotes are bad news for democracy. Human beings have long been easily influenced by the opinions of others but the social media networks that have come to dominate our lives may be making this “social proof” a problem. A recent study in the journal Science, describing a randomised experiment on a social news aggregator platform, is testament to this phenomenon. The platform was set up to be similar to crowd-based sites such as Reddit and Digg, where content is displayed according to whether users vote it “up” or “down”.

The researchers found that earlier ratings strongly affected future rating behaviour. The study involved monitoring 101,281 comments made by users over a five-month period. How Google Converted Language Translation Into a Problem of Vector Space Mathematics. Computer science is changing the nature of the translation of words and sentences from one language to another. The 29 Stages Of A Twitterstorm. How Google Cracked House Number Identification in Street View. The Insatiable Demand for Billy Joel. Netflix a 76.897 catégories de films. Et ce n’est pas forcément une bonne chose.

The DeathList 2013. Vous resterez célibataire en allant sur les sites de rencontre. Vous venez de vous faire larguer et ça y est, on vous a convaincu(e) de vous en remettre aux supers algorithmes des sites de rencontre, que Business Insider tente de décrypter, espérant qu’ils vous trouveront LA bonne personne, celle avec qui vous allez finir votre vie. Pas de bol, il semblerait que ce ne soit pas la bonne stratégie explique le site Kernelmag: Community management & veille : si on arrêtait de seulement compter les likes et les RTs.

Si nous sommes à « l’ère du numérique», c’est (aussi/surtout/généralement) que tout est transposable en chiffres, en unités de mesure, tout se calcul et tout est prétexte à évaluation (la réputation en tête). Et de nombreux outils sont là pour nous offrir cela : volume de RTs, de Likes, « force » de l’engagement, liens entrants, positionnement, etc, etc. Map Your Representatives Finds Your Government Officials in a Flash. Shocking, Football, Tornado, Porn: Science Explains Why You’ll Read This Article. Don’t write about finance. Immersion, l'outil du MIT pour constater les liaisons entre vos contacts sur Gmail. Why humans value sensational news: An evolutionary perspective. Cartographie de l’information : gadget ou outil d’entreprise ? » Can A Graph Of A News Article's Words Tell You More Than Reading It?

I.stanford.edu/~julian/pdfs/icwsm13.pdf. About: GDELT: Global Database of Events, Language, and Tone. US Military Scientists Solve the Fundamental Problem of Viral Marketing. Vincent.etter.io/publications/etter2013cosn.pdf. The Rise of Online Dating. How Quantum Computers and Machine Learning Will Revolutionize Big Data - Wired Science. Staking out Twitter and Facebook, new service lets police poke perps.

Book/ch07.html. Health information management. Healthcare analytics market to exceed $10bn by 2017. Mcostalba/Stockfish. What the NSA can do with “big data” Home - Stockfish - Powerful Open Source Chess Engine. Endgame tablebase. Exploration de données. Comparing Languages for Data Analysis: MATLAB vs. R. vs. Julia. Predictive state representation. Clbustos/Rserve-Ruby-client. Cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf.

RSRuby: Information sur le projet. Rserve - Binary R server - RForge.net. RinRuby. Best way to use R in Ruby. Can Ruby interface with r. Exploring Everyday Things with R and Ruby  Ruby and R. Perform Qualitative Data Analysis. DataMapper - DataMapper. Self-organizing map. Category:Data clustering algorithms. 20 R Packages That Should Impact Every Data Scientist « Data Scientist Insights. The Baseball Analysts. Text Analysis Tools. Space telescopes and human genomes: How researchers share petabyte data sets.

Se préparer au GSO (Graph Search Optimization) Graph Analysis in Journalism with Palantir.