
Culturomics
Culture & Meme
Googlewhack
A Googlewhack is a type of contest for finding a Google search query consisting of exactly two words without quotation marks, that returns exactly one hit. A Googlewhack must consist of two actual words found in a dictionary. A Googlewhack is considered legitimate if both of the searched-for words appear in the result page. Published googlewhacks are short-lived, since when published to a web site, the new number of hits will become at least two, one to the original hit found, and one to the publishing site. [ 1 ] [ edit ] Historyggl img srch
Introduction On December 17th 2012, I got a nice letter from Mark Mayzner , a retired 85-year-old researcher who studied the frequency of letter combinations in English words in the early 1960s. His 1965 publication has been cited in hundreds of articles.
English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU
Mike Kinde of Ideas Illustrated set out on an etymological visualization project to better understand how foreign words have shaped the English language.
Visualizing Our Word Origins
Ideas Illustrated » Blog Archive » Visualizing English Word Origins
Our adventures in culturomics
Peter Aldhous, Jim Giles and MacGregor Campbell, reporters (Image: Michael St. Maur Sheil/Corbis) Here in New Scientist 's San Francisco bureau we can't resist an invitation to participate in an entirely new field of research. So after reading about the first analyses of word usage over time in Google's mammoth database of 5 million digitised books, we were excited to learn that the search giant has provided a neat tool, the Books Ngram Viewer , to perform your own " culturomic " studies. Diving straight into the US culture war , this result made us exclaim, " Science be praised!Culturomics
Google: Le plus grand corpus linguistique de tous les temps
Lorsque j'étais étudiant, à la fin des années 70, je n'aurais jamais osé imaginer, même dans mes rêves les plus fous, que la communauté scientifique ait un jour les moyens d'analyser des corpus de textes informatisés de plusieurs de centaines de milliards de mots. A l'époque, j'étais émerveillé par le Brown Corpus , qui comportait la quantité extraordinaire d'un million de mots d'anglais américain, et qui après avoir servi à la compilation de l' American Heritage Dictionary , avait été mis assez largement à disposition des chercheurs. Ce corpus, malgré sa taille, qui apparaît maintenant dérisoire, a permis une quantité impressionnante d'études et a contribué largement à l'essor des technologies du langage...In 500 Billion Words, a New Window on Culture
The digital storehouse, which comprises words and short phrases as well as a year-by-year count of how often they appear, represents the first time a data set of this magnitude and searching tools are at the disposal of Ph.D.’s, middle school students and anyone else who likes to spend time in front of a small screen. It consists of the 500 billion words contained in books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian. The intended audience is scholarly, but a simple online tool allows anyone with a computer to plug in a string of up to five words and see a graph that charts the phrase’s use over time — a diversion that can quickly become as addictive as the habit-forming game Angry Birds. With a click you can see that “women,” in comparison with “men,” is rarely mentioned until the early 1970s, when feminism gained a foothold.Quand Google Books permet de comprendre notre génome culturel
Pour une fois, on va dire du bien de Google dans cette lecture de la semaine. A travers un article paru sur le site de Discover Magazine en décembre 2010, sous la plume de Ed Young. Le titre de cet article : “Le génome culturel ; Google Books révèle les traces de la notoriété, de la censure et des changements de la langue” .Search engine data visualisations | Search insights
I’ve decided I need a single place to put all of the search engine data visuals that I’ve been working on. The visuals are made up of thousands of actual queries put into search engines by UK users over the course of a year. This gives us an idea of ‘search demand’ which can/may/should equal actual, offline demand for a topic. Feel free to republish however please link to this blog and also to James Webb who helped to create them. They can be downloaded as PDF’s at the bottom of this page.The Google Alphabet: An Autocomplete Snapshot From A to Z
Google's "instant" results appear in the search box the moment you begin typing a search term. For every letter of the alphabet, Google autocompletes popular search terms. We thought it would be an interesting experiment to take a look at the current crop of default search results during an anonymous search session on Google.3'-Sémiométrie
Quotes google fights...
by Avraham Roos Googlefight.com At first sight, googlefight seems like a total waste of time and (because of the fighting) even completely uneducational. But think again. What you are looking at is actually one of the largest free web-based corpora.
Googlefight!
We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities. <p style="text-align:right;color:#A8A8A8"></p>
Quantitative Analysis of Culture Using Millions of Digitized Books
Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space
Global geocoded tone of all Summary of World Broadcasts content January 1979–April 2011 mentioning “Bin Laden” (click to view animation). (Credit: UIC) Computational analysis of large text archives can yield novel insights into the functioning of society, recent literature has suggested, including predicting future economic events, says Kalev Leetaru, Assistant Director for Text and Digital Media Analytics at the Institute for Computing in the Humanities, Arts, and Social Science at the University of Illinois and Center Affiliate of the National Center for Supercomputing Applications. The emerging field of “Culturomics” seeks to explore broad cultural trends through the computerized analysis of vast digital book archives, offering novel insights into the functioning of human society, while books represent the “digested history” of humanity, written with the benefit of hindsight.Trois chercheurs de l'Université Cornell viennent de mettre en ligne un outil de monitorage et de traçabilité des citations dans les sites médiatiques et les blogs. Le nom de memetracker pose un léger problème de définition car les memes ne se réduisent pas à des extraits de discours et à des framents de phrases. (1) Cependant, l'application mérite mieux qu'une querelle de mots. Il s'agit d'un dispositif de visualisation de données, technologie de pointe dans laquelle les Américains ont une bonne dizaine d'années d'avance sur l'Europe (2). Les créateurs sont Jure Leskovek - sa thèse en vidéo, commentée avec un accent inoubliable, est ici -, Lars Backstrom , et Jon Kleinberg . Memetracker inspecte 900 000 thèmes de récits ( stories ) repérés sur un million de sites d'information et de blogs. Il extrait de 17 millions de phrases les citations les plus fréquemment reprises au fil des heures et des jours sur le "spectre" de sites et de blogs qui consituent l'échantillon de référence.
Pistage visuel des citations qui se répliquent sur le web
"A short saying oft contains much wisdom"
About 959,000 results (0.56 seconds) by Aug 27
"A fine quotation is a diamond in the hand of a man of wit and a pebble in the hand of a fool"
About 87,000 results (0.64 seconds) by Aug 27

