background preloader


English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU Now we show the letter frequencies by position within word. That is, the frequencies for just the first letter in each word, just the second letter, and so on. We also show frequencies for positions relative to the end of the word: "-1" means the last letter, "-2" means the second to last, and so on. We can see that the frequencies vary quite a bit; for example, "e" is uncommon as the first letter (4 times less frequent than elsewhere); similarly "n" is 3 times less common as the first letter than it is overall. e t a o i n s r h l d c u m f p g w y b v k x j q z 2 z 3 z 4 z 5 z 6 z 7 z -7 z -6 z -5 z -4 z -3 z -2 z -1 z Two-Letter Sequence (Bigram) Counts Now we turn to sequences of letters: consecutive letters anywhere within a word. BI COUNT PERCENT bar graph TH 100.3 B (3.56%) Below is a table of all 26 × 26 = 676 bigrams; in each cell the orange bar is proportional to the frequency, and if you hover you can see the exact counts and percentage. N-Letter Sequences (N-grams) N-gram column notation Closing Thoughts

Blog - ifttt the beginning... I’d like to humbly announce that the first beta invites for a project I’m incredibly excited about are out the door. The project is called ifttt, shorthand for “if this then that”. With this blog I hope to begin fleshing out some of the initial inspirations that led to the inception of ifttt and provide you with a taste of how ifttt can help put the internet to work for you. A few years ago I became passionate about visualizing data and began experimenting with small projects that filtered and presented information in interesting ways. One evening, waiting in line to order Indian food, I was lost in thought about an event-driven programming problem when something clicked in a funny way. Before I get back to addressing that funny click, there was another concept that I was enthralled with at the time. These types of creative adaptations are much easier in a physical world where the useful properties of an individual object can be understood quickly. Now for that click. Jesse & Linden

The Hole NYC Ideas Illustrated » Blog Archive » Visualizing English Word Origins I have been reading a book on the development of the English language recently and I’ve become fascinated with the idea of word etymology — the study of words and their origins. It’s no secret that English is a great borrower of foreign words but I’m not enough of an expert to really understand what that means for my day-to-day use of the language. Simply reading about word history didn’t help me, so I decided that I really needed to see some examples. Using Douglas Harper’s online dictionary of etymology, I paired up words from various passages I found online with entries in the dictionary. The results look like this: The quick brown fox jumps over the lazy dog. This simple sentence is constructed of eight distinct words and one word suffix. A second example shows more variety: Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony. What follows are five excerpts taken from a spectrum of written sources. Passage #1: American Literature

The Art of Insight and Action random website dot com Search engine data visualisations | Search insights I’ve decided I need a single place to put all of the search engine data visuals that I’ve been working on. The visuals are made up of thousands of actual queries put into search engines by UK users over the course of a year. This gives us an idea of ‘search demand’ which can/may/should equal actual, offline demand for a topic. Feel free to republish however please link to this blog and also to James Webb who helped to create them. They can be downloaded as PDF’s at the bottom of this page. Click the links below to open the visuals in PDF format for better quality printing / viewing. Overall Gardening Health Science Nature History Questions Like this: Like Loading... How The World Spends Its Time Online - Have you ever wondered how people across the world spend their time online? Global research firm Nielsen periodically releases data from its studies of consumer behavior online. Here are the 2010 findings regarding social networking, branding and world net usage. Total Time Online The average person spends more than 60 hours a month online. This is the equivalent of 30 straight days a year in front of a computer monitor, smart phone or other internet-capable device. Social networking accounts for 22 percent of the time while 42 percent is spent viewing content, whether watching videos, reading articles or playing online games. Among people who use the Internet, each person visits 2,646 Web pages on 89 domains and logs in 57 times per month. Most Popular Brands The percentage of all online users that visit Google is 82. Social Network Usage The highest percentage of internet users who log onto social media is in Brazil, with 80 percent using social network sites. Daily Internet Activities - Schlagzeilen Googlewhack A Googlewhack is a type of contest for finding a Google search query consisting of exactly two words without quotation marks, that returns exactly one hit. A Googlewhack must consist of two actual words found in a dictionary. A Googlewhack is considered legitimate if both of the searched-for words appear in the result page. Published googlewhacks are short-lived, since when published to a web site, the new number of hits will become at least two, one to the original hit found, and one to the publishing site.[1] History[edit] The term Googlewhack first appeared on the web at UnBlinking on 8 January 2002;[2] the term was coined by Gary Stock. Participants at discovered the sporadic "cleaner girl" bug in Google's search algorithm where "results 1-1 of thousands" were returned for two relatively common words[3] such as Anxiousness Scheduler[4] or Italianate Tablesides.[5] Googlewhack went offline in November 2009 after Google stopped providing definition links. Score[edit] .