background preloader

The Cooperative Association for Internet Data Analysis

The Cooperative Association for Internet Data Analysis
Related:  Data mining

Eureqa Eureqa is a breakthrough technology that uncovers the intrinsic relationships hidden within complex data. Traditional machine learning techniques like neural networks and regression trees are capable tools for prediction, but become impractical when "solving the problem" involves understanding how you arrive at the answer. Eureqa uses a breakthrough machine learning technique called Symbolic Regression to unravel the intrinsic relationships in data and explain them as simple math. Using Symbolic Regression, Eureqa can create incredibly accurate predictions that are easily explained and shared with others. Over 35,000 people have relied on Eureqa to answer their most challenging questions, in industries ranging from Oil & Gas through Life Sciences and Big Box Retail. Try Eureqa for yourself - it's free for 30 days. Eureqa One Page Overview (.pdf) »Visit the Eureqa Community »

HTTP Compression use by Alexa Top 1000 Yesterday, frontend madman and performance nut Paul Irish reached out to me asking if I had any stats on the use HTTP compression. I’ve written a bunch about the benefits of HTTP compression, as well as the challenges in implementing it. Surprisingly, I realized that, no, I did not have any figures about HTTP compression usage by major sites. The most recent stats I had were from the talk I gave during Velocity 2011′s Ignite sessions. The more I thought about it, I saw there were lots of interesting stats to gather beyond a raw “X number of sites use compression”. Methodology How could I get data about how HTTP compression is used by the top websites? My next thought was to use the awesome HTTP Archive. Another reason for needing the full responses including headers and body bytes is the type of analysis I needed to do. In the end, there was no getting around the fact that I needed to download the actual content from Alexa’s top websites. Big Questions To Answer The Findings More Testing

Internet Malicious Activity Maps The next map below represents a summary of malicious activity seen on the Internet over the past 30 days combined. The IP space is mapped into this image using a Hilbert Curve. The numbers in the upper left-hand corner of each block of the map indicate the first octet of the IP addresses represented in that section, so, for example, the block labeled "24" represents all of the IP addresses in the netblock. Blocks with orange numbers and cross-hatching are full /8 networks that are bogons, unallocated space which should never be seen on the Internet. Non-bogon blocks blocks are displayed with red numbers. The map below is a half-size version to avoid breaking the layout of the web page and making it impossible to read - click on the image to open the full-sized version of the map in a new browser window/tab. Each individual pixel of the full map represents 4096 IP addresses. Credit for the idea of this mapping concept goes to xkcd, and their Map of the Internet.

Weka 3 - Data Mining with Open Source Machine Learning Software in Java Weka is tried and tested open source machine learning software that can be accessed through a graphical user interface, standard terminal applications, or a Java API. It is widely used for teaching, research, and industrial applications, contains a plethora of built-in tools for standard machine learning tasks, and additionally gives transparent access to well-known toolboxes such as scikit-learn, R, and Deeplearning4j. Download and installDocsCoursesBook

Wolfram|Alpha Personal Analytics Connect with Faceook, sign in for free, and get unique, personalized information anad analysis on your social data-computed by Wolfram|Alpha Clustering of your friends What are the groups of friends that make up your network? How do these groups relate to each other? Where in the world are your friends? Where do your friends live? Your network's global reach Who lives farthest from you? How popular are your friends? How many friends do your friends have? What do you talk about on Facebook? The bigger the word, the more often it's used in your conversations. When do you use Facebook? When are you most active? Where are your friends at in life? Do your friends' ages reflect what kinds of relationships they're in? Explore the structure of your friend network How do your friends connect you to your other friends? Who plays the special roles in your network? How are your friends tied together? Your most popular photos What is your most liked photo? Get a new perspective on your friends

IRR - Internet Routing Registry Data Mining, a useful tool in Business Intelligence | Ana María Orozco Zuluaga In many occasions we have heard about Data Mining but, what is it exactly and when do we have to use it?. Well, I am going to start with some basis definitions I have collected from different sources and authors and I have made a nice combination (from my point of view) that I will share in this post. What is it? Data Mining is an extraction activity and its objective is discovering facts which are in the data base. In the same way it enables you to deduce hidden knowledge by examining or training the data. When do we have to use it or when is it useful? Data mining is very useful in many fields such as: Marketing, government, medicine, sales and production. In the figure below I show general information of how each algorithms work, its characteristics and the specifics cases when we use it in a particular case.

Arbeitsgemeinschaft Online Forschung Die AGOF wurde im Dezember 2002 gegründet. Nach eigener Aussage ist sie eine „Organisation der Online-Vermarkter und -Werbeträger, die unabhängig von Individualinteressen für Transparenz und Standards in der digitalen Werbeträgerforschung sorgt“. Die AGOF ist hervorgegangen aus allen Mitgliedern der AGIREV (Arbeitsgemeinschaft Internet Research e.V), Auftraggebern des Online-Reichweiten-Monitors (ORM), und den Trägern und Lizenznehmern der Arbeitsgemeinschaft @facts. Sie arbeitet eng mit der Arbeitsgemeinschaft Media-Analyse e. V. (agma) zusammen und wird von Messdienstleistern und Marktforschungsunternehmen sowie dem BVDW unterstützt. Struktur[Bearbeiten] Die AGOF hat insgesamt 19 Mitglieder, die in den zwei Sektionen Internet und Mobile organisiert sind. Mitglieder der Sektion Internet sind (Stand Januar 2014): Axel Springer Media Impact GmbH & Co. Die Sektion Mobile wird gebildet durch (Stand Januar 2014) Axel Springer Media Impact GmbH & Co. Unique User[Bearbeiten] Studien[Bearbeiten]

Net-virtual Practice Labs ::: Cisco :: Global MPLS VPNv4 & VPNv6 MPLS VPN Inter-AS Option A or Back-to-Back VRF Click here - Described in RFC2547, section 10. If two sites of a VPN are connected to different Autonomous Systems (e.g., because the sites are connected to different SPs)? There are a number of different ways of handling this case, which we present in order of increasing scalability. VRF-to-VRF connections at the AS (Autonomous System) border routers. In this procedure, a PE router in one AS attaches directly to a PE router in another. This is a procedure that "just works", and that does not require MPLS at the border between ASes. Case Scenario : Internet Primary Path ( Solid Lines) Customers in VRF-A (SP-A) Internet primary path is via CSR19 in AS-1019 Customers in VRF-A (SP-B) Internet primary path is via XRV19 in AS-2019 Customers in VRF-B (SP-A) Internet primary path is via CSR18 in AS-1018 Customers in VRF-B (SP-B) Internet primary path is via XRV18 in AS-2018 Example #1: Inter-AS MPLS VPN All ODD Subnets are on same VRF-A.

Data Mining Image: Detail of sliced visualization of thirty video samples of Downfall remixes. See actual visualization below. As part of my post doctoral research for The Department of Information Science and Media Studies at the University of Bergen, Norway, I am using cultural analytics techniques to analyze YouTube video remixes. My research is done in collaboration with the Software Studies Lab at the University of California, San Diego. A big thank you to CRCA at Calit2 for providing a space for daily work during my stays in San Diego. The following is an excerpt from an upcoming paper titled, “Modular Complexity and Remix: The Collapse of Time and Space into Search,” to be published in the peer review journal AnthroVision, Vol 1.1. The following excerpt references sliced visualizations of the three cases studies in order to analyze the patterns of remixing videos on YouTube. Image: this is a slice visualization of “The Charleston and Lindy Hop Dance Remix.”