Network

Reordering - Adjacency matrix from edge list (preferrably in Matlab) Common neighbors and preferential attachment score matrixes using igraph for python. Using Netvizz & Gephi to Analyze a Facebook Network | persuasion. This post was originally featured on published on May 6th, 2010. Since the website will be relaunched and the post removed, I have relocated the tutorial to my personal page so that the Gephi community can continue to benefit from it. If a picture is worth a thousand words, then a graph must be worth a thousand spreadsheet rows, right?

A Facebook network rendered in Gephi Okay, maybe not, but for practitioners and researchers alike, data visualization can reveal insights that aren’t always obvious from looking at the raw data, no matter how well organized it may be. When we’re talking about social network, data visualization takes the form of a “social graph,” and it can be a powerful tool to discover deeper meanings and applications behind the relationships and communities within a network.

The Alternatives Facebook: Twitter: Mention Map The great thing about these apps is that they do most of the work for you. Two quick notes about Netvizz: Like this: Thank you! Copyright (C) 1991, 1999 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.] Preamble The licenses for most software are designed to take away your freedom to share and change it. This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it.

When we speak of free software, we are referring to freedom of use, not price. To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. The precise terms and conditions for copying, distribution and modification follow. Faster “for” Loops in R | Justin Leinaweaver. We all know that the apply functions exist to get us away from relying on “for” loops (and “for” loops within “for” loops). However, I’m rather attached to using them for some particular use cases and have thus been tweaking to try and speed them up. With a hat tip to Joseph Adler’s new book from O’Reilly, “R in a Nutshell“, I’ve found that defining the dimensions of the data object that you intend to rely on to collect the results of a loop speeds the process up dramatically!

The only difference is that the final object “data” has had its length predefined. I am not ashamed to say that this discovery has made my day. Like this: Like Loading... R - PCA FactoMineR plot data. Cerebral Mastication » Blog Archive » Principal Component Analysis (PCA) vs Ordinary Least Squares (OLS): A Visual Explanation.

Over at stats.stackexchange.com recently, a really interesting question was raised about principal component analysis (PCA). The gist was “Thanks to my college class I can do the math, but what does it MEAN?” I felt like this a number of times in my life. Many of my classes were focused on the technical implementations they kinda missed the section titled “Why I give a shit.” A perfect example was my Mathematics Principles of Economics class which taught me how to manually calculate a bordered Hessian but, for the life of me, I have no idea why I would ever want to calculate such a monster.

OK, that’s a lie. Later in life I learned that bordered Hessian matrices are a second derivative test used in some optimizations. So back to PCA: as I was reading the aforementioned stats question I was reminded of a recent presentation that Paul Teetor gave at a August Chicago R User Group. Your Independent Variable Matters: You should get something that looks like this: Ok, so what about PCA? Tools for Linking NetLogo and R. BADDELEY, A. and Turner, R. (2011). Package 'spatstat' Manual. Last accessed March 14th, 2012. BAIER, T. (2009). Package 'SWordInstaller' Manual. BAKSHY, E., and Wilensky, U. (2007).

BUTTS, C.T. (2010) Package 'sna' Manual. CRAWLEY, M. DALGAARD, P. (2008). EDDELBUETTEL, D. (2011). FOX, J. (2011). GILBERT, N. (2008). GREVE, G.C.F. (2003). GRIMM, V. and Railsback, S.F. (2005). HASTIE, T. (2011). HEIDBERGER, R.M. and Neuwirth, E. (2009). KUHN, M. (2010). LEISCH, F. (2002). LE PAGE, C., Becu, N., Bommel, P. and Bousquet, F. (2012). LORSCHEID, I., Heine, B. MAIR, P. and Hatzinger, R. (2012). MEYER, M. (2011). R DEVELOPMENT CORE TEAM (2011a). R DEVELOPMENT CORE TEAM (2011b). RAILSBACK, S.F. and Grimm, V. (2012). SCHMOLKE, A., Thorbek, P., DeAngelis, D.L. and Grimm, V. (2010). SQUAZZONI, F. (2012): Agent-Based Computational Sociology. An Introduction to R. Table of Contents This is an introduction to R (“GNU S”), a language and environment for statistical computing and graphics. R is similar to the award-winning1 S system, which was developed at Bell Laboratories by John Chambers et al.

It provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...). This manual provides information on data types, programming elements, statistical modelling and graphics. This manual is for R, version 3.1.0 (2014-04-10). Copyright © 1990 W. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Preface This introduction to R is derived from an original set of notes describing the S and S-PLUS environments written in 1990–2 by Bill Venables and David M. Comments and corrections are always welcome. Suggestions to the reader 1.1 The R environment Try ? Node-level Calculations - Daizaburo Shizuka. There are certain pre-packaged commands in statnet and igraph that allows you to calculate various node-level measures.

The statnet package seems to have a more comprehensive list, though igraph has a couple of measures that statnet does not have. The biggest problems (for my purposes) are that igraph does not have a command for calculating information centrality, and neither package seems to have commands for reach or distance-weighted reach. The latter two are pretty straight-forward, so I am posting functions that will let you easily calculate those two measures. Here is a list of commands for node-level calculations included in the two packages.

To find out what algorithms they use and how to use the commands, just look at the documentation with the help command for each index (? Degree for example). Reach and Distance-weighted Reach For igraph: Reach: 2-reach and 3-reach is simply the proportion of nodes you can reach within 2 steps or 3 steps, respectively. 2-reach: 3-reach: 5 functions to do Principal Components Analysis in R. Triangle Transitivity in dominance hierarchies & directed graphs - Daizaburo Shizuka. The study of dominance hierarchies dates back to the 1920s, when Schjelderup-Ebbe (1922) first described the emergence of linear dominance hierarchies in flocks of chickens (this is also the origin of the expression 'peck order' or 'pecking order'). Subsequently, mathematicians like Kendall & Babington Smith (1940) and Landau (1951) derived formulas to describe the structures of such hierarchies (for a good overview of these methods, see de Vries (1995--Animal Behaviour) as well as Appleby (1983--Animal Behaviour).

The methods by Kendall and Landau, however, were derived in the context of tournaments--a specific type of networks in which all asymmetric dyadic relations (e.g., dominant-subordinate) are known. This contrasts with most data on animal dominance hierarchies. Most of the time, there are pairs that are not observed to interact. To deal with this problem, various methods have been used to 'fill in' these unknown relations.

Here are the codes: R & Bioconductor - Manuals. R & Bioconductor Manual R Basics Introduction General Overview R ( is a comprehensive statistical environment and programming language for professional data analysis and graphical display. The associated Bioconductor project provides many additional R packages for statistical data analysis in different life science areas, such as tools for microarray, next generation sequence and genome analysis. Scope of this Manual This R tutorial provides a condensed introduction into the usage of the R environment and its utilities for general data analysis and clustering. Format of this Manual A not always very easy to read, but practical copy & paste format has been chosen throughout this manual.

Installation of the R Software and R PackagesThe installation instructions are provided in the Administrative Section of this manual.R working environments with syntax highlighting support and utilities to send code to the R console: R Projects and Interfaces Basic R Usage ? R Objects ! Handling Missing Values. Description A collection and description of functions for handling missing values in 'timeSeries' objects or in objects which can be transformed into a vector or a two dimensional matrix.

The functions are listed by topic. Usage ## S3 method for class 'timeSeries': na.omit(object, method = c("r", "s", "z", "ir", "iz", "ie"), interp = c("before", "linear", "after"), ...) removeNA(x, ...) substituteNA(x, type = c("zeros", "mean", "median"), ...) interpNA(x, method = c("linear", "before", "after"), ...) Arguments Details Missing Values in Price and Index Series: Applied to timeSeries objects the function removeNA just removes rows with NAs from the series.

Missing Values in Return Series: For return series the function substituteNA may be useful. Note The functions removeNA , substituteNA and interpNA are older implementations. Author(s) Raphael Gottardo for the knn function, Diethelm Wuertz for the Rmetrics R -port. References Examples. Social networking and recommendation systems. Due: at 11pm on Thursday, July 12. Submit via Catalyst CollectIt (a.k.a. Dropbox). When you sign into Facebook, it suggests friends. In this assignment, you will write a program that reads Facebook data and makes friend recommendations. This assignment looks longer than it actually is. Part 1 is background material and explanations; there is nothing to turn in for this part. First, download and unzip the file homework4.zip. Contents: Recommendation systems Facebook suggests people you may be (or should be) friends with.

A computer system that makes suggestions is called a recommender system. Collaborative filtering says that, if your past behavior/preferences were similar to some other user's, then your future behavior may be as well. In this assignment, you will implement a collaborative filtering recommendation system for suggesting friends on Facebook. Representing a social network as a graph A graph or network represents relationships among things. Recommending friends You are almost done! Social and Economic Networks - Matthew O. Jackson. NodeXL Graph Gallery: Graph Details. The graph is directed. The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.

The graph was laid out using the Fruchterman-Reingold layout algorithm. Overall Graph Metrics:Vertices: 1257Unique Edges: 1295Edges With Duplicates: 182Total Edges: 1477Self-Loops: 678Reciprocated Vertex Pair Ratio: 0.00391644908616188Reciprocated Edge Ratio: 0.00780234070221066Connected Components: 533Single-Vertex Connected Components: 509Maximum Vertices in a Connected Component: 677Maximum Edges in a Connected Component: 825Maximum Geodesic Distance (Diameter): 8Average Geodesic Distance: 2.99675Graph Density: 0.000487081262129527Modularity: 0.583904NodeXL Version: 1.0.1.229. Network models. Rtips. Revival 2012! Paul E. Johnson <pauljohn @ ku.edu> The original Rtips started in 1999.

It became difficult to update because of limitations in the software with which it was created. Now I know more about R, and have decided to wade in again. In January, 2012, I took the FaqManager HTML output and converted it to LaTeX with the excellent open source program pandoc, and from there I’ve been editing and updating it in LyX. You are reading the New Thing! The first chore is to cut out the old useless stuff that was no good to start with, correct mistakes in translation (the quotation mark translations are particularly dangerous, but also there is trouble with ~, $, and -. (I thought it was cute to call this “StatsRus” but the Toystore’s lawyer called and, well, you know…) If you need a tip sheet for R, here it is. This is not a substitute for R documentation, just a list of things I had trouble remembering when switching from SAS to R.

Heed the words of Brian D. 1.1 Bring raw numbers into R (05/22/2012) Step 1. How to use hadoop and r for big data parallel processing. How to use hadoop and r for big data parallel processing. Networks -> Centrality -> Influence. Contents - Index PURPOSE Calculate the influence measure between every pair of vertices using the models of Hubbell, Katz or Taylor. DESCRIPTION Successive powers of matrices provide measures of influence since they enumerate the number of possible walks of given length between all pairs of nodes.

Since longer walks are assumed to contribute less in terms of influence, an attenuation factor is included and the sum of all walks is taken. Hubbell includes the identity matrix in the series whereas Katz does not. For Hubbell the influence matrix is I + S(bA)^i that equals inverse of (I - bA) under certain conditions. It follows that for Katz the influence matrix is inverse of (I - bA) -I under the same condition. PARAMETERS Input dataset: Name of file containing network to be analyzed. Computational Method: Choices are: Hubbel - influence matrix defined by inverse of (I - bA) where A is the adjacency matrix and b is the attenuation factor. LOG FILE Influence matrix. COMMENTS None. What is the difference between Unix, Dos and Operating System. The GNU Operating System. CentiBiN - Centralities in Biological Networks - Documentation.

How Cells Work" At a microscopic level, we are all composed of cells. Look at yourself in a mirror -- what you see is about 10 trillion cells divided into about 200 different types. Our muscles are made of muscle cells, our livers of liver cells, and there are even very specialized types of cells that make the enamel for our teeth or the clear lenses in our eyes! If you want to understand how your body works, you need to understand cells. Everything from reproduction to infections to repairing a broken bone happens down at the cellular level. If you want to understand new frontiers like biotechnology and genetic engineering, you need to understand cells as well. Anyone who reads the paper or any of the scientific magazines (Scientific American, Discover, Popular Science) is aware that genes are BIG news these days. BiotechnologyGene splicingHuman genomeGenetic engineeringRecombinant DNAGenetic diseasesGene therapyDNA mutationsDNA fingerprinting or DNA profiling.

What Are Genes, DNA, and Chromosomes? RBGL. What is a Gene? What are Genes? What Is a Gene? What Are Proteins? What Is A Protein? How Much Protein Do I Need? Simple example:How to use foreach and doSNOW packages for parallel computation. Plyr - R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate vs. R - Assigning results of a for loop to an empty matrix. MCL-edge - analysing networks with millions of nodes. R/parallel : An easy-to-use toolkit for Parallel Computing in R. Research » Matthias Dehmer. Frank Emmert-Streib. The igraph library for complex network research. #501 (Add subgraph centrality, betweenness, communicability) – NetworkX Developer Zone. R recipes - igraph. Power-law Distributions. Csgillespie/poweRlaw. Comp 140: lab 05: networkx and the analysis of facebook graphs. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization.