background preloader

Network graphs

Facebook Twitter

An Rstudio Addin for Network Analysis and Visualization · David Schoch. The ggraph package provides a ggplot-like grammar for plotting graphs and as such you can produce very neat network visualizations. But as with ggplot, it takes a while to get used to the grammar. There are already a few amazing Rstudio Addins that assist you with ggplot (for example ggplotAssist and ggThemeAssist), but there has not been any equivalent tools for ggraph. Till now. This post introduces snahelper, an Rstudio Addin which provides a tiny GUI for visualizing and analysing networks. You can install the developer version with: #install.packages(devtools) devtools::install_github("schochastics/snahelper") In order to work properly, the Package also needs the smglr Package, which adds a new layout algorithm. devtools::install_github("schochastics/smglr") In order to use the Addin, simply highlight a network in your script and select snahelper from the Addin dropdown menu.

The GUI has the following components: Let’s go through them step-by-step. Layout Node Attribute Manager Nodes Edges. Game of Friendship Paradox. People on average have fewer friends than their friends download.file(" GoT=read.csv("got.csv") library(networkD3) simpleNetwork(GoT[,1:2]) Because it is difficult for me to incorporate some d3js script in the blog, I will illustrate with a more basic graph, Consider a vertex v∈V in the undirected graph G=(V,E) (with classical graph notations), and let d(v) denote the number of edges touching it (i.e. v has d(v) friends). M=(rbind(as.matrix(GoT[,1:2]),as.matrix(GoT[,2:1]))) nodes=unique(M[,1]) and we each of them, we can get the list of friends, and the number of friends friends = function(x) as.character(M[which(M[,1]==x),2]) nb_friends = Vectorize(function(x) length(friends(x))) as well as the number of friends friends have, and the average number of friends friends_of_friends = function(y) (Vectorize(function(x) length(friends(x)))(friends(y))) nb_friends_of_friends = Vectorize(function(x) mean(friends_of_friends(x)))

Flow charts in R | Insights of a PhD. Flow charts are an important part of a clinical trial report. Making them can be a pain though. One good way to do it seems to be with the grid and Gmisc packages in R. X and Y coordinates can be designated based on the center of the boxes in normalized device coordinates (proportions of the device space – 0.5 is this middle) which saves a lot of messing around with corners of boxes and arrows. A very basic flow chart, based very roughly on the CONSORT version, can be generated as follows… Sections of code to make the boxes are wrapped in brackets to print them immediately. For detailed info, see the Gmisc vignette. Like this: Like Loading... Networks with R. In order to practice with network data with R, we have been playing with the Padgett (1994) Florentine’s wedding dataset (discussed in the lecture).

The dataset is available from > library ( network ) > data(flo) > nflo plot(nflo, displaylabels = TRUE, + boxed.labels = + FALSE) The next step was to move from the network package to igraph. Since we have the adjacency matrix, we can use it > iflo=graph_from_adjacency_matrix(flo, + mode = "undirected") > plot(iflo) The good thing is that a lot of functions are available, for instance we can get shortest paths, between two specific nodes. > AP=all_shortest_paths(iflo, + from="Peruzzi", + to="Ginori") > L=AP$res[[1]] > V(iflo)$color="yellow" > V(iflo)$color[L[2:4]]="light blue" > V(iflo)$color[L[c(1,5)]]="blue" > plot(iflo) We can also visualize edges, but I found it slightly more complicated (to extract edges from the output) But it works.

> library( networkD3 ) > simpleNetwork (df) Then the next question was to add a vertice to the network. Ggnetwork: Network geometries for ggplot2. In-depth analysis of Twitter activity and sentiment, with R. Astronomer and budding data scientist Julia Silge has been using R for less than a year, but based on the posts using R on her blog has already become very proficient at using R to analyze some interesting data sets. She has posted detailed analyses of water consumption data and health care indicators from the Utah Open Data Catalog, religious affiliation data from the Association of Statisticians of American Religious Bodies, and demographic data from the American Community Survey (that's the same dataset we mentioned on Monday). In a two-part series, Julia analyzed another interesting dataset: her own archive of 10,000 tweets. (Julia provides all the R code for her analyses, so you can download your own Twitter archive and follow along.)

In part one, Julia uses just a few lines of R to import her Twitter archive into R — in fact, that takes just one line of R code: tweets <- read.csv(". /tweets.csv", stringsAsFactors = FALSE) mySentiment <- get_nrc_sentiment(tweets$text) Static and dynamic network visualization with R. This is a comprehensive tutorial on network visualization with R. It covers data input and formats, visualization basics, parameters and layouts for one-mode and bipartite graphs; dealing with multiplex links, interactive and animated visualization for longitudinal networks; and visualizing networks on geographic maps. To follow the tutorial, download the code and data below and use R and RStudio.

You can also check out the most recent versions of all my tutorials here. [June 2018 update] The tutorial is continuously updated and expanded. If you want to see earlier versions, they are still available here: 2015, 2016, and 2017. If you find the tutorial useful, please cite it in your work – this helps me make the case that open publishing of digital materials like this is a meaningful academic contribution: Ognyanova, K. (2018) Network visualization with R. Visualizing Twitter history with streamgraphs in R. I was exploring ways to visualize my Twitter history, and ended up creating this interactive streamgraph of my 20 most used hashtags in Twitter: The graph shows how my Twitter activity has varied a lot. The top three hashtags are #datascience, #rstats and #opendata (no surprises there).

There are also event-related hashtags that show up only once, such as #tomorrow2015 and #iccss2015, and annually repeating ones, such as #apps4finland. Twitter has quite a strict policy for obtaining data, but they do allow one to download the full personal Twitter history, i.e. all tweets as a convenient csv file (instructions here), so that’s what I did. The visualization was created with the streamgraph R package that uses the great htmlwidgets framework for easy creation of javascript visualizations from R. Embedding the streamgraph htmlwidget into this Jekyll blog required a bit of hazzle. Some problems: The size of the widget has to be fixed when creating, so it will not scale automatically. Timely Portfolio: visNetwork, Currencies, and Minimum Spanning Trees. # get MST using code from this post# currencies<-na.omit(currencies) colnames(currencies)<-c("Korea", "Malaysia", "Singapore", "Taiwan", "China", "Japan", "Thailand", "Brazil", "Mexico", "India", "USDOther", "USDBroad")#get daily percent changescurrencies <- currencies/lag(currencies)-1 currencies[1,] <- 0 cor.distance <- cor(currencies)corrplot::corrplot(cor.distance) library(igraph)g1 <- graph.adjacency(cor.distance, weighted = T, mode = "undirected", add.colnames = "label")mst <- minimum.spanning.tree(g1)plot(mst) library(visNetwork)mst_df <- get.data.frame( mst, what = "both" )visNetwork( data.frame( id = 1:nrow(mst_df$vertices) ,label = mst_df$vertices ) , mst_df$edges) %>% visOptions( highlightNearest = TRUE, navigation = T )

Mapping Flows in R. Last year I published the above graphic, which then got converted into the below for the book London: The Information Capital. I have had many requests for the code I used to create the plot so here it is! The data shown is the Office for National Statistics flow data. See here for the latest version. The file I used for the above can be downloaded here (it is >109 mb uncompressed so you need a decent computer to load/plot it all at once in R). You will also need this file of area (MSOA) codes and their co-ordinates. The code used is pasted below with comments above each segment. Load the flow data required – origin and destination points are needed. The UK Census file above didn't have coordinates just area codes. Now for plotting with ggplot2.This first step removes the axes in the resulting plot. xquiet<- scale_x_continuous("", breaks=NULL) yquiet<-scale_y_continuous("", breaks=NULL) quiet<-list(xquiet, yquiet) Let's build the plot.

Nnet – R is my friend. I’ve made quite a few blog posts about neural networks and some of the diagnostic tools that can be used to ‘demystify’ the information contained in these models. Frankly, I’m kind of sick of writing about neural networks but I wanted to share one last tool I’ve implemented in R. I’m a strong believer that supervised neural networks can be used for much more than prediction, as is the common assumption by most researchers.

I hope that my collection of posts, including this one, has shown the versatility of these models to develop inference into causation. To date, I’ve authored posts on visualizing neural networks, animating neural networks, and determining importance of model inputs. This post will describe a function for a sensitivity analysis of a neural network. Specifically, I will describe an approach to evaluate the form of the relationship of a response variable with the explanatory variables used in the model. Here’s what the model looks like: Cheers, Marcus 1Garson GD. 1991.

Facebook data mining

Beautiful network diagrams with ggplot2. Visualizing neural networks from the nnet package – R is my friend. Neural networks have received a lot of attention for their abilities to ‘learn’ relationships among variables. They represent an innovative technique for model fitting that doesn’t rely on conventional assumptions necessary for standard models and they can also quite effectively handle multivariate response data. A neural network model is very similar to a non-linear regression model, with the exception that the former can handle an incredibly large amount of model parameters. For this reason, neural network models are said to have the ability to approximate any continuous function. I’ve been dabbling with neural network models for my ‘research’ over the last few months. I’ll admit that I was drawn to the approach given the incredible amount of hype and statistical voodoo that is attributed to these models.

R has a few packages for creating neural network models (neuralnet, nnet, RSNNS). In this blog I present a function for plotting neural networks from the nnet package. Like this: Organizational Network visualization in R with the igraph package | Rules of Reason. In this post I showed a visualization of the organizational network of my department. Since several people asked for details how the plot has been produced, I will provide the code and some extensions below.

The plot has been done entirely in R (2.14.01) with the help of the igraph package. It is a great package but I found the documentation somewhat difficult to use, so hopefully this post can be a helpful introduction to network visualization with R. Here we go: # Load the igraph package (install if needed) require(igraph) # Data format.

The data is in 'edges' format meaning that each row records a relationship (edge) between two people (vertices). # Additional attributes can be included. Here is the result: Not very informative indeed. #Subset the data. Still not perfect, but much more informative and aesthetically pleasing. Additional information can be found on this guide to igraph which is in development, the examples here, and the official CRAN documentation of the package.