Chapter 11 Visualizing networks with igraph
In this chapter, we take a look at networks. R can handle networks, both for analysis as well as for visualization. There are several packages to help with this job. The methods in this tutorial are based on the igraph package. The igraph project actually interfaces with several programming languages, so you may also find it in other contexts.
11.1 Setup
if(!require(igraph)){
install.packages("igraph",repos = "http://cran.us.r-project.org")
library(igraph)
}
## Loading required package: igraph
##
## Attaching package: 'igraph'
## The following object is masked from 'package:gtools':
##
## permute
## The following objects are masked from 'package:dplyr':
##
## as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
##
## compose, simplify
## The following object is masked from 'package:tidyr':
##
## crossing
## The following object is masked from 'package:tibble':
##
## as_data_frame
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
11.2 Network representations
There are multiple ways of reading in or defining a network. The two most common ways to represent a network are edge lists and adjacency matrices.
11.2.1 Network formats: edgelist
In the first representation, you have a table with at least two columns. Every line represents an edge. The two columns take the two nodes that are connected in this line. Directionality is straight-forward with this: You can define one column to contain only source nodes (where the arrow starts) and the other column to contain only sink or target nodes (where the arrow ends). Un-directed networks can be (implicitly or explicitly) defined by giving every edge twice, with the source and sink nodes of one line being exchanged in the other line. Directionality and weights can also be defined in additional columns. A common input file format is simple interaction format.
Here is the example of a simple network.
<- matrix(c("A","B","B","C","C","D","D","E","E","F","F","A"),
edgelist_circle1 byrow = T,
ncol=2,
dimnames = list(c(1:6),c("source","sink")))
edgelist_circle1
## source sink
## 1 "A" "B"
## 2 "B" "C"
## 3 "C" "D"
## 4 "D" "E"
## 5 "E" "F"
## 6 "F" "A"
<- graph_from_edgelist(edgelist_circle1)
testgraph_circle1 plot(testgraph_circle1, layout = layout_in_circle(testgraph_circle1))
11.2.2 Network formats: adjacency matrix
The second representation is a square matrix with one row and one column
each for every node. The fields that are spanned by this matrix contain
information on whether an edge connects the node belonging to this row
and the node belonging to this column. Usually, 0
indicates no
connection and 1
indicates a connection (weights can be introduced,
too, if necessary). Directionality is dealt with by using the two
triangles of the matrix: the matrix is read such that the rows indicate
source nodes and the columns sink nodes for the edges.
#make an empty matrix with as many rows and columns as there are nodes
<- matrix(0,
adjMat_circle2 nrow=6,
ncol=6,
dimnames = list(c("J","K","L","M","N","O"),
c("J","K","L","M","N","O")))
#fill matrix by connecting each node to only the next
for(i in 1:(nrow(adjMat_circle2)-1)){
+1] <- 1
adjMat_circle2[i,i
}#close the circle
nrow(adjMat_circle2),1] <- 1
adjMat_circle2[ adjMat_circle2
## J K L M N O
## J 0 1 0 0 0 0
## K 0 0 1 0 0 0
## L 0 0 0 1 0 0
## M 0 0 0 0 1 0
## N 0 0 0 0 0 1
## O 1 0 0 0 0 0
<- graph_from_adjacency_matrix(adjMat_circle2)
testgraph_circle2 plot(testgraph_circle2, layout = layout_in_circle(testgraph_circle2))
Exercise: In the code below, change the edgelist in a way that breaks the circle and adds a new edge instead. Plot the result.
<- matrix(c("A","B","B","C","C","D","D","E","E","F","F","A"), #adapt this
edgelist_new byrow = T,
ncol=2,
dimnames = list(c(1:6),c("source","sink")))
edgelist_new
## source sink
## 1 "A" "B"
## 2 "B" "C"
## 3 "C" "D"
## 4 "D" "E"
## 5 "E" "F"
## 6 "F" "A"
<- graph_from_edgelist(edgelist_new)
testgraph_new1 plot(testgraph_new1, layout = layout_in_circle(testgraph_new1))
Exercise: In the code below, change the adjacency matrix in a way that breaks the circle and adds a new edge instead. Plot the result.
<- adjMat_circle2
adjMat_new #replace at least two fields by adjusting the following lines (you need to un-comment them by removing the hashkey):
# adjMat_new[x1,y1] <- 0
# adjMat_new[x2,y2] <- 1
adjMat_new
## J K L M N O
## J 0 1 0 0 0 0
## K 0 0 1 0 0 0
## L 0 0 0 1 0 0
## M 0 0 0 0 1 0
## N 0 0 0 0 0 1
## O 1 0 0 0 0 0
<- graph_from_adjacency_matrix(adjMat_new )
testgraph_new2 plot(testgraph_new2, layout = layout_in_circle(testgraph_new2))
11.3 Network visualization
base R and igraph
are not really made for visualizing large
graphs/networks. There’s better software outthere for those tasks. But
what is nice is that you can integrate your visualization with the
functions to analyse graphs and R’s statistical tools and all the
visualizations for the results from those functions. So let’s take a
look at igraph
’s options for plotting graphs.
We’ll first create a simple graph from a built-in function. In real
life, you’d be using an igraph
object that you created from an
adjacency matrix or an edge list, depending on your data.
<- graph_from_atlas(501)
g plot(g)
As you can see, this graph’s nodes don’t have names. Let’s give them some.
V(g)$name <- c("Maria","Johannes","Jan","Johanna","Cornelis","Hendrik","Anna")
plot(g)
11.3.1 Layouts
Layouts are what determine where the nodes sit on the canvas. By
default, igraph
guesses a good layout algorithm for your graph’s size.
For a small graph, this is usually a Fruchterman-Reingold force-directed
layout, which will put more connected nodes closer to each other. It can
also be chosen explicitly:
plot(g,layout = layout_with_fr(g))
One important point to notice is that the layout does not always look the same. If you run the above chunk a couple of times, you will end up with slightly different positions of your nodes. This can be a problem, if you want to plot the same graph twice, but highlight different aspects. In this case, you can save a layout.
<- layout_with_fr(g) frg
Now, you can re-use this layout - run the following chunk a couple of times, you will see that nothing changes:
plot(g, layout=frg)
You’ve already met the circle layout above:
plot(g,layout = layout_in_circle(g))
More layouts include stars:
plot(g,layout = layout_as_star(g))
… as trees (this one is nice for tree-like structures, e.g. pedigrees):
plot(g,layout = layout_as_tree(g))
… other algorithms that try to bring more connected nodes together while avoiding crowding, e.g.,
… using the Kamada-Kawai layout:
plot(g,layout = layout_with_kk(g))
… using the GEM layout:
plot(g,layout = layout_with_gem(g))
… or using multi-dimensional scaling of the distances between the nodes in the graph (problematic with equi-distant nodes like we have them here):
plot(g,layout = layout_with_mds(g))
… or just plain random:
plot(g,layout = layout_randomly(g))
You can even set all the coordinates manually, by giving one coordinate per node in a \(n\times2\) matric (n being the number of nodes).
plot(g,layout = matrix(sample(1:7,14,replace=T),nrow=7,ncol=2))
11.3.2 Node labels
Let’s keep our FR-layout from above and manipulate its look. Exercise: play around with the visualization arguments shown below to understand how they work. Let’s start with the labels. You can choose not to plot them:
plot(g, layout=frg,
vertex.label = NA) #no labels
In a very un-R-ish way, the default labels have serifs. You can change to a more usual sans serif font like so:
plot(g, layout=frg,
vertex.label.family = "sans") # sans serif labels
And change the color:
plot(g, layout=frg,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black") #color
… and the size:
plot(g, layout=frg,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.7) # size
… also for the individual nodes:
plot(g, layout=frg,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.2*1:7) # size
With this, you could also take values from an analysis, e.g. the degree (i.e. number of connections) of the nodes:
plot(g, layout=frg,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.2*degree(g)) # size based on degree
The labels don’t need to sit in the node:
plot(g, layout=frg,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
11.3.3 Node shapes and sizes
You have a choice of different shapes, e.g. circles, squares, rectangles, spheres.
plot(g, layout=frg,
vertex.shape="square",
vertex.size = 40,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", # label color
vertex.label.cex = 0.6) # label size
As always, they can be set by node:
plot(g, layout=frg,
vertex.shape=c("square","circle")[1+as.numeric(grepl("nn",V(g)$name))],
vertex.size = 40,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", # label color
vertex.label.cex = 0.6) # label size
Colours of both the node and its frame can be chosen freely, too:
plot(g, layout=frg,
vertex.shape=c("square","circle")[1+as.numeric(grepl("nn",V(g)$name))],
vertex.color = c("darkgreen","orange")[1+as.numeric(grepl("^J",V(g)$name))],
vertex.frame.color = c("black","grey"), # vectors are not recycled, any node without a value gets an NA
vertex.size = 40,
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", # label color
vertex.label.cex = 0.6) # label size
A special case of nodes are pies:
plot(g, layout=frg,
vertex.shape= "pie",
vertex.size = 30,
vertex.pie = list(c(1,1,0,2),
c(2,1,0,1),
c(3,1,0,0),
c(4,0,1,0),
c(2,2,0,3),
c(1,1,0,1),
c(4,4,0,1),
c(5,0,0,1)),
vertex.pie.color = list(rainbow(4)),
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", # label color
vertex.label.cex = 0.6, # label size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
Which are especially effective with the size vector:
<- list(c(1,1,0,2),
pieList c(2,1,0,1),
c(3,1,0,0),
c(4,0,1,0),
c(2,2,0,3),
c(1,1,0,1),
c(4,4,0,1),
c(5,0,0,1))
plot(g, layout=frg,
vertex.shape= "pie",
vertex.size = sapply(pieList,sum)*6,
vertex.pie = pieList,
vertex.pie.color = list(rainbow(4)),
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", # label color
vertex.label.cex = 0.6, # label size
vertex.label.dist = rep(3,7), # label distance to node centre
vertex.label.degree = rep(0,7)) # label position 0=right, pi=left, pi/2=below etc
## Warning in label.dist * cos(-label.degree) * (vertex.size + 6 * 8 * log10(2)):
## longer object length is not a multiple of shorter object length
## Warning in layout[, 1] + label.dist * cos(-label.degree) * (vertex.size + :
## longer object length is not a multiple of shorter object length
## Warning in label.dist * sin(-label.degree) * (vertex.size + 6 * 8 * log10(2)):
## longer object length is not a multiple of shorter object length
## Warning in layout[, 2] + label.dist * sin(-label.degree) * (vertex.size + :
## longer object length is not a multiple of shorter object length
11.3.4 Edge appearance
Of course, the edges can also be controlled individually.
plot(g, layout=frg,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
… including the line type:
plot(g, layout=frg,
edge.lty = 3,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
… and the thickness:
plot(g, layout=frg,
edge.width = 3,
edge.lty = 2,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
… which can, of course be adapted per edge:
plot(g, layout=frg,
edge.width = sqrt(edge_betweenness(g)),
edge.lty = 2,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
Edges can be curved:
plot(g, layout=frg,
edge.curved = T,
edge.width = sqrt(edge_betweenness(g)),
edge.lty = 1,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
… and more or less curved:
plot(g, layout=frg,
edge.curved = 0.1,
edge.width = sqrt(edge_betweenness(g)),
edge.lty = 1,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
plot(g, layout=frg,
edge.curved = -2,
edge.width = sqrt(edge_betweenness(g)),
edge.lty = 1,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
Of course, edges may have labels:
plot(g, layout=frg,
edge.label = E(g),
edge.label.family = "sans",
edge.label.cex = 0.6,
edge.label.color = "magenta",
edge.label.font = 3,
edge.curved = 0.1,
edge.width = sqrt(edge_betweenness(g)),
edge.lty = 1,
edge.color = "black",
vertex.label.family = "sans", # sans serif labels
vertex.label.color = "black", #color
vertex.label.cex = 0.6, # size
vertex.label.dist = 3, # label distance to node centre
vertex.label.degree = 0) # label position 0=right, pi=left, pi/2=below etc
You can also add a title, box and other things. See ?igraph.plotting
for further tips.