ssSankeyGrad
returns a ggplot object encoding the flow
between beginning and ending categorical data, i.e., a sankey
diagram, wherein each flow is encoded as a color gradient. Those color
gradients transition between a user-specified color for each category.
The parameters are constructed such that c1
,
c2
, col1
, col2
, and
values
are all vectors of the same length and corresponding
positions represent one flow between categories. The categories
c1
, c2
are vectors of type character string,
col1
, col2
are vectors of type character
string and include either named colors or hexidecimal representations of
color, and values
are of an integer or numeric type that
encode the amount of “flow” between each category.
To install the function from Github, simply run the following line of code:
devtools::install_github("ssp3nc3r/ggSankeyGrad", ref = "master")
In a first example, we setup some dummy data,
library(ggSankeyGrad)
c1 <- c("A", "A", "B", "B")
c2 <- c("C", "D", "C", "D")
values <- c(2L, 5L, 8L, 3L)
col1 <- c("red", "red", "green", "green")
col2 <- c("blue", "orange", "blue", "orange")
ggSankeyGrad(c1, c2, col1, col2, values, label = TRUE)
Let’s consider a second example. This data is inspired by Giorgia Lupi’s award-winning infographic, Nobels, No Degrees. We prepare the data in the same format as before, which encodes the categories of Nobel winners and from which of several schools they are associated, if any.
d5 <- read.csv(text = '
"Category","University","n","col1","col2"
"Chemistry","Berkeley",5,"#cc5b47","#003262"
"Chemistry","Caltech",4,"#cc5b47","#FF6C0C"
"Chemistry","Cambridge",3,"#cc5b47","#A3C1AD"
"Chemistry","Columbia",3,"#cc5b47","#B9D9EB"
"Chemistry","Harvard",7,"#cc5b47","#A51C30"
"Chemistry","MIT",2,"#cc5b47","#8A8B8C"
"Chemistry","Stanford",7,"#cc5b47","#B1040E"
"Economics","Berkeley",5,"#488595","#003262"
"Economics","Cambridge",3,"#488595","#A3C1AD"
"Economics","Columbia",4,"#488595","#B9D9EB"
"Economics","Harvard",7,"#488595","#A51C30"
"Economics","MIT",5,"#488595","#8A8B8C"
"Economics","Stanford",2,"#488595","#B1040E"
"Literature",NA,0,"#96c17c",NA
"Medicine","Berkeley",2,"#decd7c","#003262"
"Medicine","Caltech",5,"#decd7c","#FF6C0C"
"Medicine","Cambridge",4,"#decd7c","#A3C1AD"
"Medicine","Columbia",4,"#decd7c","#B9D9EB"
"Medicine","Harvard",12,"#decd7c","#A51C30"
"Medicine","MIT",5,"#decd7c","#8A8B8C"
"Medicine","Stanford",3,"#decd7c","#B1040E"
"Peace","Caltech",1,"#924855","#FF6C0C"
"Peace","Columbia",1,"#924855","#B9D9EB"
"Peace","Harvard",1,"#924855","#A51C30"
"Physics","Berkeley",8,"#e79275","#003262"
"Physics","Caltech",7,"#e79275","#FF6C0C"
"Physics","Cambridge",7,"#e79275","#A3C1AD"
"Physics","Columbia",6,"#e79275","#B9D9EB"
"Physics","Harvard",9,"#e79275","#A51C30"
"Physics","MIT",7,"#e79275","#8A8B8C"
"Physics","Stanford",10,"#e79275","#B1040E"
', stringsAsFactors = FALSE)
with(d5, ggSankeyGrad(c1 = Category,
c2 = University,
col1 = col1,
col2 = col2,
values = n,
label = TRUE))