R is on of the most widely used software for statistical analysis of data, which I use a lot for my work as a hydrogeologist /geochemist. R enables me to do what I want with the graphs, the exact way I like to present something. If you are not getting exactly what you want with excel and really likes to code your program, R is for you.
This post is about how to use R to make Stack Plots / stacked bar plots and place the legend of your stacks properly. I will also use the same data to make box plots to show you a different view of the data set. Let’s first take a look at the data. Data are presented in the table below which consist of monthly rainfall at five different stations. Our goal is to compare the data visually and do some statistics.
[wpdm_category id=r-programs order_field=id order=desc]
Table 1: Rainfall data in inches
Like any most other R applications, we like to read the data from a csv file, make a data-frame and then do different statistical analysis. please visit http://www.coalgeology.com/r-programming/ for many useful codes if you are just starting with R.
You can download the example database from the link StackBarPlotExample
Let me now present the code I have for the Stacked up box plot in R:
#Rainfall stackplot code in R
#Update the csv file to update new graph.
#coded on 2/10/2014
#Code by Ankan Basu, CPG, Hydrogeologist
# Read the CSV File into a dataframe
#select numeric vectors for the rainfalls.
#check structure of the selected vector data
#Assign screen design / graphs per page
#Assign empty space around the graphs
#to draw legen outside the axis, use the following command.
par(xpd = T)
### By itself, changing the top margin does not help.
### It is also vital to set the xpd parameter to T so that
### R will draw outside the main plot region
las=3 # use to make the xaxis labels vertical
6.2, 50, #set the x and y position of the legend.
#beside = TRUE)
#legend.text = TRUE,
#args.legend = list(x = “topright”, bty = “n”)
#legend(“topleft”, inset=.0, legend=”Box Plot for Pumps”)
#axis(2, at = 0:5, labels = 0:5)
#legend(“topright”, colnames(rf.num), fill = colors, bty = “n”)
If you run the code in R, this is the output you would get:
I have use twelve different colors to present twelve different months.
BOXPLOT in R:
Now, lets use the same database to create boxplots that would show us the spread in our database and produce the 5 point summary statistics in a graphical form.
Here is the R code:
#R code for Boxplot
#Coded by Ankan Basu, Hydrogeologist
#main=”Pipe Flows at Various Pumps at Kimballton Mine”,
#xlab=”Number of Cylinders”,
ylab=”Gallons Per Minute (GPM)”,
#you can add more colors here
Output of the Box plot code will produce the following graph: