Boxplots are one of the more common graphs in statistics. This article will describe how to simply compare the means of two or more groups and add significance markers.
Usually, according to the value of the significant p-value, it is divided into four categories:
(1)0.01≤p<0.05,*
(2)0.001≤p<0.01,**
(3)0.0001≤p<0.001,***
(4)p<0.0001, ****
Three methods of adding saliency markers are described next.
1: Create the data and draw a boxplot.
library(ggplot2)
ggplot(iris, aes(x=Species, y=Sepal.Length))+geom_boxplot()
2: Manually add saliency markers.
library(ggplot2)
### Set the position of the four points of the horizontal line
df2 <- data.frame(a = c(2,2,3,3), b = c(8,8.1,8.1,8))
ggplot(iris, aes(x=Species, y=Sepal.Length))+
geom_boxplot()+
### Draw the line according to the set position
geom_line(data = df2, aes(x = a, y = b)) +
### Add salient marker information
annotate("text", x = 2.5, y = 8.2, label = "***", size = 8)
The disadvantage of this method is that when adding significance manually, it is necessary to confirm the size of the significant P value.
1: Download and install the ggsignif package.
install.packages("ggsignif")
2: Create the data and draw a boxplot.
library(ggplot2)
library(ggsignif)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot()
3: Use the ggsignif
R package to add saliency markers.
library(ggplot2)
library(ggsignif)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot() +
### Add column information for pairwise comparison
geom_signif(comparisons = list(c("versicolor", "virginica")),
### P value < 0.05, then display
map_signif_level=TRUE)
1: Download and install the ggpubr package.
install.packages("ggpubr")
2: Create the data and draw a boxplot.
library(ggpubr)
### Add list of pairwise comparisons
my_comparisons <- list(c("setosa", "versicolor"), c("setosa", "virginica"), c("versicolor", "virginica"))
ggboxplot(iris, x="Species", y="Sepal.Length", color = "Species", palette = "jco", add = "jitter")
3: Use the ggpubr R package to add significance markers.
library(ggpubr)
my_comparisons <- list(c("setosa", "versicolor"), c("setosa", "virginica"), c("versicolor", "virginica"))
ggboxplot(iris, x="Species", y="Sepal.Length",color = "Species",palette = "jco", add = "jitter")+
### Add statistical results for multiple group comparisons
stat_compare_means(label.y = 9.5)+
### Add saliency marker location information for each pairwise comparison
stat_compare_means(comparisons=my_comparisons, label.y = c(7.6, 8.4, 8.0), label = "p.signif")
This R package can increase the p-value
value between multiple groups, and can also increase the between-group comparison for a specified group.