Box plot is one of the more common graphs in statistics. This article will describe how to simply compare the average of two or more groups and add significance markers.
Generally, according to the value of the significance p-value, it is divided into four categories:
0.01≤p<0.05, *
0.001≤p<0.01, **
0.0001≤p<0.001, ***
p<0.0001, ****
1: Create data and draw a box plot.
library(ggplot2)
ggplot(iris, aes(x=Species, y=Sepal.Length))+geom_boxplot()
2: Manually add a significant mark.
library(ggplot2)
### Set the position of the four points of the horizontal line
df2 <- data.frame(a = c(2,2,3,3), b = c(8,8.1,8.1,8))
ggplot(iris, aes(x=Species, y=Sepal.Length))+
geom_boxplot()+
### Draw a line according to the set position
geom_line(data = df2, aes(x = a, y = b)) +
### Add distinctive mark information
annotate("text", x = 2.5, y = 8.2, label = "***", size = 8)
The disadvantage of this method is that when adding saliency manually, it is necessary to confirm the size of the saliency P value.
1: Download and install the ggsignif package.
install.packages("ggsignif")
2: Create data and draw box plots.
library(ggplot2)
library(ggsignif)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot()
3: Use the ggsignif R package to add a significant mark.
library(ggplot2)
library(ggsignif)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot() +
### Add column information for pairwise comparison
geom_signif(comparisons = list(c("versicolor", "virginica")),
### P value<0.05, then display
map_signif_level=TRUE)
1: Download and install the ggpubr package.
install.packages("ggpubr")
2: Create data and draw box plots.
library(ggpubr)
### Add a list of pairwise comparisons
my_comparisons <- list(c("setosa", "versicolor"), c("setosa", "virginica"), c("versicolor", "virginica"))
ggboxplot(iris, x="Species", y="Sepal.Length",color = "Species",palette = "jco", add = "jitter")
3: Use the ggpubr R package to add a distinctive mark.
library(ggpubr)
my_comparisons <- list(c("setosa", "versicolor"), c("setosa", "virginica"), c("versicolor", "virginica"))
ggboxplot(iris, x="Species", y="Sepal.Length",color = "Species",palette = "jco", add = "jitter")+
### Add statistical results for multiple group comparisons
stat_compare_means(label.y = 9.5)+
### Add saliency marker position information for each pairwise comparison
stat_compare_means(comparisons=my_comparisons, label.y = c(7.6, 8.4, 8.0), label ="p.signif")
The R package can increase the p-value value between multiple groups, and can also increase the comparison between the specified groups.