R data visualization-circlize genome drawing function

created at 07-03-2021 views: 5

Drawing function

The function to create plot regions of genomic data is circos.genomicTrack(), or circos.genomicTrackPlotRegions().

Its practical way is similar to the circos.track() function, you can use panel.fun to add custom drawing functions

circos.genomicTrackPlotRegion(
  data, panel.fun = function(region, value, ...) {
    circos.genomicPoints(region, value, ...)
})

In the panel.fun function, you can add graphics with basic graphics functions. The function receives two parameters region and value:

  • region: A data frame containing two columns of start and end positions
  • value: the data frame of other column information, generally the data starting from the fourth column

the region data is used to identify the x-axis, and the value identifies the y-axis.

The panel.fun function is also mandatory to pass in the third parameter..., which is used to pass variables that are not visible to the user and handed over to its internal basic drawing functions for analysis, such as circos.genomicPoints.

For example, we create data with two additional columns

> bed <- generateRandomBed(nc = 2)
> head(bed, n = 2)
   chr  start    end     value1     value2
1 chr1 121306 127516 -0.5083810 -0.7065109
2 chr1 140866 680287  0.4426682 -1.0817683

We can print out the region and value in the panel.fun function

circos.initializeWithIdeogram(plotType = NULL)
circos.genomicTrackPlotRegion(
  bed, panel.fun = function(region, value, ...) {
    if(CELL_META$sector.index == "chr1") {
      print(head(region, n = 2))
      print(head(value, n = 2))
    }
})

As you can see, the region is the column 2 and 3 of the data, and the value is the two columns of 4 and 5.

   start    end
1 121306 127516
2 140866 680287
      value1     value2
1 -0.5083810 -0.7065109
2  0.4426682 -1.0817683

The numeric.column parameter is used to specify the y-axis data, you can pass the corresponding column name or column index, the default is all numeric columns (starting from the fourth column), these data have the same x-axis coordinates, you can use ylim to set the data range ,E.g

circos.genomicTrackPlotRegion(
  data, ylim = c(0, 1),
  panel.fun = function(region, value, ...) {
    circos.genomicPoints(region, value, ...)
})
circos.genomicTrackPlotRegion(
  data, numeric.column = c("value1", "value2"), 
  panel.fun = function(region, value, ...) {
     circos.genomicPoints(region, value, ...)
})

If the input data is a data frame list, numeric.column is a vector or a scalar with the same length as the list.

1. Basic graphics functions

circos.genomicPoints() is used to draw a point map, which is implemented using the circos.points() function

circos.genomicPoints <- function(region, value, numeric.column = 1, ...) {
    x = (region[[2]] + region[[1]])/2
    for(i in numeric.column) {
        y = value[[i]]
        circos.points(x, y, ...)
    }
}

If you don't want to use circos.genomic*() type functions, you can use circos.*() to achieve.

usages include

circos.genomicPoints(region, value, numeric.column = c(1, 2))
circos.genomicPoints(region, value, cex, pch)
circos.genomicPoints(region, value, sector.index, track.index)
circos.genomicTrack(data, numeric.column = 4, 
    panel.fun = function(region, value, ...) {
        circos.genomicPoints(region, value, ...)
})

Other genomic data drawing functions are also implemented using the corresponding circos.*()

  • circos.genomicLines
circos.genomicLines(region, value, ...)
circos.genomicLines(region, value, numeric.column = c(1, 2))
circos.genomicLines(region, value, area, baseline, border)
circos.genomicLines(region, value, sector.index, track.index)
  • circos.genomicText
circos.genomicText(region, value, ...)
circos.genomicText(region, value, y = 1, labels)
circos.genomicText(region, value, numeric.column, labels.column)
circos.genomicText(region, value, facing, niceFacing, adj)
circos.genomicText(region, value, sector.index, track.index)
  • circos.genomicRect

Because the left and right borders of the rectangular box are fixed by the x axis, you only need to set the upper and lower borders. The parameters can be ytop, ybottom or ytop.column, ybottom.column to specify the corresponding data column

circos.genomicRect(region, value, ytop = 1, ybottom = 0)
circos.genomicRect(region, value, ytop.column = 2, ybottom = 0)
circos.genomicRect(region, value, col, border)
  • circos.genomicLink

Two data frames are needed to determine the connection area, and other parameters are parsed by circos.link(), for example

bed1 <- generateRandomBed(nr = 100)
bed1 <- bed1[sample(nrow(bed1), 20), ]
bed2 <- generateRandomBed(nr = 100)
bed2 <- bed2[sample(nrow(bed2), 20), ]

circos.initializeWithIdeogram()
circos.genomicLink(
  bed1, bed2, border = NA,
  col = rand_color(nrow(bed1), transparency = 0.5)
)
circos.clear()

Basic graphics

2. Drawing mode

The circos.genomicTrack() function and panel.fun parameters have different expressions for different input data or different modes

2.1 Normal mode

2.1.1 DataFrame

If the input data is a data frame, the drawing method is the same as before

circos.initializeWithIdeogram()
circos.genomicTrack(
  data, numeric.column = 4, 
  panel.fun = function(region, value, ...) {
    circos.genomicPoints(region, value, col = "blue")
    # Here numeric.column = 1 means the first column of value, that is, the fourth column of data
    circos.genomicPoints(region, value, numeric.column = 1, col = "red")
  }
)
circos.clear()

Normal mode

2.1.2 Data frame list

For the input data of the data frame list, panel.fun will draw according to the different data frames of the current chromosome. Region and value represent the current chromosome and the value corresponding to the current data frame.

You need to use getI(...) inside the panel.fun function to get the index of the current data frame. E.g

circos.initializeWithIdeogram()
circos.genomicTrack(
  bed_list,
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(region, value, col = rand_color(1), ...)
  })

# column 4 in the first bed and column 5 in the second bed
circos.genomicTrack(
  bed_list,
  numeric.column = c(4, 5),
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(region, value, col = rand_color(1), ...)
  }
)
circos.clear()

Data frame list

2.2 Stack mode

Set stack = TRUE in the circos.genomicTrack() function to turn on the stack mode.

In stacked mode, ylim will be redefined, and the y-axis will be divided into bins with the same height (y = 1, 2, ...), and a corresponding graph will be placed in each bin

2.2.1 Data Frame

If the input data in the stacked mode is a data frame containing multiple numeric columns, each numeric column specified by numeric.column will be used as a unit, ylim is set to (0.5,0.5+n), and n is the value of the numeric column number. The y-axis value value will be replaced with y=i.

for example:

data <- generateRandomBed(nr = 100, nc = 2)
circos.initializeWithIdeogram()
circos.genomicTrack(
  data, stack = TRUE,
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(region, value, col = rand_color(1), ...)
  }
)
circos.clear()

Stack mode

2.2.2 Data frame list

If the input is a data frame list, each data frame is considered as a unit, ylim is redefined as (0.5,0.5+n), and n is the length of the data frame list.

panel.fun will be applied to each data frame,

circos.initializeWithIdeogram()
circos.genomicTrack(
  bed_list, stack = TRUE,
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(region, value, col = rand_color(1), ...)
  })

circos.clear()

Data frame list

3 Application

3.1. Point graph

To make it easier to see the difference between the graphs, we only display one chromosome and draw it as a 1/4 circle

circos.par(
  "track.height" = 0.1, start.degree = 90,
  canvas.xlim = c(0, 1), canvas.ylim = c(0, 1), 
  gap.degree = 270
)
circos.initializeWithIdeogram(
  chromosome.index = "chr1", plotType = NULL
)

Add trajectory A, only draw points

bed <- generateRandomBed(nr = 300)

circos.genomicTrack(
  bed, panel.fun = function(region, value, ...) {
  circos.genomicPoints(region, value, pch = 16, cex = 0.5, ...)
  circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'A', adj = c(1.05, 0.5)
    )
  }
)

Add track B, arrange the points in a stack pattern, and add a dashed line

circos.genomicTrack(
  bed, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    circos.genomicPoints(region, value, pch = 16, cex = 0.5, ...)
    i = getI(...)
    circos.lines(CELL_META$cell.xlim, c(i, i), 
                 lty = 2, col = "#00000040")
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'B', adj = c(1.05, 0.5)
    )
  }
)

Add trajectory C, use the data frame list, and set different colors for the points of the two data frames

bed1 <- generateRandomBed(nr = 300)
bed2 <- generateRandomBed(nr = 300)
bed_list <- list(bed1, bed2)
circos.genomicTrack(
  bed_list, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(
      region, value, pch = 16,
      cex = 0.5, col = rand_color(1), ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'C', adj = c(1.05, 0.5)
    )
  }
)

Add trajectory D, use the stacking method for the data frame list

circos.genomicTrack(
  bed_list, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(
      region, value, pch = 16,
      cex = 0.5, col = rand_color(1), ...
    )
    circos.lines(
      CELL_META$cell.xlim, c(i, i), lty = 2, 
      col = "grey50"
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'D', adj = c(1.05, 0.5)
    )
  }
)

Add trajectory E, the data frame contains 4 columns of numerical data, each column of data has the same x-axis coordinate, and set a different color

bed <- generateRandomBed(nr = 300, nc = 4)
circos.genomicTrack(
  bed, panel.fun = function(region, value, ...) {
    circos.genomicPoints(
      region, value, pch = 16,
      cex = 0.5, col = 1:4, ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'E', adj = c(1.05, 0.5)
    )
  }
)

Add track F, display 4 columns of data in stacked mode

bed <- generateRandomBed(nr = 300, nc = 4)
circos.genomicTrack(
  bed, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicPoints(
      region, value, pch = 16,
      cex = 0.5, col = i, ...
    )
    circos.lines(
      CELL_META$cell.xlim, c(i, i), 
      lty = 2, col = "grey50"
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'F', adj = c(1.05, 0.5)
    )
  })
circos.clear()

Point graph

3.2 Line

Similar to the dot diagram above, we only use 1/4 circle to draw chr1

circos.par(
  "track.height" = 0.08, start.degree = 90,
  canvas.xlim = c(0, 1), canvas.ylim = c(0, 1), 
  gap.degree = 270,
  cell.padding = c(0, 0, 0, 0)
)

circos.initializeWithIdeogram(
  chromosome.index = "chr1", plotType = NULL
)

Trajectory A, draw a simple polyline, the point of the polyline is the midpoint of the interval

bed <- generateRandomBed(nr = 500)
circos.genomicTrack(
  bed, panel.fun = function(region, value, ...) {
    circos.genomicLines(region, value)
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'A', adj = c(1.05, 0.5)
    )
  }
)

Trajectory B, area line graph, trajectory C is of type h

circos.genomicTrack(
  bed,  panel.fun = function(region, value, ...) {
    circos.genomicLines(region, value, area = TRUE)
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'B', adj = c(1.05, 0.5)
    )
  }
)
circos.genomicTrack(
  bed, panel.fun = function(region, value, ...) {
    circos.genomicLines(region, value, type = "h")
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'C', adj = c(1.05, 0.5)
    )
  }
)

Trajectory D, draw a grouped line chart with a data frame list, each group is a data frame

bed1 <- generateRandomBed(nr = 500)
bed2 <- generateRandomBed(nr = 500)
bed_list <- list(bed1, bed2)
circos.genomicTrack(
  bed_list, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicLines(
      region, value, col = rand_color(1), ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'D', adj = c(1.05, 0.5)
    )
  }
)

Trajectory E, using a stacked method to draw a list of data frames

circos.genomicTrack(
  bed_list, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicLines(region, value, col = i, ...)
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'E', adj = c(1.05, 0.5)
    )
  }
)

Trajectory F, a data frame with 4 columns, draw a grouped line graph, each group represents a column

bed <- generateRandomBed(nr = 500, nc = 4)
circos.genomicTrack(
  bed, 
  panel.fun = function(region, value, ...) {
    circos.genomicLines(region, value, col = 1:4, ...)
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'F', adj = c(1.05, 0.5)
    )
  }
)

Trajectory G, stacked data frame

circos.genomicTrack(
  bed, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicLines(region, value, col = i, ...)
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'G', adj = c(1.05, 0.5)
    )
  }
)

Trajectory H, draw a segment type line

bed <- generateRandomBed(nr = 200)
circos.genomicTrack(
  bed, 
  panel.fun = function(region, value, ...) {
    circos.genomicLines(
      region, value, type = "segment",
      lwd = 2, col = rand_color(nrow(region)),
      ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'H', adj = c(1.05, 0.5)
    )
  })
circos.clear()

Line

3.3 Rectangle

Due to the size of the value represented by the color of the matrix, we define a continuous color map

circos.par(
  "track.height" = 0.15, start.degree = 90,
  canvas.xlim = c(0, 1), canvas.ylim = c(0, 1), 
  gap.degree = 270
)
circos.initializeWithIdeogram(
  chromosome.index = "chr1", plotType = NULL
)
col_fun <- colorRamp2(
  breaks = c(-1, 0, 1), 
  colors = c("#ef8a62", "#f7f7f7", "#67a9cf")
)

If you want to draw a heat map, you can set the stack mode

bed <- generateRandomBed(nr = 100, nc = 4)
circos.genomicTrack(
  bed, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    circos.genomicRect(
      region, value,
      col = col_fun(value[[1]]),
      border = NA, ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'A', adj = c(1.05, 0.5)
    )
  }
)

In track B, use the stacked mode of the data frame list

bed1 <- generateRandomBed(nr = 100)
bed2 <- generateRandomBed(nr = 100)
bed_list <- list(bed1, bed2)
circos.genomicTrack(
  bed_list, stack = TRUE, 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicRect(
      region, value, ytop = i + 0.3,
      ybottom = i - 0.3,
      col = col_fun(value[[1]]),
      ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'B', adj = c(1.05, 0.5)
    )
  }
)

In trajectory C, we use the normal mode to achieve similar functions

circos.genomicTrack(
  bed_list, ylim = c(0.5, 2.5), 
  panel.fun = function(region, value, ...) {
    i = getI(...)
    circos.genomicRect(
      region, value, ytop = i + 0.3,
      ybottom = i - 0.3,
      col = col_fun(value[[1]]),
      ...
    )
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'C', adj = c(1.05, 0.5)
    )
  }
)

Track D, we can set the height of the bar, ytop.column = 1 means the first column of value

bed <- generateRandomBed(nr = 200)
circos.genomicTrack(
  bed, 
  panel.fun = function(region, value, ...) {
    circos.genomicRect(
      region, value, ytop.column = 1,
      ybottom = 0,
      col = ifelse(value[[1]] > 0, "#ef8a62", "#67a9cf"),
      ...
    )
    circos.lines(CELL_META$cell.xlim, c(0, 0), lty = 2, col = "grey50")
    circos.text(
      CELL_META$cell.xlim[1], 
      mean(CELL_META$cell.ylim), 
      'D', adj = c(1.05, 0.5)
    )
  }
)
circos.clear()

Rectangle

Please log in to leave a comment.