GRanges-class {GenomicRanges}R Documentation

GRanges objects

Description

The GRanges class is a container for the genomic locations and their associated annotations.

Details

The GRanges class stores the sequences of genomic locations and associated annotations. Each element in the sequence is comprised of a sequence name, an interval, a strand, and optional element metadata (e.g. score, GC content, etc.). This information is stored in four slots:

seqnames
a 'factor' Rle object containing the sequence names.
ranges
an IRanges object containing the ranges.
strand
a 'factor' Rle object containing the strand information.
elementMetadata
a DataFrame object containing the annotation columns. Columns cannot be named "seqnames", "ranges", "strand", "seqlevels", "seqlengths", "isCircular", "start", "end", "width", or "element".

Constructor

GRanges(seqnames = Rle(), ranges = IRanges(), strand = Rle("*", length(seqnames)), ..., seqlengths = structure(rep(NA_integer_, length(levels(seqnames))), names = levels(seqnames))): Creates a GRanges object.
seqnames
Rle object, character vector, or factor containing the sequence names.
ranges
IRanges object containing the ranges.
strand
Rle object, character vector, or factor containing the strand information.
seqlengths
a named integer vector containing the sequence lengths for each level(seqnames).
...
Optional annotation columns for the elementMetadata slot. These columns cannot be named "start", "end", "width", or "element".

Coercion

In the code snippets below, x is a GRanges object.

as(from, "GRanges"): Creates a GRanges object from a RangedData, RangesList or RleList object.
as(from, "RangedData"): Creates a RangedData object from a GRanges object. The strand and the values become columns in the result. The seqlengths(from) and isCircular(from) vectors are stored in the element metadata of ranges(rd).
as(from, "RangesList"): Creates a RangesList object from a GRanges object. The strand and values become element metadata on the ranges. The seqlengths(from) and isCircular(from) vectors are stored in the element metadata.
as.data.frame(x, row.names = NULL, optional = FALSE): Creates a data.frame with columns seqnames (factor), start (integer), end (integer), width (integer), strand (factor), as well as the additional columns stored in elementMetadata(x).

Accessors

In the following code snippets, x is a GRanges object.

length(x): Gets the number of elements.
seqnames(x), seqnames(x) <- value: Gets or sets the sequence names. value can be an Rle object, a character vector, or a factor.
ranges(x), ranges(x) <- value: Gets or sets the ranges. value can be a Ranges object.
names(x), names(x) <- value: Gets or sets the names of the elements.
strand(x), strand(x) <- value: Gets or sets the strand. value can be an Rle object, character vector, or factor.
elementMetadata(x), elementMetadata(x) <- value: Gets or sets the optional data columns. value can be a DataFrame, data.frame object, or NULL.
values(x), values(x) <- value: Alternative to elementMetadata functions.
seqinfo(x), seqinfo(x) <- value: Gets or sets the information about the underlying sequences. value must be a Seqinfo object.
seqlevels(x), seqlevels(x) <- value: Gets or sets the sequence levels. seqlevels(x) is equivalent to seqlevels(seqinfo(x)) or to levels(seqnames(x)), those 2 expressions being guaranteed to return indentical character vectors on a GRanges object. value must be a character vector with no NAs.
seqlengths(x), seqlengths(x) <- value: Gets or sets the sequence lengths. seqlengths(x) is equivalent to seqlengths(seqinfo(x)). value can be a named non-negative integer or numeric vector eventually with NAs.
isCircular(x), isCircular(x) <- value: Gets or sets the circularity flags. isCircular(x) is equivalent to isCircular(seqinfo(x)). value must be a named logical vector eventually with NAs.

Ranges methods

In the following code snippets, x is a GRanges object.

start(x), start(x) <- value: Gets or sets start(ranges(x)).
end(x), end(x) <- value: Gets or sets end(ranges(x)).
width(x), width(x) <- value: Gets or sets width(ranges(x)).
flank(x, width, start = TRUE, both = FALSE, use.names = TRUE): Returns a new GRanges object containing intervals of width width that flank the intervals in x. The start argument takes a logical indicating whether x should be flanked at the "start" (TRUE) or the "end" (FALSE), which for strand(x) != "-" is start(x) and end(x) respectively and for strand(x) == "-" is codeend(x) and start(x) respectively. The both argument takes a single logical value indicating whether the flanking region width positions extends into the range. If both = TRUE, the resulting range thus straddles the end point, with width positions on either side.
resize(x, width, use.names = TRUE): Returns a new GRanges object containing intervals that have been resized to width width based on the strand(x) values. Elements where strand(x) == "+" are anchored at start(x), elements where strand(x) == "-" are anchored at the end(x), and elements where strand = "*" are anchored at (end(x) - start(x))%/%2. The use.names argument determines whether or not to keep the names on the ranges.
shift(x, shift, use.names = TRUE): Returns a new GRanges object containing intervals with start and end values that have been shifted by integer vector shift. The use.names argument determines whether or not to keep the names on the ranges.
disjoin(x): Returns a new GRanges object containing disjoint ranges for each distinct (seqname, strand) pairing. The names (names(x)) and the columns in x are dropped.
gaps(x, start = 1L, end = seqlengths(x)): Returns a new GRanges object containing complemented ranges for each distinct (seqname, strand) pairing. The names (names(x)) and the columns in x are dropped. See ?gaps for more information about range complements and for a description of the optional arguments.
range(x, ...): Returns a new GRanges object containing range bounds for each distinct (seqname, strand) pairing. The names (names(x)) and the columns in x are dropped.
reduce(x, drop.empty.ranges = FALSE, min.gapwidth = 1L): Returns a new GRanges object containing reduced ranges for each distinct (seqname, strand) pairing. The names (names(x)) and the columns in x are dropped. See ?reduce for more information about range reduction and for a description of the optional arguments.
coverage(x, shift=0L, width=NULL, weight=1L): Returns a named RleList object with one element ('integer' Rle) per underlying sequence in x representing how many times each position in the sequence is covered by the intervals in x.

See ?coverage for the role of optional arguments shift, width and weight.

Here is how those arguments are handled when x is a GRanges object:

Splitting and Combining

In the code snippets below, x is a GRanges object.

append(x, values, after = length(x)): Inserts the values into x at the position given by after, where x and values are of the same class.
c(x, ...): Combines x and the GRanges objects in ... together. Any object in ... must belong to the same class as x, or to one of its subclasses, or must be NULL. The result is an object of the same class as x.
c(x, ..., .ignoreElementMetadata=TRUE) If the GRanges objects have associated elementMetadata (also known as values), each such DataFrame must have the same columns in order to combine successfully. In order to circumvent this restraint, you can pass in an .ignoreElementMetadata=TRUE argument which will combine all the objects into one and drop all of their elementMetadata.
split(x, f = seq_len(length(x)), drop = FALSE): Splits x into a GRangesList, according to f, dropping elements corresponding to unrepresented levels if drop is TRUE. Split factor f defaults to splitting each element of x into a separate element in the resulting GRangesList object.

Subsetting

In the code snippets below, x is a GRanges object.

x[i, j], x[i, j] <- value: Gets or sets elements i with optional elementMetadata columns elementMetadata(x)[,j], where i can be missing; an NA-free logical, numeric, or character vector; or a 'logical' Rle object.
x[i,j] <- value: Replaces elements i and optional elementMetadata columns j with value.
head(x, n = 6L): If n is non-negative, returns the first n elements of the GRanges object. If n is negative, returns all but the last abs(n) elements of the GRanges object.
rep(x, times, length.out, each): Repeats the values in x through one of the following conventions:
times
Vector giving the number of times to repeat each element if of length length(x), or to repeat the whole vector if of length 1.
length.out
Non-negative integer. The desired length of the output vector.
each
Non-negative integer. Each element of x is repeated each times.
seqselect(x, start=NULL, end=NULL, width=NULL): Similar to window, except that multiple consecutive subsequences can be requested for concatenation. As such two of the three start, end, and width arguments can be used to specify the consecutive subsequences. Alternatively, start can take a Ranges object or something that can be converted to a Ranges object like an integer vector, logical vector or logical Rle. If the concatenation of the consecutive subsequences is undesirable, consider using Views.
seqselect(x, start=NULL, end=NULL, width=NULL) <- value: Similar to window<-, except that multiple consecutive subsequences can be replaced with a value whose length is a divisor of the number of elements it is replacing. As such two of the three start, end, and width arguments can be used to specify the consecutive subsequences. Alternatively, start can take a Ranges object or something that can be converted to a Ranges object like an integer vector, logical vector or logical Rle.
subset(x, subset): Returns a new object of the same class as x made of the subset using logical vector subset, where missing values are taken as FALSE.
tail(x, n = 6L): If n is non-negative, returns the last n elements of the GRanges object. If n is negative, returns all but the first abs(n) elements of the GRanges object.
window(x, start = NA, end = NA, width = NA, frequency = NULL, delta = NULL, ...): Extracts the subsequence window from the GRanges object using:
start, end, width
The start, end, or width of the window. Two of the three are required.
frequency, delta
Optional arguments that specify the sampling frequency and increment within the window.
In general, this is more efficient than using "[" operator.
window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <- value: Replaces the subsequence window specified on the left (i.e. the subsequence in x specified by start, end and width) by value. value must either be of class class(x), belong to a subclass of class(x), be coercible to class(x), or be NULL. If keepLength is TRUE, the elements of value are repeated to create a GRanges object with the same number of elements as the width of the subsequence window it is replacing. If keepLength is FALSE, this replacement method can modify the length of x, depending on how the length of the left subsequence window compares to the length of value.

Author(s)

P. Aboyoun

See Also

GRangesList-class, seqinfo, Seqinfo-class, Vector-class, Ranges-class, Rle-class, DataFrame-class

Examples

  gr <-
    GRanges(seqnames =
            Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
            ranges =
            IRanges(1:10, width = 10:1, names = head(letters,10)),
            strand =
            Rle(strand(c("-", "+", "*", "+", "-")),
                c(1, 2, 2, 3, 2)),
            score = 1:10,
            GC = seq(1, 0, length=10))
  gr

  # Summarizing elements
  table(seqnames(gr))
  sum(width(gr))
  summary(elementMetadata(gr)[,"score"]) # or values(gr)
  coverage(gr)

  # Renaming the underlying sequences
  seqlevels(gr)
  seqlevels(gr) <- sub("chr", "Chrom", seqlevels(gr))
  gr

  # Intra-interval operations
  flank(gr, 10)
  resize(gr, 10)
  shift(gr, 1)

  # Inter-interval operations
  disjoin(gr)
  gaps(gr, start = 1, end = 10)
  range(gr)
  reduce(gr)
  
  # Combining objects
  gr2 <- GRanges(seqnames=Rle(c('Chrom1', 'Chrom2', 'Chrom3'), c(3, 3, 4)),
                 IRanges(1:10, width=5), strand='-',
                 score=101:110, GC = runif(10))
  gr3 <- GRanges(seqnames=Rle(c('Chrom1', 'Chrom2', 'Chrom3'), c(3, 4, 3)),
                 IRanges(101:110, width=10), strand='-',
                 score=21:30)
  some.gr <- c(gr, gr2)
  
  ## all.gr <- c(gr, gr2, gr3) ## (This would fail)
  all.gr <- c(gr, gr2, gr3, .ignoreElementMetadata=TRUE)

[Package GenomicRanges version 1.4.6 Index]