Sequence-class {IRanges}R Documentation

Sequence objects

Description

The Sequence virtual class serves as the heart of the IRanges package and has over 90 subclasses. It serves a similar role as vector in base R. The Sequence class includes three slots: elementType, metadata (via extension of the Annotated class), and elementMetadata. Their purpose is defined below.

The elementType slot is the preferred location for Sequence subclasses to store the type of data represented in the sequence. It is designed to take a character of length 1 representing the class of the sequence elements. While the Sequence class performs no validity checking based on elementType, if a subclass expects elements to be of a given type, that subclass is expected to perform the necessary validity checking. For example, the subclass IntegerList has elementType = "integer" and its validity method checks if this condition is TRUE.

The Sequence class supports the storage of global and element-wise metadata with its metadata and elementMetadata slots. The metadata slot can store a list of metadata pertaining to the whole object and the elementMetadata slot can store a DataTable (or NULL) for element-wise metadata with a row for each element and a column for each metadata variable.

To be functional, a class that inherits from Sequence must define a length and names methods as well as one or both of the subscript methods "[[" and "[".

Accessors

In the following code snippets, x is a Sequence object.

length(x): Get the number of elements in x.
NROW(x): Defined as length(x) for any Sequence object that is not a DataTable object. If x is a DataTable object, then it's defined as nrow(x).
names(x), names(x) <- value: Get or set the names of the elements in the Sequence.
elementType(x): Get the scalar string naming the class from which all elements must derive.
elementLengths(x): Get the 'length' of each of the elements.
isEmpty(x): Returns a logical indicating either if the sequence has no elements or if all its elements are empty.
nlevels(x): Returns the number of factor levels.
metadata(x), metadata(x) <- value: Get or set the list holding arbitrary R objects as annotations. May be, and often is, empty.
elementMetadata(x), elementMetadata(x) <- value: Get or set the DataTable holding local metadata on each element. The rows are named according to the names of the elements. Optional, may be NULL.
values(x), values(x) <- value: Alternative to elementMetadata functions.

Subsetting

In the code snippets below, x is a Sequence object or regular R vector object. The R vector object methods for window and seqselect are defined in this package and the remaining methods are defined in base R.

x[i, drop=TRUE]: If defined, returns a new Sequence object made of selected elements i, which can be missing; an NA-free logical, numeric, or character vector; or a logical Rle object. The drop argument specifies whether or not to coerce the returned sequence to a standard vector.
x[i] <- value: Equivalent to seqselect(x, i) <- value.
window(x, start = NA, end = NA, width = NA, frequency = NULL, delta = NULL, ...): Extract the subsequence window from the Sequence object using:
start, end, width
The start, end, or width of the window. Two of the three are required.
frequency, delta
Optional arguments that specify the sampling frequency and increment within the window.
In general, this is more efficient than using "[" operator.
window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <- value: Replace the subsequence window specified on the left (i.e. the subsequence in x specified by start, end and width) by value. value must either be of class class(x), belong to a subclass of class(x), be coercible to class(x), or be NULL. If keepLength is TRUE, the elements of value are repeated to create a Sequence with the same number of elements as the width of the subsequence window it is replacing. If keepLength is FALSE, this replacement method can modify the length of x, depending on how the length of the left subsequence window compares to the length of value.
seqselect(x, start=NULL, end=NULL, width=NULL): Similar to window, except that multiple consecutive subsequences can be requested for concatenation. As such two of the three start, end, and width arguments can be used to specify the consecutive subsequences. Alternatively, start can take a Ranges object or something that can be converted to a Ranges object like an integer vector, logical vector or logical Rle. If the concatenation of the consecutive subsequences is undesirable, consider using Views.
seqselect(x, start=NULL, end=NULL, width=NULL) <- value: Similar to window<-, except that multiple consecutive subsequences can be replaced by a value whose length is a divisor of the number of elements it is replacing. As such two of the three start, end, and width arguments can be used to specify the consecutive subsequences. Alternatively, start can take a Ranges object or something that can be converted to a Ranges object like an integer vector, logical vector or logical Rle.
head(x, n = 6L): If n is non-negative, returns the first n elements of the Sequence object. If n is negative, returns all but the last abs(n) elements of the Sequence object.
tail(x, n = 6L): If n is non-negative, returns the last n elements of the Sequence object. If n is negative, returns all but the first abs(n) elements of the Sequence object.
rev(x): Return a new Sequence object made of the original elements in the reverse order.
rep(x, times, length.out, each), rep.int(x, times): Repeats the values in x through one of the following conventions:
times
Vector giving the number of times to repeat each element if of length length(x), or to repeat the whole vector if of length 1.
length.out
Non-negative integer. The desired length of the output vector.
each
Non-negative integer. Each element of x is repeated each times.
subset(x, subset): Return a new Sequence object made of the subset using logical vector subset, where missing values are taken as FALSE.

Element extraction (list style)

In the code snippets below, x is a Sequence object.

x[[i]]: If defined, return the selected element i, where i is an numeric or character vector of length 1.
x$name: Similar to x[[name]], but name is taken literally as an element name.

Combining

In the code snippets below, x is a Sequence object.

c(x, ...): Combine x and the Sequence objects in ... together. Any object in ... must belong to the same class as x, or to one of its subclasses, or must be NULL. The result is an object of the same class as x.
append(x, values, after = length(x)): Insert the Sequence values onto x at the position given by after. values must have an elementType that extends that of x.

Looping

In the code snippets below, x is a Sequence object.

lapply(X, FUN, ...): Like the standard lapply function defined in the base package, the lapply method for Sequence objects returns a list of the same length as X, with each element being the result of applying FUN to the corresponding element of X.
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE): Like the standard sapply function defined in the base package, the sapply method for Sequence objects is a user-friendly version of lapply by default returning a vector or matrix if appropriate.
mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE): Like the standard mapply function defined in the base package, the mapply method for Sequence objects is a multivariate version of sapply.
tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE): Like the standard tapply function defined in the base package, the tapply method for Sequence objects applies a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors.
endoapply(X, FUN, ...): Similar to lapply, but performs an endomorphism, i.e. returns an object of class(X).
mendoapply(FUN, ..., MoreArgs = NULL): Similar to mapply, but performs an endomorphism across multiple objects, i.e. returns an object of class(list(...)[[1]]).
shiftApply(SHIFT, X, Y, FUN, ..., OFFSET = 0L, simplify = TRUE, verbose = FALSE): Let i be the indices in SHIFT, X_i = window(X, 1 + OFFSET, length(X) - SHIFT[i]), and Y_i = window(Y, 1 + SHIFT[i], length(Y) - OFFSET). Calculates the set of FUN(X_i, Y_i, ...) values and return the results in a convenient form:
SHIFT
A non-negative integer vector of shift values.
X, Y
The Sequence or R vector objects to shift.
FUN
The function, found via match.fun, to be applied to each set of shifted vectors.
...
Further arguments for FUN.
OFFSET
A non-negative integer offset to maintain throughout the shift operations.
simplify
A logical value specifying whether or not the result should be simplified to a vector or matrix if possible.
verbose
A logical value specifying whether or not to print the i indices to track the iterations.
aggregate(x, by, FUN, start = NULL, end = NULL, width = NULL, frequency = NULL, delta = NULL, ..., simplify = TRUE)): Generates summaries on the specified windows and returns the result in a convenient form:
by
An object with start, end, and width methods.
FUN
The function, found via match.fun, to be applied to each window of x.
start, end, width
the start, end, or width of the window. If by is missing, then must supply two of the three.
frequency, delta
Optional arguments that specify the sampling frequency and increment within the window.
...
Further arguments for FUN.
simplify
A logical value specifying whether or not the result should be simplified to a vector or matrix if possible.

Coercion

In the code snippets below, x is a Sequence object.

as.env(x, enclos = parent.frame()): Creates an environment from x with a symbol for each names(x). The values are not actually copied into the environment. Rather, they are dynamically bound using makeActiveBinding. This prevents unnecessary copying of the data from the external vectors into R vectors. The values are cached, so that the data is not copied every time the symbol is accessed.
as.list(x, ...), as(from, "list"): Turns x into a standard list.
stack(x, indName = "space", valuesName = "values"): As with stack on a list, constructs a DataFrame with two columns: one for the unlisted values, the other indicating the name of the element from which each value was obtained. indName specifies the column name for the index (source name) column and valuesName specifies the column name for the values.

Functional Programming

The R base package defines some Higher-Order functions that are commonly found in Functional Programming Languages. See ?Reduce for the details, and, in particular, for a description of their arguments. The IRanges package provides methods for Sequence objects, so, in addition to be a vector, the x argument can also be a Sequence object.

Reduce(f, x, init, right = FALSE, accumulate = FALSE): Uses a binary function to successively combine the elements of x and a possibly given initial value. See ?Reduce (in the base package) for the details.
Filter(f, x): Extracts the elements of x for which function f is TRUE. See ?Filter (in the base package) for the details.
Find(f, x, right = FALSE, nomatch = NULL): Extracts the first or last such element in x. See ?Find (in the base package) for the details.
Map(f, ...): Applies a function to the corresponding elements of given Sequence objects. See ?Map (in the base package) for the details.
Position(f, x, right = FALSE, nomatch = NA_integer_): Extracts the first or last such position in x. See ?Position (in the base package) for the details.

Evaluating

In the code snippets below, envir and data are Sequence objects.

eval(expr, envir, enclos = parent.frame()): Converts the Sequence object specified in envir to an environment using as.env, with enclos as its parent, and then evaluates expr within that environment.
with(data, expr, ...): Equivalent to eval(quote(expr), data, ...).

Author(s)

P. Aboyoun

See Also

Annotated, DataTable, SimpleList, Ranges, Rle, XVector for example implementations

Examples

  showClass("Sequence")  # shows (some of) the known subclasses

[Package IRanges version 1.6.16 Index]