Hits-class {IRanges}R Documentation

Set of hits between 2 vector-like objects

Description

The Hits class stores a set of "hits" between the elements in one vector-like object (called the "query") and the elements in another (called the "subject"). Currently, Hits are used to represent the result of a call to findOverlaps, though other operations producing "hits" are imaginable.

Details

The as.matrix and as.data.frame methods coerce a Hits object to a two column matrix or data.frame with one row for each hit, where the value in the first column is the index of an element in the query and the value in the second column is the index of an element in the subject.

The as.table method counts the number of hits for each query element and outputs the counts as a table.

To transpose a Hits x, so that the subject and query are interchanged, call t(x). This allows, for example, counting the number of hits for each subject element using as.table.

Coercion

In the code snippets below, x is a Hits object.

as.matrix(x): Coerces x to a two column integer matrix, with each row representing a hit between a query index (first column) and subject index (second column).

as(from, "DataFrame"): Creates a DataFrame by combining the result of as.matrix(from) with mcols(from).

as.data.frame(x): Attempts to coerce the result of as(from, "DataFrame") to a data.frame.

as.table(x): counts the number of hits for each query element in x and outputs the counts as a table.

t(x): Interchange the query and subject in x, returns a transposed Hits.

as.list(x): Returns a list with an element for each query, where each element contains the indices of the subjects that have a hit with the corresponding query.

as(x, "List"): Like as.list, above.

Extraction

x[i]: Extracts a subset of the hits. The index argument i may be logical or numeric. If numeric, be sure that i does not contain any duplicates, which would violate the set property of Hits.

Accessors

queryHits(x): Equivalent to as.data.frame(x)[[1]].

subjectHits(x): Equivalent to as.data.frame(x)[[2]].

countQueryHits(x): Counts the number of hits for each query, returning an integer vector.

countSubjectHits(x): Counts the number of hits for each subject, returning an integer vector.

length(x): get the number of hits

queryLength(x), nrow(x): get the number of elements in the query

subjectLength(x), ncol(x): get the number of elements in the subject

Other operations

queryHits(x, query.map=NULL, new.queryLength=NA, subject.map=NULL, new.subjectLength=NA): Remaps the hits in x thru a "query map" and/or a "subject map" map. The query hits are remapped thru the "query map", which is specified via the query.map and new.queryLength arguments. The subject hits are remapped thru the "subject map", which is specified via the subject.map and new.subjectLength arguments.

The "query map" is conceptually a function (in the mathematical sense) and is also known as the "mapping function". It must be defined on the 1..M interval and take values in the 1..N interval, where N is queryLength(x) and M is the value specified by the user via the new.queryLength argument. Note that this mapping function doesn't need to be injective or surjective. Also it is not represented by an R function but by an integer vector of length M with no NAs. More precisely query.map can be NULL (identity map), or a vector of queryLength(x) non-NA integers that are >= 1 and <= new.queryLength, or a factor of length queryLength(x) with no NAs (a factor is treated as an integer vector, and, if missing, new.queryLength is taken to be its number of levels). Note that a factor will typically be used to represent a mapping function that is not injective.

The same apply to the "subject map".

remapHits returns a Hits object where all the query and subject hits (accessed with queryHits and subjectHits, respectively) have been remapped thru the 2 specified maps. This remapping is actually only the 1st step of the transformation, and is followed by 2 additional steps: (2) the removal of duplicated hits, and (3) the reordering of the hits (first by query hits, then by subject hits). Note that if the 2 maps are injective then the remapping won't introduce duplicated hits, so, in that case, step (2) is a no-op (but is still performed). Also if the "query map" is strictly ascending and the "subject map" ascending then the remapping will preserve the order of the hits, so, in that case, step (3) is also a no-op (but is still performed).

Author(s)

Michael Lawrence

See Also

findOverlaps, which generates an instance of this class. setops-methods for set operations on Hits objects.

Examples

query <- IRanges(c(1, 4, 9), c(5, 7, 10))
subject <- IRanges(c(2, 2, 10), c(2, 3, 12))
tree <- IntervalTree(subject)
overlaps <- findOverlaps(query, tree)

as.matrix(overlaps)
as.data.frame(overlaps)

as.table(overlaps) # hits per query
as.table(t(overlaps)) # hits per subject

hits1 <- remapHits(overlaps, subject.map=factor(c("e", "e", "d"), letters[1:5]))
hits1
hits2 <- remapHits(overlaps, subject.map=c(5, 5, 4), new.subjectLength=5)
hits2
stopifnot(identical(hits1, hits2))

[Package IRanges version 1.18.0 Index]