OverlapEncodings-class {IRanges} | R Documentation |
The OverlapEncodings class is a container for storing the
"overlap encodings" returned by the encodeOverlaps
function.
## OverlapEncodings accessors: ## S4 method for signature 'OverlapEncodings' length(x) ## S4 method for signature 'OverlapEncodings' Loffset(x) ## S4 method for signature 'OverlapEncodings' Roffset(x) ## S4 method for signature 'OverlapEncodings' encoding(x) ## S4 method for signature 'OverlapEncodings' levels(x) ## S4 method for signature 'OverlapEncodings' flippedQuery(x) ## S4 method for signature 'OverlapEncodings' Lencoding(x) ## S4 method for signature 'OverlapEncodings' Rencoding(x) ## S4 method for signature 'OverlapEncodings' ngap(x) ## S4 method for signature 'OverlapEncodings' Lngap(x) ## S4 method for signature 'OverlapEncodings' Rngap(x) ## Coercing an OverlapEncodings object: ## S4 method for signature 'OverlapEncodings' as.data.frame(x, row.names=NULL, optional=FALSE, ...) ## Low-level related utilities: ## S4 method for signature 'character' Lencoding(x) ## S4 method for signature 'character' Rencoding(x) ## S4 method for signature 'character' ngap(x) ## S4 method for signature 'character' Lngap(x) ## S4 method for signature 'character' Rngap(x) ## S4 method for signature 'factor' Lencoding(x) ## S4 method for signature 'factor' Rencoding(x) ## S4 method for signature 'factor' ngap(x) ## S4 method for signature 'factor' Lngap(x) ## S4 method for signature 'factor' Rngap(x)
x |
An OverlapEncodings object. For the low-level utilities, |
row.names |
|
optional, ... |
Ignored. |
Given a query
and a subject
of the same length, both
list-like objects with top-level elements typically containing multiple
ranges (e.g. RangesList objects), the "overlap encoding" of the
i-th element in query
and i-th element in subject
is a
character string describing how the ranges in query[[i]]
are
qualitatively positioned relatively to the ranges in
subject[[i]]
.
The encodeOverlaps
function computes those overlap
encodings and returns them in an OverlapEncodings object of the same
length as query
and subject
.
The topic of working with overlap encodings is covered in details in the "Overlap encodings" vignette in the GenomicRanges package.
In the following code snippets, x
is an OverlapEncodings object
typically obtained by a call to encodeOverlaps(query, subject)
.
length(x)
:
Get the number of elements (i.e. encodings) in x
.
This is equal to length(query)
and length(subject)
.
Loffset(x)
, Roffset(x)
:
Get the "left offsets" and "right offsets" of the encodings,
respectively. Both are integer vectors of the same length as x
.
Let's denote Qi = query[[i]]
, Si = subject[[i]]
,
and [q1,q2] the range covered by Qi
i.e.
q1 = min(start(Qi))
and q2 = max(end(Qi))
,
then Loffset(x)[i]
is the number L
of ranges at the
head of Si
that are strictly to the left of all
the ranges in Qi
i.e. L
is the greatest value such that
end(Si)[k] < q1 - 1
for all k
in seq_len(L)
.
Similarly, Roffset(x)[i]
is the number R
of ranges at the
tail of Si
that are strictly to the right of all
the ranges in Qi
i.e. R
is the greatest value such that
start(Si)[length(Si) + 1 - k] > q2 + 1
for all k
in seq_len(L)
.
encoding(x)
:
Factor of the same length as x
where the i-th element is
the encoding obtained by comparing each range in Qi
with
all the ranges in tSi = Si[(1+L):(length(Si)-R)]
(tSi
stands for "trimmed Si").
More precisely, here is how this encoding is obtained:
All the ranges in Qi
are compared with tSi[1]
,
then with tSi[2]
, etc...
At each step (one step per range in tSi
), comparing
all the ranges in Qi
with tSi[k]
is done with
rangeComparisonCodeToLetter(compare(Qi, tSi[k]))
.
So at each step, we end up with a vector of M
single letters (where M
is length(Qi)
).
Each vector obtained previously (1 vector per range in
tSi
, all of them of length M
) is turned
into a single string (called "encoding block") by pasting
its individual letters together.
All the encoding blocks (1 per range in tSi
) are pasted
together into a single long string and separated by colons
(":"
). An additional colon is prepended to the long
string and another one appended to it.
Finally, a special block containing the value of M
is
prepended to the long string. The final string is the encoding.
levels(x)
: Equivalent to levels(encoding(x))
.
flippedQuery(x)
:
Whether or not the top-level element in query used for computing the
encoding was "flipped" before the encoding was computed.
Note that this flipping generally affects the "left offset",
"right offset", in addition to the encoding itself.
Lencoding(x)
, Rencoding(x)
:
Extract the "left encodings" and "right encodings" of paired-end
encodings.
Paired-end encodings are obtained by encoding paired-end overlaps
i.e. overlaps between paired-end reads and transcripts (typically).
The difference between a single-end encoding and a paired-end encoding
is that all the blocks in the latter contain a "--"
separator
to mark the separation between the "left encoding" and the "right
encoding".
See the "Overlap encodings" vignette in the GenomicRanges package for examples of paired-end encodings.
ngap(x)
, Lngap(x)
, Rngap(x)
:
Extract the number of gaps in each encoding by looking at their first
block (aka special block).
If an element xi
in x
is a paired-end encoding,
then Lngap(xi)
, Rngap(xi)
, and ngap(xi)
,
return ngap(Lencoding(xi))
, ngap(Rencoding(xi))
,
and Lngap(xi) + Rngap(xi)
, respectively.
In the following code snippets, x
is an OverlapEncodings object.
as.data.frame(x)
:
Return x
as a data frame with columns "Loffset"
,
"Roffset"
and "encoding"
.
H. Pages
The "Overlap encodings" vignette in the GenomicRanges package.
compare
for the interpretation of the string
returned by encoding
.
The RangesList class.
example(encodeOverlaps) # to make 'ovenc' length(ovenc) Loffset(ovenc) Roffset(ovenc) encoding(ovenc) levels(ovenc) nlevels(ovenc) flippedQuery(ovenc) ngap(ovenc) as.data.frame(ovenc) ngap(levels(ovenc))