basecontent {Biostrings}R Documentation

Obtain the ATCG content of a gene

Description

WARNING: Both basecontent and countbases have been deprecated in favor of alphabetFrequency.

These functions accept a character vector representing the nucleotide sequences and compute the frequencies of each base (A, C, G, T).

Usage

basecontent(seq)
countbases(seq, dna = TRUE)

Arguments

seq Character vector.
dna Logical value indicating whether the sequence is DNA (TRUE) or RNA (FALSE)

Details

The base frequencies are calculated separately for each element of x. The elements of x can be in upper case, lower case or mixed.

Value

A matrix with 4 columns and length(x) rows. The columns are named A, C, T, G, and the values in each column are the counts of the corresponding bases in the elements of x. When dna=FALSE, the T column is replaced with a U column.

Author(s)

R. Gentleman, W. Huber, S. Falcon

See Also

alphabetFrequency, reverseComplement

Examples

 v<-c("AAACT", "GGGTT", "ggAtT")

 ## Do not use these functions anymore:
 if (interactive()) {
   basecontent(v)
   countbases(v)
 }

 ## But use more efficient alphabetFrequency() instead:
 v <- DNAStringSet(v)
 alphabetFrequency(v, baseOnly=TRUE)

 ## Comparing efficiencies:
 if (interactive()) {
   library(hgu95av2probe)
   system.time(y1 <- countbases(hgu95av2probe$sequence))
   x <- DNAStringSet(hgu95av2probe)
   system.time(y2 <- alphabetFrequency(x, baseOnly=TRUE))
 }

[Package Biostrings version 2.14.12 Index]