XStringSet-io {Biostrings} | R Documentation |
Functions to read/write an XStringSet or XStringViews object from/to a file.
## XStringSet object: read.BStringSet(file, format) read.DNAStringSet(file, format) read.RNAStringSet(file, format) read.AAStringSet(file, format) write.XStringSet(x, file="", append=FALSE, format, width=80) ## XStringViews object: read.XStringViews(file, format, subjectClass, collapse="") write.XStringViews(x, file="", append=FALSE, format, width=80) ## FASTQ utilities: fastq.geometry(file) ## Some related helper functions: FASTArecordsToCharacter(FASTArecs, use.names=TRUE) CharacterToFASTArecords(x) FASTArecordsToXStringViews(FASTArecs, subjectClass, collapse="") XStringSetToFASTArecords(x)
file |
A character vector with no NAs.
If "" (the default for write.XStringSet and
write.XStringViews ), then the functions write to the standard
output connection (the console) unless redirected by sink .
|
format |
Either "fasta" or "fastq" .
Note that write.XStringSet and write.XStringViews only
support "fasta" for now.
|
x |
For write.XStringSet and write.XStringViews , the object to
write to file .
For CharacterToFASTArecords , the (possibly named) character
vector to be converted to a list of FASTA records as one returned
by readFASTA .
For XStringSetToFASTArecords , the XStringSet object
to be converted to a list of FASTA records as one returned
by readFASTA .
|
append |
TRUE or FALSE . If TRUE output will be
appended to file ; otherwise, it will overwrite the contents
of file . See ?cat for the details.
|
width |
Only relevant if format is "fasta" .
The maximum number of letters per line of sequence.
|
subjectClass |
The class to be given to the subject of the XStringViews object
created and returned by the function.
Must be the name of one of the direct XString subclasses i.e.
"BString" , "DNAString" , "RNAString"
or "AAString" .
|
collapse |
An optional character string to be inserted between the views of the XStringViews object created and returned by the function. |
FASTArecs |
A list of FASTA records as one returned by readFASTA .
|
use.names |
Whether or not the description line preceding each FASTA records should be used to set the names of the returned object. |
Only FASTA and FASTQ files are supported for now. The identifiers and qualities stored in the FASTQ records are ignored (only the sequences are returned).
Reading functions read.BStringSet
, read.DNAStringSet
,
read.RNAStringSet
, read.AAStringSet
and read.XStringViews
load sequences from a file into an XStringSet or XStringViews
object. Only one FASTA file, but more than one FASTQ file, can be read at a
time (by passing a character vector of length > 1). In that case, all the
FASTQ files must have the same "width" (i.e. all their sequences must have
the same length) and the sequences from all the files are stored in the
returned object in the order they were read.
The fastq.geometry
utility returns an integer vector describing
the "geometry" of the FASTQ files i.e. a vector of length 2 where the
first element is the total number of sequences contained in the FASTQ
files and the second element the "width" of these files (this width is
NA
if the files have different "widths").
Writing functions write.XStringSet
and write.XStringViews
write an XStringSet or XStringViews object to a file or
connection. They only support the FASTA format for now.
FASTArecordsToCharacter
, CharacterToFASTArecords
,
FASTArecordsToXStringViews
and XStringSetToFASTArecords
are helper functions used internally by write.XStringSet
and
read.XStringViews
for switching between different
representations of the same object.
fasta.info
,
readFASTA
,
writeFASTA
,
XStringSet-class,
XStringViews-class,
BString-class,
DNAString-class,
RNAString-class,
AAString-class
## --------------------------------------------------------------------- ## A. READ/WRITE FASTA FILES ## --------------------------------------------------------------------- file <- system.file("extdata", "someORF.fa", package="Biostrings") x <- read.DNAStringSet(file, "fasta") x write.XStringSet(x, format="fasta") # writes to the console ## --------------------------------------------------------------------- ## B. READ FASTQ FILES ## --------------------------------------------------------------------- file <- system.file("extdata", "s_1_sequence.txt", package="Biostrings") fastq.geometry(file) read.DNAStringSet(file, "fastq") # only the FASTQ sequences are returned # (identifiers and qualities are dropped) ## --------------------------------------------------------------------- ## C. SOME RELATED HELPER FUNCTIONS ## --------------------------------------------------------------------- ## Converting 'x'... ## ... to a list of FASTA records (as one returned by the "readFASTA" function) x1 <- XStringSetToFASTArecords(x) ## ... to a named character vector x2 <- FASTArecordsToCharacter(x1) # same as 'as.character(x)'