matchPWM {Biostrings}R Documentation

A simple PWM matching function and related utilities

Description

A function implementing a simple algorithm for matching a set of patterns represented by a Position Weight Matrix (PWM) to a DNA sequence. PWM for amino acid sequences are not supported.

Usage

  matchPWM(pwm, subject, min.score="80%")
  countPWM(pwm, subject, min.score="80%")
  PWMscoreStartingAt(pwm, subject, starting.at=1)

  ## Utility functions for basic manipulation of the Position Weight Matrix
  maxWeights(pwm)
  maxScore(pwm)
  ## S4 method for signature 'matrix':
  reverseComplement(x, ...)

Arguments

pwm, x A Position Weight Matrix (numeric matrix with row names A, C, G and T).
subject A DNAString object containing the subject sequence.
min.score The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.
starting.at An integer vector specifying the starting positions of the Position Weight Matrix relatively to the subject.
... Additional arguments are currently ignored by the reverseComplement method for matrix objects.

Value

An XStringViews object for matchPWM.
A single integer for countPWM.
A numeric vector containing the Position Weight Matrix-based scores for PWMscoreStartingAt.
A vector containing the max weight for each position in pwm for maxWeights.
The highest possible score for a given Position Weight Matrix for maxScore.
A PWM obtained by reverting the column order in PWM x and by reassigning each row to its complementary nucleotide for reverseComplement.

See Also

matchPattern, reverseComplement, DNAString-class, XStringViews-class

Examples

  pwm <- rbind(A=c( 1,  0, 19, 20, 18,  1, 20,  7),
               C=c( 1,  0,  1,  0,  1, 18,  0,  2),
               G=c(17,  0,  0,  0,  1,  0,  0,  3),
               T=c( 1, 20,  0,  0,  0,  1,  0,  8))
  maxWeights(pwm)
  maxScore(pwm)
  reverseComplement(pwm)

  subject <- DNAString("AGTAAACAA")
  PWMscoreStartingAt(pwm, subject, starting.at=c(2:1, NA))

  library(BSgenome.Dmelanogaster.UCSC.dm3)
  chr3R <- unmasked(Dmelanogaster$chr3R)
  chr3R

  ## Match the plus strand
  matchPWM(pwm, chr3R)
  countPWM(pwm, chr3R)

  ## Match the minus strand
  matchPWM(reverseComplement(pwm), chr3R)

[Package Biostrings version 2.12.9 Index]