reflex::Matcher Class Reference

updated Thu Jan 26 2017
 
Classes | Public Member Functions | Protected Types | Protected Member Functions | Protected Attributes | List of all members
reflex::Matcher Class Reference

RE/flex matcher engine class, implements reflex::PatternMatcher pattern matching interface with scan, find, split functors and iterators. More...

#include <matcher.h>

Inheritance diagram for reflex::Matcher:
Inheritance graph
[legend]
Collaboration diagram for reflex::Matcher:
Collaboration graph
[legend]

Classes

struct  FSM
 FSM data for FSM code. More...
 

Public Member Functions

 Matcher ()
 Default constructor. More...
 
template<typename P >
 Matcher (const P *pat, const Input &inp=Input(), const char *opt=NULL)
 Construct matcher engine from a pattern or a string regex, and an input character sequence. More...
 
template<typename P >
 Matcher (const P &pat, const Input &inp=Input(), const char *opt=NULL)
 Construct matcher engine from a pattern or a string regex, and an input character sequence. More...
 
virtual void reset (const char *opt=NULL)
 Reset this matcher's state to the initial state. More...
 
const std::vector< size_t > & stops (void) const
 
void clear_stops (void)
 Clear tab stops. More...
 
void push_stops (void)
 Push stops and clear stops. More...
 
void pop_stops (void)
 Pop stops. More...
 
void FSM_INIT (int &c1)
 FSM code INIT,. More...
 
int FSM_CHAR (void)
 FSM code CHAR. More...
 
void FSM_HALT (int c1)
 FSM code HALT. More...
 
void FSM_TAKE (Pattern::Index cap)
 FSM code TAKE. More...
 
void FSM_TAKE (Pattern::Index cap, int c1)
 FSM code TAKE. More...
 
void FSM_REDO (void)
 FSM code REDO. More...
 
void FSM_REDO (int c1)
 FSM code REDO. More...
 
void FSM_HEAD (Pattern::Index la)
 FSM code HEAD. More...
 
void FSM_TAIL (Pattern::Index la)
 FSM code TAIL. More...
 
bool FSM_DENT (void)
 FSM code DENT. More...
 
bool FSM_META_DED (void)
 FSM code META DED. More...
 
bool FSM_META_IND (void)
 FSM code META IND. More...
 
bool FSM_META_EOB (int c1)
 FSM code META EOB. More...
 
bool FSM_META_BOB (void)
 FSM code META BOB. More...
 
bool FSM_META_EOL (int c1)
 FSM code META EOL. More...
 
bool FSM_META_BOL (void)
 FSM code META BOL. More...
 
bool FSM_META_EWE (int c0, int c1)
 FSM code META EWE. More...
 
bool FSM_META_BWE (int c0, int c1)
 FSM code META BWE. More...
 
bool FSM_META_EWB (void)
 FSM code META EWB. More...
 
bool FSM_META_BWB (void)
 FSM code META BWB. More...
 
bool FSM_META_NWE (int c0, int c1)
 FSM code META NWE. More...
 
bool FSM_META_NWB (void)
 FSM code META NWB. More...
 
- Public Member Functions inherited from reflex::PatternMatcher< reflex::Pattern >
 PatternMatcher (const PatternMatcher &matcher)
 Copy constructor, the underlying pattern object is shared (not deep copied). More...
 
virtual ~PatternMatcher ()
 Delete matcher, deletes pattern when owned, deletes this matcher's internal buffer. More...
 
virtual PatternMatcherpattern (const PatternMatcher &matcher)
 Set the pattern to use with this matcher as a shared pointer to another matcher pattern. More...
 
virtual PatternMatcherpattern (const Pattern &pat)
 Set the pattern to use with this matcher (the given pattern is shared and must be persistent). More...
 
virtual PatternMatcherpattern (const Pattern *pat)
 Set the pattern to use with this matcher (the given pattern is shared and must be persistent). More...
 
virtual PatternMatcherpattern (const char *pat)
 Set the pattern from a regex string to use with this matcher. More...
 
virtual PatternMatcherpattern (const std::string &pat)
 Set the pattern from a regex string to use with this matcher. More...
 
const Patternpattern (void) const
 Returns the pattern object associated with this matcher. More...
 
bool has_pattern (void) const
 Returns true if this matcher has a pattern. More...
 
bool own_pattern (void) const
 Returns true if this matcher has its own pattern not received from another matcher (responsible to delete). More...
 
- Public Member Functions inherited from reflex::AbstractMatcher
bool buffer (size_t blk=0)
 Set buffer block size for reading: use 1 for interactive input, 0 (or omit argument) to buffer all input in which case returns true if all the data could be read and false if a read error occurred. More...
 
void interactive (void)
 Set buffer to 1 for interactive input. More...
 
void flush (void)
 Flush the buffer's remaining content. More...
 
virtual AbstractMatcherinput (const Input &inp)
 Set the input character sequence for this matcher and reset the matcher. More...
 
size_t matches (void)
 Returns true if the entire input matches this matcher's pattern (and internally caches the true/false result for repeat invocations). More...
 
size_t accept (void) const
 Returns a positive integer (true) indicating the capture index of the matched text in the pattern or zero (false) for a mismatch. More...
 
const char * text (void) const
 Returns string with the text matched. More...
 
size_t size (void) const
 Returns the length of the matched text in number of bytes. More...
 
size_t wsize (void) const
 Returns the length of the matched text in number of (wide) characters. More...
 
size_t lineno (void) const
 Returns the line number of the match in the input character sequence. More...
 
size_t columno (void) const
 Returns the column number of matched text, counting wide characters (unless compiled with WITH_BYTE_COLUMNO). More...
 
std::pair< size_t, std::string > pair () const
 Returns a pair of size_t accept() and std::string text(), useful for tokenizing input into containers of pairs. More...
 
size_t first (void) const
 Returns the position of the first character starting the match in the input character sequence. More...
 
size_t last (void) const
 Returns the position of the last character + 1 after of the match in the input character sequence. More...
 
bool at_bob (void) const
 Returns true if this matcher is at the start of an input character sequence. Use reset() to restart input. More...
 
bool at_end (void)
 Returns true if this matcher has no more input to read from the input character sequence. More...
 
bool hit_end (void) const
 Returns true if this matcher hit the end of the input character sequence. More...
 
void set_end (bool eof)
 Set and force the end of input state. More...
 
bool at_bol (void) const
 Returns true if this matcher reached the begin of a new line. More...
 
void set_bol (bool bol)
 Set the begin of a new line state. More...
 
int input (void)
 Returns the next character from the input character sequence while preserving the current text match. More...
 
void unput (char c)
 Put back one character on the input character sequence for matching, invalidating the current match info and text. More...
 
const char * rest (void)
 Fetch the rest of the input as text, useful for searching/splitting up to n times after which the rest is needed. More...
 
void more (void)
 Append the next match to the currently matched text returned by AbstractMatcher::text, when the next match found is adjacent to the current match. More...
 
void less (size_t n)
 Truncate the AbstractMatcher::text length of the match to n characters in length and reposition for next match. More...
 
 operator size_t () const
 Cast this matcher to positive integer indicating the nonzero capture index of the matched text in the pattern, same as AbstractMatcher::accept. More...
 
 operator std::string () const
 Cast this matcher to a std::string of the text matched by this matcher. More...
 
 operator std::pair< size_t, std::string > () const
 Cast this matcher to a pair of size_t accept() and std::string text(), useful for tokenization into containers. More...
 
bool operator== (const char *rhs) const
 Returns true if matched text is equal to a string, useful for std::algorithm. More...
 
bool operator== (const std::string &rhs) const
 Returns true if matched text is equalt to a string, useful for std::algorithm. More...
 
bool operator== (size_t rhs) const
 Returns true if capture index is equal to a given size_t value, useful for std::algorithm. More...
 
bool operator== (int rhs) const
 Returns true if capture index is equal to a given int value, useful for std::algorithm. More...
 
bool operator!= (const char *rhs) const
 Returns true if matched text is not equal to a string, useful for std::algorithm. More...
 
bool operator!= (const std::string &rhs) const
 Returns true if matched text is not equal to a string, useful for std::algorithm. More...
 
bool operator!= (size_t rhs) const
 Returns true if capture index is not equal to a given size_t value, useful for std::algorithm. More...
 
bool operator!= (int rhs) const
 Returns true if capture index is not equal to a given int value, useful for std::algorithm. More...
 

Protected Types

typedef std::vector< size_t > Stops
 indent margin/tab stops More...
 
- Protected Types inherited from reflex::AbstractMatcher
typedef int Method
 

Protected Member Functions

virtual size_t match (Method method)
 Returns true if input matched the pattern using method Const::SCAN, Const::FIND, Const::SPLIT, or Const::MATCH. More...
 
void newline (size_t &col)
 Update indentation column counter for indent() and dedent(). More...
 
bool indent (size_t &col)
 Returns true if looking at indent. More...
 
bool dedent (size_t &col)
 Returns true if looking at dedent. More...
 
- Protected Member Functions inherited from reflex::PatternMatcher< reflex::Pattern >
 PatternMatcher (const Pattern *pat=NULL, const Input &inp=Input(), const char *opt=NULL)
 Construct a base abstract matcher from a pointer to a persistent pattern object (that is shared with this class) and an input character sequence. More...
 
 PatternMatcher (const Pattern &pat, const Input &inp=Input(), const char *opt=NULL)
 
 PatternMatcher (const char *pat, const Input &inp=Input(), const char *opt=NULL)
 Construct a base abstract matcher from a regex pattern string and an input character sequence. More...
 
 PatternMatcher (const std::string &pat, const Input &inp=Input(), const char *opt=NULL)
 Construct a base abstract matcher from a regex pattern string and an input character sequence. More...
 
- Protected Member Functions inherited from reflex::AbstractMatcher
 AbstractMatcher (const Input &inp, const char *opt)
 Construct a base abstract matcher. More...
 
 AbstractMatcher (const Input &inp, const Option &opt)
 Construct a base abstract matcher. More...
 
void init (const char *opt=NULL)
 Initialize the base abstract matcher at construction. More...
 
virtual size_t get (char *s, size_t n)
 Returns more input (method can be overriden as by reflex::FlexLexer::get to invoke reflex::FlexLexer::LexerInput). More...
 
virtual bool wrap (void)
 Returns true if wrapping of input after EOF is supported. More...
 
bool grow (size_t need=Const::BLOCK)
 Shift or expand the internal buffer when it is too small to accommodate more input, where the buffer size is doubled when needed. More...
 
int get (void)
 Returns the next character from the buffered input character sequence. More...
 
int peek (void)
 Peek at the next character in the buffered input without consuming it. More...
 
void set_current (size_t loc)
 Set the current position to advance to the next match. More...
 

Protected Attributes

size_t ded_
 dedent count More...
 
Stops tab_
 tab stops set by detecting indent margins More...
 
std::vector< int > lap_
 lookahead position in input that heads a lookahead match (indexed by lookahead number) More...
 
std::stack< Stopsstk_
 stack to push/pop stops More...
 
FSM fsm_
 local state for FSM code More...
 
- Protected Attributes inherited from reflex::PatternMatcher< reflex::Pattern >
bool own_
 true if PatternMatcher::pat_ was internally allocated More...
 
const Patternpat_
 points to the pattern object used by the matcher More...
 
- Protected Attributes inherited from reflex::AbstractMatcher
Option opt_
 options for matcher engines More...
 
char * buf_
 input character sequence buffer More...
 
const char * txt_
 points to the matched text in buffer AbstractMatcher::buf_ More...
 
size_t len_
 size of the matched text More...
 
size_t cap_
 nonzero capture index of an accepted match or zero More...
 
size_t cur_
 next position in AbstractMatcher::buf_ to assign to AbstractMatcher::txt_ More...
 
size_t pos_
 position in AbstractMatcher::buf_ after AbstractMatcher::txt_ More...
 
size_t end_
 ending position of the input buffered in AbstractMatcher::buf_ More...
 
size_t max_
 total buffer size and max position + 1 to fill More...
 
size_t ind_
 current indent position More...
 
size_t blk_
 block size for block-based input reading, as set by AbstractMatcher::buffer More...
 
int got_
 last unsigned character we looked at (to determine anchors and boundaries) More...
 
int chr_
 the character located at AbstractMatcher::buf_[AbstractMatcher::pos_] More...
 
size_t lno_
 line number count (prior to this buffered input) More...
 
size_t cno_
 column number count (prior to this buffered input) More...
 
size_t num_
 character count (number of characters flushed prior to this buffered input) More...
 
bool eof_
 input has reached EOF More...
 
bool mat_
 true if AbstractMatcher::matches() was successful More...
 

Additional Inherited Members

- Public Types inherited from reflex::PatternMatcher< reflex::Pattern >
typedef reflex::Pattern Pattern
 
- Public Types inherited from reflex::AbstractMatcher
typedef AbstractMatcher::Iterator< AbstractMatcheriterator
 std::input_iterator for scanning, searching, and splitting input character sequences More...
 
typedef AbstractMatcher::Iterator< const AbstractMatcherconst_iterator
 
- Public Attributes inherited from reflex::AbstractMatcher
Operation scan
 functor to scan input (to tokenize input) More...
 
Operation find
 functor to search input More...
 
Operation split
 functor to split input More...
 
Input in
 input character sequence being matched by this matcher More...
 

Detailed Description

RE/flex matcher engine class, implements reflex::PatternMatcher pattern matching interface with scan, find, split functors and iterators.

More info TODO

Member Typedef Documentation

typedef std::vector<size_t> reflex::Matcher::Stops
protected

indent margin/tab stops

Constructor & Destructor Documentation

reflex::Matcher::Matcher ( )
inline

Default constructor.

template<typename P >
reflex::Matcher::Matcher ( const P *  pat,
const Input inp = Input(),
const char *  opt = NULL 
)
inline

Construct matcher engine from a pattern or a string regex, and an input character sequence.

Template Parameters
<P>a reflex::Pattern or a string regex
Parameters
patpoints to a reflex::Pattern or a string regex for this matcher
inpinput character sequence for this matcher
optoption string of the form (A|N|T(=[[:digit:]])?|;)*
template<typename P >
reflex::Matcher::Matcher ( const P &  pat,
const Input inp = Input(),
const char *  opt = NULL 
)
inline

Construct matcher engine from a pattern or a string regex, and an input character sequence.

Template Parameters
<P>a reflex::Pattern or a string regex
Parameters
pata reflex::Pattern or a string regex for this matcher
inpinput character sequence for this matcher
optoption string of the form (A|N|T(=[[:digit:]])?|;)*

Member Function Documentation

void reflex::Matcher::clear_stops ( void  )
inline

Clear tab stops.

bool reflex::Matcher::dedent ( size_t &  col)
inlineprotected

Returns true if looking at dedent.

Returns
true if dedent.
Parameters
colindent column counter
int reflex::Matcher::FSM_CHAR ( void  )
inline

FSM code CHAR.

bool reflex::Matcher::FSM_DENT ( void  )
inline

FSM code DENT.

void reflex::Matcher::FSM_HALT ( int  c1)
inline

FSM code HALT.

void reflex::Matcher::FSM_HEAD ( Pattern::Index  la)
inline

FSM code HEAD.

void reflex::Matcher::FSM_INIT ( int &  c1)
inline

FSM code INIT,.

bool reflex::Matcher::FSM_META_BOB ( void  )
inline

FSM code META BOB.

bool reflex::Matcher::FSM_META_BOL ( void  )
inline

FSM code META BOL.

bool reflex::Matcher::FSM_META_BWB ( void  )
inline

FSM code META BWB.

bool reflex::Matcher::FSM_META_BWE ( int  c0,
int  c1 
)
inline

FSM code META BWE.

bool reflex::Matcher::FSM_META_DED ( void  )
inline

FSM code META DED.

bool reflex::Matcher::FSM_META_EOB ( int  c1)
inline

FSM code META EOB.

bool reflex::Matcher::FSM_META_EOL ( int  c1)
inline

FSM code META EOL.

bool reflex::Matcher::FSM_META_EWB ( void  )
inline

FSM code META EWB.

bool reflex::Matcher::FSM_META_EWE ( int  c0,
int  c1 
)
inline

FSM code META EWE.

bool reflex::Matcher::FSM_META_IND ( void  )
inline

FSM code META IND.

bool reflex::Matcher::FSM_META_NWB ( void  )
inline

FSM code META NWB.

bool reflex::Matcher::FSM_META_NWE ( int  c0,
int  c1 
)
inline

FSM code META NWE.

void reflex::Matcher::FSM_REDO ( void  )
inline

FSM code REDO.

void reflex::Matcher::FSM_REDO ( int  c1)
inline

FSM code REDO.

void reflex::Matcher::FSM_TAIL ( Pattern::Index  la)
inline

FSM code TAIL.

void reflex::Matcher::FSM_TAKE ( Pattern::Index  cap)
inline

FSM code TAKE.

void reflex::Matcher::FSM_TAKE ( Pattern::Index  cap,
int  c1 
)
inline

FSM code TAKE.

bool reflex::Matcher::indent ( size_t &  col)
inlineprotected

Returns true if looking at indent.

Returns
true if indent.
Parameters
colindent column counter
virtual size_t reflex::Matcher::match ( Method  method)
protectedvirtual

Returns true if input matched the pattern using method Const::SCAN, Const::FIND, Const::SPLIT, or Const::MATCH.

Returns
nonzero if input matched the pattern.
Parameters
methodConst::SCAN, Const::FIND, Const::SPLIT, or Const::MATCH

Implements reflex::AbstractMatcher.

void reflex::Matcher::newline ( size_t &  col)
inlineprotected

Update indentation column counter for indent() and dedent().

Parameters
colindent column counter
void reflex::Matcher::pop_stops ( void  )
inline

Pop stops.

void reflex::Matcher::push_stops ( void  )
inline

Push stops and clear stops.

virtual void reflex::Matcher::reset ( const char *  opt = NULL)
inlinevirtual

Reset this matcher's state to the initial state.

Reimplemented from reflex::AbstractMatcher.

const std::vector<size_t>& reflex::Matcher::stops ( void  ) const
inline

Returns vector of tab stops.

Returns
vector of size_t.

Member Data Documentation

size_t reflex::Matcher::ded_
protected

dedent count

FSM reflex::Matcher::fsm_
protected

local state for FSM code

std::vector<int> reflex::Matcher::lap_
protected

lookahead position in input that heads a lookahead match (indexed by lookahead number)

std::stack<Stops> reflex::Matcher::stk_
protected

stack to push/pop stops

Stops reflex::Matcher::tab_
protected

tab stops set by detecting indent margins


The documentation for this class was generated from the following file: