weka.filters.unsupervised.instance
Class SubsetByExpression

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.SimpleFilter
          extended by weka.filters.SimpleBatchFilter
              extended by weka.filters.unsupervised.instance.SubsetByExpression
All Implemented Interfaces:
java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler

public class SubsetByExpression
extends SimpleBatchFilter

Filters instances according to a user-specified expression.

Grammar:

boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;

boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;

boolexpr ::= BOOLEAN
| true
| false
| expr < expr
| expr <= expr
| expr > expr
| expr >= expr
| expr = expr
| ( boolexpr )
| not boolexpr
| boolexpr and boolexpr
| boolexpr or boolexpr
| ATTRIBUTE is STRING
;

expr ::= NUMBER
| ATTRIBUTE
| ( expr )
| opexpr
| funcexpr
;

opexpr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
;

funcexpr ::= abs ( expr )
| sqrt ( expr )
| log ( expr )
| exp ( expr )
| sin ( expr )
| cos ( expr )
| tan ( expr )
| rint ( expr )
| floor ( expr )
| pow ( expr for base , expr for exponent )
| ceil ( expr )
;

Notes:
- NUMBER
any integer or floating point number
(but not in scientific notation!)
- STRING
any string surrounded by single quotes;
the string may not contain a single quote though.
- ATTRIBUTE
the following placeholders are recognized for
attribute values:
- CLASS for the class value in case a class attribute is set.
- ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.

Examples:
- extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird')
- extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2)
- extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)

Valid options are:

 -E <expr>
  The expression to use for filtering
  (default: true).
 -F
  Apply the filter to instances that arrive after the first
  (training) batch. The default is to not apply the filter (i.e.
  always return the instance)

Version:
$Revision: 7599 $
Author:
fracpete (fracpete at waikato dot ac dot nz)
See Also:
Serialized Form

Constructor Summary
SubsetByExpression()
           
 
Method Summary
 java.lang.String expressionTipText()
          Returns the tip text for this property.
 java.lang.String filterAfterFirstBatchTipText()
          Returns the tip text for this property.
 Capabilities getCapabilities()
          Returns the Capabilities of this filter.
 java.lang.String getExpression()
          Returns the expression used for filtering.
 boolean getFilterAfterFirstBatch()
          Get whether to apply the filter to instances that arrive once the first (training) batch has been seen.
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 java.lang.String getRevision()
          Returns the revision string.
 java.lang.String globalInfo()
          Returns a string describing this filter.
 boolean input(Instance instance)
          Input an instance for filtering.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for running this filter.
 void setExpression(java.lang.String value)
          Sets the expression used for filtering.
 void setFilterAfterFirstBatch(boolean b)
          Set whether to apply the filter to instances that arrive once the first (training) batch has been seen.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 
Methods inherited from class weka.filters.SimpleBatchFilter
batchFinished
 
Methods inherited from class weka.filters.SimpleFilter
debugTipText, getDebug, setDebug, setInputFormat
 
Methods inherited from class weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SubsetByExpression

public SubsetByExpression()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this filter.

Specified by:
globalInfo in class SimpleFilter
Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

input

public boolean input(Instance instance)
              throws java.lang.Exception
Input an instance for filtering. Filter requires all training instances be read before producing output (calling the method batchFinished() makes the data available). If this instance is part of a new batch, m_NewBatch is set to false.

Overrides:
input in class SimpleBatchFilter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
java.lang.IllegalStateException - if no input structure has been defined
java.lang.Exception - if something goes wrong
See Also:
SimpleBatchFilter.batchFinished()

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class SimpleFilter
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -E <expr>
  The expression to use for filtering
  (default: true).
 -F
  Apply the filter to instances that arrive after the first
  (training) batch. The default is to not apply the filter (i.e.
  always return the instance)

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class SimpleFilter
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported
See Also:
SimpleFilter.reset()

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class SimpleFilter
Returns:
an array of strings suitable for passing to setOptions

getCapabilities

public Capabilities getCapabilities()
Returns the Capabilities of this filter.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Filter
Returns:
the capabilities of this object
See Also:
Capabilities

setExpression

public void setExpression(java.lang.String value)
Sets the expression used for filtering.

Parameters:
value - the expression

getExpression

public java.lang.String getExpression()
Returns the expression used for filtering.

Returns:
the expression

expressionTipText

public java.lang.String expressionTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setFilterAfterFirstBatch

public void setFilterAfterFirstBatch(boolean b)
Set whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.

Parameters:
b - true if the filter should be applied to instances that arrive after the first (training) batch has been processed.

getFilterAfterFirstBatch

public boolean getFilterAfterFirstBatch()
Get whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.

Returns:
true if the filter should be applied to instances that arrive after the first (training) batch has been processed.

filterAfterFirstBatchTipText

public java.lang.String filterAfterFirstBatchTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class Filter
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method for running this filter.

Parameters:
args - arguments for the filter: use -h for help