public class CheckAttributeSelection extends CheckScheme
java weka.attributeSelection.CheckAttributeSelection -W ASscheme_name
-- ASscheme_options
CheckAttributeSelection reports on the following:
weka.attributeSelection.AbstractAttributeSelectionTest
uses this class to test all the schemes. Any changes here, have to be
checked in that abstract test class, too.
Valid options are:
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-eval name [options] Full name and options of the evaluator analyzed. eg: weka.attributeSelection.CfsSubsetEval
-search name [options] Full name and options of the search method analyzed. eg: weka.attributeSelection.Ranker
-test <eval|search> The scheme to test, either the evaluator or the search method. (Default: eval)
Options specific to evaluator weka.attributeSelection.CfsSubsetEval:
-M Treat missing values as a seperate value.
-L Don't include locally predictive attributes.
Options specific to search method weka.attributeSelection.Ranker:
-P <start set> Specify a starting set of attributes. Eg. 1,3,5-7. Any starting attributes specified are ignored during the ranking.
-T <threshold> Specify a theshold by which attributes may be discarded from the ranking.
-N <num to select> Specify number of attributes to select
TestInstances
CheckScheme.PostProcessor
Modifier and Type | Field and Description |
---|---|
protected ASEvaluation |
m_Evaluator
The evaluator to be examined
|
protected ASSearch |
m_Search
The search method to be used
|
protected boolean |
m_TestEvaluator
whether to test the evaluator (default) or the search method
|
m_ClasspathProblems, m_NumDate, m_NumInstances, m_NumInstancesRelational, m_NumNominal, m_NumNumeric, m_NumRelational, m_NumString, m_PostProcessor, m_Words, m_WordSeparators
Constructor and Description |
---|
CheckAttributeSelection() |
Modifier and Type | Method and Description |
---|---|
protected boolean[] |
canHandleClassAsNthAttribute(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int classIndex)
Checks whether the scheme can handle class attributes as Nth attribute.
|
protected boolean[] |
canHandleMissing(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
boolean predictorMissing,
boolean classMissing,
int missingLevel)
Checks basic missing value handling of the scheme.
|
protected boolean[] |
canHandleNClasses(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int numClasses)
Checks whether nominal schemes can handle more than two classes.
|
protected boolean[] |
canHandleZeroTraining(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the scheme can handle zero training instances.
|
protected boolean[] |
canPredict(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks basic prediction of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canTakeOptions()
Checks whether the scheme can take command line options.
|
protected boolean[] |
correctSearchInitialisation(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the scheme correctly initialises models when
ASSearch.search is called.
|
protected boolean[] |
datasetIntegrity(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
boolean predictorMissing,
boolean classMissing)
Checks whether the scheme alters the training dataset during
training.
|
protected boolean[] |
declaresSerialVersionUID()
tests for a serialVersionUID.
|
void |
doTests()
Begin the tests, reporting results to System.out
|
ASEvaluation |
getEvaluator()
Get the current evaluator
|
String[] |
getOptions()
Gets the current settings of the CheckAttributeSelection.
|
String |
getRevision()
Returns the revision string.
|
ASSearch |
getSearch()
Get the current search method
|
boolean |
getTestEvaluator()
Gets whether the evaluator is being tested or the search method.
|
protected Object |
getTestObject()
returns either the evaluator or the search method.
|
protected boolean[] |
instanceWeights(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the scheme can handle instance weights.
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] args)
Test method for this class
|
protected Object[] |
makeCopies(Object obj,
int num)
returns deep copies of the given object
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numNominal,
int numNumeric,
int numString,
int numDate,
int numRelational,
int numClasses,
int classType,
boolean multiInstance)
Make a simple set of instances, which can later be modified
for use in specific tests.
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numNominal,
int numNumeric,
int numString,
int numDate,
int numRelational,
int numClasses,
int classType,
int classIndex,
boolean multiInstance)
Make a simple set of instances with variable position of the class
attribute, which can later be modified for use in specific tests.
|
protected boolean[] |
multiInstanceHandler()
Checks whether the scheme handles multi-instance data.
|
protected void |
printAttributeSummary(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Print out a short summary string for the dataset characteristics
|
protected boolean[] |
runBasicTest(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int missingLevel,
boolean predictorMissing,
boolean classMissing,
int numTrain,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
protected boolean[] |
runBasicTest(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int classIndex,
int missingLevel,
boolean predictorMissing,
boolean classMissing,
int numTrain,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
protected AttributeSelection |
search(ASSearch search,
ASEvaluation eval,
Instances data)
Performs a attribute selection with the given search and evaluation scheme
on the provided data.
|
void |
setEvaluator(ASEvaluation value)
Set the evaluator to test.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setSearch(ASSearch value)
Set the search method to test.
|
void |
setTestEvaluator(boolean value)
Sets whether the evaluator or the search method is being tested.
|
protected void |
testsPerClassType(int classType,
boolean weighted,
boolean multiInstance)
Run a battery of tests for a given class attribute type
|
protected boolean[] |
weightedInstancesHandler()
Checks whether the scheme says it can handle instance weights.
|
addMissing, arrayToList, attributeTypeToString, compareDatasets, getNumDate, getNumInstances, getNumInstancesRelational, getNumNominal, getNumNumeric, getNumRelational, getNumString, getPostProcessor, getWords, getWordSeparators, hasClasspathProblems, listToArray, process, setNumDate, setNumInstances, setNumInstancesRelational, setNumNominal, setNumNumeric, setNumRelational, setNumString, setPostProcessor, setWords, setWordSeparators
protected ASEvaluation m_Evaluator
protected ASSearch m_Search
protected boolean m_TestEvaluator
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class CheckScheme
public void setOptions(String[] options) throws Exception
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-eval name [options] Full name and options of the evaluator analyzed. eg: weka.attributeSelection.CfsSubsetEval
-search name [options] Full name and options of the search method analyzed. eg: weka.attributeSelection.Ranker
-test <eval|search> The scheme to test, either the evaluator or the search method. (Default: eval)
Options specific to evaluator weka.attributeSelection.CfsSubsetEval:
-M Treat missing values as a seperate value.
-L Don't include locally predictive attributes.
Options specific to search method weka.attributeSelection.Ranker:
-P <start set> Specify a starting set of attributes. Eg. 1,3,5-7. Any starting attributes specified are ignored during the ranking.
-T <threshold> Specify a theshold by which attributes may be discarded from the ranking.
-N <num to select> Specify number of attributes to select
setOptions
in interface OptionHandler
setOptions
in class CheckScheme
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class CheckScheme
public void doTests()
doTests
in class CheckScheme
public void setEvaluator(ASEvaluation value)
value
- the evaluator to use.public ASEvaluation getEvaluator()
public void setSearch(ASSearch value)
value
- the search method to use.public ASSearch getSearch()
public void setTestEvaluator(boolean value)
value
- if true then the evaluator will be testedpublic boolean getTestEvaluator()
protected Object getTestObject()
m_TestEvaluator
protected Object[] makeCopies(Object obj, int num) throws Exception
obj
- the object to copynum
- the number of copiesException
- if copying failsprotected AttributeSelection search(ASSearch search, ASEvaluation eval, Instances data) throws Exception
search
- the search scheme to useeval
- the evaluator to usedata
- the data to work onException
- if the attribute selection failsprotected void testsPerClassType(int classType, boolean weighted, boolean multiInstance)
classType
- true if the class attribute should be numericweighted
- true if the scheme says it handles weightsmultiInstance
- true if the scheme handles multi-instance dataprotected boolean[] canTakeOptions()
protected boolean[] weightedInstancesHandler()
protected boolean[] multiInstanceHandler()
protected boolean[] declaresSerialVersionUID()
protected boolean[] canPredict(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NOMINAL, NUMERIC, etc.)protected boolean[] canHandleNClasses(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int numClasses)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is needednumClasses
- the number of classes to testprotected boolean[] canHandleClassAsNthAttribute(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int classIndex)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class attribute (0-based, -1 means last attribute)TestInstances.CLASS_IS_LAST
protected boolean[] canHandleZeroTraining(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] correctSearchInitialisation(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] canHandleMissing(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, boolean predictorMissing, boolean classMissing, int missingLevel)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)predictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classmissingLevel
- the percentage of missing valuesprotected boolean[] instanceWeights(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] datasetIntegrity(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, boolean predictorMissing, boolean classMissing)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)predictorMissing
- true if we know the scheme can handle
(at least) moderate missing predictor valuesclassMissing
- true if we know the scheme can handle
(at least) moderate missing class valuesprotected boolean[] runBasicTest(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int missingLevel, boolean predictorMissing, boolean classMissing, int numTrain, int numClasses, FastVector accepts)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)missingLevel
- the percentage of missing valuespredictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected boolean[] runBasicTest(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int classIndex, int missingLevel, boolean predictorMissing, boolean classMissing, int numTrain, int numClasses, FastVector accepts)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the attribute index of the classmissingLevel
- the percentage of missing valuespredictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected Instances makeTestDataset(int seed, int numInstances, int numNominal, int numNumeric, int numString, int numDate, int numRelational, int numClasses, int classType, boolean multiInstance) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumNominal
- the number of nominal attributesnumNumeric
- the number of numeric attributesnumString
- the number of string attributesnumDate
- the number of date attributesnumRelational
- the number of relational attributesnumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)multiInstance
- whether the dataset should a multi-instance datasetException
- if the dataset couldn't be generatedCheckScheme.process(Instances)
protected Instances makeTestDataset(int seed, int numInstances, int numNominal, int numNumeric, int numString, int numDate, int numRelational, int numClasses, int classType, int classIndex, boolean multiInstance) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumNominal
- the number of nominal attributesnumNumeric
- the number of numeric attributesnumString
- the number of string attributesnumDate
- the number of date attributesnumRelational
- the number of relational attributesnumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class (0-based, -1 as last)multiInstance
- whether the dataset should a multi-instance datasetException
- if the dataset couldn't be generatedTestInstances.CLASS_IS_LAST
,
CheckScheme.process(Instances)
protected void printAttributeSummary(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- true if nominal predictor attributes are presentnumericPredictor
- true if numeric predictor attributes are presentstringPredictor
- true if string predictor attributes are presentdatePredictor
- true if date predictor attributes are presentrelationalPredictor
- true if relational predictor attributes are presentmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)public String getRevision()
public static void main(String[] args)
args
- the commandline parametersCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.