public class Apriori extends AbstractAssociator implements OptionHandler, CARuleMiner, TechnicalInformationHandler
@inproceedings{Agrawal1994, author = {R. Agrawal and R. Srikant}, booktitle = {20th International Conference on Very Large Data Bases}, pages = {478-499}, publisher = {Morgan Kaufmann, Los Altos, CA}, title = {Fast Algorithms for Mining Association Rules in Large Databases}, year = {1994} } @inproceedings{Liu1998, author = {Bing Liu and Wynne Hsu and Yiming Ma}, booktitle = {Fourth International Conference on Knowledge Discovery and Data Mining}, pages = {80-86}, publisher = {AAAI Press}, title = {Integrating Classification and Association Rule Mining}, year = {1998} }Valid options are:
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-c <the class index> The class index. (default = last)
Modifier and Type | Field and Description |
---|---|
protected static int |
CONFIDENCE
Metric type: Confidence
|
protected static int |
CONVICTION
Metric type: Conviction
|
protected static int |
LEVERAGE
Metric type: Leverage
|
protected static int |
LIFT
Metric type: Lift
|
protected FastVector[] |
m_allTheRules
The list of all generated rules.
|
protected boolean |
m_car
Flag indicating whether class association rules are mined.
|
protected int |
m_classIndex
The class index.
|
protected int |
m_cycles
Number of cycles used before required number of rules was one.
|
protected double |
m_delta
Delta by which m_minSupport is decreased in each iteration.
|
protected FastVector |
m_hashtables
The same information stored in hash tables.
|
protected Instances |
m_instances
The instances (transactions) to be used for generating
the association rules.
|
protected double |
m_lowerBoundMinSupport
The lower bound for the minimum support.
|
protected FastVector |
m_Ls
The set of all sets of itemsets L.
|
protected int |
m_metricType
The selected metric type.
|
protected double |
m_minMetric
The minimum metric score.
|
protected double |
m_minSupport
The minimum support.
|
protected int |
m_numRules
The maximum number of rules that are output.
|
protected Instances |
m_onlyClass
Only the class attribute of all Instances.
|
protected boolean |
m_outputItemSets
Output itemsets found?
|
protected boolean |
m_removeMissingCols
Remove columns with all missing values
|
protected double |
m_significanceLevel
Significance level for optional significance test.
|
protected double |
m_upperBoundMinSupport
The upper bound on the support
|
protected boolean |
m_verbose
Report progress iteratively
|
static Tag[] |
TAGS_SELECTION
Metric types.
|
Constructor and Description |
---|
Apriori()
Constructor that allows to sets default values for the
minimum confidence and the maximum number of rules
the minimum confidence.
|
Modifier and Type | Method and Description |
---|---|
void |
buildAssociations(Instances instances)
Method that generates all large itemsets with a minimum support, and from
these all association rules with a minimum confidence.
|
String |
carTipText()
Returns the tip text for this property
|
String |
classIndexTipText()
Returns the tip text for this property
|
String |
deltaTipText()
Returns the tip text for this property
|
FastVector[] |
getAllTheRules()
returns all the rules
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getCar()
Gets whether class association ruels are mined
|
int |
getClassIndex()
Gets the class index
|
double |
getDelta()
Get the value of delta.
|
Instances |
getInstancesNoClass()
Gets the instances without the class atrribute.
|
Instances |
getInstancesOnlyClass()
Gets only the class attribute of the instances.
|
double |
getLowerBoundMinSupport()
Get the value of lowerBoundMinSupport.
|
SelectedTag |
getMetricType()
Get the metric type
|
double |
getMinMetric()
Get the value of minConfidence.
|
int |
getNumRules()
Get the value of numRules.
|
String[] |
getOptions()
Gets the current settings of the Apriori object.
|
boolean |
getOutputItemSets()
Gets whether itemsets are output as well
|
boolean |
getRemoveAllMissingCols()
Returns whether columns containing all missing values are to be removed
|
String |
getRevision()
Returns the revision string.
|
double |
getSignificanceLevel()
Get the value of significanceLevel.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
double |
getUpperBoundMinSupport()
Get the value of upperBoundMinSupport.
|
boolean |
getVerbose()
Gets whether algorithm is run in verbose mode
|
String |
globalInfo()
Returns a string describing this associator
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
String |
lowerBoundMinSupportTipText()
Returns the tip text for this property
|
static void |
main(String[] args)
Main method.
|
String |
metricString()
Returns the metric string for the chosen metric type
|
String |
metricTypeTipText()
Returns the tip text for this property
|
FastVector[] |
mineCARs(Instances data)
Method that mines all class association rules with minimum support and
with a minimum confidence.
|
String |
minMetricTipText()
Returns the tip text for this property
|
String |
numRulesTipText()
Returns the tip text for this property
|
String |
outputItemSetsTipText()
Returns the tip text for this property
|
String |
removeAllMissingColsTipText()
Returns the tip text for this property
|
protected Instances |
removeMissingColumns(Instances instances)
Removes columns that are all missing from the data
|
void |
resetOptions()
Resets the options to the default values.
|
void |
setCar(boolean flag)
Sets class association rule mining
|
void |
setClassIndex(int index)
Sets the class index
|
void |
setDelta(double v)
Set the value of delta.
|
void |
setLowerBoundMinSupport(double v)
Set the value of lowerBoundMinSupport.
|
void |
setMetricType(SelectedTag d)
Set the metric type for ranking rules
|
void |
setMinMetric(double v)
Set the value of minConfidence.
|
void |
setNumRules(int v)
Set the value of numRules.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setOutputItemSets(boolean flag)
Sets whether itemsets are output as well
|
void |
setRemoveAllMissingCols(boolean r)
Remove columns containing all missing values.
|
void |
setSignificanceLevel(double v)
Set the value of significanceLevel.
|
void |
setUpperBoundMinSupport(double v)
Set the value of upperBoundMinSupport.
|
void |
setVerbose(boolean flag)
Sets verbose mode
|
String |
significanceLevelTipText()
Returns the tip text for this property
|
String |
toString()
Outputs the size of all the generated sets of itemsets and the rules.
|
String |
upperBoundMinSupportTipText()
Returns the tip text for this property
|
String |
verboseTipText()
Returns the tip text for this property
|
forName, makeCopies, makeCopy, runAssociator
protected double m_minSupport
protected double m_upperBoundMinSupport
protected double m_lowerBoundMinSupport
protected static final int CONFIDENCE
protected static final int LIFT
protected static final int LEVERAGE
protected static final int CONVICTION
public static final Tag[] TAGS_SELECTION
protected int m_metricType
protected double m_minMetric
protected int m_numRules
protected double m_delta
protected double m_significanceLevel
protected int m_cycles
protected FastVector m_Ls
protected FastVector m_hashtables
protected FastVector[] m_allTheRules
protected Instances m_instances
protected boolean m_outputItemSets
protected boolean m_removeMissingCols
protected boolean m_verbose
protected Instances m_onlyClass
protected int m_classIndex
protected boolean m_car
public Apriori()
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public void resetOptions()
protected Instances removeMissingColumns(Instances instances) throws Exception
instances
- the instancesException
- if something goes wrongpublic Capabilities getCapabilities()
getCapabilities
in interface Associator
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractAssociator
Capabilities
public void buildAssociations(Instances instances) throws Exception
buildAssociations
in interface Associator
instances
- the instances to be used for generating the associationsException
- if rules can't be built successfullypublic FastVector[] mineCARs(Instances data) throws Exception
mineCARs
in interface CARuleMiner
data
- the instances for which class association rules should be minedException
- if rules can't be built successfullypublic Instances getInstancesNoClass()
getInstancesNoClass
in interface CARuleMiner
public Instances getInstancesOnlyClass()
getInstancesOnlyClass
in interface CARuleMiner
public Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(String[] options) throws Exception
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-c <the class index> The class index. (default = last)
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
public String toString()
public String metricString()
metricString
in interface CARuleMiner
public String removeAllMissingColsTipText()
public void setRemoveAllMissingCols(boolean r)
r
- true if cols are to be removed.public boolean getRemoveAllMissingCols()
public String upperBoundMinSupportTipText()
public double getUpperBoundMinSupport()
public void setUpperBoundMinSupport(double v)
v
- Value to assign to upperBoundMinSupport.public void setClassIndex(int index)
setClassIndex
in interface CARuleMiner
index
- the class indexpublic int getClassIndex()
public String classIndexTipText()
public void setCar(boolean flag)
flag
- if class association rules are mined, false otherwisepublic boolean getCar()
public String carTipText()
public String lowerBoundMinSupportTipText()
public double getLowerBoundMinSupport()
public void setLowerBoundMinSupport(double v)
v
- Value to assign to lowerBoundMinSupport.public SelectedTag getMetricType()
public String metricTypeTipText()
public void setMetricType(SelectedTag d)
d
- the type of metricpublic String minMetricTipText()
public double getMinMetric()
public void setMinMetric(double v)
v
- Value to assign to minConfidence.public String numRulesTipText()
public int getNumRules()
public void setNumRules(int v)
v
- Value to assign to numRules.public String deltaTipText()
public double getDelta()
public void setDelta(double v)
v
- Value to assign to delta.public String significanceLevelTipText()
public double getSignificanceLevel()
public void setSignificanceLevel(double v)
v
- Value to assign to significanceLevel.public void setOutputItemSets(boolean flag)
flag
- true if itemsets are to be output as wellpublic boolean getOutputItemSets()
public String outputItemSetsTipText()
public void setVerbose(boolean flag)
flag
- true if algorithm should be run in verbose modepublic boolean getVerbose()
public String verboseTipText()
public FastVector[] getAllTheRules()
m_allTheRules
public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractAssociator
public static void main(String[] args)
args
- the commandline optionsCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.