public class FPGrowth extends AbstractAssociator implements OptionHandler, TechnicalInformationHandler
@inproceedings{Han2000, author = {J. Han and J.Pei and Y. Yin}, booktitle = {Proceedings of the 2000 ACM-SIGMID International Conference on Management of Data}, pages = {1-12}, title = {Mining frequent patterns without candidate generation}, year = {2000} }Valid options are:
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
Modifier and Type | Class and Description |
---|---|
static class |
FPGrowth.AssociationRule |
static class |
FPGrowth.BinaryItem
Inner class that handles a single binary item
|
protected static class |
FPGrowth.FPTreeNode
A node in the FP-tree.
|
protected static class |
FPGrowth.FrequentBinaryItemSet
Class for maintaining a frequent item set.
|
protected static class |
FPGrowth.FrequentItemSets
Maintains a list of frequent item sets.
|
protected static class |
FPGrowth.ShadowCounts
This class holds the counts for projected tree nodes
and header lists.
|
Modifier and Type | Field and Description |
---|---|
protected double |
m_delta
The amount by which to decrease the support in each iteration
|
protected boolean |
m_findAllRulesForSupportLevel
If true, just all rules meeting the lower bound on the minimum
support will be found.
|
protected FPGrowth.FrequentItemSets |
m_largeItemSets
Holds the large item sets found
|
protected double |
m_lowerBoundMinSupport
The lower bound on minimum support
|
protected int |
m_maxItems |
protected FPGrowth.AssociationRule.METRIC_TYPE |
m_metric |
protected double |
m_metricThreshold |
protected boolean |
m_mustContainOR
Use OR rather than AND when considering must contain lists
|
protected int |
m_numRulesToFind
The number of rules to find
|
protected int |
m_positiveIndex
The index (1 based) of binary attributes to treat as the positive value
|
protected List<FPGrowth.AssociationRule> |
m_rules
Holds the rules
|
protected String |
m_rulesMustContain
If set, then only output rules containing these itmes
|
protected String |
m_transactionsMustContain
If set, limit the transactions (instances) input to the
algorithm to those that contain these items
|
protected double |
m_upperBoundMinSupport
The upper bound on the minimum support
|
Constructor and Description |
---|
FPGrowth()
Construct a new FPGrowth object.
|
Modifier and Type | Method and Description |
---|---|
void |
buildAssociations(Instances data)
Method that generates all large item sets with a minimum support, and from
these all association rules with a minimum metric (i.e.
|
protected weka.associations.FPGrowth.FPTreeRoot |
buildFPTree(ArrayList<FPGrowth.BinaryItem> singletons,
Instances data,
int minSupport)
Construct the frequent pattern tree by inserting each transaction
in the data into the tree.
|
String |
deltaTipText()
Returns the tip text for this property
|
String |
findAllRulesForSupportLevelTipText()
Tip text for this property suitable for displaying
in the GUI.
|
List<FPGrowth.AssociationRule> |
getAssociationRules()
Gets the list of mined association rules.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
double |
getDelta()
Get the value of delta.
|
boolean |
getFindAllRulesForSupportLevel()
Get whether all rules meeting the lower bound on min support
and the minimum metric threshold are to be found.
|
double |
getLowerBoundMinSupport()
Get the value of lowerBoundMinSupport.
|
int |
getMaxNumberOfItems()
Gets the maximum number of items to be included in large item sets.
|
SelectedTag |
getMetricType()
Get the metric type to use.
|
double |
getMinMetric()
Get the value of minConfidence.
|
int |
getNumRulesToFind()
Get the number of rules to find.
|
String[] |
getOptions()
Gets the current settings of the classifier.
|
int |
getPositiveIndex()
Get the index of the attribute value to consider as positive
for binary attributes in normal dense instances.
|
String |
getRevision()
Returns the revision string.
|
String |
getRulesMustContain()
Get the comma separated list of items that
rules must contain in order to be output.
|
protected ArrayList<FPGrowth.BinaryItem> |
getSingletons(Instances data)
Get the singleton items in the data
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
String |
getTransactionsMustContain()
Gets the comma separated list of items that
transactions must contain in order to be considered
for large item sets and rules.
|
double |
getUpperBoundMinSupport()
Get the value of upperBoundMinSupport.
|
boolean |
getUseORForMustContainList()
Gets whether OR is to be used rather than AND when
considering must contain lists.
|
String |
globalInfo()
Returns a string describing this associator
|
String |
graph(weka.associations.FPGrowth.FPTreeRoot tree)
Assemble a dot graph representation of the FP-tree.
|
Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
String |
lowerBoundMinSupportTipText()
Returns the tip text for this property
|
static void |
main(String[] args)
Main method.
|
String |
maxNumberOfItemsTipText()
Tip text for this property suitable for displaying
in the GUI.
|
String |
metricTypeTipText()
Tip text for this property suitable for displaying
in the GUI.
|
protected void |
mineTree(weka.associations.FPGrowth.FPTreeRoot tree,
FPGrowth.FrequentItemSets largeItemSets,
int recursionLevel,
FPGrowth.FrequentBinaryItemSet conditionalItems,
int minSupport)
Find large item sets in the FP-tree.
|
String |
minMetricTipText()
Returns the tip text for this property
|
String |
numRulesToFindTipText()
Tip text for this property suitable for displaying
in the GUI.
|
String |
positiveIndexTipText()
Tip text for this property suitable for displaying
in the GUI.
|
void |
resetOptions()
Reset all options to their default values.
|
String |
rulesMustContainTipText()
Returns the tip text for this property
|
void |
setDelta(double v)
Set the value of delta.
|
void |
setFindAllRulesForSupportLevel(boolean s)
If true then turn off the iterative support reduction method
of finding x rules that meet the minimum support and metric
thresholds and just return all the rules that meet the
lower bound on minimum support and the minimum metric.
|
void |
setLowerBoundMinSupport(double v)
Set the value of lowerBoundMinSupport.
|
void |
setMaxNumberOfItems(int max)
Set the maximum number of items to include in large items sets.
|
void |
setMetricType(SelectedTag d)
Set the metric type to use.
|
void |
setMinMetric(double v)
Set the value of minConfidence.
|
void |
setNumRulesToFind(int numR)
Set the desired number of rules to find.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setPositiveIndex(int index)
Set the index of the attribute value to consider as positive
for binary attributes in normal dense instances.
|
void |
setRulesMustContain(String list)
Set the comma separated list of items that rules
must contain in order to be output.
|
void |
setTransactionsMustContain(String list)
Set the comma separated list of items that transactions
must contain in order to be considered for large
item sets and rules.
|
void |
setUpperBoundMinSupport(double v)
Set the value of upperBoundMinSupport.
|
void |
setUseORForMustContainList(boolean b)
Set whether to use OR rather than AND when considering
must contain lists.
|
String |
toString()
Output the association rules.
|
String |
transactionsMustContainTipText()
Returns the tip text for this property
|
String |
upperBoundMinSupportTipText()
Returns the tip text for this property
|
String |
useORForMustContainListTipText()
Returns the tip text for this property
|
String |
xmlRules() |
forName, makeCopies, makeCopy, runAssociator
protected int m_numRulesToFind
protected double m_upperBoundMinSupport
protected double m_lowerBoundMinSupport
protected double m_delta
protected boolean m_findAllRulesForSupportLevel
protected int m_positiveIndex
protected FPGrowth.AssociationRule.METRIC_TYPE m_metric
protected double m_metricThreshold
protected FPGrowth.FrequentItemSets m_largeItemSets
protected List<FPGrowth.AssociationRule> m_rules
protected int m_maxItems
protected String m_transactionsMustContain
protected boolean m_mustContainOR
protected String m_rulesMustContain
public Capabilities getCapabilities()
getCapabilities
in interface Associator
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractAssociator
Capabilities
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
protected ArrayList<FPGrowth.BinaryItem> getSingletons(Instances data) throws Exception
data
- the Instances to processException
- if the singletons can't be found for some reasonprotected weka.associations.FPGrowth.FPTreeRoot buildFPTree(ArrayList<FPGrowth.BinaryItem> singletons, Instances data, int minSupport)
singletons
- the singleton item setsdata
- the Instances containing the transactionsminSupport
- the minimum supportprotected void mineTree(weka.associations.FPGrowth.FPTreeRoot tree, FPGrowth.FrequentItemSets largeItemSets, int recursionLevel, FPGrowth.FrequentBinaryItemSet conditionalItems, int minSupport)
tree
- the root of the tree to minelargeItemSets
- holds the large item sets foundrecursionLevel
- the recursion level for the current projected
countsconditionalItems
- the current set of items that the current
(projected) tree is conditional onminSupport
- the minimum acceptable supportpublic void resetOptions()
public String positiveIndexTipText()
public void setPositiveIndex(int index)
index
- the index to use for positive values in binary attributes.public int getPositiveIndex()
public void setNumRulesToFind(int numR)
numR
- the number of rules to find.public int getNumRulesToFind()
public String numRulesToFindTipText()
public void setMetricType(SelectedTag d)
d
- the metric typepublic void setMaxNumberOfItems(int max)
max
- the maxim number of items to include in large item sets.public int getMaxNumberOfItems()
public String maxNumberOfItemsTipText()
public SelectedTag getMetricType()
public String metricTypeTipText()
public String minMetricTipText()
public double getMinMetric()
public void setMinMetric(double v)
v
- Value to assign to minConfidence.public String transactionsMustContainTipText()
public void setTransactionsMustContain(String list)
list
- a comma separated list of items (empty
string indicates no restriction on the transactions).public String getTransactionsMustContain()
public String rulesMustContainTipText()
public void setRulesMustContain(String list)
list
- a comma separated list of items (empty
string indicates no restriction on the rules).public String getRulesMustContain()
public String useORForMustContainListTipText()
public void setUseORForMustContainList(boolean b)
b
- true if OR should be used instead of AND when
considering transaction and rules must contain lists.public boolean getUseORForMustContainList()
public String deltaTipText()
public double getDelta()
public void setDelta(double v)
v
- Value to assign to delta.public String lowerBoundMinSupportTipText()
public double getLowerBoundMinSupport()
public void setLowerBoundMinSupport(double v)
v
- Value to assign to lowerBoundMinSupport.public String upperBoundMinSupportTipText()
public double getUpperBoundMinSupport()
public void setUpperBoundMinSupport(double v)
v
- Value to assign to upperBoundMinSupport.public String findAllRulesForSupportLevelTipText()
public void setFindAllRulesForSupportLevel(boolean s)
s
- true if all rules meeting the lower bound on the support
and minimum metric thresholds are to be found.public boolean getFindAllRulesForSupportLevel()
public List<FPGrowth.AssociationRule> getAssociationRules()
public Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
public void setOptions(String[] options) throws Exception
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
public void buildAssociations(Instances data) throws Exception
buildAssociations
in interface Associator
data
- the instances to be used for generating the associationsException
- if rules can't be built successfullypublic String toString()
public String graph(weka.associations.FPGrowth.FPTreeRoot tree)
tree
- the root of the FP-treepublic String xmlRules()
public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractAssociator
public static void main(String[] args)
args
- the commandline optionsCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.