public class SimpleKMeans extends RandomizableClusterer implements NumberOfClustersRequestable, WeightedInstancesHandler
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
RandomizableClusterer
,
Serialized FormModifier and Type | Field and Description |
---|---|
protected int[] |
m_Assignments
Assignments obtained
|
protected DistanceFunction |
m_DistanceFunction
the distance function used.
|
m_Seed, m_SeedDefault
Constructor and Description |
---|
SimpleKMeans()
the default constructor
|
Modifier and Type | Method and Description |
---|---|
void |
buildClusterer(Instances data)
Generates a clusterer.
|
int |
clusterInstance(Instance instance)
Classifies a given instance.
|
String |
displayStdDevsTipText()
Returns the tip text for this property
|
String |
distanceFunctionTipText()
Returns the tip text for this property.
|
String |
dontReplaceMissingValuesTipText()
Returns the tip text for this property
|
int[] |
getAssignments()
Gets the assignments for each instance
|
Capabilities |
getCapabilities()
Returns default capabilities of the clusterer.
|
Instances |
getClusterCentroids()
Gets the the cluster centroids
|
int[][][] |
getClusterNominalCounts()
Returns for each cluster the frequency counts for the values of each
nominal attribute
|
int[] |
getClusterSizes()
Gets the number of instances in each cluster
|
Instances |
getClusterStandardDevs()
Gets the standard deviations of the numeric attributes in each cluster
|
boolean |
getDisplayStdDevs()
Gets whether standard deviations and nominal count
Should be displayed in the clustering output
|
DistanceFunction |
getDistanceFunction()
returns the distance function currently in use.
|
boolean |
getDontReplaceMissingValues()
Gets whether missing values are to be replaced
|
int |
getMaxIterations()
gets the number of maximum iterations to be executed
|
int |
getNumClusters()
gets the number of clusters to generate
|
String[] |
getOptions()
Gets the current settings of SimpleKMeans
|
boolean |
getPreserveInstancesOrder()
Gets whether order of instances must be preserved
|
String |
getRevision()
Returns the revision string.
|
double |
getSquaredError()
Gets the squared error for all clusters
|
String |
globalInfo()
Returns a string describing this clusterer
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] argv)
Main method for testing this class.
|
String |
maxIterationsTipText()
Returns the tip text for this property
|
protected double[] |
moveCentroid(int centroidIndex,
Instances members,
boolean updateClusterInfo)
Move the centroid to it's new coordinates.
|
int |
numberOfClusters()
Returns the number of clusters.
|
String |
numClustersTipText()
Returns the tip text for this property
|
String |
preserveInstancesOrderTipText()
Returns the tip text for this property
|
void |
setDisplayStdDevs(boolean stdD)
Sets whether standard deviations and nominal count
Should be displayed in the clustering output
|
void |
setDistanceFunction(DistanceFunction df)
sets the distance function to use for instance comparison.
|
void |
setDontReplaceMissingValues(boolean r)
Sets whether missing values are to be replaced
|
void |
setMaxIterations(int n)
set the maximum number of iterations to be executed
|
void |
setNumClusters(int n)
set the number of clusters to generate
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setPreserveInstancesOrder(boolean r)
Sets whether order of instances must be preserved
|
String |
toString()
return a string describing this clusterer
|
getSeed, seedTipText, setSeed
distributionForInstance, forName, makeCopies, makeCopy, runClusterer
protected DistanceFunction m_DistanceFunction
protected int[] m_Assignments
public String globalInfo()
public Capabilities getCapabilities()
getCapabilities
in interface Clusterer
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractClusterer
Capabilities
public void buildClusterer(Instances data) throws Exception
buildClusterer
in interface Clusterer
buildClusterer
in class AbstractClusterer
data
- set of instances serving as training dataException
- if the clusterer has not been
generated successfullyprotected double[] moveCentroid(int centroidIndex, Instances members, boolean updateClusterInfo)
centroidIndex
- index of the centroid which the coordinates will be computedmembers
- the objects that are assigned to the cluster of this centroidupdateClusterInfo
- if the method is supposed to update the m_Cluster arrayspublic int clusterInstance(Instance instance) throws Exception
clusterInstance
in interface Clusterer
clusterInstance
in class AbstractClusterer
instance
- the instance to be assigned to a clusterException
- if instance could not be classified
successfullypublic int numberOfClusters() throws Exception
numberOfClusters
in interface Clusterer
numberOfClusters
in class AbstractClusterer
Exception
- if number of clusters could not be returned
successfullypublic Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableClusterer
public String numClustersTipText()
public void setNumClusters(int n) throws Exception
setNumClusters
in interface NumberOfClustersRequestable
n
- the number of clusters to generateException
- if number of clusters is negativepublic int getNumClusters()
public String maxIterationsTipText()
public void setMaxIterations(int n) throws Exception
n
- the maximum number of iterationsException
- if maximum number of iteration is smaller than 1public int getMaxIterations()
public String displayStdDevsTipText()
public void setDisplayStdDevs(boolean stdD)
stdD
- true if std. devs and counts should be
displayedpublic boolean getDisplayStdDevs()
public String dontReplaceMissingValuesTipText()
public void setDontReplaceMissingValues(boolean r)
r
- true if missing values are to be
replacedpublic boolean getDontReplaceMissingValues()
public String distanceFunctionTipText()
public DistanceFunction getDistanceFunction()
public void setDistanceFunction(DistanceFunction df) throws Exception
df
- the new distance function to useException
- if instances cannot be processedpublic String preserveInstancesOrderTipText()
public void setPreserveInstancesOrder(boolean r)
r
- true if missing values are to be
replacedpublic boolean getPreserveInstancesOrder()
public void setOptions(String[] options) throws Exception
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
setOptions
in interface OptionHandler
setOptions
in class RandomizableClusterer
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableClusterer
public String toString()
public Instances getClusterCentroids()
public Instances getClusterStandardDevs()
public int[][][] getClusterNominalCounts()
public double getSquaredError()
public int[] getClusterSizes()
public int[] getAssignments() throws Exception
Exception
- if order of instances wasn't preserved or no assignments were madepublic String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractClusterer
public static void main(String[] argv)
argv
- should contain the following arguments: -t training file [-N number of clusters]
Copyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.