By default, kmeans begins the clustering process using a randomly selected set of initial centroid locations. The kmeans algorithm can converge to a solution that is a local minimum; that is, kmeans can partition the data such that moving any single point to a different cluster increases the total sum of distances. However, as with many other types of numerical minimizations, the solution that kmeans reaches sometimes depends on the starting points. Therefore, other solutions (local minima) that have a lower total sum of distances can exist for the data. You can use the optional 'Replicates' name-value pair argument to test different solutions. When you specify more than one replicate, kmeans repeats the clustering process starting from different randomly selected centroids for each replicate. kmeans then returns the solution with the lowest total sum of distances among all the replicates. [cidx3,cmeans3,sumd3] = kmeans(meas,3,'replicates',5,'display','final');
The output shows that, even for this relatively simple problem, non-global minima do exist. Each of these five replicates began from a different set of initial centroids. Depending on where it started from, kmeans reached one of two different solutions. However, the final solution that kmeans returns is the one with the lowest total sum of distances, over all replicates. The third output argument contains the sum of distances within each cluster for that best solution.
Clustering Fisher's Iris Data Using K-Means Clustering sum(sumd3) ans = 78.8514 A silhouette plot for this three-cluster solution indicates that there is one cluster that is well-separated, but that the other two clusters are not very distinct. [silh3,h] = silhouette(meas,cidx3,'sqeuclidean'); Clustering Fisher's Iris Data Using K-Means Clustering Again, you can plot the raw data to see how kmeans has assigned the points to clusters.
Do'stlaringiz bilan baham: |