for i = 1:3 clust = find(cidx3==i); plot3(meas(clust,1),meas(clust,2),meas(clust,3),ptsymb{i}); hold on end plot3(cmeans3(:,1),cmeans3(:,2),cmeans3(:,3),'ko'); plot3(cmeans3(:,1),cmeans3(:,2),cmeans3(:,3),'kx'); hold off xlabel('Sepal Length'); ylabel('Sepal Width'); zlabel('Petal Length'); view(-137,10); grid on You can see that kmeans has split the upper cluster from the two-cluster solution, and that those two clusters are very close to each other. Depending on what you intend to do with these data after clustering them, this three-cluster solution may be more or less useful than the previous, two-cluster, solution. The first output argument from silhouette contains the silhouette values for each point, which you can use to compare the two solutions quantitatively. The average silhouette value was larger for the two-cluster solution, indicating that it is a better answer purely from the point of view of creating distinct clusters. [mean(silh2) mean(silh3)] You can also cluster these data using a different distance. The cosine distance might make sense for these data because it would ignore absolute sizes of the measurements, and only consider their relative sizes. Thus, two flowers that were different sizes, but which had similarly shaped petals and sepals, might not be close with respect to squared Euclidean distance, but would be close with respect to cosine distance. Clustering Fisher's Iris Data Using K-Means Clustering [cidxCos,cmeansCos] = kmeans(meas,3,'dist','cos'); From the silhouette plot, these clusters appear to be only slightly better separated than those found using squared Euclidean distance. [silhCos,h] = silhouette(meas,cidxCos,'cos'); [mean(silh2) mean(silh3) mean(silhCos)] Clustering Fisher's Iris Data Using K-Means Clustering Notice that the order of the clusters is different than in the previous silhouette plot. This is because kmeans chooses initial cluster assignments at random.
Do'stlaringiz bilan baham: |