Mashinali o‘qitishga kirish Nosirov Xabibullo xikmatullo o‘gli Falsafa doktori (PhD), tret kafedrasi mudiri

Clustering Fisher's Iris Data Using Hierarchical Clustering

Download 1.6 Mb.

bet	9/10
Sana	15.06.2023
Hajmi	1.6 Mb.
	#1478310

1 2 3 4 5 6 7 8 9 10

Bog'liq
Mashinali o\'qitishga kirish 20-ma\'ruza Nosirov Kh

[h,nodes] = dendrogram(clustTreeCos,12);
Clustering Fishers Iris Data Using Hierarchical Clustering

Clustering Fisher's Iris Data Using Hierarchical Clustering

Based on the results from K-Means clustering, cosine might also be a good choice of distance measure. The resulting hierarchical tree is quite different, suggesting a very different way to look at group structure in the iris data.

cosD = pdist(meas,'cosine'); clustTreeCos = linkage(cosD,'average'); cophenet(clustTreeCos,cosD)

[h,nodes] = dendrogram(clustTreeCos,0); h_gca = gca; h_gca.TickDir = 'out'; h_gca.TickLength = [.002 0]; h_gca.XTickLabel = [];

Clustering Fisher's Iris Data Using Hierarchical Clustering

The highest level of this tree separates iris specimens into two very distinct groups. The dendrogram shows that, with respect to cosine distance, the within-group differences are much smaller relative to the between-group differences than was the case for Euclidean distance. This is exactly what you would expect for these data, since the cosine distance computes a zero pairwise distance for objects that are in the same "direction" from the origin.

With 150 observations, the plot is cluttered, but you can make a simplified dendrogram that does not display the very lowest levels of the tree.

[h,nodes] = dendrogram(clustTreeCos,12);

Clustering Fisher's Iris Data Using Hierarchical Clustering

The three highest nodes in this tree separate out three equally-sized groups, plus a single specimen (labeled as leaf node 5) that is not near any others.

[sum(ismember(nodes,[11 12 9 10])) sum(ismember(nodes,[6 7 8])) ... sum(ismember(nodes,[1 2 4 3])) sum(nodes==5)]

Clustering Fisher's Iris Data Using Hierarchical Clustering

For many purposes, the dendrogram might be a sufficient result. However, you can go one step further, and use the cluster function to cut the tree and explicitly partition observations into specific clusters, as with K-Means. Using the hierarchy from the cosine distance to create clusters, specify a linkage height that will cut the tree below the three highest nodes, and create four clusters, then plot the clustered raw data.

Download 1.6 Mb.

Do'stlaringiz bilan baham:

1 2 3 4 5 6 7 8 9 10

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling