Partitioning methods:
Use the squared Euclidean distance from the (dis) similarity menu.
|
Deciding on the number of clusters
|
Hierarchical clustering:
|
Examine the dendrogram:
|
Statistics ► Multivariate analysis ► Cluster analysis ►
Postclustering ► Dendrogram
|
cluster dendrogram wards_linkage, cutnumber
(10) showcount
|
Examine the VRC and Duda-Hart indices:
|
Statistics Multivariate analysis ► Cluster analysis ►
Postclustering ► Cluster analysis stopping rules.
|
For VRC: cluster stop wards_linkage, rule (calinski) groups(2/11)
|
For Duda-Hart: cluster stop wards_linkage, rule (duda) groups(1/10)
|
Include practical considerations in your decision.
|
Partitioning methods:
|
Run a hierarchical cluster analysis and decide on the number
of segments based on a dendrogram, the VRC, and the Duda- Hart indices; use the resulting partition as starting partition.
|
Statistics Multivariate analysis ► Cluster analysis ►
Postclustering ► Cluster analysis stopping rules.
|
cluster kmeans e1 e5 e9 e21 e22, k(3) measure (L2squared) name(kmeans) start(group
(cluster_wl))
|
Include practical considerations in your decision.
|
Stability
|
Re-run the analysis using different clustering procedures, linkage algorithms or distance measures. For example, generate a cluster membership variable and use this grouping
as starting partition for k-means clustering.
|
cluster generate cluster_wl ¼ groups(3), name (wards_linkage) ties(error)
|
cluster kmeans e1 e5 e9 e21 e22, k(3) measure
(L2squared) name (kmeans) start(group (cluster_wl))
|
Examine the overlap in the clustering solutions. If more than 20% of the cluster affiliations change from one technique to the other, you should reconsider the set-up.
|
tabulate cluster_wl kmeans
|
Change the order of objects in the dataset (hierarchical clustering only).
|
Differentiation of the data
|
Compare the cluster centroids across the different clusters for significant differences.
|
mean e1 e5 e9 e21 e22, over(cluster_wl)
|
If possible, assess the solution’s criterion validity.
|
Profiling
|
Identify observable variables (e.g., demographics) that best mirror the partition of the objects based on the clustering variables.
|
tabulate cluster_wl flight_purpose, chi2 V
|
Interpretating of the cluster solution
|
Identify names or labels for each cluster and characterize each cluster by means of observable variables.
|