Hierarchical clustering
Hierarchical clustering, also known as hierarchical cluster analysis, is a clustering algorithm that groups similar objects into groups called clusters.[1] Hierarchical clustering is often associated with heatmaps.[2]
Strategies
Strategies for hierarchical clustering generally fall into two types[3]:
- Divisive: This type works on the assumption that all the feature vectors form a single set and then hierarchically go on dividing this group into different sets.[4]
- Agglomerative: The idea is to ensure nearby points end up in the same cluster.[5] In this type, partitions are visualized using a tree structure called dendrogram.[4]
Advantages vs disadvantages[6]
Advantages
- Hierarchical clustering does not require the number of clusters to be specified.
- It is easy to implement.
- Hierarchical clustering produces a dendogram, which hlps with understanding the data.
Disadvantages
- The hierarchical algorithm can never do any previous steps throughout the algorithm
- The time complexity for the clustering can result in very long computation times in comparison with efficient algorithms such as K-means.
- If we have a large data set, it can become difficult to determine the correct number of clusters by the dendrogram.
References
- ↑ What is Hierarchical Clustering?displayr.com
- ↑ StatQuest: Hierarchical Clusteringyoutube.com
- ↑ Intro to Hierarchical ClusteringCoursera
- ↑ 4.0 4.1 Divisive Clusteringsciencedirect.com
- ↑ Agglomerative Clustering: how it worksyoutube.com
- ↑ Hierarchical ClusteringCoursera