Hierarchical clustering: Difference between revisions

Latest revision as of 15:22, 14 April 2020

Hierarchical clustering, also known as hierarchical cluster analysis, is a clustering algorithm that groups similar objects into groups called clusters.^[1] Hierarchical clustering is often associated with heatmaps.^[2]

Strategies

Strategies for hierarchical clustering generally fall into two types^[3]:

Divisive: This type works on the assumption that all the feature vectors form a single set and then hierarchically go on dividing this group into different sets.^[4]
Agglomerative: The idea is to ensure nearby points end up in the same cluster.^[5] In this type, partitions are visualized using a tree structure called dendrogram.^[4]

Advantages vs disadvantages^[6]

Advantages

Hierarchical clustering does not require the number of clusters to be specified.
It is easy to implement.
Hierarchical clustering produces a dendogram, which hlps with understanding the data.

Disadvantages

The hierarchical algorithm can never do any previous steps throughout the algorithm
The time complexity for the clustering can result in very long computation times in comparison with efficient algorithms such as K-means.
If we have a large data set, it can become difficult to determine the correct number of clusters by the dendrogram.

References

↑ What is Hierarchical Clustering?displayr.com
↑ StatQuest: Hierarchical Clusteringyoutube.com
↑ Intro to Hierarchical ClusteringCoursera
↑ ^4.0 ^4.1 Divisive Clusteringsciencedirect.com
↑ Agglomerative Clustering: how it worksyoutube.com
↑ Hierarchical ClusteringCoursera

[1] What is Hierarchical Clustering?displayr.com

[2] StatQuest: Hierarchical Clusteringyoutube.com

[intro-3] Intro to Hierarchical ClusteringCoursera

[ssa-4] 4.0 ^4.1 Divisive Clusteringsciencedirect.com

[5] Agglomerative Clustering: how it worksyoutube.com

[6] Hierarchical ClusteringCoursera

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 1: / Line 1: @@
-'''Hierarchical clustering''', also known as hierarchical cluster analysis, is an algorithm that groups similar objects into groups called clusters.<ref>[https://www.displayr.com/what-is-hierarchical-clustering/ What is Hierarchical Clustering?]displayr.com</ref> Hierarchical clustering is often associated with heatmaps.<ref>[https://www.youtube.com/watch?v=7xHsRkOdVwo StatQuest: Hierarchical Clustering]youtube.com</ref>
+'''Hierarchical clustering''', also known as hierarchical cluster analysis, is a [[clustering]] algorithm that groups similar objects into groups called clusters.<ref>[https://www.displayr.com/what-is-hierarchical-clustering/ What is Hierarchical Clustering?]displayr.com</ref> Hierarchical clustering is often associated with heatmaps.<ref>[https://www.youtube.com/watch?v=7xHsRkOdVwo StatQuest: Hierarchical Clustering]youtube.com</ref>
 == Strategies ==
 Strategies for hierarchical clustering generally fall into two types<ref name="intro">[https://www.coursera.org/learn/machine-learning-with-python/lecture/cHku3/intro-to-hierarchical-clustering Intro to Hierarchical Clustering]Coursera</ref>:
 * Divisive: This type works on the assumption that all the feature vectors form a single set and then hierarchically go on dividing this group into different sets.<ref name="ssa">[https://www.sciencedirect.com/topics/computer-science/divisive-clustering Divisive Clustering]sciencedirect.com</ref>
-* Agglomerative: In this type partitions are visualized using a tree structure called [[dendrogram]].<ref name="ssa"/>
+* Agglomerative: The idea is to ensure nearby points end up in the same cluster.<ref>[https://www.youtube.com/watch?v=XJ3194AmH40 Agglomerative Clustering: how it works]youtube.com</ref> In this type, partitions are visualized using a tree structure called [[dendrogram]].<ref name="ssa"/>
 == Advantages vs disadvantages<ref>[https://www.coursera.org/learn/machine-learning-with-python/lecture/h2iAD/more-on-hierarchical-clustering Hierarchical Clustering]Coursera</ref> ==