Hierarchical clustering

Hierarchical clustering is a branch of cluster analysis which treats clusters hierarchically, i.e. as a set of levels. The construction of the hierarchy can be performed using two major approaches, or combinations thereof: In agglomerative hierarchical clustering, existing clusters are merged iteratively, while divisive hierarchical clustering starts out with all data in one cluster that is then split iteratively. At each step of the process, a mathematical measure of distance or similarity between (agglomerative) or within clusters (divisive) is being computed to determine how to split or merge. Several different distance and similarity measures can be used, which generally result in different hierarchies, thus complicating their interpretation. Nonetheless, hierarchical clustering is more intuitively understandable than flat clustering, and so it enjoys considerable popularity for multivariate analysis of data, e.g. of gene or protein sequences.