Tag Archives: clustering

Clustering Analysis in R – Part 1

The last article provided a brief introduction to clustering. This one demonstrates how to conduct a basic clustering analysis in the statistical computing environment R (I have actually split it into 2 parts as it got rather long!). For demos like this it is easiest to use a small data set, ideally with few features relative to instances. The one used in this example is the Acidosis Patients data set available from this collection of clustering data sets. This data set has 40 instances, each corresponding to a patient and 6 features each corresponding to a measurement of blood or cerebrospinal fluid. Continue reading

A Quick Introduction to Clustering

Cluster analysis more usually referred to as clustering, is a common data mining task. In clustering the goal is to divide the data set into groups so that objects in the same group are similar to one another while objects in different groups are different to one another. In other words the goal is to minimize the intra-cluster distance while maximizing the inter-cluster distance.

Continue reading