Back to Showcase

K-Medoids Clustering

Watch the K-medoids algorithm iteratively assign data points to clusters and update medoid positions. Based on my SC2300 HW2 implementation with support for L1, L2, and Cosine distance metrics.

-1-1-0.5-0.5000.50.511Cluster 1Cluster 2Cluster 3
Iteration:0 / 3

Ringed points are medoids (cluster representatives)

Clustering Stats

Iteration0
Objective6.50
ConvergedNo
Points45

Objective

Cluster Sizes

Cluster 1
19
Cluster 2
12
Cluster 3
14

Configuration

The Algorithm

K-medoids is a robust variant of K-means that uses actual data points (medoids) as cluster centers instead of computed centroids. This makes it more robust to outliers and applicable to any distance metric.

# K-Medoids Algorithm

repeat until convergence:

1. Assign each point to nearest medoid

2. Update each medoid to the cluster

member minimizing total within-cluster distance

Key Concepts

Medoids vs Centroids

Unlike K-means which computes mean centroids, K-medoids selects actual data points as representatives. This makes the algorithm work with any distance metric, not just Euclidean.

K-Medoids++ Init

Similar to K-means++, initial medoids are selected with probability proportional to squared distance from existing medoids. This spreads initial centers and typically leads to faster convergence.

Distance Metrics

L2 (Euclidean) measures straight-line distance. L1 (Manhattan) sums absolute differences. Cosine measures angular similarity, useful for text data where direction matters more than magnitude.

Multi-Trial Selection

Since K-medoids can converge to local optima, the HW2 implementation runs multiple independent trials and selects the result with the lowest objective value (best-of-T strategy).

Implementation Details

ParameterValue
Distance MetricsL1, L2, Cosine
InitializationRandom, K-Medoids++
Datasets (HW2)Wine, Digits, 20 Newsgroups
EvaluationNMI, Elbow Method
CourseSC2300 Introduction to ML