Regarding the discussion on K-medoid algorithm, the standard idea seems to be to
consider one of the data points in the cluster as the medoid (so it is the most representative
datapoint). The way the most representative data point is chosen may vary, but the most reasonable
idea (that is resistant to outliers) is to pick the point that has the lowest cumulative distance to all other
points. Since this is a costly operation, sometimes it is done only on a sample of the points in the cluster.
Here is the algorithm:
Basic K-medoid Algorithm:
1. Select K points as the initial medoids.
2. Assign all points to the closest medoid.
3. See if any other point is a "better" medoid (i.e, has the lowest average distance to all other points)
Finding a better medoid involves comparing all pairs of medoid and non-medoid points and is relatively inefficient
.–Sampling may be used.
4. Repeat steps 2 and 3 until the medoids don't change.
ps: Here is a paper that does use this type of clustering: