The K-means algorithm:

requires the dimension of the feature space to be no bigger than the number of samples
has the smallest value of the objective function when k = 1
minimizes the within class variance for a given number of clusters
converges to the global optimum if and only if the initial means are chosen as some of the samples themselves

The correct answer is: C. minimizes the within class variance for a given number of clusters.

The K-means algorithm is a clustering algorithm that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

The within-cluster sum of squares (WCSS) is a measure of the compactness of each cluster. It is defined as the sum of the squared distances between each data point and the cluster center of its assigned cluster. The K-means algorithm minimizes the WCSS, which is equivalent to minimizing the within-class variance.

Option A is incorrect because the dimension of the feature space does not need to be no bigger than the number of samples. In fact, the dimension of the feature space can be much larger than the number of samples.

Option B is incorrect because the smallest value of the objective function is not necessarily when k = 1. The smallest value of the objective function depends on the data set and the number of clusters.

Option D is incorrect because the K-means algorithm does not always converge to the global optimum. The K-means algorithm is an iterative algorithm, and it can converge to a local optimum instead of the global optimum.

Exit mobile version