Suppose your model is demonstrating high variance across the different training sets. Which of the following is NOT valid way to try and reduce the variance?

increase the amount of traning data in each traning set
improve the optimization algorithm being used for error minimization.
decrease the model complexity
reduce the noise in the training data

The correct answer is D.

High variance is a problem in machine learning when a model’s predictions are very sensitive to small changes in the training data. This can happen when the model is too complex or when the training data is noisy.

There are a few things that can be done to reduce variance:

  • Increase the amount of training data. This will help to reduce the noise in the training data and make the model more robust to small changes.
  • Improve the optimization algorithm. This will help to find a better minimum for the loss function, which will reduce the variance of the model’s predictions.
  • Decrease the model complexity. This will help to prevent the model from overfitting the training data and making predictions that are not generalizable to new data.
  • Reduce the noise in the training data. This can be done by using techniques such as data cleaning and data augmentation.

Option D, reduce the noise in the training data, is not a valid way to reduce variance because it does not address the root cause of the problem. The noise in the training data is a symptom of the model being too complex or the training data being too small. Reducing the noise in the training data will not make the model more robust to small changes in the training data.

The other options, increase the amount of training data, improve the optimization algorithm, and decrease the model complexity, are all valid ways to reduce variance.

Exit mobile version