Selecting data so as to assure that each class is properly represented in both the training and test set.

cross validation
stratification
verification
bootstrapping

The correct answer is B. stratification.

Stratification is a technique used in machine learning to ensure that the training and test sets are representative of the original data set. This is done by dividing the data set into groups based on the target variable, and then randomly selecting samples from each group to be included in the training and test sets. This helps to prevent the model from overfitting to the training data and makes it more likely to generalize well to new data.

Cross validation is a technique used to evaluate the performance of a machine learning model. It does this by dividing the data set into multiple subsets, and then using each subset to train the model and evaluate its performance on the remaining subsets. This helps to reduce the bias in the model’s performance estimate.

Verification is a technique used to assess the accuracy of a machine learning model. It does this by comparing the model’s predictions to the actual values in the test set. This helps to identify any errors in the model’s predictions.

Bootstrapping is a technique used to estimate the uncertainty in the performance of a machine learning model. It does this by repeatedly sampling from the data set and training the model on each sample. This helps to create a distribution of the model’s performance, which can be used to estimate the model’s confidence interval.

Exit mobile version