In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model. Which of the following option is true?

If R Squared increases, this variable is significant.
If R Squared decreases, this variable is not significant.
Individually R squared cannot tell about variable importance. We can't say anything about it right now.
None of these.

The correct answer is: C. Individually R squared cannot tell about variable importance. We can’t say anything about it right now.

R-squared is a measure of the goodness of fit of a linear regression model. It is calculated by taking the sum of squares of the residuals (SSR) and dividing it by the total sum of squares (SST). The closer R-squared is to 1, the better the model fits the data.

However, R-squared is not a good measure of variable importance. This is because R-squared is affected by all of the variables in the model, not just the one you are interested in. For example, if you add a new variable to a model that is already well-fit, R-squared may decrease even though the new variable is actually significant.

To determine the importance of a variable, you need to use a technique such as stepwise regression or ridge regression. These techniques allow you to add and remove variables from the model and see how the R-squared changes. This information can then be used to determine which variables are most important for predicting the response variable.

Here is a brief explanation of each option:

  • Option A: If R Squared increases, this variable is significant. This is not necessarily true. As mentioned above, R-squared is affected by all of the variables in the model, not just the one you are interested in. For example, if you add a new variable to a model that is already well-fit, R-squared may increase even though the new variable is actually not significant.
  • Option B: If R Squared decreases, this variable is not significant. This is also not necessarily true. As mentioned above, R-squared is affected by all of the variables in the model, not just the one you are interested in. For example, if you remove a variable from a model that is already well-fit, R-squared may decrease even though the variable is actually significant.
  • Option D: None of these. This is the correct answer. R-squared is not a good measure of variable importance. To determine the importance of a variable, you need to use a technique such as stepwise regression or ridge regression.
Exit mobile version