Learn Before
Concept
Multicollinearity
Multicollinearity = Very strong correlation between two or more predictor variables.
-
As a result the posterior distribution will appear to suggest that none of the variables are reliably associated with the outcome - when in fact they might be strongly associated with the outcome.
-
The model will still have accurate predictions, but interpreting it will lead to confusion. You wont be able to determine the actual effect of the variables on the outcome.
-
See next node for an example
Solutions:
- Identifying highly correlated variables, eliminating all but one
- Combining them into one variable
- However, it is best to take a causal approach - to determine conditional associations, and not just correlations.
0
1
Updated 2021-07-20
Tags
Bayesian Statistics
Statistics
Data Science