What is the bias-variance tradeoff in machine learning
· Category: AI & Machine Learning
Short answer
High bias causes underfitting and poor training performance, while high variance causes overfitting and poor generalization. The tradeoff is balancing model complexity to minimize total error. For evaluating where your model sits, see how to evaluate machine learning model performance. For improving data quality, see what is feature engineering and why is it important.
Steps
- Check training error: high error indicates high bias
- Check test error: large gap between train and test indicates high variance
- Reduce bias by increasing model complexity or adding features
- Reduce variance with regularization, more data, or simpler models
- Use validation curves to find the sweet spot
Tips
- Ensemble methods can reduce variance without adding much bias
- Cross-validation helps estimate true generalization error
- For data preparation strategies, see how to preprocess data for machine learning models