Guide for Over-fitting vs Under-fitting

Guide for Over-fitting vs Under-fitting

Hi everyone!! This is my first article on this website, hope it helps you all! :)

These Over-fitting and Under-fitting terms are quite common among people who are in Machine Learning and Data Science filed. In this article, we will look into these two terminology and few more terms to understand it better.

What is Over-fitting?

Over-fitting can be described as when the model has high variance and low bias.

What is Under-fitting?

Under-fitting can be described as when the model has high bias and low variance.

Oh ho, what is this Bias and Variance now?

  • Bias: In simple language, understand this as when our model has a very simple assumption of data.
  • Variance: In contrast to bias, variance when our model is too complex on training data.

image.png

As we can see in the above image, first one example of High bias(underfitting) and last one is an example of High variance(over-fitting).

Okay, now tell us how would we know our model is over-fitting(high bias) and underfitting(high variance)?

One way is using error in model predications.

  • For the case of Under-fitting: We have high error in training data as well as testing data.

  • In case of Over-fitting: We have less error on training data but high error testing data.

Now, as we know all the required terms, let's conclude and define Under-fitting and Over-fitting again.

Conclusion:

A model is said to be under-fit when the model has high bias and less variance, which can also be verified if the model gives high error on both training and test dataset.

On the other hand, a model is said to be an over-fit model if it has high variance and low bias, for verification, over-fit model has high accuracy(less error) on training data whereas high error on test data.

Author: Satyampd