Cross Validation and Hyperparameter Tuning

Himanshu Sharma
4 min readApr 8, 2021

Hello everyone! In this blog post, I want to focus on the importance of cross validation and hyperparameter tuning. I will give a short overview of the topic.

Cross validation is a technique used to identify how well our model performed and there is always a need to test the accuracy of our model to verify that, our model is well trained with data without any overfitting and underfitting. This process of validation is performed only after training the model with data.

First, let us understand the terms overfitting and underfitting.

while using statistical methods (like logistic regression, linear regression etc…) on our data, generally we split our data into training and testing samples and fit the model on training samples and make predictions on test samples. Now, there is a possibility of overfitting or underfitting the data.

Overfitting

In statistics, overfitting means our model fits too closely to our data. The fitted line will go exactly through every point in the graph and this may fail to make predictions on future data reliably.

To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout)

Underfitting

Underfitting means our model doesn’t fit well with the data(i.e, model cannot capture the underlying trend of data, which destroys the model accuracy)and occurs when a statistical model or machine learning algorithm cannot adequately capture the underlying structure of the data.

K-fold Cross Validation

Our dataset should be as large as possible to train the model and removing considerable part of it for validation poses a problem of losing valuable portion of data that we would prefer to be able to train. In order to address this issue, we use the K-fold Cross validation technique.

In K Fold cross validation, the data is divided into k subsets and train our model on k-1 subsets and hold the last one for test. This process is repeated k times, such that each time, one of the k subsets is used as the test set/ validation set and the other k-1 subsets are put together to form a training set. We then average the model against each of the folds and then finalize our model. After that we test it against the test set.

The more the number of folds, less is value of error due to bias but increasing the error due to variance will increase; the more folds you have, the longer it would take to compute it and you would need more memory. With a lower number of folds, we’re reducing the error due to variance, but the error due to bias would be bigger. It would also computationally cheaper. Therefore, in big datasets, k=3 is usually advised.

Hyperparameter Tuning

Hyperparameters are hugely important in getting good performance with models. In order to understand this process, we first need to understand the difference between a model parameter and a model hyperparameter.

Model parameters are internal to the model whose values can be estimated from the data and we are often trying to estimate them as best as possible . whereas hyperparameters are external to our model and cannot be directly learned from the regular training process. These parameters express “higher-level” properties of the model such as its complexity or how fast it should learn. Hyperparameters are model-specific properties that are ‘fixed’ before you even train and test your model on data.

The process for finding the right hyperparameters is still somewhat of a dark art, and it currently involves either random search or grid search across Cartesian products of sets of hyperparameters.

There are bunch of methods available for tuning of hyperparameters. In this blog post, I chose to demonstrate using grid search .

Grid Search

GridSearch takes a dictionary of all of the different hyperparameters that you want to test, and then feeds all of the different combinations through the algorithm for you and then reports back to you which one had the highest accuracy.

--

--