I want to write something about model evaluation for a long time, but I can not manage my time.

4 min readApr 21, 2021

I want to write something about model evaluation for a long time, but I can not manage my time. Many people find some difficulties in Model evaluation in Machine Learning, and they can not get the main purpose and ways of evaluating the model’s performance. However, I tried to write about this from my little knowledge. Hopefully, it will help others.

Machine Learning MoDeL EvaLuaTiOn:

Making the judgement about an amount or value of something or the result of an assessment is known as Evaluation. Okay, coming to the main point is the Model Evaluation of Machine Learning (ML).

In Machine Learning, we generally follow few steps, which are given below:👇
1: Taking the input the dataset
2: Cleaning the dataset (clear the null value)
3: Exploratory Data Analysis
4: Image Augmentation (If the dataset is consisting of image data)
5: Building a Machine Learning Model
6: Model Evaluation
7: Deployment

Well, we all know if an individual wants to be the best in Dance, there is no alternative to do a lot of practice. Finally, an assessment’s result can evaluate the level of that individual’s skill in dance. This Example is precisely similar to the Machine Learning Model Evaluation technique because in Machine Learning when we build a model, it is a must to measure the performance of each model to evaluate whether that is an obnoxious model or an auspicious model.

After knowing the importance of Model Evaluation, let me clarify how to evaluate the Models’ performances. Okay, here it is, the types of Model Evaluation Approaches that can also be considered the Model Evaluation Indicators: Confusion matrix, ROC-AUC curve, and K-fold cross-validation.

Aww, wait, wait …. keep patience; it’s not the bolt from the blue; trust me and go through the explanation. 👀👇

Confusion Matrix:

The confusion matrix ascertains the accuracy of all classification algorithms, which consists of four parameters: True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN).

Okay, let me clear….💁

True Positive- The model has predicted as positive, and that is correct, such as predicting a pregnant woman as pregnant.

False Positive — The model has predicted as positive, but that is wrong, such as predicting a man as pregnant.

True Negative — The model has predicted as negative, which is correct, such as predicting a man as not pregnant.

False Negative — The model has predicted as negative, but that is wrong, such as predicting a pregnant woman who is not pregnant.

ROC-AUC- Curve:

Here, the other way is the ROC-AUC curve ( ROC-Receiver Operating Curve and AUC- Area Under Curve). Binarize, the output of the multi-label classification, is a must for the extension of ROC curves or ROC areas classification. Wow!!!!!!!! How Complex… 👿 ...

Let me explain…..😏

This indicator is applied to measure how auspicious the accomplishment of different classification models. In other words, ROC evaluates the output quality of the classifier algorithms, which is generally applied in binary classification for determining the output of a classifier.

In ROC curves, True-Positive (TP) rates are featured on the Y-axis, and False-Positive (FP) rate featured on the X-axis, which indicates that the top left corner of the plot is an “Ideal” point with a True-Positive(TP) volume of one and False-Positive(FP) volume of zero. Though it is difficult to consider it as real, it delineates that greater Area Under Curve (AUC) usually indicates the better performance of the model.

Have a look at this figure, ☝

we can see that the higher the value of ROC, the closer the blue line of the top (1), which indicates a better model.

Here is the last model evaluation indicator……. 😉

K-Fold Cross-Validation:

It is a unique method that creates multiple subsets of a given dataset in a machine learning model, where for different subsets, different model prediction accuracy has resulted. Based on that, the overall model performance can be anticipated. Suppose we create 5 subsets for a dataset, consisting of equal data size in each subgroup. These subsets are called folds means we divided our dataset into 5 similar folds. Fold number is expressed by ‘K’. It is possible to eliminate overfitting through K fold cross-validation.

Please read the example below if you find the description of K-fold cross-validation tough…..👇

If we carry a jug full of water and run, then some water can be fallen from the jug, which creates the loss. On the other hand, if we pour some water into the other pot from the jug, leaving some empty space in the upper part of the jug and run, in that case, the chance of falling off some water can be lower than before, which creates a low loss. It can be considered as the overfitting reduction.

Written by Taseenreazniloy