Software engineering


Download 341.69 Kb.
bet17/21
Sana20.12.2022
Hajmi341.69 Kb.
#1035265
1   ...   13   14   15   16   17   18   19   20   21
Bog'liq
MASHINA-LEARNING2

Figure 4-7. Evaluation metrics for regression and classification
Let us first look at the evaluation metrics for supervised regression.
Mean absolute error
The mean absolute error (MAE) is the sum of the absolute differences between predictions and actual values. The MAE is a linear score, which means that all the individual differences are weighted equally in the average. It gives an idea of how wrong the predictions were. The measure gives an idea of the magnitude of the error, but no idea of the direction (e.g., over- or underpredicting).
Mean squared error
The mean squared error (MSE) represents the sample standard deviation of the differences between predicted values and observed values (called residuals). This is much like the mean absolute error in that it provides a gross idea of the magnitude of the error. Taking the square root of the mean squared error converts the units back to the original units of the output variable and can be meaningful for description and presentation. This is called the root mean squared error (RMSE).
R2 metric
The R2 metric provides an indication of the “goodness of fit” of the predictions to actual value. In statistical literature this measure is called the coefficient of determination. This is a value between zero and one, for no-fit and perfect fit, respectively.
Adjusted R2 metric
Just like R2, adjustedR2 also shows how well terms fit a curve or line but adjusts for the number of terms in a model. It is given in the following formula:
R adj 2 = 1 - (1-R 2 )(n-1)) n-k-1
where n is the total number of observations and k is the number of predictors. Adjusted R2 will always be less than or equal to R2.
Selecting an evaluation metric for supervised regression
In terms of a preference among these evaluation metrics, if the main goal is predictive accuracy, then RMSE is best. It is computationally simple and is easily differentiable. The loss is symmetric, but larger errors weigh more in the calculation. The MAEs are symmetric but do not weigh larger errors more. R2 and adjusted R2 are often used for explanatory purposes by indicating how well the selected independent variable(s) explains the variability in the dependent variable(s).
Let us first look at the evaluation metrics for supervised classification.
Classification
For simplicity, we will mostly discuss things in terms of a binary classification problem (i.e., only two outcomes, such as true or false); some common terms are:
True positives (TP)
Predicted positive and are actually positive.
False positives (FP)
Predicted positive and are actually negative.
True negatives (TN)
Predicted negative and are actually negative.
False negatives (FN)
Predicted negative and are actually positive.
The difference between three commonly used evaluation metrics for classification, accuracy, precision, and recall, is illustrated in Figure 4-8.

Precision =

True positive

nr True positive

Actual results

True positive + False positive

Recall =

True positive

True positive

Predictive results

True positive + False negative

Accuracy =

True positive + True negative Total

True
positive

False
positive

- -
False
negative
- -

True
negative

Actual

Download 341.69 Kb.

Do'stlaringiz bilan baham:
1   ...   13   14   15   16   17   18   19   20   21




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling