Model goodness of fit

CODE:

Get the code used in this section on evaluation model goodness of fit

Diognosing model goodness of fit

Essentially, whenever we train or calibrate a model, we can then generate some predictions. The question one needs to ask is how good are those predictions? Generally, we confront this question by comparing observed values with their corresponding predictions.

Some of the common goodness of fit measures are the root mean square error (RMSE), bias, coefficient of determination or commonly the R² value, and concordance.

You will also find in the digital soil mapping and general statistical literature various other model evaluation tests.

The RMSE is defined as:

$RMSE equation$

where obs is the observed soil property, pred is the predicted soil property from a given model, and n is the number of observations i.

Bias, also called the mean error of prediction and is defined as:

$bias equation$

The R² is the proportion of the variation in the dependent variable (observed data) that is predictable from the independent variable (predicted values). The most general definition of tR² is:

$R2$

SS_res and SS_tot are the sum of squares of residuals (also called the residual sum of squares), and total sum of squares (proportional to the variance of the data) respectively.

$SS_res$

$SS_tot$

The R² measures the precision of the relationship (between observed and predicted).

Concordance, or more formally — Lin’s concordance correlation coefficient (Lin 1989), on the other hand is a single statistic that both evaluates the accuracy and precision of the relationship. It is often referred to as the goodness of fit along a 45 degreee line. Thus it is probably a more useful statistic than the R² alone. Concordance ρ_c is defined as:

$concordance equation$

where μ_pred and μ_obs are the means of the predicted and observed values respectively. σ_pred² and σ_obs² are the corresponding variances. ρ is the correlation coefficient between the predictions and observations.

Example usage

So lets fit a simple linear model. We will use the soil.data set from the ithir package. First load the data in. We then want to regress CEC content on clay (also be sure to remove as NAs).

library(ithir)
library(MASS)
data(USYD_soil1)
soil.data<- USYD_soil1
mod.data <- na.omit(soil.data[, c("clay", "CEC")])
mod.1 <- lm(CEC ~ clay, data = mod.data, y = TRUE, x = TRUE)
mod.1

## 
## Call:
## lm(formula = CEC ~ clay, data = mod.data, x = TRUE, y = TRUE)
## 
## Coefficients:
## (Intercept)         clay  
##      3.7791       0.2053

You may recall that this is the same model that was fitted during the introduction to R chapter. What we now want to do is evaluate some of the model goodness of fit statistics that were described above.

Conveniently, these are available in the goof function in the ithir package. We will use this function a lot when doing digital soil mapping, so it might be useful to describe it.

goof takes four inputs. A vector of observed values, a vector of predicted values, a logical choice of whether an output plot is required, and a character input of what type of output is required.

There are number of possible goodness of fit statistics that can be requested, with only some being used frequently in digital soil mapping projects. Therefore setting the type parameter to DSM will output only the R², RMSE, MSE, bias and concordance statistics as these are most most relevant to DSM.

Additional statistics can be returned if spec is specified for the type parameter.

ithir::goof(observed = mod.data$CEC, 
predicted = mod.1$fitted.values, type = "DSM")

##          R2 concordance      MSE     RMSE bias
## 1 0.4213764   0.5888521 14.11304 3.756733    0

You may wish to generate a plot in which case you would set the plot.it logical to TRUE. Note that the MASS package also needs to be installed and loaded if you want to use the plot.it parameter.

This model mod.1 does not seem to be too bad. On average the predictions are 3.75 cmol (+)/kg off the true value. The model on average is neither over- or under-predictive, but we can see that a few high CEC values are influencing the concordance and R². This outcome may mean that there are other factors that influence the CEC, such as mineralogy type for example.

References

Lin, L I. 1989. “A Concordance Correlation Coefficient to Evaluate Reproducibility.” Biometrics 45: 255–68.