Universal Kriging 101
- Data preparation
- Perform the covariate intersection
- Model fitting
- External validation
The universal kriging function in
R is found in the
gstat package. It is useful from the view that both the regression model and variogram modeling of the residuals are handled in one modelling system.
Using universal kriging, one can efficiently derive prediction uncertainties by way of the kriging variance. A limitation of universal kriging in the true sense of the model parameter fitting is that the model is linear.
The general preference is DSM studies is to used non-linear and recursive models that do not require strict model assumptions and assume a linear relationship between target variable and covariates.
So lets firstly get the data organized. Recall from before in the data preparatory exercises that we were working with the soil point data and environmental covariates for the Hunter Valley area. These data are stored in the
HV_subsoilpH object together with associated rasters supplied with the
For the succession of models examined in these various pages, we will concentrate on modelling and mapping the soil pH for the 60-100cm depth interval. To refresh, lets load the data in, then intersect the data with the available covariates.
# libraries library(ithir) library(MASS) library(terra) library(sf) library(gstat) # point data data(HV_subsoilpH) # Start afresh round pH data to 2 decimal places HV_subsoilpH$pH60_100cm <- round(HV_subsoilpH$pH60_100cm, 2) # remove already intersected data HV_subsoilpH <- HV_subsoilpH[, 1:3] # add an id column HV_subsoilpH$id <- seq(1, nrow(HV_subsoilpH), by = 1) # re-arrange order of columns HV_subsoilpH <- HV_subsoilpH[, c(4, 1, 2, 3)] # Change names of coordinate columns names(HV_subsoilpH)[2:3] <- c("x", "y") # save a copy of coordinates HV_subsoilpH$x2 <- HV_subsoilpH$x HV_subsoilpH$y2 <- HV_subsoilpH$y # convert data to sf object HV_subsoilpH <- sf::st_as_sf(x = HV_subsoilpH, coords = c("x", "y")) # grids (covariate rasters from ithir package) hv.sub.rasters <- list.files(path = system.file("extdata/", package = "ithir"), pattern = "hunterCovariates_sub", full.names = TRUE) # read them into R as SpatRaster objects hv.sub.rasters <- terra::rast(hv.sub.rasters)
# extract covariate data DSM_data <- terra::extract(x = hv.sub.rasters, y = HV_subsoilpH, bind = T, method = "simple") DSM_data <- as.data.frame(DSM_data)
Often it is handy to check to see whether there are missing values both in the target variable and of the covariates. It is possible that a point location does not fit within the extent of the available covariates. In these cases the data should be excluded. A quick way to assess whether there are missing or
NA values in the data is to use the
which(!complete.cases(DSM_data)) ## integer(0) DSM_data <- DSM_data[complete.cases(DSM_data), ]
There do not appear to be any missing data as indicated by the
integer(0) output above i.e there are zero rows with missing information.
First we will take a subset (using the random hold back sampling strategy) of the data to use for out-of-bag model evaluation.
# model calibration dataset set.seed(123) training <- sample(nrow(DSM_data), 0.7 * nrow(DSM_data)) cDat <- DSM_data[training, ] names(cDat)[3:4] <- c("x", "y")
Now lets parametise the universal kriging model, and we will use selected covariates that were used in the multiple linear regression example.
# calculate residual variogram vgm1 <- variogram(pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI, ~x + y, data = cDat, width = 200, cutoff = 3000) # establish some intital variogram model parameters mod <- vgm(psill = var(cDat$pH60_100cm), "Exp", range = 3000, nugget = 0) # fit model to residual variogram model_1 <- fit.variogram(vgm1, mod) model_1 ## model psill range ## 1 Nug 0.5386983 0.0000 ## 2 Exp 0.8326639 221.5831 # plot variogram and fitted model plot(vgm1, model = model_1) # Universal kriging model object gUK <- gstat(NULL, "pH", pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI, data = cDat, locations = ~x + y, model = model_1) gUK ## data: ## pH : formula = pH60_100cm`~`AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI ; data dim = 354 x 13 ## variograms: ## model psill range ## pH Nug 0.5386983 0.0000 ## pH Exp 0.8326639 221.5831 ## ~x + y
Using the out-of-bag data we can evaluate the performance of universal kriging using the
## out-of-bag data vDat <- DSM_data[-training, ] names(vDat)[3:4] <- c("x", "y") # make the predictions on out-of-bag data UK.preds.V <- krige(pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI, cDat, locations = ~x + y, model = model_1, newdata = vDat) ## [using universal kriging] # evaluate UK predictions goof(observed = DSM_data$pH60_100cm[-training], predicted = c(UK.preds.V[, 3]), plot.it = T) ## R2 concordance MSE RMSE bias ## 1 0.4381711 0.5779432 1.077 1.037786 0.0287364
The universal kriging model performs a little better than the MLR that was fitted which did not include additional spatial structure through the autocorrelation of model residuals. In the absence of additional covariate data that may help describe better the spatial properties of a given soil target variable, spatial modelling of the residuals from a deterministic model can help mop up that source of uncertainty to some extent.
Applying the universal kriging model spatially is facilitated through the
Kriging results in two main outputs: the prediction and the prediction variance.
## Apply model spatially par(mfrow = c(1, 2)) ## # prediction and prediction variance UK.P.map <- terra::interpolate(object = hv.sub.rasters, model = gUK, xyOnly = FALSE) plot(UK.P.map[], main = "Universal kriging predictions") plot(UK.P.map[], main = "prediction variance")