Universal Kriging

CODE:

Universal Kriging 101

Introduction
Data preparation
Perform the covariate intersection
Model fitting
External validation
Interpolation

Introduction

The universal kriging function in R is found in the gstat package. It is useful from the view that both the regression model and variogram modeling of the residuals are handled in one modelling system.

Using universal kriging, one can efficiently derive prediction uncertainties by way of the kriging variance. A limitation of universal kriging in the true sense of the model parameter fitting is that the model is linear.

The general preference is DSM studies is to used non-linear and recursive models that do not require strict model assumptions and assume a linear relationship between target variable and covariates.

Data preparation

So lets firstly get the data organized. Recall from before in the data preparatory exercises that we were working with the soil point data and environmental covariates for the Hunter Valley area. These data are stored in the HV_subsoilpH object together with associated rasters supplied with the ithir package.

For the succession of models examined in these various pages, we will concentrate on modelling and mapping the soil pH for the 60-100cm depth interval. To refresh, lets load the data in, then intersect the data with the available covariates.

# libraries
library(ithir)
library(MASS)
library(terra)
library(sf)
library(gstat)

# point data
data(HV_subsoilpH)

# Start afresh round pH data to 2 decimal places
HV_subsoilpH$pH60_100cm <- round(HV_subsoilpH$pH60_100cm, 2)

# remove already intersected data
HV_subsoilpH <- HV_subsoilpH[, 1:3]

# add an id column
HV_subsoilpH$id <- seq(1, nrow(HV_subsoilpH), by = 1)

# re-arrange order of columns
HV_subsoilpH <- HV_subsoilpH[, c(4, 1, 2, 3)]

# Change names of coordinate columns
names(HV_subsoilpH)[2:3] <- c("x", "y")
# save a copy of coordinates
HV_subsoilpH$x2 <- HV_subsoilpH$x
HV_subsoilpH$y2 <- HV_subsoilpH$y

# convert data to sf object
HV_subsoilpH <- sf::st_as_sf(x = HV_subsoilpH, coords = c("x", "y"))

# grids (covariate rasters from ithir package)
hv.sub.rasters <- list.files(path = system.file("extdata/", package = "ithir"), pattern = "hunterCovariates_sub", full.names = TRUE)

# read them into R as SpatRaster objects
hv.sub.rasters <- terra::rast(hv.sub.rasters)

Perform the covariate intersection

# extract covariate data
DSM_data <- terra::extract(x = hv.sub.rasters, y = HV_subsoilpH, bind = T, method = "simple")
DSM_data <- as.data.frame(DSM_data)

Often it is handy to check to see whether there are missing values both in the target variable and of the covariates. It is possible that a point location does not fit within the extent of the available covariates. In these cases the data should be excluded. A quick way to assess whether there are missing or NA values in the data is to use the complete.cases function.

which(!complete.cases(DSM_data))

## integer(0)

DSM_data <- DSM_data[complete.cases(DSM_data), ]

There do not appear to be any missing data as indicated by the integer(0) output above i.e there are zero rows with missing information.

Model fitting

First we will take a subset (using the random hold back sampling strategy) of the data to use for out-of-bag model evaluation.

# model calibration dataset
set.seed(123)
training <- sample(nrow(DSM_data), 0.7 * nrow(DSM_data))
cDat <- DSM_data[training, ]
names(cDat)[3:4] <- c("x", "y")

Now lets parametise the universal kriging model, and we will use selected covariates that were used in the multiple linear regression example.

# calculate residual variogram
vgm1 <- variogram(pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon +
MRVBF + NDVI + TWI, ~x + y, data = cDat, width = 200, cutoff = 3000)

# establish some intital variogram model parameters
mod <- vgm(psill = var(cDat$pH60_100cm), "Exp", range = 3000, nugget = 0)

# fit model to residual variogram
model_1 <- fit.variogram(vgm1, mod)
model_1

##   model     psill    range
## 1   Nug 0.5386983   0.0000
## 2   Exp 0.8326639 221.5831

# plot variogram and fitted model 
plot(vgm1, model = model_1)

# Universal kriging model object
gUK <- gstat(NULL, "pH", pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI, data = cDat, locations = ~x + y, model = model_1)
gUK

## data:
## pH : formula = pH60_100cm`~`AACN + Landsat_Band1 + Elevation + Hillshading + Mid_Slope_Positon + MRVBF + NDVI + TWI ; data dim = 354 x 13
## variograms:
##       model     psill    range
## pH[1]   Nug 0.5386983   0.0000
## pH[2]   Exp 0.8326639 221.5831
## ~x + y

External validation

Using the out-of-bag data we can evaluate the performance of universal kriging using the ithir::goof function.

## out-of-bag data
vDat <- DSM_data[-training, ]
names(vDat)[3:4] <- c("x", "y")

# make the predictions on out-of-bag data
UK.preds.V <- krige(pH60_100cm ~ AACN + Landsat_Band1 + Elevation + Hillshading +
Mid_Slope_Positon + MRVBF + NDVI + TWI, cDat, locations = ~x + y, model = model_1, newdata = vDat)

## [using universal kriging]

# evaluate UK predictions
goof(observed = DSM_data$pH60_100cm[-training], predicted = c(UK.preds.V[, 3]), plot.it = T)

##          R2 concordance   MSE     RMSE      bias
## 1 0.4381711   0.5779432 1.077 1.037786 0.0287364

The universal kriging model performs a little better than the MLR that was fitted which did not include additional spatial structure through the autocorrelation of model residuals. In the absence of additional covariate data that may help describe better the spatial properties of a given soil target variable, spatial modelling of the residuals from a deterministic model can help mop up that source of uncertainty to some extent.

Interpolation

Applying the universal kriging model spatially is facilitated through the terra::interpolate function.

Kriging results in two main outputs: the prediction and the prediction variance.

## Apply model spatially
par(mfrow = c(1, 2))
## # prediction and prediction variance
UK.P.map <- terra::interpolate(object = hv.sub.rasters, model = gUK, xyOnly = FALSE)
plot(UK.P.map[[1]], main = "Universal kriging predictions")
plot(UK.P.map[[2]], main = "prediction variance")

rconsole — Universal kriging prediction and prediction variance of Hunter Valley soil pH (60-100cm).