## Reprojections, resampling and rasterisation

A useful thing to do in R in the GIS context are the procedures around resampling and re-projections. Such tasks are useful in the situation of aligning spatial data to common resolutions and extents. For digital soil mapping this is an important and regular task to perform. Similarly rasterisation of polygon data is regularly performed. This page will cover some know-how on how to do these varied tasks.

Code is available here

The specific data with which to do the exercises is found here

Initialize the libraries that are needed

library(raster);library(rgdal);library(sp)


### Data preparation

In this first part we will be working with three raster layers, each covering the same geographical area but have different projection systems, resolutions and spatial extents. After you have downloaded the data, you can use the following code to establish where they are located and retrieve the metadata of each raster.

files <- list.files("~", pattern = "tif$", full.names = T) files ##  "~/20140509.B3.tif" ##  "~/elevation.tif" ##  "~/GR_Th.tif" # RASTER 1: remote sensing data rs.raster <- raster(files) rs.raster ## class : RasterLayer ## dimensions : 3026, 4037, 12215962 (nrow, ncol, ncell) ## resolution : 0.0002245788, 0.0002245788 (x, y) ## extent : 149.863, 150.7696, -31.64765, -30.96807 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## source : /~/20140509.B3.tif ## names : X20140509.B3 # RASTER 2: digital elevation model elev.raster <- raster(files) elev.raster ## class : RasterLayer ## dimensions : 54, 81, 4374 (nrow, ncol, ncell) ## resolution : 100, 100 (x, y) ## extent : 1510462, 1518562, -3636821, -3631421 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs ## source : /~/elevation.tif ## names : elevation ## values : 312.4171, 590.6609 (min, max) # RASTER 3: Very granular gamma radiometrics data gamma.raster <- raster(files) gamma.raster ## class : RasterLayer ## dimensions : 606, 865, 524190 (nrow, ncol, ncell) ## resolution : 10, 10 (x, y) ## extent : 1510418, 1519068, -3637349, -3631289 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs ## source : /~/GR_Th.tif ## names : GR_Th ## values : 1.815541, 8.94882 (min, max)  Back to top ### Raster resampling The function to use here is resample. Here we want to get the elev.raster to be the same resolution and extent to the gamma.raster. Essentially going from a 100m grid cell to a 10m grid cell. The resampling method used in bi-linear interpolation. The alternative is nearest neighbor interpolation which is generally not appropriate for numerical data, dependent on the context and use-case of course. Stacking rasters is a nice feature of raster as it confirms to the user that a collection of rasters all have the same CRS, resolution and extent. This is really important for doing digital soil mapping. When rasters don’t stack, it means there is a problem, and one will have difficulty in subsequent steps in the DSM workflow. The stack function is the tool that we need. Also the brick function appears to do the same sort of thing as stack too, but seems to be used more for temporal data, or data stacks of the same provenance. stack appears to be a generic function. # resample rs.grid <- resample(x = elev.raster, y = gamma.raster, method = "bilinear") rs.grid ## class : RasterLayer ## dimensions : 606, 865, 524190 (nrow, ncol, ncell) ## resolution : 10, 10 (x, y) ## extent : 1510418, 1519068, -3637349, -3631289 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs ## source : memory ## names : elevation ## values : 312.4438, 590.6609 (min, max) # stack r3 <- stack(rs.grid, gamma.raster) r3 ## class : RasterStack ## dimensions : 606, 865, 524190, 2 (nrow, ncol, ncell, nlayers) ## resolution : 10, 10 (x, y) ## extent : 1510418, 1519068, -3637349, -3631289 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs ## names : elevation, GR_Th ## min values : 312.443825, 1.815541 ## max values : 590.66095, 8.94882  Back to top ### Simultaneous reprojection and resampling The function projectRaster is an incredibly useful function from the raster package. As the name suggests it is used for changing data from one projection system to another. This function also usefully does resampling to by default where the raster (or stack of rasters) to be processed is given the same CRS, resolution and extent as the target raster (or stack of rasters). First, the default use case in demonstrated, followed by a change to the target CRS but with a resolution change. If you recall from earlier, the rs.grid has the WGS1984 geographic CRS: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0, where the grid cell have a resolution of about 22m, albeit expressed in decimal degrees. The target raster is the elev.raster which has the GDA94/ Geoscience Australia Lambert projection where the units are expressed in meters. The resolution of this raster is 100m. Therefore in this instance we do a reprojection and resample all in one step with the following code. Note the use of nearest neighbor interpolation. # reproject remote sensing grid pl.grid <- projectRaster(from = rs.raster, to = elev.raster, method = "ngb") # stack r4 <- stack(pl.grid, elev.raster) r4 ## class : RasterStack ## dimensions : 54, 81, 4374, 2 (nrow, ncol, ncell, nlayers) ## resolution : 100, 100 (x, y) ## extent : 1510462, 1518562, -3636821, -3631421 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs ## names : X20140509.B3, elevation ## min values : 0.05698202, 312.41711426 ## max values : 0.2888759, 590.6609497  Although both rasters above stack together you will note by plotting each raster that the remote sensing raster appears has a rectangular extent, yet the elev.raster is clipped to some boundary. Both rasters actually have the same extent, but for the elev.raster, part of this data extent has no data. This is because this data is just concerned with a particular landholding and there is not any need to have the data surrounding it. In the exercise concerned with raster clipping, one can, if there is a requirement, to clip the pl.grid to the bounds of the data in elev.raster. Back to top ### Reprojection and resolution change Rather than set the target raster to the same properties as a source raster, one can re-project to a common CRS, but set the resolution to what is desired. Below the rs.raster to reprojected to the the same CRS as the elev.raster, and outputted to 50m resolution (rather than 100m as is the source raster). # reproject and resample (50m pixels) pl.grid50 <- projectRaster(from = rs.raster, crs = crs(elev.raster), method = "ngb", res = 50) pl.grid50 ## class : RasterLayer ## dimensions : 1711, 1898, 3247478 (nrow, ncol, ncell) ## resolution : 50, 50 (x, y) ## extent : 1486689, 1581589, -3675366, -3589816 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs ## source : memory ## names : X20140509.B3 ## values : -Inf, 0.5445645 (min, max)  Back to top ### Polygon reprojection Back in the work with point patterns data you were shown how to change the CRS of SpatialPointsDataFrames. This type of transformation can also be readily extended to spatial lines and polygons too. Below, we will be working with a shapefile where the data is given as polygons, and is in fact a legacy soil map for the same geographic area as the raster data we were working with above. What we will do is load the data, do a reprojection, and a rasterisation procedure on this data. Ultimately for a digital soil mapping context we might want to use polygon data as a predictive covariate. Therefore these data need to be processed accordingly. Now to load the data we use the readOGR function form the sp package. You will note after import of the shapefile this data is encoded as a SpatialPolygonsDataFrame. The data frame elements of this polygon data contain the pertinent details of each polygon which in our case is descriptive information of the soil type and landscape. poly.dat <- readOGR("/~/Soils_Curlewis_GDA94.shp") ## OGR data source with driver: ESRI Shapefile ## Source: "/~/Soils_Curlewis_GDA94.shp", layer: "Soils_Curlewis_GDA94" ## with 354 features ## It has 6 fields poly.dat ## class : SpatialPolygonsDataFrame ## features : 354 ## extent : 150.0011, 150.5011, -31.49843, -30.99843 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +ellps=GRS80 +no_defs ## variables : 6 ## names : NAME, LANDSCAPE, CODE, LENGTH, SHAPE_AREA, PROCESS ## min values : BATTERY HILL, AElo, bh, 0.004701920115403, 3.82996751e-07, AEOLIAN ## max values : YARRAMAN, TRwfa, ya, 2.62094635880552, 0.031703676998096, TRANSFERRAL  Lets find out a bit more information about the poly.dat object i.e. the soil map. # Attribute table head(poly.dat@data) ## NAME LANDSCAPE CODE LENGTH SHAPE_AREA PROCESS ## 0 KILPHYSICS ROAD TRkr kr 0.06258586 1.763738e-04 TRANSFERRAL ## 1 EAST LYNNE COel el 0.03153234 1.838048e-05 COLLUVIAL ## 2 LONG MOUNTAIN RElm lm 0.03033508 1.855672e-05 RESIDUAL ## 3 QUIRINDI CREEK ALqc qc 0.05039743 8.478940e-05 ALLUVIAL ## 4 QUIRINDI CREEK ALqc qc 0.02419802 2.233679e-05 ALLUVIAL ## 5 CONADILLY ALcd cd 0.03408230 3.911834e-05 ALLUVIAL # soil type code and frequency summary(poly.dat@data$CODE)

##  bh  bj  bo  ca  cd  cr  cu  dh  el  fr  gi  gl  go  gy  ha  kr  lg lga  lm  lo
##  18  26   9  10  16  12   1   2  12  12   6   4  16   3   2   1   2   2  13  14
##  lr  ma  md  mm  mn  mo  mt  mw  nj  ph  pn  po  qc  sg  ta  tf  tu  wa wfa  xx
##   3   1   1  17   2   4   6   2   9  12  10   8  13  29   5   7  11  10   1  21
##  ya
##   1

summary(as.factor(as.numeric(poly.dat@data$CODE))) # numeric factor ## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 ## 18 26 9 10 16 12 1 2 12 12 6 4 16 3 2 1 2 2 13 14 3 1 1 17 2 4 ## 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 ## 6 2 9 12 10 8 13 29 5 7 11 10 1 21 1 # Basic plot plot(poly.dat) invisible(text(getSpPPolygonsLabptSlots(poly.dat), # labels = as.character(poly.dat@data$CODE), cex = 0.4))

# coordinate reference system
crs(poly.dat)

## CRS arguments: +proj=longlat +ellps=GRS80 +no_defs


Currently the data is in GDA94 CRS: +proj=longlat +ellps=GRS80 +no_defs, and we want to have it in the project GDA94 Lambert system as for the rasters that were processed above.

## Reproject to new projection (newProj)
newProj <- "+proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs"
poly.dat.T <- spTransform(poly.dat, CRSobj = newProj)
class(poly.dat.T)

##  "SpatialPolygonsDataFrame"
## attr(,"package")
##  "sp"


To confirm this worked we can plot the remote sensing grid (pl.grid50) and overlay the polygon data.

plot(pl.grid)
invisible(text(getSpPPolygonsLabptSlots(poly.dat.T), labels = as.character(poly.dat.T@data$CODE), cex = 1.2)) Gridded remote sensing data overlayed with soil map polygons. Each data source have the same CRS. Back to top ### Rasterisation The function we need here is rasterize from the raster package. Here we will rasterise the CODE variable of the polygon data, and fit it to the same resolution and extent as the pl.grid50 remote sensing data. # rasterise poly.raster <- rasterize(poly.dat.T, pl.grid50, field = "CODE") names(poly.raster) <- "soil_map_CODE" # stack r5 <- stack(pl.grid50, poly.raster) r5 ## class : RasterStack ## dimensions : 1711, 1898, 3247478, 2 (nrow, ncol, ncell, nlayers) ## resolution : 50, 50 (x, y) ## extent : 1486689, 1581589, -3675366, -3589816 (xmin, xmax, ymin, ymax) ## crs : +proj=lcc +lat_1=-18 +lat_2=-36 +lat_0=0 +lon_0=134 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs ## names : X20140509.B3, soil_map_CODE ## min values : -Inf, 1 ## max values : 0.5445645, 41.0000000  Plotting rasters in R is easily facilitated with the plot() function, and is hackable in terms of customising legends, titles and other graphical features. Experience of using this function for categorical data like we have just created in poly.raster does not result in very nice looking maps. The approach to map categorical variables in R nicely can be done via levelplot from the rasterVis package. The below code is a generic one to use to display categorical maps. # plot rasterised polygon r2 <- as.factor(r2) rat <- levels(r2)[] # Match raster levels to polygon code m1 <- c(as.matrix(levels(r2)[])) m2 <- levels(poly.dat@data$CODE)[m1] Rasterised soil map created using the levelplot function. This function is more suitable to use for raster mapping in particular in instances of working with categorical data.