Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data

0
6477

The Kriging interpolation technique is being increasingly used in geostatistics these days. But how does Kriging work to create a prediction, after all?

To start with, Kriging is a method where the distance and direction between the sample data points indicate a spatial correlation. This correlation is then used to explain the different variations in the surface.

In cases where the distance and direction give appropriate spatial correlation, Kriging will be able to predict surface variations in the most effective way. As such, we often see Kriging being used in Geology and Soil Sciences.

Kriging generates an optimal output surface for prediction which it estimates based on a scattered set with z-values. The procedure involves investigating the z-values’ spatial behavior in an ‘interactive’ manner where advanced statistical relationships are measured (autocorrelation). Mathematically speaking, Kriging is somewhat similar to regression analysis and its whole idea is to predict the unknown value of a function at a given point by calculating the weighted average of all known functional values in the neighborhood. To get the output value for a location, we take the weighted sum of already measured values in the surrounding (all the points that we intend to consider around a specific radius), using a  formula such as the following: In a regression equation, λi would represent the weights of how far the points are from the prediction location. However, in Kriging, λi represent not just the weights of how far the measured points are from prediction location, but also how the measured points are arranged spatially around the prediction location. First, the variograms and covariance functions are generated to create the spatial autocorrelation of data. Then, that data is used to make predictions.

Thus, unlike the deterministic interpolation techniques like Inverse Distance Weighted (IDW) and Spline interpolation tools, Kriging goes beyond just estimating a prediction surface. Here, it brings an element of certainty in that prediction surface.

That is why experts rate kriging so highly for a strong prediction. Instead of a weather report forecasting a 2 mm rain on a certain Saturday, Kriging also tells you what is the “probability” of a 2 mm rain on that Saturday.

We hope you enjoy this simple R tutorial on Kriging by Berry Boessenkool.

Geostatistics: Kriging – spatial interpolation between points, using semivariance

We will be covering following sections in our tutorial with supported illustrations:

• Packages
• Variogram
• Kriging
• Plotting

Kriging: packages

install.packages("rgeos")
install.packages("sf")
install.packages("geoR")

# st_centroid,  st_area, st_union
library(geoR)  # as.geodata, variog,  variofit,
# krige.control,  krige.conv, legend.krige

## Warning: 	package ’sf’ was built under R version 3.4.1

Kriging: read shapefile / few points for demonstration

x <-  c(1,1,2,2,3,3,3,4,4,5,6,6,6)
y <-  c(4,7,3,6,2,4,6,2,6,5,1,5,7)
z <-  c(5,9,2,6,3,5,9,4,8,8,3,6,7)
plot(x,y, pch="+",  cex=z/4) GEODATA  <-  as.geodata(cbind(x,y,z))
plot(GEODATA)

Kriging: Variogram I

EMP_VARIOGRAM  <-  variog(GEODATA)

## variog: computing  omnidirectional  variogram

FIT_VARIOGRAM  <-  variofit(EMP_VARIOGRAM)

## variofit: covariance model used is  matern
## variofit: weights   used:  npairs
## variofit: minimisation function  used:  optim

## Warning in variofit(EMP_VARIOGRAM): initial values  not  provided - running the default  search

## variofit: searching for  best  initial value  ... selected values:
##                        	sigmasq phi   	tausq  kappa
## initial.value "9.19"   "3.65" "0"        	"0.5"
## status          	"est"  	"est"   "est"  "fix"
## loss value: 401.578968904954

Kriging: Variogram II

plot(EMP_VARIOGRAM)
lines(FIT_VARIOGRAM)

Kriging: Kriging

res  <-  0.1
grid <-  expand.grid(seq(min(x),max(x),res),
seq(min(y),max(y),res))
krico <-  krige.control(type.krige="OK",
obj.model=FIT_VARIOGRAM)
krobj <-  krige.conv(GEODATA,
locations=grid, krige=krico)

## krige.conv: model with  constant mean
## krige.conv: Kriging performed  using  global  neighbourhood

# KRigingObjekt

Kriging: Plotting I

image(krobj, col=rainbow2(100))
legend.krige(col=rainbow2(100),
x.leg=c(6.2,6.7),  y.leg=c(2,6),
vert=T, off=-0.5,
values=krobj\$predict)
colPoints(x,y,z, col=rainbow2(100),  legend=F)
points(x,y)

Kriging: Plotting II

library("berryFunctions")  # scatterpoints by color

Kriging: Plotting III

colPoints(grid[ ,1],  grid[ ,2],  krobj\$predict,  add=F,
cex=2,  col2=NA,  legargs=list(y1=0.8,y2=1))

Time for a real dataset

Precipitation from ca 250 gauges in Brandenburg  as Thiessen Polygons with steep gradients at edges: Exercise 41: Kriging

2. With colPoints in the package berryFunctions, add the precipitation  values at the centroids of the polygons.
3. Calculate the variogram and fit a semivariance curve.
4. Perform kriging on a grid with a useful resolution (keep in mind that computing time rises exponentially  with grid size).
5. Plot the interpolated  values with image or an equivalent (Rclick 4.15)
7. What went wrong? (if you used the defaults, the result will be dissatisfying.) How can you fix it?

Solution for exercise 41.1-2: Kriging Data

# Shapefile:
quiet=TRUE)
# Plot  prep
pcol  <-  colorRampPalette(c("red","yellow","blue"))(50)
clss <-  berryFunctions::classify(p\$P1,  breaks=50)\$index
# Plot
par(mar  = c(0,0,1.2,0))
plot(p, col=pcol[clss], max.plot=1)  # P1:  Precipitation
# kriging  coordinates
cent  <-  sf::st_centroid(p)
legargs=list(y1=0.8,y2=1),  col=pcol)
points(cent\$x,  cent\$y,  cex=0.7)

Solution for exercise 41.3: Variogram

library(geoR)
# Semivariance:
geoprec  <-  as.geodata(cbind(cent\$x,cent\$y,p\$P1))
vario <-  variog(geoprec,  max.dist=130000)

## variog: computing  omnidirectional variogram

fit <-variofit(vario)

## Warning in variofit(vario):  initial values not provided - running the  default search

## variofit: searching for  best  initial value  ... selected values:
##                        	sigmasq	phi           	tausq  kappa
## initial.value "1326.72"  "19999.93" "0"       	"0.5"
## status          	"est"      	"est"        	"est" "fix"
## loss value: 107266266.76371

plot(vario) ; lines(fit)
# distance to  closest other   point:
d <-  sapply(1:nrow(cent),  function(i)  min(berryFunctions::distance(
cent\$x[i],  cent\$y[i],  cent\$x[-i],  cent\$y[-i])))
hist(d/1000,  breaks=20,  main="distance to  closest  gauge [km]")
mean(d)  # 8 km

##   8165.633

Solution for exercise 41.4-5: Kriging

# Kriging:
res  <-  1000 # 1 km, since  stations are  8 km  apart   on average
grid <-  expand.grid(seq(min(cent\$x),max(cent\$x),res),
seq(min(cent\$y),max(cent\$y),res))
krico <-  krige.control(type.krige="OK",  obj.model=fit)
krobj <-  krige.conv(geoprec, locations=grid, krige=krico)

## krige.conv: model with  constant mean
## krige.conv: Kriging performed  using  global  neighbourhood

# Set  values  outside of  Brandenburg to  NA:
grid_sf <-  sf::st_as_sf(grid,  coords=1:2,  crs=sf::st_crs(p))
isinp <-  sapply(sf::st_within(grid_sf,  p), length) > 0
krobj2 <-  krobj
krobj2\$predict[!isinp]  <-  NA

Solution for exercise 41.5: Kriging Visualization

geoR:::image.kriging(krobj2,  col=pcol)
colPoints(cent\$x,  cent\$y,  p\$P1, col=pcol, zlab="Prec",  cex=0.7,
legargs=list(y1=0.1,y2=0.8,  x1=0.78,  x2=0.87,  horiz=F))
plot(p, add=T, col=NA,  border=8)#;  points(cent\$x,cent\$y,  cex=0.7)   Berry started working with R in 2010 during his studies of Geoecology at Potsdam University, Germany. He has since then given a number of R programming workshops and tutorials, including full-week workshops in Kyrgyzstan and Kazachstan. He has left the department for environmental science in summer 2017 to focus more on software development and teaching in the data science industry. Please follow the Github link for detailed explanations on Berry’s R courses.