Categories: TutorialsData

Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data

6 min read

The Kriging interpolation technique is being increasingly used in geostatistics these days. But how does Kriging work to create a prediction, after all?

To start with, Kriging is a method where the distance and direction between the sample data points indicate a spatial correlation. This correlation is then used to explain the different variations in the surface.

In cases where the distance and direction give appropriate spatial correlation, Kriging will be able to predict surface variations in the most effective way. As such, we often see Kriging being used in Geology and Soil Sciences.

Kriging generates an optimal output surface for prediction which it estimates based on a scattered set with z-values. The procedure involves investigating the z-values’ spatial behavior in an ‘interactive’ manner where advanced statistical relationships are measured (autocorrelation).

Mathematically speaking, Kriging is somewhat similar to regression analysis and its whole idea is to predict the unknown value of a function at a given point by calculating the weighted average of all known functional values in the neighborhood. To get the output value for a location, we take the weighted sum of already measured values in the surrounding (all the points that we intend to consider around a specific radius), using a  formula such as the following:

In a regression equation, λi would represent the weights of how far the points are from the prediction location. However, in Kriging, λi represent not just the weights of how far the measured points are from prediction location, but also how the measured points are arranged spatially around the prediction location. First, the variograms and covariance functions are generated to create the spatial autocorrelation of data. Then, that data is used to make predictions.

Thus, unlike the deterministic interpolation techniques like Inverse Distance Weighted (IDW) and Spline interpolation tools, Kriging goes beyond just estimating a prediction surface. Here, it brings an element of certainty in that prediction surface.

That is why experts rate kriging so highly for a strong prediction. Instead of a weather report forecasting a 2 mm rain on a certain Saturday, Kriging also tells you what is the “probability” of a 2 mm rain on that Saturday.

We hope you enjoy this simple R tutorial on Kriging by Berry Boessenkool.

Geostatistics: Kriging – spatial interpolation between points, using semivariance

We will be covering following sections in our tutorial with supported illustrations:

  • Packages
  • read shapefile
  • Variogram
  • Kriging
  • Plotting

Kriging: packages

install.packages("rgeos") 
install.packages("sf") 
install.packages("geoR")

library(sf)  # for st_read  (read  shapefiles),
                       # st_centroid,  st_area, st_union
library(geoR)  # as.geodata, variog,  variofit,
                      # krige.control,  krige.conv, legend.krige

## Warning:  package ’sf’ was built under R version 3.4.1

Kriging: read shapefile / few points for demonstration

x <-  c(1,1,2,2,3,3,3,4,4,5,6,6,6)
y <-  c(4,7,3,6,2,4,6,2,6,5,1,5,7) 
z <-  c(5,9,2,6,3,5,9,4,8,8,3,6,7) 
plot(x,y, pch="+",  cex=z/4)

Kriging: read shapefile II

GEODATA  <-  as.geodata(cbind(x,y,z))
plot(GEODATA)

Kriging: Variogram I

EMP_VARIOGRAM  <-  variog(GEODATA)
 
## variog: computing  omnidirectional  variogram
 
FIT_VARIOGRAM  <-  variofit(EMP_VARIOGRAM)
 
## variofit: covariance model used is  matern
## variofit: weights   used:  npairs
## variofit: minimisation function  used:  optim
 
## Warning in variofit(EMP_VARIOGRAM): initial values  not  provided - running the default  search
 
## variofit: searching for  best  initial value  ... selected values:
##                         sigmasq phi    tausq  kappa
## initial.value "9.19"   "3.65" "0"         "0.5"
## status           "est"   "est"   "est"  "fix"
## loss value: 401.578968904954

Kriging: Variogram II

plot(EMP_VARIOGRAM)
lines(FIT_VARIOGRAM)

Kriging: Kriging

res  <-  0.1
grid <-  expand.grid(seq(min(x),max(x),res),
                                seq(min(y),max(y),res))
krico <-  krige.control(type.krige="OK",
                                   obj.model=FIT_VARIOGRAM)
krobj <-  krige.conv(GEODATA,
                                locations=grid, krige=krico)
 
## krige.conv: model with  constant mean
## krige.conv: Kriging performed  using  global  neighbourhood
 
# KRigingObjekt

Kriging: Plotting I

image(krobj, col=rainbow2(100))
legend.krige(col=rainbow2(100),
                     x.leg=c(6.2,6.7),  y.leg=c(2,6),
                     vert=T, off=-0.5,
                     values=krobj$predict)
contour(krobj, add=T)
colPoints(x,y,z, col=rainbow2(100),  legend=F)
points(x,y)

Kriging: Plotting II

library("berryFunctions")  # scatterpoints by color
colPoints(x,y,z, add=F, cex=2,  legargs=list(y1=0.8,y2=1))

Kriging: Plotting III

colPoints(grid[ ,1],  grid[ ,2],  krobj$predict,  add=F,
                cex=2,  col2=NA,  legargs=list(y1=0.8,y2=1))

Time for a real dataset

Precipitation from ca 250 gauges in Brandenburg  as Thiessen Polygons with steep gradients at edges:

Exercise 41: Kriging

  1. Load and plot the shapefile in PrecBrandenburg.zip with sf::st_read.
  2. With colPoints in the package berryFunctions, add the precipitation  values at the centroids of the polygons.
  3. Calculate the variogram and fit a semivariance curve.
  4. Perform kriging on a grid with a useful resolution (keep in mind that computing time rises exponentially  with grid size).
  5. Plot the interpolated  values with image or an equivalent (Rclick 4.15)
  6. and add contour lines.
  7. What went wrong? (if you used the defaults, the result will be dissatisfying.) How can you fix it?

Solution for exercise 41.1-2: Kriging Data

# Shapefile:
p <-  sf::st_read("data/PrecBrandenburg/niederschlag.shp",
                           quiet=TRUE)
# Plot  prep
pcol  <-  colorRampPalette(c("red","yellow","blue"))(50)
clss <-  berryFunctions::classify(p$P1,  breaks=50)$index
# Plot
par(mar  = c(0,0,1.2,0))
plot(p, col=pcol[clss], max.plot=1)  # P1:  Precipitation
# kriging  coordinates
cent  <-  sf::st_centroid(p)
berryFunctions::colPoints(cent$x,  cent$y,  p$P1, add=T, cex=0.7,
                                          legargs=list(y1=0.8,y2=1),  col=pcol)
points(cent$x,  cent$y,  cex=0.7)

Solution for exercise 41.3: Variogram

library(geoR)
# Semivariance:
geoprec  <-  as.geodata(cbind(cent$x,cent$y,p$P1))
vario <-  variog(geoprec,  max.dist=130000)
 
## variog: computing  omnidirectional variogram

fit <-variofit(vario)

## Warning in variofit(vario):  initial values not provided - running the  default search

## variofit: searching for  best  initial value  ... selected values:
##                         sigmasq phi            tausq  kappa
## initial.value "1326.72"  "19999.93" "0"        "0.5"
## status           "est"       "est"         "est" "fix"
## loss value: 107266266.76371

plot(vario) ; lines(fit)
# distance to  closest other   point:
d <-  sapply(1:nrow(cent),  function(i)  min(berryFunctions::distance(
                                           cent$x[i],  cent$y[i],  cent$x[-i],  cent$y[-i])))
hist(d/1000,  breaks=20,  main="distance to  closest  gauge [km]")
mean(d)  # 8 km

## [1]  8165.633

Solution for exercise 41.4-5: Kriging

# Kriging:
res  <-  1000 # 1 km, since  stations are  8 km  apart   on average
grid <-  expand.grid(seq(min(cent$x),max(cent$x),res),
                                 seq(min(cent$y),max(cent$y),res))
krico <-  krige.control(type.krige="OK",  obj.model=fit)
krobj <-  krige.conv(geoprec, locations=grid, krige=krico)
 
## krige.conv: model with  constant mean
## krige.conv: Kriging performed  using  global  neighbourhood
 
# Set  values  outside of  Brandenburg to  NA:
grid_sf <-  sf::st_as_sf(grid,  coords=1:2,  crs=sf::st_crs(p))
isinp <-  sapply(sf::st_within(grid_sf,  p), length) > 0
krobj2 <-  krobj
krobj2$predict[!isinp]  <-  NA

Solution for exercise 41.5: Kriging Visualization

geoR:::image.kriging(krobj2,  col=pcol)
colPoints(cent$x,  cent$y,  p$P1, col=pcol, zlab="Prec",  cex=0.7,
                legargs=list(y1=0.1,y2=0.8,  x1=0.78,  x2=0.87,  horiz=F))
plot(p, add=T, col=NA,  border=8)#;  points(cent$x,cent$y,  cex=0.7)

[author title=”About the author”]Berry started working with R in 2010 during his studies of Geoecology at Potsdam University, Germany. He has since then given number of R programming workshops and tutorials, including full-week workshops in Kyrgyzstan and Kazachstan. He has left the department for environmental science in summer 2017 to focus more on software development and teaching in the data science industry. Please follow the Github link for detailed explanations on Berry’s R courses. [/author]

Guest Contributor

Share
Published by
Guest Contributor

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago