• AGENDA
• ARTICLES
• PUBLICATIONS
• Your current location:
Tomislav Hengl
SCIENCE.UVA.NL
MAIL: UvA, Nieuwe Achtergracht 166, 1018 WV Amsterdam, NL
TEL: +31-(0)20-5257379
FAX: +31-(0)20-5257431
• DEM parameterization
• Fuzzy classification in GIS
• Grid size calculator
• Regression-kriging
• Supervised landform classification
• Visualization of uncertainty
• Writing Research Articles

Pedometrics.org

ONCE AN ITC STUDENT, ALWAYS AN ITC ALUMNI!

CROATIAN ENGLISH

by Tomislav Hengl

Post-doctoral researcher [SCIENCE.UVA.NL]

REGRESSION-KRIGING

RK is a spatial interpolation technique that combines a regression of the dependent variable on auxiliary variables (such as terrain parameters, remote sensing imagery and thematic maps) with kriging of the regression residuals. It is mathematically equivalent to interpolation method variously called "Universal Kriging" and "Kriging with External Drift" (the proofs are given down-below), where auxiliary predictors are used directly to solve the kriging weights. Such algorithms will play more and more important role because the number of possible covariates is today increasing dramatically. DEMs are now available from a number of sources. Detailed and accurate images of topography can now be ordered from remote sensing systems such as SPOT and ASTER; SPOT5 offers the High Resolution Stereoscopic (HRS) scanner, which can give DEMs at resolutions of up to 5 m. Finer differences in elevation can also be obtained with airborne laser-scanners. The cost of data is either free or dropping in price as technology advances. NASA recorded most of the world's topography in the Shuttle Radar Topographic Mission in 2000. From summer of 2004, these data has been available for almost whole globe at resolution of about 90 m (for the North American continent at resolution of about 30 m). Likewise, MODIS multispectral images are freely available for download at resolutions of 250 m.

See also: the practical guide to Regression-Kriging


A Practical Guide to Geostatistical Mapping of Environmental Variables, pp. 143. EUR 22904 EN Scientific and Technical Research series (9 MB)
Comparison of kriging with external drift and regression-kriging, pp. 18. Technical note with example (325 KB)
Comparison of KED and RK using an example with 20 points (123 KB)

GENERIC FRAMEWORK FOR REGRESSION-KRIGING

On this page you can also find the step-by-step instructions to run the regression-kriging with your own dataset in a statistically sound manner. This is what we called the generic framework for regression-kriging. Unfortunatelly, there is still no GIS package where it could be run yet. As far as I am aware of, only in Gstat (see also the practical guide to Regression-Kriging), you should be able to run it in a statistically optimal manner. We used ILWIS together with VESPER and S-Plus to run regression-kriging, but this was far from semi-automated data processing. For more details see the complete reference:

Please cite as: Hengl T., Heuvelink G.B.M., Rossiter D.G., 2007. About regression-kriging: from equations to case studies. Computers and Geosciences, 33(10): 1301-1315.

 


Figure: Regression-kriging in theory: fitting a vertical cross-section with assumed distribution of a soil variable in horizontal space.

 

DO-IT-YOURSELF:

INPUTS:
  • Interpolation set (point map) - z(si) i=1,...,n; at primary locations
  • Minimum and maximum expected values (Eq 8) and measurement precision (dz)
  • Validation set (point map) - z*(sj) j=1,...,l
  • Continuous predictors (raster map) - q(s); at new unvisited locations
  • Discrete predictors (polygon map)
  • Threshold values for predictions (z1, z2) and uncertainty (u1, u2)
  • Lag spacing, limiting distance and variogram model
 


 

STEP 1.
Data preparation
Transform the target variable using the logit-transfomation (Eq 1).
Transform the polyon map to indicator maps. Each class becomes a separate variable (map) - qc(s).
Create point map by using the table to point map operation showing the interpolation set.
Rasterize point map and polygon map to same georeference.
   
STEP 2.
Regression analysis
Extract the transformed target variable and the predictors to a table sheet.
Run step-wise regression to find the optimal subset of predictors. This gives you the OLS coefficients.
   
STEP 3.
Analysis of residuals

Analyse residuals for spatial autocorrelation.
Determine the semivariance type and parameters - C0, C1, R.

TIPS:
#1 First analyse the residuals for normality using e.g. the correlation test for normality.
#2 Use automated variogram modelling and the exponential model to fit the variogram.

   
STEP 4.
Estimation of GLS coefficients
Use the covariance matrix to derive the GLS coefficients (Eq 4).
Make predictions (drift) at primary locations and derive residuals e(si).
   
STEP 6.
Predictions

Interpolate the residuals and sum them to the drift estimation map (Eq 3).

TIP:
#1 Make sure you calculate the residuals using Eq 3, so that the signs stay the same.

   
STEP 7.
Prediction uncertainty

Calculate variance of the prediction error using Eq 7.

TIP:
#1 Derive this error with the range of variation of the original variable to obtain the relative prediction error Eq 10.

   
STEP 8.
Back-transformation to original scale
Back-transform predictions using Eq 8.
   
STEP 9.
Visualisation
Use the prediction map and prediction uncertainty map, select the threshold levels for predictions and uncertainty and produce an RGB image.
   
OUTPUTS: 
  • Map of predictions and relative prediction error
  • Best subset of predictors and correlation significance (R2)
  • Variogram model parameters (C0, C1, R)
  • GLS drift model coefficients
  • Coefficient of correlation between the ordered residuals
  • HSI coded image
  • Accuracy of prediction at validation points: mean prediction error (MPE), root mean square prediction error (RMSPE)

Overview of formulas:

The target variable is transform to logit variable by:

( 1. )

The original target variable is first standardised to the 0 to 1 range, which can be done via:

( 2. )

where min and max are the physical minimum and maximum.

A soil property at an new, unvisited location (s0) is predicted by (Christensen, 1990; p. 269):

( 3. )

where q0 is a matrix of predictors at new locations, beta is a vector of regression coefficients, lamda is vector of kriging weights at new locations and qare the predictors at primary locations.

The drift model coefficients are estimated using the generalized least squares (GLS) to account for spatial correlation of residuals (Cressie, 1993; p. 166):

( 4. )

where q is the matrix of predictors (nxp+1), z++is the vector of sampled (transformed) observations at primary locations and C is the covariance matrix of residuals:

( 5. )

The covariances C(si,sj), also writen as C(h), are typically estimated by modelling a variogram (Isaaks and Srivastava, 1989). A flexible variogram model is the exponential:

( 6. )

where γ(h) is the semivariance function, C0, C1 and R are variogram parameters and h is matrix of distances between the points.

The variogram is commonly modelled using semivariances and then, for the reasons of computational efficiency, covariances are used. Note that the estimation of the residuals is an iterative process: first the drift model is estimated using the ordinary least squares (OLS) estimation, then the covariance function of the residuals is used to obtain the GLS coefficients. The variance of the predict error is calculated by (Cressie, 1993; p. 155):

( 7. )

where c0 is the vector of covariances of residuals at unvisited location.

The predictions are back-transformed to original scale by:

( 8. )

The variance of prediction error can be used, though, to derive confidence limits:

( 9. )

or to derive the normalized prediction error:

( 10. )

where sz++ is the standard deviation of the transformed variable.

 
spatial-analyst.net/regkriging visited:  6461  times
Comment
 

 

 

LAST UPDATE: January 31, 2008
 Total unique visitors: 68859   this month: 1120 #1 hengl by Google © Tomislav Hengl, 1995-2008