20th Century Droughts

The Global Precipitation Climatology Project (GPCP)[1] of Deutscher Wetterdienst created a global dataset of monthly rainfall covering the years 1901-2011 at 0.5o resolution. It is based on rain gauge observations and was first published in 2011.

Rainfall patterns and variability vary greatly from one location to another. Moreover, the human eye is easily fooled by randomness in time-series data.  A convenient way to handle these issues is to replace the observed (highly non-normal and seasonally varying) rainfall amounts by a standardized index such that an index  value of say, -2, represents an extremely dry period (-2 standard deviations from the mean implies that rainfall is lower than the observed value only 2.1% of the time). Another advantage of using a Standardized Precipitation Index (SPI) is that a separate index can be defined with respect to each cumulative rainfall time-scale of interest. For instance, low values of the 3-month index means dry soil, whereas low values of the 12-month index could be consistent with either wet or dry soils, but may mean empty reservoirs and dry streams .

Plotting SPI as a function of both time and time-scale can give insight into nature of droughts at a particular location. The graphic highlights 20th century droughts (actually 1901-2012) for some agriculturally important places, extracted from the GPCP dataset.

The graphic includes some truly terrible historical events. For example, the Ukraine and Volga drought of 1920 and extensive period of drought in the Volga during the 1930s cost millions of lives.

The Russian droughts of 1935-39 coincided with the dust bowl years visible in the index data for Kansas. The 1950s drought in Kansas was at least as severe but it’s impacts were limited by improved agricultural practices and disaster response. The recent drought in Texas and Chihuahua province of Northern Mexico are the most extreme there for at least 50 years. The 2008 drought in Buenos Aires province of Argentina (which helped fuel food-price inflation globally) is without precedent in the GPCP record.

While droughts are an inevitable fact of life, it is also possible that the agricultural revolution which began in the 1960s took place during an unusually benign period. I hope to look into this in a future post.

[1] Schneider, Udo; Becker, Andreas; Finger, Peter; Meyer-Christoffer, Anja; Rudolf, Bruno; Ziese, Markus (2011): GPCC Full Data Reanalysis Version 6.0 at 0.5°: Monthly Land-Surface Precipitation from Rain-Gauges built on GTS-based and Historic Data.

SPI algorithm

There are many alternative ways to define a standardized precipitation index. In the approach used above , no functional form for the distribution of rainfall was assumed. SPI was computed based on the empirical cumulative distribution function( ecdf) only. ecdf can be defined from low  to high values values of precipitation P(>x) or from high to low values P(<x). The algorithm (R function)  given below symmetrises over these two alternatives. I think this is an improvement over the algorithm given in a previous post.

getSPI <- function(y){
 #empirical, symmetrized Standard Precipitation Index
 fit.cdf <- ecdf(y)
 fit.rcdf <- ecdf(-y)
 cdfs <- fit.cdf(y)
 rcdfs <- 1 -fit.rcdf(-y)
 #invert normal
 spi.t <- qnorm(cdfs)
 spi.tp <- na.omit(spi.t[ spi.t != Inf ]) #drop Inf
 rspi.t <- qnorm(rcdfs)
 rspi.tp <- na.omit(rspi.t[ rspi.t != -Inf ]) #drop Inf
 spi.sym <- (spi.t+rspi.t)/2
 spi.sym[which(spi.t == Inf)] <- rspi.t[which(spi.t==Inf)]
 spi.sym[which(rspi.t == -Inf)] <- spi.t[which(rspi.t==-Inf)]



  1. Sure. Put your spi index data in “melted” form with columns “time”,”timescale”,”state”,”spi”.

    ggplot(spi.melt,aes(time,timescale,fill=spi)) + geom_tile()+scale_fill_gradientn(name=”index”,colours=brewer.pal(9,”RdBu”)) + facet_grid(state~.)

  2. Hi,

    How did you exacted the SPI value per State (is it by averaging or some other method)

  3. Hi Joe,

    My data is in .csv (or .xls).
    Kind help me write a script that can generate SPI and plot it.
    Have check this script you uploaded but I was not able to use it.

  4. Hi,
    Thank you very much for sharing the R codes.
    As I saw in your script, SPI value at different time scale is calculated for one grid point. And I wonder how can we creat the global SPI with the output as Netcdf file (.nc)
    I’m a beginner and just I just started use R to compute SPI value for global land during 1901-2014 period.
    Would you like to guide me
    Thank you very much

  5. Hi,
    should be straightforward using the raster package. A global monthly rainfall dataset in netcdf format can be read into R as a raster “stack” (or brick). The rainfall stack can be transformed into an SPIs stack using the functions calc and getSPI above. Output as an .nc file using writeRaster( … , format=”CDF”,…).
    Joe Wheatley

  6. Hi Joe

    I’m sorry to bombard you with many questions, but I’m very keen on learning R script for computing SPI so I’m actually reading through a number of pages of your posts. Thank you very much again for sharing!

    I tried to use getSPI function you provided in this post, and was wondering how many months of timescale doesn the returned value represents? (i.e. 1,3,6,9,12 months SPI?)



Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.