Ensemble Prediction

Weather is unpredictable. Small differences in initial conditions can develop into big differences in the pattern of circulation, in the timing and location of cyclones, rainfall etc. This is true no matter how good the initial observing system is.

The approach taken by organisations such as ECMWF or NCEP is to re-run numerical forecast models with a range of carefully chosen initial conditions. The collection of runs is called the ensemble. Ensemble prediction systems (EPS) give probabilistic forecasts for variables such as rainfall, temperature etc. Current operational EPS have 20 (GFS)  or 51 (ECMWF) ensemble members from which the probability distributions are derived. ECMWF give an overview of their system here. The probability distributions capture part of the intrinsic uncertainty in weather or climate.

The graph below shows histograms of 20 ensemble member temperatures near some major cities. The data were extracted from NCEP GENS 16-day 2m temperature forecast produced at 00UTC 2 Feb 2010 (i.e GFS forecasts for 18 Feb).


The maps below show some corresponding ensemble statistics for the entire globe (1° resolution, equal area cylindrical projection).


The upper map indicates that forecast uncertainty (standard error) is high between 40° and 60° in both hemispheres (related to the chaotic behaviour of  jet streams.) Currently, 16 day temperatures north of Lake Baikal in Siberia are very uncertain, for example. The contours indicate ensemble median temperatures.

Skewness in ensemble temperatures is shown in the lower map. For example, large negative skewness is found in north central US, eastern mediterranean, and Paraguay/Mato Grosso. This suggests tail risk of low temperatures relative to ensemble mean in these areas.




EPS is the future of weather and climate forecasting. These systems produce huge amounts of data. Building useful applications of EPS is both a challenge and an opportunity.

For anyone interested, the R code used to produce these graphs is given here.



  1. A related windows question: for the line:

    shell(paste(“wgrib2 -s “,t,” | grep \”TMP:2 m\” | wgrib2 -i “,t,” -netcdf “,tmp,sep=””),intern=T)

    how did you get wgrib2 to run in windows? I can find where I might download a windows ready version.

  2. Thanks . I could not get to that site with my browser but got in via FileZilla. I managed to get your script to run. I had to download GNU grep, FWTools (up to version 4.2.7 now) and the RColorBrewer package as well.

    One item to point out as well, I had to add a specific reference to the “RColorBrewer” package as well, which your version did not include.

    Thanks for making this available. There is some R based processing I’ve been thinking of doing with the GFS ensemble for some time. This gets me started.


  3. I am a very new user to R and the weather files so, my appolgies if this is a dumb question. After some work I got wgrib2 and grep installed and workig but run to a error when pulling the 2m temps. It seems that the following line …

    t2m.en[[i]] <- get.var.ncdf(t2m,"TMP_2maboveground")

    …should be changed to:

    t2m.en[[i]] <- get.var.ncdf(t2m,"TMP__1_2maboveground")

    Is this correct?

  4. This is killing me! Is there any reason why the the TMP_2maboveground variable name would be prefixed by …



    for each of the ensebme runs?

    I tested my wgrib2 statement mannually in the command window and it seems to produced the same prefixed variable names listed above which prevent the following line from working:

    t2m.en[[i]] <- get.var.ncdf(t2m,"TMP_2maboveground")

    I tried adding and modifying the above line to the followig to dynamicaly create the variable name:

    t2m.en[[i]] <- get.var.ncdf(t2m,nm)

    While this does seem to produce the charts the results dont seem reasonable the histograms produce results that show temps in the -30 range (Kelvin?)

    I'm so close but cant seem to crack it, and I have become obsessed with figguring this out. Any assistance would be greatly appreciated.

  5. Mike,

    as a sanity check, try making a map of the ensemble mean. it should be obvious what the problem is e.g. the map is inverted.

    by the way, i recommend R’s raster package. it gives a nice concise way of handling weather ensembles.

  6. Thank you very much for the reply. However, I don’t understand; Are you saying my map is inverted? How do i uninvert it?

    Why does the following line from your code return a “variable not found” error?

    t2m.en[[i]] <- get.var.ncdf(t2m,"TMP_2maboveground")

    When I look at the variable names for the t2m variable (i.e. print(t2m) ) it suggest (I think) that somewhere in the process the variable name has been modified in the follwoing way:

    from TMP_2maboveground to TMP__1_2maboveground
    from TMP_2maboveground to TMP__2_2maboveground
    from TMP_2maboveground to TMP__3_2maboveground

    from TMP_2maboveground to TMP__20_2maboveground

    Will look at the raster doccumetnation but any assistance you could proveivde for this new user would be very much appreciated.

  7. The line should be changed to read:

    t2m.en[[i]] < - get.var.ncdf(t2m,paste("TMP__",i,"_2maboveground",sep="")) I tested the script with that change and it runs correctly.

  8. Thanks. Thats what it looked like to me but as a new user (of R, GRIB2) I wasn’t sure somehow I was modifying the names or doing something wrong. At some point, I guess NECP changed the naming convention of the field to include the ensemble number as part of variable name. Thanks again.


Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.