Black Swan White Mountain

Everyone knows that tree rings mark annual variations in tree growth. Biomass production is influenced by climatic variability  (precipitation,temperature, sunlight) or disease, pests, fire etc. Tree rings can be measured very accurately and used to recover information about past climate over thousands of years.

The World Data Center for Paleoclimate (NOAA) maintains a  tree ring database. As well as raw data on growth rings for individual trees, the database contains “chronology” files labelled  ***.crn. These files contain annual growth data averaged over small stands of trees, and indexed relative to the mean growth. Thus an index value of 1000 is average growth, index lower than 1000 is below average growth etc. Here is a simple R script which uploads the chronology file  “ca535.crn” file into a time-series treeRing.ts. ca535.crn describes annual growth in an ancient Bristlecone Pine forest in the White Mountains of California from 6000 BC to 1979 AD. The world’s oldest known (non-clonal) tree  is a member of the stand of Bristlecone Pines making up this index. The area is known as Methuselah Grove.

You can click on the plot below to see a full-screen version.


Most of the volatility in the growth of Bristlecone Pines in the White Mountains is due to drought. The investigator Edmund Schulman wrote “There is something a little fantastic in the persistent ability of a 4,000-year-old tree to shut up shop almost everywhere throughout its stem in a very dry year, and faithfully to reawaken to add many new cells in a favorable year.” (1958). With so much high quality data (7980 points), many interesting questions can be asked about climatic variability at this location. For example, a statistical model can be built relating recent tree ring growth to observed precipitation and temperature, say. Such models are used to extract information about historical climate.

The following R commands produce a histogram of tree ring growth values (probability distribution) as well as a smooth curve fit:

truehist(treeRing.ts,nbins=50,main="Distribution of Growth Index at Methuselah Grove",font.main=2,font.axis=2,font.lab=3,xlab="Growth Index",ylab="Density");


The distribution approximates a bell-shaped curve, but there is a pronounced “fat-tail” on the low growth side. For example, there  is only one year with growth index > 1900, but there are 44 years with growth < 100.

If you only ever read one book about statistics, make sure it is Nassim Taleb’s book The Black Swan. Fat-tailed distributions (such as drought impact on Bristlecone Pines) are the norm in complex systems. Many of Taleb’s examples are from finance and economics. However geophysical time-series, such as precipitation levels, often have this property. When making inferences from data, it is important to decide in advance what kind of process you are likely to be dealing with. A process where there is no clear constraint on the outcome (such as a physical conservation law) is likely to be subject to fat-tail events.

The asymmetry in growth distribution in White Mountain Bristlecone Pines is due to the fact that higher than normal precipitation levels cannot boost growth beyond the limits set by available temperature, sunlight and plant physiology. Drought years, however, virtually eliminate growth.

Traditional farmers in drought-prone parts of the world are acutely aware of the fat-tailed distribution. Farming practices which appear sub-optimal to some, may in fact be adapted to the existence of fat-tails. Nassim Taleb points out that risk managers in banks tend to behave as though Black Swans do not exist.

In that sense, bankers know less about real-world statistics than subsistence farmers.