Posts Tagged ‘statistical learning’

Murphy-Epstein’s Law

Predicting the value of a continuous real variable based on historical observations, covariates etc is a routine problem. It has never been easier to create sophisticated statistical models from data. Sadly however, it often turns out that the predictions of the fancy model are not much better than a simple mean of the historical observations.

The output of the best predictive models (determined by cross-validation for example) always shows less variance than the observations. This fact is called shrinkage.

Shrinkage can be understood from an identity known in weather forecasting as Murphy-Epstein decomposition[*].

    \[\text{forecast skill} = \rho^2 - \left( \rho - {\sigma_f \over \sigma_o} \right)^2  \]

\rho is the correlation between forecasts and observations, and \sigma_o and \sigma_f are standard deviations of the observations and forecasts respectively.

To maximise skill, the second term needs to be made as small as possible. For example, \rho \ll 1 requires \sigma_f \ll \sigma_o.

Having low variance compared to the observations may seem strange. It makes your predictive model seem a less realistic description of reality.  Yet shrinkage is a feature of any imperfect (\rho < 1) but optimised predictive model.