date: Wed, 20 Nov 2002 09:05:08 +0000
from: Phil Jones
subject: Fwd: Re: PDSI low-frequency issues
to: k.briffa@uea.ac.uk
Date: Tue, 19 Nov 2002 11:16:56 -0500
From: "Thomas R Karl"
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.1) Gecko/20020823
Netscape/7.0
X-Accept-Language: en-us, en
To: Ed Cook
CC: Jones , Mann ,
Christopher D Miller ,
Bradley ,
Trenberth ,
David Easterling ,
Mark Eakin ,
Sharon Leduc ,
Connie Woodhouse ,
David M Anderson
Subject: Re: PDSI low-frequency issues
Thanks Ed and Mike,
This is an important issue, and as Mike indicated it affects all of our tree ring (and
others) chronologies. I think we have a substantial amount of work to do to make sure
we use all this data appropriately. Ed, your description is very throrough and
admirable. The problem I want to avoid is went we link these data up with the
instrumental record is making false statements about long term trends and return
periods. I am requesting that our Paleo group work with you and other experts to make
sure we get this right. I think that is a heavy responsibility we bear if we are using
these in monitoring products.
Thanks again,
Tom
Ed Cook wrote:
Hi Tom,
Phil Jones sent me an email concerning some discussions you had with he, Mike Mann, and
others at a CCDD Panel Meeting about the low-frequency characteristics of the extended
PDSI reconstructions that I have generated. Mark Eakin showed you some of these I
believe and has put them on the NCDC/NGDC web site. I was in Bhutan at the time when I
received Phil's email (yes, Bhutan does have the odd cyber café in its bigger towns!),
so I was not able to reply until now. I have also cc'd this message to a few members of
the review panel, including Phil and Mike.
In Phil's email, he indicated that you were concerned about what you perceived to be an
unnatural lack of low-frequency (i.e. <1/100 years) variability in the extended PDSI
reconstructions. There are several issues that need to be considered in understanding
why the observed low-frequency variability in the PDSI reconstructions is expressed as
it is. I will describe them below in some detail.
A) In terms of low-frequency variability, what should we expect in the PDSI
reconstructions?
As we all know, monthly PDSI is largely determined by current monthly precipitation and
antecedent conditions, with current temperature acting as a lesser demand function
during the warm-season months through its transformation into units of
evapotranspiration. So while temperature (and its generally greater low-frequency
information) may have a measurable effect on estimates of PDSI, I do not think that it
will be all that large. I say this because of my experiments in reconstructing US
gridded summer (JJA) PDSI and Ned Guttman's 12-month running sum Standardized
Precipitation Index (SPI; Ned's suggestion for an SPI most similar to PDSI) from tree
rings. The calibration/verification results were extremely similar, as were the
reconstructions themselves, with PDSI doing slightly better on average. Given that SPI
contains no explicit temperature information, this indicates to me that the large
majority of the summer PDSI variability in the reconstructions is driven by variations
in precipitation alone.
This being the case, how much low-frequency variance (i.e. <1/100 years) should we
expect from precipitation records on local and regional spatial scales? In my
experience not much. Compared to temperature, precipitation is much more dominated by
high-frequency variability that often behaves in a short-lag persistence sense as a
white-noise process. This being the case, I suggest that one ought not expect to find
much centennial-timescale variability in PDSI reconstructions from tree rings. Indeed,
Phil has indicated to me that he also does not find much low-frequency variability in
PDSI series based directly on long European instrumental climate records. Consequently,
I think that the relative lack of centennial timescale variability in PDSI
reconstructions is, at least partly, a natural reflection of the way that local
precipitation varies as a (nearly) white noise process. However, as we both know, the
way in the which the tree-ring chronologies have been processed can affect how much
low-frequency variance can be realized in any climate reconstruction. So `
B) How much low-frequency variance might be missing in the PDSI reconstructions due to
the way in which the tree-ring chronologies were created and processed?
Conceptually, and even theoretically, we have a pretty good idea what is going on here.
There are basically two ways in which low-frequency variance is lost during the process
of tree-ring chronology development. The first relates to what I have coined the
"segment length curse" (Cook et al., 1995). In that paper, I (with four co-authors)
described how the theoretical resolvability limit of low-frequency variance in a given
time series of length n is O(1/n). Any variability at timescales >n can not necessarily
be differentiated from trend. This is the basis for Granger (1966)'s "trend in mean"
concept.
Now in classical tree-ring chronology development, the chronology is a mean-value
function of length N, composed of m overlapping, (typically) shorter, length-n series
extracted from a stand of living trees. Note that n is usually quite variable from tree
to tree in the m-series ensemble, depending on the age structure of the sampled trees,
with the worst-case scenario (from a low-frequency preservation perspective) being that
n<100 years if each series is
independently detrended first. There are ways that this limit might be circumvented
(e.g., via RCS or age-banding methods), but I won't get into these issues here because
all of the tree-ring chronologies used in reconstructing continental-scale gridded PDSI
are based on classical tree-ring chronology development methods. Therefore, a
reasonable diagnostic for determining the lowest frequency that might be preserved in a
length-N tree-ring chronology might be something like avg(1/n). I actually prefer
med(1/n) because of the greater robustness of the median compared to the mean.
So, how does this translate to the tree-ring network used to reconstruct PDSI over North
America? I can't give you the exact med(1/n) information because it has not been
formally tabulated for all chronologies. However, I can provide reasonably accurate
estimates based on what I know about many of the chronologies. First, consider the
eastern US where I developed a tree-ring chronology network in the early 1980s. Over 20
years ago, I recognized the existence of the "segment length curse" as part of my
dissertation research. Consequently, from many chronologies I purposely deleted
individual tree-ring series that began after 1800. This means that the minimum segment
length was ~180 years for the large majority of the ~60 chronologies that I developed
and have used in the PDSI reconstructions for the eastern US. The med(1/n) is probably
more like 1/220 for most chronologies. Many of the tree-ring chronologies developed by
Dave Stahle in other parts of eastern North America have comparable median segment
lengths. In western North America, the situation will be generally better because the
ages of the sampled trees are often older than those sampled in eastern North America.
Consequently, I suggest that med(1/n) is <1/300 in many cases. This estimate suggests
that, in principle, we ought to be able to reconstruct low-frequency PDSI variability
<1/200 years from the North American tree-ring network.
Of course, there are good reasons why <1/200 is overly optimistic because it does not
take into account the method(s) of detrending used to "standardize" the m individual
tree-ring series prior to averaging them together into the final chronology used to
reconstruct PDSI. As is widely described in the dendrochronology literature, there are
many different ways in which the tree-ring series may be detrended. The simplest fitted
growth curves used for detrending are monotonic, either linear or negative exponential
in form. These growth curves are commonly used to standardize western North American
tree-ring series from open-canopy forests with minimal stand dynamics effects. Such
detrending will have relatively little impact on the med(1/n) estimate for the
preservation of low-frequency variance (e.g., 1/200 years). However, in closed-canopy
forests typical of eastern North American and more mesic forests in western North
American, stand dynamics effects can perturb the trajectory of radial growth (i.e., the
ring-width series) away from that which can be reasonably fitted by monotonic growth
curves. Consequently, more flexible and locally adaptive growth curves are often used
to detrend such series. The most commonly used "flexible and locally adaptive" method
is probably the cubic smoothing spline. This particular cubic spline is especially
attractive because its exact theoretical properties as a digital filter have been
derived. Therefore, one knows what the 50% frequency response cutoff in years for any
given cubic smoothing spline.
So, in cases where the smoothing spline is used for detrending tree-ring series, how
does its use affect the realizable minimum low-frequency variance preserved in the
chronology? For some tree-ring chronologies in the network that are based on spline
detrending, this information is not formally known. However, in the case of my
chronologies from eastern North America (and many of Dave Stahle's chronologies), the
50% frequency response cutoff was set (in most cases) to 2/3 the length of the series
being detrended. This translates to an adjusted med(1/n) of ~1/150 years, assuming an
initial median segment length of ~220 years. If we wish to be even more conservative
(pessimistic?) in our estimate by taking into account the transition bandwidth of the
spline frequency response function, the realizable minimum low-frequency variance that
is usefully preserved in the chronology could be more like ~1/120 years on average. So
even in regions where spline detrending is used (mostly eastern North America), it is
likely that century-scale PDSI variability can be reconstructed ` to the degree that it
exists in local precipitation variability over time. In western North America, the
potential recoverable low-frequency PDSI variability ought to exceed 1/200 years ` again
to the degree that it exists in local precipitation variability over time.
C) Are there others data processing issues that might affect the preservation of
low-frequency variance in the PDSI reconstructions?
Phil indicated to me that Mike is concerned about the effects of prewhitening procedures
used by me in my "Point-by-Point Regression" (PPR) procedure used to calibrate
tree-rings into estimates of PDSI. This is a legitimate concern. However, I do not
regard it to be nearly as important as the effects of "segment length" and "detrending"
on the preservation of low-frequency variance in the PDSI reconstructions.
First, let me explain the rationale for applying Box-Jenkins style prewhitening to the
PDSI calibration problem. It has long been recognized in dendroclimatology that annual
tree-ring chronologies frequently have a persistence structure (order and magnitude)
that exceeds that associated with the climate variable thought to be well related to
ring width (either causally or statistically). There are a number of physiological
reasons for expecting this to be so, and many such processes can be thought of as
operating in a causal feedback sense, i.e. the tree has a physiological memory that
preconditions the potential for new radial growth driven by the arrival of new climate
influences in any given year.
Now, it turns out that causal feedback filters can be described mathematically as
autoregressive (AR) processes, hence the usefulness of Box-Jenkins (B-J) modeling in
dendroclimatology. However, the application of B-J modeling to the
calibration/reconstruction problem is not necessarily straightforward because the
climate variable being reconstructed may have its own persistence structure that needs
to be preserved in the tree-ring reconstruction. Therefore, two persistence models must
be considered: one for the climate variable to be reconstructed and one for each
tree-ring chronology used for reconstruction. Knowledge of both models can be used to
"correct" the tree-ring persistence to better reflect that due to climate alone. That
this is necessary can be appreciated by realizing that the typical AR model for the
instrumental summer PDSI series is AR(1-2), with a range of coefficients that explain
from near-zero to 20% of the time series variance, depending on the geographic
location. In contrast, the tree-ring AR models are typically AR(1-3) and the
coefficients can cumulatively account for 2-5 times as much variance, depending on the
location and tree species. This illustrates the need to adjust the persistence in
tree-ring series as part of the climate reconstruction procedure.
There are a variety of ways that this may be approached. Dave Meko investigated two
methods in his PhD dissertation (Meko, 1981) for developing precipitation
reconstructions from arid-site conifers in western North America. The first used the
classic B-J transfer function model and the second used a method devised by Dave, which
he called the "random shock" model. Both methods worked well, with each having certain
advantages over the other. In particular, the "random shock" model was found to produce
an overall flat cross-spectral gain between actual and estimated precipitation. This
means that the statistical model used to develop the reconstruction was generally
unbiased as a function of frequency. In contrast, the B-J transfer function model
approach produced a somewhat "redder" cross-spectral gain, which means that the model
tended to emphasize low-frequency variability. This difference does not mean that
either method is necessarily superior to the other. However, the "random shock" model
is easier to implement in an automatic way by using the minimum Akaike Information
Criterion (AIC) to estimate the order of each AR model. This is the reason why I have
chosen the prewhitening/postreddening procedures of Meko's "random shock" model.
The implementation of the "random shock" model in my PPR program is based first on
prewhitening the climate series and tree-ring chronologies independently using the
minimum AIC to estimate the order of the model and the maximum entropy method to
estimate the AR coefficients themselves. The correlation between PDSI and tree rings
for years t and t+1 (tree rings lag PDSI) are estimated and only those tree-ring
variables that are significantly correlated (p<0.10) are retained and used in principal
components regression. Note that the determination of "significance" here is reasonably
straightforward here because the series being compared are serially random in a
short-lag sense. The resulting reconstruction, based on prewhitened tree rings and
PDSI, is than "reddened" by adding the AR model persistence of the PDSI data into the
tree-ring reconstruction. This is all very straightforward to do.
So, now back to Mike's concerns. Does this procedure result in a significant loss of
low-frequency variance in the PDSI reconstructions? I do not think so. Dave Meko's
results suggest that the "random shock" model produces a reasonably unbiased
reconstructions w.r.t. cross-spectral gain. I have done similar tests of the method and
agree with Dave's finding.
Therefore, the prewhitening used in my PPR program should not be regarded as a flaw in
the calibration/reconstruction procedure. Indeed, if PDSI reconstructions are generated
without the use of prewhitening, the verification statistics are much worse on average.
This is because the presence of autocorrelation in both the tree-ring and climate series
makes the identification of "true" lead-lag relationships between series very
difficult. So, prewhitening is clearly doing some good here and is clearly better than
not doing any persistence modeling at all. This being said, are there better, more
elegant ways of adjusting for the differences in persistence structure between
tree-rings and PDSI? Perhaps, but it is not clear what they might be.
I hope that what I have written here helps clarify the issues concerning the nature of
low-frequency variance in my extended PDSI records. Are they missing some amount of
centennial-timescale variability? Almost certainly based on what I have told you. Is
the amount of missing variability large? Based on the nature of precipitation
variability and the experience of Phil Jones in calculating PDSI from long instrumental
European records, I do not think so. Is the prewhitening method used in my PPR program
responsible a loss of low-frequency variance? Tests performed by Dave Meko and myself
on the method that I use do not indicate that this is a problem.
If you have any questions about what I have written, please do not hesitate to ask.
Cheers,
Ed
References
Cook, E.R., Briffa, K.R., Meko, D.M., Graybill, D.A. and Funkhouser, G. 1995. The
segment length curse in long tree-ring chronology development for paleoclimatic studies.
The Holocene 5(2):229-237.
Granger, C.W. 1966. On the typical shape of an econometric variable. Econometrica
34:150-161.
Meko, D.M. 1981. Applications of Box-Jenkins methods of time series analysis to the
reconstruction of drought from tree rings. Ph.D. dissertation, University of Arizona,
Tucson.
--
==================================
Dr. Edward R. Cook
Doherty Senior Scholar and
Director, Tree-Ring Laboratory
Lamont-Doherty Earth Observatory
Palisades, New York 10964 USA
Email: [1]drdendro@ldeo.columbia.edu
Phone: 845-365-8618
Fax: 845-365-8152
==================================
Prof. Phil Jones
Climatic Research Unit Telephone +44 (0) 1603 592090
School of Environmental Sciences Fax +44 (0) 1603 507784
University of East Anglia
Norwich Email p.jones@uea.ac.uk
NR4 7TJ
UK
----------------------------------------------------------------------------