date: Wed, 20 Nov 2002 09:05:08 +0000
from: Phil Jones <p.jones@uea.ac.uk>
subject: Fwd: Re: PDSI low-frequency issues
to: k.briffa@uea.ac.uk

     Date: Tue, 19 Nov 2002 11:16:56 -0500
     From: "Thomas R Karl" <Thomas.R.Karl@noaa.gov>
     User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.1) Gecko/20020823
     Netscape/7.0
     X-Accept-Language: en-us, en
     To: Ed Cook <drdendro@ldeo.columbia.edu>
     CC: Jones <p.jones@uea.ac.uk>, Mann <mann@virginia.edu>,
             Christopher D Miller <Christopher.D.Miller@noaa.gov>,
             Bradley <rbradley@geo.umass.edu>,
             Trenberth <trenbert@ncar.ucar.edu>,
             David Easterling <David.Easterling@noaa.gov>,
               Mark Eakin <Mark.Eakin@noaa.gov>,
             Sharon Leduc <Sharon.Leduc@noaa.gov>,
               Connie Woodhouse <Connie.Woodhouse@noaa.gov>,
             David M Anderson <David.M.Anderson@noaa.gov>
     Subject: Re: PDSI low-frequency issues
     Thanks Ed and Mike,
     This is an important issue, and as Mike indicated it affects all of our tree ring (and
     others) chronologies.   I think we have a substantial amount of work to do to make sure
     we use all this data appropriately.  Ed, your description is very throrough and
     admirable.  The problem I want to avoid is went we link these data up with the
     instrumental record is making false statements about long term trends and return
     periods.  I am requesting that our Paleo group work with you and other experts to make
     sure we get this right.  I think that is a heavy responsibility we bear if we are using
     these in monitoring products.
     Thanks again,
     Tom
     Ed Cook wrote:

     Hi Tom,
     Phil Jones sent me an email concerning some discussions you had with he, Mike Mann, and
     others at a CCDD Panel Meeting about the low-frequency characteristics of the extended
     PDSI reconstructions that I have generated.  Mark Eakin showed you some of these I
     believe and has put them on the NCDC/NGDC web site. I was in Bhutan at the time when I
     received Phil's email (yes, Bhutan does have the odd cyber café in its bigger towns!),
     so I was not able to reply until now.  I have also cc'd this message to a few members of
     the review panel, including Phil and Mike.
     In Phil's email, he indicated that you were concerned about what you perceived to be an
     unnatural lack of low-frequency (i.e. <1/100 years) variability in the extended PDSI
     reconstructions.  There are several issues that need to be considered in understanding
     why the observed low-frequency variability in the PDSI reconstructions is expressed as
     it is.  I will describe them below in some detail.
     A) In terms of low-frequency variability, what should we expect in the PDSI
     reconstructions?
     As we all know, monthly PDSI is largely determined by current monthly precipitation and
     antecedent conditions, with current temperature acting as a lesser demand function
     during the warm-season months through its transformation into units of
     evapotranspiration.  So while temperature (and its generally greater low-frequency
     information) may have a measurable effect on estimates of PDSI, I do not think that it
     will be all that large.  I say this because of my experiments in reconstructing US
     gridded summer (JJA) PDSI and Ned Guttman's 12-month running sum Standardized
     Precipitation Index (SPI; Ned's suggestion for an SPI most similar to PDSI) from tree
     rings.  The calibration/verification results were extremely similar, as were the
     reconstructions themselves, with PDSI doing slightly better on average.  Given that SPI
     contains no explicit temperature information, this indicates to me that the large
     majority of the summer PDSI variability in the reconstructions is driven by variations
     in precipitation alone.
     This being the case, how much low-frequency variance (i.e. <1/100 years) should we
     expect from precipitation records on local and regional spatial scales?  In my
     experience not much.  Compared to temperature, precipitation is much more dominated by
     high-frequency variability that often behaves in a short-lag persistence sense as a
     white-noise process.  This being the case, I suggest that one ought not expect to find
     much centennial-timescale variability in PDSI reconstructions from tree rings.  Indeed,
     Phil has indicated to me that he also does not find much low-frequency variability in
     PDSI series based directly on long European instrumental climate records.  Consequently,
     I think that the relative lack of centennial timescale variability in PDSI
     reconstructions is, at least partly, a natural reflection of the way that local
     precipitation varies as a (nearly) white noise process.  However, as we both know, the
     way in the which the tree-ring chronologies have been processed can affect how much
     low-frequency variance can be realized in any climate reconstruction.  So `
     B) How much low-frequency variance might be missing in the PDSI reconstructions due to
     the way in which the tree-ring chronologies were created and processed?
     Conceptually, and even theoretically, we have a pretty good idea what is going on here.
     There are basically two ways in which low-frequency variance is lost during the process
     of tree-ring chronology development.  The first relates to what I have coined the
     "segment length curse" (Cook et al., 1995).  In that paper, I (with four co-authors)
     described how the theoretical resolvability limit of low-frequency variance in a given
     time series of length n is O(1/n).  Any variability at timescales >n can not necessarily
     be differentiated from trend.  This is the basis for Granger (1966)'s "trend in mean"
     concept.
     Now in classical tree-ring chronology development, the chronology is a mean-value
     function of length N, composed of m overlapping, (typically) shorter, length-n series
     extracted from a stand of living trees.  Note that n is usually quite variable from tree
     to tree in the m-series ensemble, depending on the age structure of the sampled trees,
     with the worst-case scenario (from a low-frequency preservation perspective) being that
     n<<N for all m.  The "segment length curse" refers to the fact the maximum recoverable
     low-frequency variance in a length N chronology is O(avg(1/n)) for the m series if each
     series is de-trended (or even only de-meaned) independently of all other series.  What
     this means is that if we have a tree-ring chronology of N=1000 years long made up of m
     series each n=100 years long (the worst-case scenario above), we can not preserve
     variance in the length-N chronology at timescales >100 years if each series is
     independently detrended first.  There are ways that this limit might be circumvented
     (e.g., via RCS or age-banding methods), but I won't get into these issues here because
     all of the tree-ring chronologies used in reconstructing continental-scale gridded PDSI
     are based on classical tree-ring chronology development methods.  Therefore, a
     reasonable diagnostic for determining the lowest frequency that might be preserved in a
     length-N tree-ring chronology might be something like avg(1/n).  I actually prefer
     med(1/n) because of the greater robustness of the median compared to the mean.
     So, how does this translate to the tree-ring network used to reconstruct PDSI over North
     America?  I can't give you the exact med(1/n) information because it has not been
     formally tabulated for all chronologies.  However, I can provide reasonably accurate
     estimates based on what I know about many of the chronologies.  First, consider the
     eastern US where I developed a tree-ring chronology network in the early 1980s.  Over 20
     years ago, I recognized the existence of the "segment length curse" as part of my
     dissertation research.  Consequently, from many chronologies I purposely deleted
     individual tree-ring series that began after 1800.  This means that the minimum segment
     length was ~180 years for the large majority of the ~60 chronologies that I developed
     and have used in the PDSI reconstructions for the eastern US.  The med(1/n) is probably
     more like 1/220 for most chronologies.  Many of the tree-ring chronologies developed by
     Dave Stahle in other parts of eastern North America have comparable median segment
     lengths.  In western North America, the situation will be generally better because the
     ages of the sampled trees are often older than those sampled in eastern North America.
     Consequently, I suggest that med(1/n) is <1/300 in many cases.  This estimate suggests
     that, in principle, we ought to be able to reconstruct low-frequency PDSI variability
     <1/200 years from the North American tree-ring network.
     Of course, there are good reasons why <1/200 is overly optimistic because it does not
     take into account the method(s) of detrending used to "standardize" the m individual
     tree-ring series prior to averaging them together into the final chronology used to
     reconstruct PDSI.  As is widely described in the dendrochronology literature, there are
     many different ways in which the tree-ring series may be detrended.  The simplest fitted
     growth curves used for detrending are monotonic, either linear or negative exponential
     in form.  These growth curves are commonly used to standardize western North American
     tree-ring series from open-canopy forests with minimal stand dynamics effects.  Such
     detrending will have relatively little impact on the med(1/n) estimate for the
     preservation of low-frequency variance (e.g., 1/200 years).  However, in closed-canopy
     forests typical of eastern North American and more mesic forests in western North
     American, stand dynamics effects can perturb the trajectory of radial growth (i.e., the
     ring-width series) away from that which can be reasonably fitted by monotonic growth
     curves.  Consequently, more flexible and locally adaptive growth curves are often used
     to detrend such series.  The most commonly used "flexible and locally adaptive" method
     is probably the cubic smoothing spline.  This particular cubic spline is especially
     attractive because its exact theoretical properties as a digital filter have been
     derived.  Therefore, one knows what the 50% frequency response cutoff in years for any
     given cubic smoothing spline.
     So, in cases where the smoothing spline is used for detrending tree-ring series, how
     does its use affect the realizable minimum low-frequency variance preserved in the
     chronology?  For some tree-ring chronologies in the network that are based on spline
     detrending, this information is not formally known.  However, in the case of my
     chronologies from eastern North America (and many of Dave Stahle's chronologies), the
     50% frequency response cutoff was set (in most cases) to 2/3 the length of the series
     being detrended.  This translates to an adjusted med(1/n) of ~1/150 years, assuming an
     initial median segment length of ~220 years.  If we wish to be even more conservative
     (pessimistic?) in our estimate by taking into account the transition bandwidth of the
     spline frequency response function, the realizable minimum low-frequency variance that
     is usefully preserved in the chronology could be more like ~1/120 years on average.  So
     even in regions where spline detrending is used (mostly eastern North America), it is
     likely that century-scale PDSI variability can be reconstructed ` to the degree that it
     exists in local precipitation variability over time.  In western North America, the
     potential recoverable low-frequency PDSI variability ought to exceed 1/200 years ` again
     to the degree that it exists in local precipitation variability over time.
     C) Are there others data processing issues that might affect the preservation of
     low-frequency variance in the PDSI reconstructions?
     Phil indicated to me that Mike is concerned about the effects of prewhitening procedures
     used by me in my "Point-by-Point Regression" (PPR) procedure used to calibrate
     tree-rings into estimates of PDSI.  This is a legitimate concern.  However, I do not
     regard it to be nearly as important as the effects of "segment length" and "detrending"
     on the preservation of low-frequency variance in the PDSI reconstructions.
     First, let me explain the rationale for applying Box-Jenkins style prewhitening to the
     PDSI calibration problem.  It has long been recognized in dendroclimatology that annual
     tree-ring chronologies frequently have a persistence structure (order and magnitude)
     that exceeds that associated with the climate variable thought to be well related to
     ring width (either causally or statistically).  There are a number of physiological
     reasons for expecting this to be so, and many such processes can be thought of as
     operating in a causal feedback sense, i.e. the tree has a physiological memory that
     preconditions the potential for new radial growth driven by the arrival of new climate
     influences in any given year.
     Now, it turns out that causal feedback filters can be described mathematically as
     autoregressive (AR) processes, hence the usefulness of Box-Jenkins (B-J) modeling in
     dendroclimatology.  However, the application of B-J modeling to the
     calibration/reconstruction problem is not necessarily straightforward because the
     climate variable being reconstructed may have its own persistence structure that needs
     to be preserved in the tree-ring reconstruction.  Therefore, two persistence models must
     be considered: one for the climate variable to be reconstructed and one for each
     tree-ring chronology used for reconstruction.  Knowledge of both models can be used to
     "correct" the tree-ring persistence to better reflect that due to climate alone.  That
     this is necessary can be appreciated by realizing that the typical AR model for the
     instrumental summer PDSI series is AR(1-2), with a range of coefficients that explain
     from near-zero to 20% of the time series variance, depending on the geographic
     location.  In contrast, the tree-ring AR models are typically AR(1-3) and the
     coefficients can cumulatively account for 2-5 times as much variance, depending on the
     location and tree species.  This illustrates the need to adjust the persistence in
     tree-ring series as part of the climate reconstruction procedure.
     There are a variety of ways that this may be approached.  Dave Meko investigated two
     methods in his PhD dissertation (Meko, 1981) for developing precipitation
     reconstructions from arid-site conifers in western North America.  The first used the
     classic B-J transfer function model and the second used a method devised by Dave, which
     he called the "random shock" model.  Both methods worked well, with each having certain
     advantages over the other.  In particular, the "random shock" model was found to produce
     an overall flat cross-spectral gain between actual and estimated precipitation.  This
     means that the statistical model used to develop the reconstruction was generally
     unbiased as a function of frequency.  In contrast, the B-J transfer function model
     approach produced a somewhat "redder" cross-spectral gain, which means that the model
     tended to emphasize low-frequency variability.  This difference does not mean that
     either method is necessarily superior to the other.  However, the "random shock" model
     is easier to implement in an automatic way by using the minimum Akaike Information
     Criterion (AIC) to estimate the order of each AR model.  This is the reason why I have
     chosen the prewhitening/postreddening procedures of Meko's "random shock" model.
     The implementation of the "random shock" model in my PPR program is based first on
     prewhitening the climate series and tree-ring chronologies independently using the
     minimum AIC to estimate the order of the model and the maximum entropy method to
     estimate the AR coefficients themselves.  The correlation between PDSI and tree rings
     for years t and t+1 (tree rings lag PDSI) are estimated and only those tree-ring
     variables that are significantly correlated (p<0.10) are retained and used in principal
     components regression.  Note that the determination of "significance" here is reasonably
     straightforward here because the series being compared are serially random in a
     short-lag sense.  The resulting reconstruction, based on prewhitened tree rings and
     PDSI, is than "reddened" by adding the AR model persistence of the PDSI data into the
     tree-ring reconstruction.  This is all very straightforward to do.
     So, now back to Mike's concerns.  Does this procedure result in a significant loss of
     low-frequency variance in the PDSI reconstructions?  I do not think so.  Dave Meko's
     results suggest that the "random shock" model produces a reasonably unbiased
     reconstructions w.r.t. cross-spectral gain.  I have done similar tests of the method and
     agree with Dave's finding.
     Therefore, the prewhitening used in my PPR program should not be regarded as a flaw in
     the calibration/reconstruction procedure.  Indeed, if PDSI reconstructions are generated
     without the use of prewhitening, the verification statistics are much worse on average.
     This is because the presence of autocorrelation in both the tree-ring and climate series
     makes the identification of "true" lead-lag relationships between series very
     difficult.  So, prewhitening is clearly doing some good here and is clearly better than
     not doing any persistence modeling at all.  This being said, are there better, more
     elegant ways of adjusting for the differences in persistence structure between
     tree-rings and PDSI?  Perhaps, but it is not clear what they might be.
     I hope that what I have written here helps clarify the issues concerning the nature of
     low-frequency variance in my extended PDSI records.  Are they missing some amount of
     centennial-timescale variability?  Almost certainly based on what I have told you.  Is
     the amount of missing variability large?  Based on the nature of precipitation
     variability and the experience of Phil Jones in calculating PDSI from long instrumental
     European records, I do not think so.  Is the prewhitening method used in my PPR program
     responsible a loss of low-frequency variance?  Tests performed by Dave Meko and myself
     on the method that I use do not indicate that this is a problem.
     If you have any questions about what I have written, please do not hesitate to ask.
     Cheers,
     Ed
     References
     Cook, E.R., Briffa, K.R., Meko, D.M., Graybill, D.A. and Funkhouser, G. 1995. The
     segment length curse in long tree-ring chronology development for paleoclimatic studies.
     The Holocene 5(2):229-237.
     Granger, C.W. 1966. On the typical shape of an econometric variable. Econometrica
     34:150-161.
     Meko, D.M. 1981. Applications of Box-Jenkins methods of time series analysis to the
     reconstruction of drought from tree rings. Ph.D. dissertation, University of Arizona,
     Tucson.
--


     ==================================
     Dr. Edward R. Cook
     Doherty Senior Scholar and
     Director, Tree-Ring Laboratory
     Lamont-Doherty Earth Observatory
     Palisades, New York 10964  USA
     Email:  [1]drdendro@ldeo.columbia.edu
     Phone:  845-365-8618
     Fax:    845-365-8152
     ==================================

   Prof. Phil Jones
   Climatic Research Unit        Telephone +44 (0) 1603 592090
   School of Environmental Sciences    Fax +44 (0) 1603 507784
   University of East Anglia
   Norwich                          Email    p.jones@uea.ac.uk
   NR4 7TJ
   UK
   ----------------------------------------------------------------------------