date: Wed, 20 Nov 2002 09:05:08 +0000 from: Phil Jones subject: Fwd: Re: PDSI low-frequency issues to: k.briffa@uea.ac.uk Date: Tue, 19 Nov 2002 11:16:56 -0500 From: "Thomas R Karl" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en To: Ed Cook CC: Jones , Mann , Christopher D Miller , Bradley , Trenberth , David Easterling , Mark Eakin , Sharon Leduc , Connie Woodhouse , David M Anderson Subject: Re: PDSI low-frequency issues Thanks Ed and Mike, This is an important issue, and as Mike indicated it affects all of our tree ring (and others) chronologies. I think we have a substantial amount of work to do to make sure we use all this data appropriately. Ed, your description is very throrough and admirable. The problem I want to avoid is went we link these data up with the instrumental record is making false statements about long term trends and return periods. I am requesting that our Paleo group work with you and other experts to make sure we get this right. I think that is a heavy responsibility we bear if we are using these in monitoring products. Thanks again, Tom Ed Cook wrote: Hi Tom, Phil Jones sent me an email concerning some discussions you had with he, Mike Mann, and others at a CCDD Panel Meeting about the low-frequency characteristics of the extended PDSI reconstructions that I have generated. Mark Eakin showed you some of these I believe and has put them on the NCDC/NGDC web site. I was in Bhutan at the time when I received Phil's email (yes, Bhutan does have the odd cyber café in its bigger towns!), so I was not able to reply until now. I have also cc'd this message to a few members of the review panel, including Phil and Mike. In Phil's email, he indicated that you were concerned about what you perceived to be an unnatural lack of low-frequency (i.e. <1/100 years) variability in the extended PDSI reconstructions. There are several issues that need to be considered in understanding why the observed low-frequency variability in the PDSI reconstructions is expressed as it is. I will describe them below in some detail. A) In terms of low-frequency variability, what should we expect in the PDSI reconstructions? As we all know, monthly PDSI is largely determined by current monthly precipitation and antecedent conditions, with current temperature acting as a lesser demand function during the warm-season months through its transformation into units of evapotranspiration. So while temperature (and its generally greater low-frequency information) may have a measurable effect on estimates of PDSI, I do not think that it will be all that large. I say this because of my experiments in reconstructing US gridded summer (JJA) PDSI and Ned Guttman's 12-month running sum Standardized Precipitation Index (SPI; Ned's suggestion for an SPI most similar to PDSI) from tree rings. The calibration/verification results were extremely similar, as were the reconstructions themselves, with PDSI doing slightly better on average. Given that SPI contains no explicit temperature information, this indicates to me that the large majority of the summer PDSI variability in the reconstructions is driven by variations in precipitation alone. This being the case, how much low-frequency variance (i.e. <1/100 years) should we expect from precipitation records on local and regional spatial scales? In my experience not much. Compared to temperature, precipitation is much more dominated by high-frequency variability that often behaves in a short-lag persistence sense as a white-noise process. This being the case, I suggest that one ought not expect to find much centennial-timescale variability in PDSI reconstructions from tree rings. Indeed, Phil has indicated to me that he also does not find much low-frequency variability in PDSI series based directly on long European instrumental climate records. Consequently, I think that the relative lack of centennial timescale variability in PDSI reconstructions is, at least partly, a natural reflection of the way that local precipitation varies as a (nearly) white noise process. However, as we both know, the way in the which the tree-ring chronologies have been processed can affect how much low-frequency variance can be realized in any climate reconstruction. So ` B) How much low-frequency variance might be missing in the PDSI reconstructions due to the way in which the tree-ring chronologies were created and processed? Conceptually, and even theoretically, we have a pretty good idea what is going on here. There are basically two ways in which low-frequency variance is lost during the process of tree-ring chronology development. The first relates to what I have coined the "segment length curse" (Cook et al., 1995). In that paper, I (with four co-authors) described how the theoretical resolvability limit of low-frequency variance in a given time series of length n is O(1/n). Any variability at timescales >n can not necessarily be differentiated from trend. This is the basis for Granger (1966)'s "trend in mean" concept. Now in classical tree-ring chronology development, the chronology is a mean-value function of length N, composed of m overlapping, (typically) shorter, length-n series extracted from a stand of living trees. Note that n is usually quite variable from tree to tree in the m-series ensemble, depending on the age structure of the sampled trees, with the worst-case scenario (from a low-frequency preservation perspective) being that n<100 years if each series is independently detrended first. There are ways that this limit might be circumvented (e.g., via RCS or age-banding methods), but I won't get into these issues here because all of the tree-ring chronologies used in reconstructing continental-scale gridded PDSI are based on classical tree-ring chronology development methods. Therefore, a reasonable diagnostic for determining the lowest frequency that might be preserved in a length-N tree-ring chronology might be something like avg(1/n). I actually prefer med(1/n) because of the greater robustness of the median compared to the mean. So, how does this translate to the tree-ring network used to reconstruct PDSI over North America? I can't give you the exact med(1/n) information because it has not been formally tabulated for all chronologies. However, I can provide reasonably accurate estimates based on what I know about many of the chronologies. First, consider the eastern US where I developed a tree-ring chronology network in the early 1980s. Over 20 years ago, I recognized the existence of the "segment length curse" as part of my dissertation research. Consequently, from many chronologies I purposely deleted individual tree-ring series that began after 1800. This means that the minimum segment length was ~180 years for the large majority of the ~60 chronologies that I developed and have used in the PDSI reconstructions for the eastern US. The med(1/n) is probably more like 1/220 for most chronologies. Many of the tree-ring chronologies developed by Dave Stahle in other parts of eastern North America have comparable median segment lengths. In western North America, the situation will be generally better because the ages of the sampled trees are often older than those sampled in eastern North America. Consequently, I suggest that med(1/n) is <1/300 in many cases. This estimate suggests that, in principle, we ought to be able to reconstruct low-frequency PDSI variability <1/200 years from the North American tree-ring network. Of course, there are good reasons why <1/200 is overly optimistic because it does not take into account the method(s) of detrending used to "standardize" the m individual tree-ring series prior to averaging them together into the final chronology used to reconstruct PDSI. As is widely described in the dendrochronology literature, there are many different ways in which the tree-ring series may be detrended. The simplest fitted growth curves used for detrending are monotonic, either linear or negative exponential in form. These growth curves are commonly used to standardize western North American tree-ring series from open-canopy forests with minimal stand dynamics effects. Such detrending will have relatively little impact on the med(1/n) estimate for the preservation of low-frequency variance (e.g., 1/200 years). However, in closed-canopy forests typical of eastern North American and more mesic forests in western North American, stand dynamics effects can perturb the trajectory of radial growth (i.e., the ring-width series) away from that which can be reasonably fitted by monotonic growth curves. Consequently, more flexible and locally adaptive growth curves are often used to detrend such series. The most commonly used "flexible and locally adaptive" method is probably the cubic smoothing spline. This particular cubic spline is especially attractive because its exact theoretical properties as a digital filter have been derived. Therefore, one knows what the 50% frequency response cutoff in years for any given cubic smoothing spline. So, in cases where the smoothing spline is used for detrending tree-ring series, how does its use affect the realizable minimum low-frequency variance preserved in the chronology? For some tree-ring chronologies in the network that are based on spline detrending, this information is not formally known. However, in the case of my chronologies from eastern North America (and many of Dave Stahle's chronologies), the 50% frequency response cutoff was set (in most cases) to 2/3 the length of the series being detrended. This translates to an adjusted med(1/n) of ~1/150 years, assuming an initial median segment length of ~220 years. If we wish to be even more conservative (pessimistic?) in our estimate by taking into account the transition bandwidth of the spline frequency response function, the realizable minimum low-frequency variance that is usefully preserved in the chronology could be more like ~1/120 years on average. So even in regions where spline detrending is used (mostly eastern North America), it is likely that century-scale PDSI variability can be reconstructed ` to the degree that it exists in local precipitation variability over time. In western North America, the potential recoverable low-frequency PDSI variability ought to exceed 1/200 years ` again to the degree that it exists in local precipitation variability over time. C) Are there others data processing issues that might affect the preservation of low-frequency variance in the PDSI reconstructions? Phil indicated to me that Mike is concerned about the effects of prewhitening procedures used by me in my "Point-by-Point Regression" (PPR) procedure used to calibrate tree-rings into estimates of PDSI. This is a legitimate concern. However, I do not regard it to be nearly as important as the effects of "segment length" and "detrending" on the preservation of low-frequency variance in the PDSI reconstructions. First, let me explain the rationale for applying Box-Jenkins style prewhitening to the PDSI calibration problem. It has long been recognized in dendroclimatology that annual tree-ring chronologies frequently have a persistence structure (order and magnitude) that exceeds that associated with the climate variable thought to be well related to ring width (either causally or statistically). There are a number of physiological reasons for expecting this to be so, and many such processes can be thought of as operating in a causal feedback sense, i.e. the tree has a physiological memory that preconditions the potential for new radial growth driven by the arrival of new climate influences in any given year. Now, it turns out that causal feedback filters can be described mathematically as autoregressive (AR) processes, hence the usefulness of Box-Jenkins (B-J) modeling in dendroclimatology. However, the application of B-J modeling to the calibration/reconstruction problem is not necessarily straightforward because the climate variable being reconstructed may have its own persistence structure that needs to be preserved in the tree-ring reconstruction. Therefore, two persistence models must be considered: one for the climate variable to be reconstructed and one for each tree-ring chronology used for reconstruction. Knowledge of both models can be used to "correct" the tree-ring persistence to better reflect that due to climate alone. That this is necessary can be appreciated by realizing that the typical AR model for the instrumental summer PDSI series is AR(1-2), with a range of coefficients that explain from near-zero to 20% of the time series variance, depending on the geographic location. In contrast, the tree-ring AR models are typically AR(1-3) and the coefficients can cumulatively account for 2-5 times as much variance, depending on the location and tree species. This illustrates the need to adjust the persistence in tree-ring series as part of the climate reconstruction procedure. There are a variety of ways that this may be approached. Dave Meko investigated two methods in his PhD dissertation (Meko, 1981) for developing precipitation reconstructions from arid-site conifers in western North America. The first used the classic B-J transfer function model and the second used a method devised by Dave, which he called the "random shock" model. Both methods worked well, with each having certain advantages over the other. In particular, the "random shock" model was found to produce an overall flat cross-spectral gain between actual and estimated precipitation. This means that the statistical model used to develop the reconstruction was generally unbiased as a function of frequency. In contrast, the B-J transfer function model approach produced a somewhat "redder" cross-spectral gain, which means that the model tended to emphasize low-frequency variability. This difference does not mean that either method is necessarily superior to the other. However, the "random shock" model is easier to implement in an automatic way by using the minimum Akaike Information Criterion (AIC) to estimate the order of each AR model. This is the reason why I have chosen the prewhitening/postreddening procedures of Meko's "random shock" model. The implementation of the "random shock" model in my PPR program is based first on prewhitening the climate series and tree-ring chronologies independently using the minimum AIC to estimate the order of the model and the maximum entropy method to estimate the AR coefficients themselves. The correlation between PDSI and tree rings for years t and t+1 (tree rings lag PDSI) are estimated and only those tree-ring variables that are significantly correlated (p<0.10) are retained and used in principal components regression. Note that the determination of "significance" here is reasonably straightforward here because the series being compared are serially random in a short-lag sense. The resulting reconstruction, based on prewhitened tree rings and PDSI, is than "reddened" by adding the AR model persistence of the PDSI data into the tree-ring reconstruction. This is all very straightforward to do. So, now back to Mike's concerns. Does this procedure result in a significant loss of low-frequency variance in the PDSI reconstructions? I do not think so. Dave Meko's results suggest that the "random shock" model produces a reasonably unbiased reconstructions w.r.t. cross-spectral gain. I have done similar tests of the method and agree with Dave's finding. Therefore, the prewhitening used in my PPR program should not be regarded as a flaw in the calibration/reconstruction procedure. Indeed, if PDSI reconstructions are generated without the use of prewhitening, the verification statistics are much worse on average. This is because the presence of autocorrelation in both the tree-ring and climate series makes the identification of "true" lead-lag relationships between series very difficult. So, prewhitening is clearly doing some good here and is clearly better than not doing any persistence modeling at all. This being said, are there better, more elegant ways of adjusting for the differences in persistence structure between tree-rings and PDSI? Perhaps, but it is not clear what they might be. I hope that what I have written here helps clarify the issues concerning the nature of low-frequency variance in my extended PDSI records. Are they missing some amount of centennial-timescale variability? Almost certainly based on what I have told you. Is the amount of missing variability large? Based on the nature of precipitation variability and the experience of Phil Jones in calculating PDSI from long instrumental European records, I do not think so. Is the prewhitening method used in my PPR program responsible a loss of low-frequency variance? Tests performed by Dave Meko and myself on the method that I use do not indicate that this is a problem. If you have any questions about what I have written, please do not hesitate to ask. Cheers, Ed References Cook, E.R., Briffa, K.R., Meko, D.M., Graybill, D.A. and Funkhouser, G. 1995. The segment length curse in long tree-ring chronology development for paleoclimatic studies. The Holocene 5(2):229-237. Granger, C.W. 1966. On the typical shape of an econometric variable. Econometrica 34:150-161. Meko, D.M. 1981. Applications of Box-Jenkins methods of time series analysis to the reconstruction of drought from tree rings. Ph.D. dissertation, University of Arizona, Tucson. -- ================================== Dr. Edward R. Cook Doherty Senior Scholar and Director, Tree-Ring Laboratory Lamont-Doherty Earth Observatory Palisades, New York 10964 USA Email: [1]drdendro@ldeo.columbia.edu Phone: 845-365-8618 Fax: 845-365-8152 ================================== Prof. Phil Jones Climatic Research Unit Telephone +44 (0) 1603 592090 School of Environmental Sciences Fax +44 (0) 1603 507784 University of East Anglia Norwich Email p.jones@uea.ac.uk NR4 7TJ UK ----------------------------------------------------------------------------