cc: carl mears , Karl Taylor , Tom Wigley , Tom Wigley , "Thorne, Peter" , Steven Sherwood , John Lanzante , "'Dian J. Seidel'" , Melissa Free , Frank Wentz , Steve Klein , Leopold Haimberger , peter gleckler date: Wed, 05 Dec 2007 14:19:17 -0800 from: Ben Santer subject: Re: [Fwd: sorry to take your time up, but really do need a scrub to: Phil Jones Dear Phil, Just a quick response to the issue of "model weighting" which you and Carl raised in your emails. We recently published a paper dealing with the identification of an anthropogenic fingerprint in SSM/I-based estimates of total column water vapor changes. This was a true multi-model detection and attribution ("D&A") study, which made use of results from 22 different A/OGCMs for fingerprint and noise estimation. Together with Peter Gleckler and Karl Taylor, I'm now in the process of repeating our water vapor D&A study using a subset of the original 22 models. This subset will comprise 10-12 models which are demonstrably more successful in capturing features of the observed mean state and variability of water vapor and SST - particularly features crucial to the D&A problem (such as the low-frequency variability). We've had fun computing a whole range of metrics that might be used to define such a subset of "better" models. The ultimate goal is to determine the sensitivity of our water vapor D&A results to model quality. I think that this kind of analysis will be unavoidable in the multi-model world in which we now live. Given substantial inter-model differences in simulation quality, "one model, one vote" is probably not the best policy for D&A work! Once we've used Carl's method to calculate synthetic MSU temperatures from the IPCC AR4 20c3m data (as described in my previous email), it should be relatively easy to do a similar "model culling" exercise with MSU T2, T4, and TLT. In fact, this is what we had already planned to do in collaboration with Carl and Frank. One key point in any model weighting or selection strategy is to avoid circularity. In the D&A context, it would be impermissible to include information on trend behavior as a criterion used for selecting "better" models. Likewise, if our interest is in assessing the statistical significance of model-versus-observed trend differences, we can't use model performance in simulating "observed" tropospheric or stratospheric trends (whatever those might be!) as a means of identifying more credible models. A further issue, of course, is that we are relying on results from fully coupled A/OGCMs, and are making trend comparisons over relatively short periods (several decades). On these short timescales, estimates of the "true" trend in response to the applied 20c3m forcings are quite sensitive to natural variability noise (as Peter Thorne's 2007 GRL paper clearly illustrates). Because of such chaotic variability, even a hypothetical model with perfect physics and forcings would yield a distribution of tropospheric temperature trends over 1979 to 1999, some of which would show larger or smaller cooling than observed. This is why it's illogical to stratify model results according to correspondence between modeled and observed surface warming - something which John Christy is very fond of doing. What we've done (in the new water vapor work described above) is to evaluate the fidelity with which the AR4 models simulate the observed mean state and variability of precipitable water and SST - not the trends in these quantities. We've looked at a model performance in a variety of different regions, and on multiple timescales. The results are fascinating, and show (at least for water vapor and SST) that every model has its own individual strengths and weaknesses. It is difficult to identify a subset of models that CONSISTENTLY does well in many different regions and over a range of different timescales. My guess is that we would obtain somewhat different results for MSU temperatures - particularly for comparisons involving variability. Clearly, the absence of volcanic forcing in roughly half of the 20c3m experiments will have a large impact on the estimated variability of synthetic T4 temperatures (and perhaps even on T2), and hence on model-versus-data variability comparisons. It's also quite possible that the inclusion or absence of volcanic forcing has an impact not only on the amplitude of the variability of global-mean T4 anomalies, but also on the pattern of T4 variability. So model ranking exercises based on performance in simulating the mean state and variability of T4 and T2 may show some connection to the presence or absence of volcanic/ozone forcing. The sad thing is we are being distracted from doing this fun stuff by the need to respond to Douglass et al. That's a real shame. With best regards, Ben Phil Jones wrote: > All, > IJC do have comments but only very rarely. I see little point in > doing this > as there is likely to be a word limit, and if the system works properly > Douglass et al would get the final say. There is also a large backlog in > papers awaiting to appear, so even if the comment were accepted it would > be some time after Douglass et al that it would appear. > Better would be a submission to another journal (JGR?) which > would be quicker. This could go in before Douglass et al appeared in > print - it should be in the IJC early online view fairly soon based on > recent experiences. > A paper pointing out the issues of trying to weight models in some way > would be very beneficial to the community. AR5 will have to go down this > route at some point. How models simulate the > recent trends at the surface and in the troposphere/stratosphere and > how they might be ranked is a possibility. This could bring in the > new work Peter alludes to with the sondes. > There are also some aspects of recent surface T changes that could be > discussed as well. These relate to the growing dominance of buoy SSTs > (now 70% of the total) vs conventional ships. There is a paper in J. > Climate > accepted from Smith/Reynolds et al at NCDC, which show that buoys > could conceivably be cooler than ship-based SST by about 0.1C - meaning > that the last 5-10 years are being gradually underestimated over the > oceans. > Overlap is still too short to be confident about this, but it highlights a > major systematic change occurring in surface ocean measurements. As the > buoys are presumably better for absolute SSTs, this means models > driven with fixed SSTs should be using fields that are marginally cooler. > > And then there is the continual reference to Kalnay and Cai, when > Simmons et al (2004) have shown the problems with NCEP. It is possible > to add in the ERA-Interim analyses and operational analyses to > being results from ERA-40 up to date. > > Cheers > Phil > > > At 23:40 04/12/2007, carl mears wrote: >> Karl -- thanks for clarifying what I was trying to say >> >> Some further comments..... >> >> At 02:53 PM 12/4/2007, Karl Taylor wrote: >>> Dear all, >>> 2) unforced variability hasn't dominated the observations. >> >> But on this short time scale, we strongly suspect that it has >> dominated. For example, the >> 2 sigma error bars from table 3.4, CCSP for satellite TLT are 0.18 >> (UAH) or 0.19 (RSS), larger >> than either group's trends (0.05, 0.15) for 1979-2004. These were >> calculated using a "goodness >> of linear fit" criterion, corrected for autocorrelation. This is a >> probably a reasonable >> estimate of the contribution of unforced variability to trend >> uncertainty. >> >> >> >>> Douglass et al. have *not* shown that every individual model is in >>> fact inconsistent with the observations. If the spread of individual >>> model results is large enough and at least 1 model overlaps the >>> observations, then one cannot claim that all models are wrong, just >>> that the mean is biased. >> >> >> Given the magnitude of the unforced variability, I would say "the mean >> *may* be biased." You can't prove this >> with only one universe, as Tom alluded. All we can say is that the >> observed trend cannot be proven to >> be inconsistent with the model results, since it is inside their range. >> >> It we interesting to see if we can say anything more, when we start >> culling out the less realistic models, >> as Ben has suggested. >> >> -Carl >> >> >> >> > > Prof. Phil Jones > Climatic Research Unit Telephone +44 (0) 1603 592090 > School of Environmental Sciences Fax +44 (0) 1603 507784 > University of East Anglia > Norwich Email p.jones@uea.ac.uk > NR4 7TJ > UK > ---------------------------------------------------------------------------- > -- ---------------------------------------------------------------------------- Benjamin D. Santer Program for Climate Model Diagnosis and Intercomparison Lawrence Livermore National Laboratory P.O. Box 808, Mail Stop L-103 Livermore, CA 94550, U.S.A. Tel: (925) 422-2486 FAX: (925) 422-7675 email: santer1@llnl.gov ----------------------------------------------------------------------------