cc: "Thomas.R.Karl" , carl mears , "David C. Bader" , "'Dian J. Seidel'" , "'Francis W. Zwiers'" , Frank Wentz , Karl Taylor , Melissa Free , "Michael C. MacCracken" , "'Philip D. Jones'" , santer1@llnl.gov, Sherwood Steven , Steve Klein , 'Susan Solomon' , "Thorne, Peter" , Tim Osborn , Tom Wigley date: Sun, 16 Dec 2007 14:39:09 +0100 from: Leopold Haimberger subject: Re: [Fwd: sorry to take your time up, but really do need a scrub to: John.Lanzante@noaa.gov Hello John and colleagues, My colleagues from Vienna and I have a paper under review in J. Climate, we submitted the second revision last week. The reviews of the first revision were quite positive. It tries to explain why versions 1.3 and 1.4 in particular are better than 1.2. It also explains a method that uses the breakpoint dates gained from RAOBCORE as metadata for a neighbor composite homogenization method similar to HadAT. This second method is much less dependent on possible inhomogeneities in the ERA-40 background. I sent it to Ben already but those interested may download it from [1]ftp://srvx6.img.univie.ac.at/pub/RSMSUBG_rev2.pdf but please keep it confidential. The paper shows that the radiosonde homogenization is making progress but of course ongoing work. At least it shows that there seem to be ways that remove much the pervasive bias in radiosonde temperatures. Time series from RAOBCORE v1.4 are already published in "Arguez et al (2007) Supplement to State of the Climate in 2006 BAMS 88 Nr 6, s1-s135" (I believe the plot numbers are 2.4 and 2.5). These plots were collected by J. Christy about the same time last year. I have to leave now but can you give details later. Regards, Leo John Lanzante wrote: Ben, Perhaps a resampling test would be appropriate. The tests you have performed consist of pairing an observed time series (UAH or RSS MSU) with each one of 49 GCM times series from your "ensemble of opportunity". Significance of the difference between each pair of obs/GCM trends yields a certain number of "hits". To determine a baseline for judging how likely it would be to obtain the given number of hits one could perform a set of resampling trials by treating one of the ensemble members as a surrogate observation. For each trial, select at random one of the 49 GCM members to be the "observation". >From the remaining 48 members draw a bootstrap sample of 49, and perform 49 tests, yielding a certain number of "hits". Repeat this many times to generate a distribution of "hits". The actual number of hits, based on the real observations could then be referenced to the Monte Carlo distribution to yield a probability that this could have occurred by chance. The basic idea is to see if the observed trend is inconsistent with the GCM ensemble of trends. There are a couple of additional tweaks that could be applied to your method. You are currently computing trends for each of the two time series in the pair and assessing the significance of their differences. Why not first create a difference time series and assess the significance of it's trend? The advantage of this is that you would reduce somewhat the autocorrelation in the time series and hence the effect of the "degrees of freedom" adjustment. Since the GCM runs are based on coupled model runs this differencing would help remove the common externally forced variability, but not internally forced variability, so the adjustment would still be needed. Another tweak would be to alter the significance level used to assess differences in trends. Currently you are using the 5% level, which yields only a small number of hits. If you made this less stringent you would get potentially more weaker hits. But it would all come out in the wash so to speak since the number of hits in the Monte Carlo simulations would increase as well. I suspect that increasing the number of expected hits would make the whole procedure more powerful/efficient in a statistical sense since you would no longer be dealing with a "rare event". In the current scheme, using a 5% level with 49 pairings you have an expected hit rate of 0.05 X 49 = 2.45. For example, if instead you used a 20% significance level you would have an expected hit rate of 0.20 X 49 = 9.8. I hope this helps. On an unrelated matter, I'm wondering a bit about the different versions of Leo's new radiosonde dataset (RAOBCORE). I was surprised to see that the latest version has considerably more tropospheric warming than I recalled from an earlier version that was written up in JCLI in 2007. I have a couple of questions that I'd like to ask Leo. One concern is that if we use the latest version of RAOBCORE is there a paper that we can reference -- if this is not in a peer-reviewed journal is there a paper in submission? The other question is: could you briefly comment on the differences in methodology used to generate the latest version of RAOBCORE as compared to the version used in JCLI 2007, and what/when/where did changes occur to yield a stronger warming trend? Best regards, ______John On Saturday 15 December 2007 12:21 pm, Thomas.R.Karl wrote: Thanks Ben, You have the makings of a nice article. I note that we would expect to 10 cases that are significantly different by chance (based on the 196 tests at the .05 sig level). You found 3. With appropriately corrected Leopold I suspect you will find there is indeed stat sig. similar trends incl. amplification. Setting up the statistical testing should be interesting with this many combinations. Regards, Tom -- Ao. Univ. Prof. Dr. Leopold Haimberger Institut für Meteorologie und Geophysik, Universität Wien Althanstraße 14, A - 1090 Wien Tel.: +43 1 4277 53712 Fax.: +43 1 4277 9537 [2]http://mailbox.univie.ac.at/~haimbel7/