date: Thu, 17 Aug 2006 12:31:28 +0200 from: Eduardo Zorita subject: comments to mitrie manuscript to: m.n.juckes@rl.ac.uk, " Moberg; Anders " , Gabi Hegerl , esper@wsl.ch, " Briffa; Keith " , " Osborn; Tim " , m.allen1@physics.ox.ac.uk, weber@knmi.nl  Due to the ongoing debate, this has turned an even more difficult manuscript. In general, I think Martin did a very good job in the review of the literature. Concerning the new reconstructions and the evaluation of McIntyre work, I would not fully agree with some of the conclusions, which I thin do not follow from the material presented in the text. I have some remarks on this which you may consider useful. But I think that I am not the one that should give the manuscript the final shape, as Martin is the person in charge of the project. Please, consider the following comments as suggestions. eduardo Consensus: I would tend to avoid the word 'consensus', since it is not a well defined concept. Depending on the meaning of consensus, each would agree with it to a certain degree. I would prefer to refer to a particular IPCC conclusion, or something similar. I think this review of the literature is very well written and informative, but I am not sure that each one of us will agree with each one of the concussions of each of the papers. Page 12, section 2.8. I think the text is somewhat vague here, and it could be misunderstood. Mann et al (2005) tested the RegEM method, not the original MBH98 method. It is true that applied to the real proxies both methods, according to Mann, yield very similar results. But strictly speaking , Mann did not test the MBH98 method in the CSM simulation. The MBH98 method is thereby only by implication I tested the the sensitivity of the MBH98, and not of RegEM, to the length of the calibration period. It may be the RegEM is less sensitive or not at all. Figure 4 and 5, if I understood well, support this dependency of MBH to the calibration period. Am I correct to interpret the large differences between the original MBH reconstruction (dashed red) and the black curve as due to the different calibration period (1901-1980 versus 1856-1980) and to the use of the leading PC or NHT as calibration target? At least in the period prior to 1600 I think these are the only methodological differences between both curves (?). My interpretation of this figure is also somewhat different. If the final reconstructions differs so strongly by using a longer calibration period (in general yielding stronger decadal variability in the reconstruction) I would tend to think that the method based on these proxies is quite unstable. What would happen if the calibration period could have been extended to 1800, for instance?. Page 15: top. The role of forcing on the global or NH T is also recognized in the correlation between the NHT simulated by ECHO-G and CSM for the millennium. For the case of a second ECHO-G simulation /Gonzalez-Rouco et al.) the agreement is very close at 30-year timescale. Section 3, beginning. In my opinion, MM05 stress the inadequacies and uncertainties in the MBH work, but they not put forward their own reconstruction implying a warmer-than-today MWP. They believe that this is true, but in their works so far, at least to my knowledge, they do not assert that the MWP was warmer than present, only that the uncertainties are too large for such a claim. Section 3: Consensus. This paragraph may be problematic. Again what is the consensus? If we look at the recent NAS report, which again not every one would agree with, the 'consensus' is reduced to the past 400 years in comparison to IPCC, leaving ample space for speculation before this period. Does the NAS report belong to the consensus? perhaps partially, but I am not sure to what extent. Section 3, discussion of MM05 and hockey-stick index. I have here a certain level of disagreement with these paragraphs. The issue raised by MM05 would be that the de-centering of the proxies prior to the calculations of the principal components tends to produce hockey-stick-shaped leading PC. I think this effect is true, at least with spatially uncorrelated red-noise series . It can be easily verified and it has been recognized in the NAS, the Wegman report and by Francis Zwiers. To be fair, following this issue is the problem of the truncation- just to keep the leading PC or further Pcs down the hiercharchy, and if this is done, the final differences could be probably minor. in the final reconstructions. But the paragraph implies, in my opinion, that this criticism by MM05 has no grounds, which as I said is problematic and could open the manuscript with criticisms based on these recent reports. I think that the calculation shown in Figure 3 is very useful, as it boils down to the issue raised by MM05: how relevant is the de-centering and standardization with real proxies?. Apparently, I get a different message from Figure3 (although I may have misinterpreted the text). I see quite large differences in the 20th century between the original MBH leading PC and the 'correct' calculation (whole period centering and standarization,blue line). Only the original MBH PC shows a positive trend in the 20th century. The blue lines seems even to show a negative trend or no trend at all. If this PCs were to be used in the MBH regression model (with trend included in the calibration) the results could be quite different. I would tend to think that this figure actually supports the MM05 criticism, since the hockey-stick shape of the leading PC disappears. Section 3, end, bristlecone pines. I am also worried by this paragraph. The recent NAS report clearly states that the bristlecone pines should not be used for reconstructions in view of their potential problems. They cite previous analysis on this issue. I think that to refer to just one study indicating no fertilization effect could not be enough. However, I am not a dendroclimatologist. This could open the door to potential problems. Section 4 , end. years 1997 and onwards were the warmest in the millennium. I see here also potential problems with this claim, and I do not see the need to make our lives more complicated. The NAS report expressed that the uncertainties are too large for this type of conclusion and certainly this conclusion would attract some attention from the reader. I see two lines of criticism on this: one is that the standard errors have been calculated with the calibration residuals and these are an underestimation of the true uncertainties. A reviewer may require that the uncertainty range be calculated by cross-calibration or bootstraping. In the case of CVM perhaps this effect is not very important, as there is just one free parameter, but in the case of inverse regression there are much many more free parameters and the true uncertainties can be quite different from those estimated from the calibration residuals. This potential criticism could be exacerbated by the fact that the new reconstruction has not been tested in a validation period. The other line of criticism could be that the calibration period has been, as in all reconstructions, a priori truncated -data after 1980 are not considered as the proxies are known to not follow the temperature. Strictly speaking this truncation can be only justified by a credible physical explanation about the cause of this divergence. Statistically, I think it is not correct to a priori ignore some data because they do not fit. If one does so, I think the uncertainty range should be enlarged to encompass the possibility that this divergence could have happened in the past, i.e. an additional standard deviation of the instrumental NH T in the period 1980-2000 (or perhaps more correct, the square root of the sum of the error variance and the NHT variance in 1980-2000). Alternatively, one could include the period 1980-2000 in the calibration and due to the divergence the standard errors would grow, but perhaps this is practically not possible as the proxy time series may not have been archived for the last 20 years. Section 5, conclusions. I share the worry of Anders Moberg about the wording 'serious flaws' in the analysis of MM05. This sentence would be based on Figure 3, if I understood properly, but as I said I think Figures 3 actually does not support this conclusion. Finally, I think it would strategically better to avoid conflicts on the particular point of whether some particular year was the warmest of the millennium or not, and to stress the fact that all reconstructions, also the new ones presented in the manuscript (with one exception) show MWP temperatures lower than late 20th century temperatures. Another conclusion could be, in my view, that the average temperature in the cold centuries in the millennium seems to be still quite uncertain. The new reconstructions, or the calculation of the leading PCs of the proxies, seem to be still quite sensitive to particular choices in the statistical set-up.