cc: Eduardo Zorita , anders.moberg@natgeo.su.se, hegerl@duke.edu, weber@knmi.nl, myles.allen@physics.ox.ac.uk, k.briffa@uea.ac.uk, jan.esper@wsl.ch date: Wed, 1 Nov 2006 16:47:24 +0000 from: Martin Juckes subject: Re: CPD submission to: Tim Osborn Firstly, there really is no way of getting a PC with non-zero mean out of centred data. There is no need for any discussion on trivial mathematical identities. That part is not complicated. The Huybers comment deals with a different McIntyre and McKitrick paper, so it is not directly relevant. I've used the code provided by McIntyre, data from his website (which is just the MBH 1998 data in a slightly different format), and the result is a graph which is indistinguishable from that published in his Energy and Environment paper. The software does not centre the data, if it did centre the data it could not possibly produce the published graph. If you think there is anything complicated here it would be helpful to know what it is, it all looks blindingly obvious to me (which is not to say that there aren't closely related issues which are complicated). cheers, Martin On Wednesday 01 November 2006 16:18, Tim Osborn wrote: > Hi Martin, > > I have only had time for a quick read of your comments plus a quick > think about this possible complex issue. > > I agree with Eduardo that we should be careful about claiming coding > errors (or failure to implement exactly the process that is described > in the text), and would suggest that you contact McIntyre with a > brief question about the centering that he thinks he is doing etc. > before posting to the CPD site. I expect that you won't want to get > into a detailed and lengthy interaction over this, but a short query > to the effect that you are concerned that his code does not appear to > be centering the data might be sufficient to elucidate whether your > concern is correct or not. For example, even if he does not centred > the data in his code, if the input data are already centred then this > does not matter (sorry, I have no time to examine his input data files today!). > > I also draw your attention to the Huybers comment (and the MM > response to it), PDFs of both are attached. Does this have any > relevance to your concerns over the MM code? First, because Huybers > seems happy that MM results are reproducible. Second, because he > points out that MM and MBH differ not only in centering period, but > also in standardisation (i.e. correlation vs. covariance) -- does > that cover part of your concerns already? > > Sorry for the rushed and not completely-thought-through reply, but I > have to leave now. > > Cheers > > Tim > > > > At 15:40 01/11/2006, Martin Juckes wrote: > >I've attached the document I intend to put on the MITRIE web site. Following > >Eduardo's comments, I've only put myself as author, but I'm happy to include > >anyone else who would like to endorse it. > > > >It is important to emphasise that figure 2 of MM2005 (Energy and Environment) > >which shows a line with clearly non-zero mean and claims it is a principal > >component of centred data cannot be correct: principal components of centred > >data have zero mean. It is slightly embarassing to have missed this rather > >obvious point until now, but it is nevertheless true. Studying their code, > >and getting it to run so that I am not dependent on assuming that routines > >are platform independent, allows the source of this error to be determined. > > > >I've also attached the MM2005 paper, so you can check that their figure is > >properly reproduced. > > > >cheers, > >Martin > > > >On Wednesday 01 November 2006 14:25, Eduardo Zorita wrote: > > > > > > dear co-authors, > > > > > > > > > On the question of data and code -sharing, I am not sure whether > > Climate of > >the Past is the adequate forum, but I have > > > in principle nothing against it. I see however the risk that the possible > >discussion drifts from > > > the manuscript itself towards those general questions. > > > > > > Concerning the more particulat question of the errors in the code my > >MM05-ee, again I would tend to be very > > > cautious. I have tried to look a little bit into the R routines > > that may be > >used to calculate the > > > principal components, prcomb and princomb. There are several methods to do > >it, and apparently even those R-routines do not produce the same results with > >the same data. I am not an expert in the R languange and I feel completely > >unsure to as > > > what those routines do internally, e.g. whether the data are > > indeed centered > >or not in any internal steps. > > > However, I recall that when this issue was raised by MM, Mann itself > >recognized that the calculation by MM was > > > correct, i.e. the leading PC was dependent on the centering > > period, but that > >when choosing the correct truncation > > > (i.e. keeping more PCs than just the leading one) the final results were > >insensitive to this step. > > > Wegman also went through the code and apparently he found it to be ok. Of > >course, it is possible that both were wrong. > > > This, together with the fact that is quite easy to overlook aspects of the > >code written by others, guards > > > me against making any definitive assertions on a code written in > > a language > >that I do not command, the results of which I do not have the chance to test > >with my own software. Of course, you are free to do as you think is correct, > >but please not under my undorsement. > > > > > > > > > > > > > > > eduardo > > > > > > > > > > > > > > > > > > > >