date: Wed, 29 Oct 2003 08:35:33 +0000 from: Phil Jones subject: Fwd: Re: STOP THE PRESS! to: k.briffa@uea.ac.uk >Date: Tue, 28 Oct 2003 19:10:17 -0800 (PST) >From: Stephen H Schneider >To: "Michael E. Mann" >cc: Richard Kerr , Andy Revkin , > David Appell , > , > Mike MacCracken , > Michael Oppenheimer , > "Socci.Tony-epamail.epa.gov" , > , , > , Jonathan Overpeck , > Phil Jones , Scott Rutherford > , > Gabi Hegerl , tom crowley , > Tom Wigley , Tim Osborn , > Stefan Rahmstorf , > Gavin Schmidt , > Rob Dunbar , , > Ross Gelbspan , Ben Santer , > , >Subject: Re: STOP THE PRESS! > >Hello all. Interesting tale--why we have competent peer review at >competent journals, and why professional courtesy is always to run >heterodox results by the orthodox for private comments before going >public--unless the motivation isn't science, but a big spalsh. Too bad for >them--the wrong guys will belly-flop (couldn't have happened to a nicer >bunch of prevaricators!). By the way, I give it a 50% (Bayesian priors) >subjective probability they will accuse you of deliberately misleading >them or deliberately preventing replication by "independent" scientists >and the only reason they did this was to smoke you out. From them, expect >anything. Can you explain this to Senator McCain's folks so they >understand the complexities and professional courtesy/peer review issues? >This stuff is not very sound bite friendly and needs some prethinking to >put it simply and clearly so it can be useful in the debate held by >non-scientist debaters. Good luck, Steve > >On Tue, 28 Oct 2003, Michael E. Mann wrote: > > > Dear Friends and Colleagues, > > > > I've got a story with a very happy ending to tell. I't will take a bit > > of patience to get through the details of the story, but I think its > > worth it. > > > > By the way, please keep this information confidential for about the next > > day or so. > > > > OK, well its about 48 hours since I first had the chance to review the > > E&E paper by M&M. Haven't had a lot of sleep, but I have had a lot of > > coffee, and my wife Lorraine has been kind enough to allow me to stay > > perpetually glued to the terminal. So what has this effort produced? > > > > Well, upon first looking at what the authors had done, I realized that > > they had used the wrong CRU surface temperature dataset (post 1995 > > version) to calculate the standard deviations for use in un-normalizing > > the Mann et al (1998) EOF patterns. Their normalization factors were > > based on Phil's older dataset. The clues to them should have been that a) > > our data set goes back to 1854 and theirs only back to 1856 and (b) why > > are 4 of the 1082 Mann et al (1998) gridpoints missing?? [its because > > the reference periods are different in the two datasets, which leads to a > > different spatial pattern of missing values]. So they had used the wrong > > temperature standard deviations to un-normalize our EOFs in the process > > of forming the surface temperature reconstruction. And I thought to > > myself, hmm--this could lead to some minor problems, but I don't see how > > they get this divergence from the Mann et al (1998) estimate that > > increases so much back in time, and becomes huge before 1500 or so. That > > can't be it, can it? > > > > Then I uncovered that they had used standard deviations of the raw > > gridpoint temperature series to un-normalize the EOFs, while we had > > normalized the data by the detrended standard deviations. Either > > convention can be justified, but you can't mix and match--which is what > > they effectively did by adopting our EOFs and PCs, and using their > > standard deviations. And I thought, hmm--this could certainly lead to an > > artificial inflation of the variance in the reconstruction in general, > > and this could give an interesting spatial pattern of bias as well (which > > might have an interesting influence on the areally-weighted hemispheric > > mean). But I thought, hmm, this can't really lead to that tremendous > > divergence before 1500 that the authors find. I was still scratching my > > head a bit at this point. > > > > Then I read about the various transcription errors, values being shifted, > > etc. that the authors describe as existing in the dataset. And I thought, > > hmm, that sounds like an excel spread sheet problem, not a problem w/ the > > MBH98 proxy data set. It started to occur to me at this point that there > > might be some problems w/ the excel spreadsheet data that my colleague > > Scott Rutherford had kindly provided the authors at their request. But > > these problems sounded pretty minor from the authors' description, and > > the authors described a procedure to try to fix any obvious > > transcription errors, shifted cell values, etc. So I thought, hmm, they > > might not have fixed things perfectly, and that could also lead to some > > problems. But I still don't see how they get that huge divergence back in > > time from this sort of error... > > > > Still scratching my head at this point...Then finally this afternoon, > > some clues. After looking at their on-line description one more time, I > > became disturbed at something I read. The data matrix they're using has > > 112 columns! Well that can't be right! That's can't constitute the Mann > > et al (1998) dataset. There are considerably more than that number of > > independent proxy indicators necessary to reproduce the stepwise Mann et > > al reconstruction. Something is amiss! > > > > Well, 112 is the number of proxy indicators used back to 1820. But some > > of these indicators are principal components of regional sub-networks > > (e.g. the Western U.S. ITRDB tree-ring data) to make the dataset more > > managable in size, and those principal components (PCs) are unique to the > > time interval analyzed. So there is some set of PC series for the > > 1820-1980 period. Farther back in time, say, back to 1650 there are fewer > > data series the regional sub-networks. So we recalculate a completely > > different EOF/PC basis set for that period, and that constitutes an > > additional, unique set of proxy indicators that are appropriate for a > > reconstruction of the 1650-1980 period. PC #1 from one interval is not > > equivalent to PC#1 from a different interval. This turns out to be the > > essential detail. A reconstruction back to 1820 calibrated against the > > 20th century needs to make use of the unique set of proxy PCs available > > for the 1820-1980 period. A reconstruction back to 1650 calibrated > > against the 20th century needs to make use of the independent (smaller) > > set of PC series available for the 1650-1980 period, and so on, back to > > 1400. > > > > So there have to be significantly more than 112 series available to > > perform the iterative,stepwise reconstruction approach of Mann et al > > (1998), because each sub interval actually has a unique set of PC series > > representations of various proxy sub-networks. Then it started to hit > > me. The PC#1 series calculated for networks of similar size (say, the > > network available back to 1820 and that available back to 1750) should be > > similar. But as the sub-network gets sparser back in time, the PC#1 > > series will resemble less and less the PC#1 series of the denser networks > > available at later times. PC#1 of the western ITRDB tree-ring calculated > > for the 1400-1980 period will bear almost no resemblance to the PC#1 > > series of the western N.Amer ITRDB data calculated for the 1820-1980 > > period during their interval (1820-1980) of mutual overlap. > > > > Then it really hit me. What--just what--if the proxy data had been > > pigeonholed into a 112 column matrix by the following (completely > > inappropriate!) procedure: What if it had been decided that there would > > only be 1 column for "PC #1 of the Western ITRDB tree ring data", even > > though that PC reflects something completely different over each > > sub-interval. Well, that can't be done in a reasonable way. But it can be > > done in an *unreasonable* way: by successively overprinting the data in > > that column as one stores the PCs from later and later intervals. So a > > given column would reflect PC#1 of the 1400-1980 data from 1400-1450, > > PC#1 of the 1450-1980 from 1450-1500, PC#1 of the 1500-1980 data for > > 1500-1650, PC#1 of the 1650-1980 data for 1650-1750, etc. and so on. In > > this process, the information necessary to calibrate the early PCs would > > be obliterated with each successive overprint. The resulting 'series' > > corresponding to that column of the data matrix, an amalgam of > > increasingly unrelated information down the column, would be completely > > useless for calibration of the earlier data. A reconstruction back to AD > > 1400 would be reconstructing the PC#1 of the 1400-1450 interval based on > > calibration against the almost entirely unrelated PC#1 of the 1820-1980 > > interval. The reconstruction of the earliest centuries would be based on > > a completely spurious calibration of an unrelated PC of a much later > > proxy sub network. And I thought, gee, what if Scott (sorry Scott), had > > *happened* to do this in preparing the excel file that the authors used. > > Well it would mean that, progressively in earlier centuries, one would > > be reconstructing an apple, based on calibration against an orange. It > > would yield completely meaningless results more than a few centuries ago. > > And then came the true epiphany--ahhh, this could lead to the kind of > > result the authors produced. In fact, it seemed to me that this would > > almost *insure* the result that the authors get--an increasing divergence > > back in time, and total nonsense prior to 1500 or so. At this point, I > > knew that's what Scott must have done. But I had to confirm. > > > > I simply had to contact Scott, and ask him: Scott, when you prepared that > > excel file for these guys, you don't suppose by any chance that you might > > have.... > > > > And, well, I think you know the answer. > > > > So the proxy data back to AD 1820 used by the authors may by-in-large be > > correct (aside from the apparent transcription/cell shift errors which > > they purport to have caught, and fixed, anyway). The data become > > progressively corrupted in earlier centuries. By the time one goes back > > to AD 1400, the 1400-1980 data series are, in many cases, entirely > > meaningless combinations of early and late information, and have no > > relation to the actual proxy series used by Mann et al (1998). > > > > And so, the authors results are wrong/meaningless/useless. The mistake > > made insures, especially, that the estimates during the 15th and 16th > > centuries are entirely spurious. > > > > So whose fault is this? Well, the full, raw ascii proxy data set has been > > available on our anonymous ftp site > > ftp://holocene.evsc.virginia.edu/pub/MBH98/ > > and the authors were informed of this in email correspondence. But they > > specifically requested that the data be provided to them in excel format. > > And Scott prepared it for them in that format, in good faith--but > > overlooked the fact that all of the required information couldn't > > possibly be fit into a 112 column format. So the file Scott produced was > > a complete corruption of the actual Mann et al proxy data set, and > > essentially useless, transcription errors, etc. aside. The authors had > > full access to the uncorrupted data set. We therefore take no > > reasonability for their use of corrupted data. > > > > One would have thought that the authors might have tried to reconcile > > their completely inconsistent result prior to publication. One might have > > thought that it would at least occur to them as odd that the Mann et al > > (1998) reconstruction is remarkably similar to entirely independent > > estimates, for example, by Crowley and Lowery (2000). Could both have > > made the same supposed mistake, even though the data and method are > > entirely unrelated. Or might M&M have made a mistake? Just possibly, > > perhaps??? > > > > Of course, a legitimate peer-review process would have caught this > > problem. In fact, in about 48 hours if I (or probably, many of my > > colleagues) had been given the opportunity to review the paper. But that > > isn't quite the way things work at "E&E" I guess. I guess there may just > > be some corruption of scientific objectivity when a journal editor seems > > more interested in politics than science. > > > > The long and short of this. I think it is morally incumbent upon E&E to > > publish a full retraction of the M&M article immediately. Its unlikely > > that they'll do this, but its reasonable to assert that it would be > > irresponsible for them not to if the issue arises. > > > > I think that's the end of the story. Please, again, keep this information > > under wraps for next day or two. Then, by all means, feel free to > > disseminate this information as widely as you like... > > > > Mike > > > > ______________________________________________________________ > > Professor Michael E. Mann > > Department of Environmental Sciences, Clark Hall > > University of Virginia > > Charlottesville, VA 22903 > > _______________________________________________________________________ > > e-mail: mann@virginia.edu Phone: (434) 924-7770 FAX: (434) 982-2137 > > http://www.evsc.virginia.edu/faculty/people/mann.shtml > > > >------ >Stephen H. Schneider, Professor >Dept. of Biological Sciences >Stanford University >Stanford, CA 94305-5020 U.S.A. > >Tel: (650)725-9978 >Fax: (650)725-4387 >shs@stanford.edu Prof. Phil Jones Climatic Research Unit Telephone +44 (0) 1603 592090 School of Environmental Sciences Fax +44 (0) 1603 507784 University of East Anglia Norwich Email p.jones@uea.ac.uk NR4 7TJ UK ----------------------------------------------------------------------------