date: Wed, 29 Oct 2003 08:35:33 +0000
from: Phil Jones
subject: Fwd: Re: STOP THE PRESS!
to: k.briffa@uea.ac.uk
>Date: Tue, 28 Oct 2003 19:10:17 -0800 (PST)
>From: Stephen H Schneider
>To: "Michael E. Mann"
>cc: Richard Kerr , Andy Revkin ,
> David Appell ,
> ,
> Mike MacCracken ,
> Michael Oppenheimer ,
> "Socci.Tony-epamail.epa.gov" ,
> , ,
> , Jonathan Overpeck ,
> Phil Jones , Scott Rutherford
> ,
> Gabi Hegerl , tom crowley ,
> Tom Wigley , Tim Osborn ,
> Stefan Rahmstorf ,
> Gavin Schmidt ,
> Rob Dunbar , ,
> Ross Gelbspan , Ben Santer ,
> ,
>Subject: Re: STOP THE PRESS!
>
>Hello all. Interesting tale--why we have competent peer review at
>competent journals, and why professional courtesy is always to run
>heterodox results by the orthodox for private comments before going
>public--unless the motivation isn't science, but a big spalsh. Too bad for
>them--the wrong guys will belly-flop (couldn't have happened to a nicer
>bunch of prevaricators!). By the way, I give it a 50% (Bayesian priors)
>subjective probability they will accuse you of deliberately misleading
>them or deliberately preventing replication by "independent" scientists
>and the only reason they did this was to smoke you out. From them, expect
>anything. Can you explain this to Senator McCain's folks so they
>understand the complexities and professional courtesy/peer review issues?
>This stuff is not very sound bite friendly and needs some prethinking to
>put it simply and clearly so it can be useful in the debate held by
>non-scientist debaters. Good luck, Steve
>
>On Tue, 28 Oct 2003, Michael E. Mann wrote:
>
> > Dear Friends and Colleagues,
> >
> > I've got a story with a very happy ending to tell. I't will take a bit
> > of patience to get through the details of the story, but I think its
> > worth it.
> >
> > By the way, please keep this information confidential for about the next
> > day or so.
> >
> > OK, well its about 48 hours since I first had the chance to review the
> > E&E paper by M&M. Haven't had a lot of sleep, but I have had a lot of
> > coffee, and my wife Lorraine has been kind enough to allow me to stay
> > perpetually glued to the terminal. So what has this effort produced?
> >
> > Well, upon first looking at what the authors had done, I realized that
> > they had used the wrong CRU surface temperature dataset (post 1995
> > version) to calculate the standard deviations for use in un-normalizing
> > the Mann et al (1998) EOF patterns. Their normalization factors were
> > based on Phil's older dataset. The clues to them should have been that a)
> > our data set goes back to 1854 and theirs only back to 1856 and (b) why
> > are 4 of the 1082 Mann et al (1998) gridpoints missing?? [its because
> > the reference periods are different in the two datasets, which leads to a
> > different spatial pattern of missing values]. So they had used the wrong
> > temperature standard deviations to un-normalize our EOFs in the process
> > of forming the surface temperature reconstruction. And I thought to
> > myself, hmm--this could lead to some minor problems, but I don't see how
> > they get this divergence from the Mann et al (1998) estimate that
> > increases so much back in time, and becomes huge before 1500 or so. That
> > can't be it, can it?
> >
> > Then I uncovered that they had used standard deviations of the raw
> > gridpoint temperature series to un-normalize the EOFs, while we had
> > normalized the data by the detrended standard deviations. Either
> > convention can be justified, but you can't mix and match--which is what
> > they effectively did by adopting our EOFs and PCs, and using their
> > standard deviations. And I thought, hmm--this could certainly lead to an
> > artificial inflation of the variance in the reconstruction in general,
> > and this could give an interesting spatial pattern of bias as well (which
> > might have an interesting influence on the areally-weighted hemispheric
> > mean). But I thought, hmm, this can't really lead to that tremendous
> > divergence before 1500 that the authors find. I was still scratching my
> > head a bit at this point.
> >
> > Then I read about the various transcription errors, values being shifted,
> > etc. that the authors describe as existing in the dataset. And I thought,
> > hmm, that sounds like an excel spread sheet problem, not a problem w/ the
> > MBH98 proxy data set. It started to occur to me at this point that there
> > might be some problems w/ the excel spreadsheet data that my colleague
> > Scott Rutherford had kindly provided the authors at their request. But
> > these problems sounded pretty minor from the authors' description, and
> > the authors described a procedure to try to fix any obvious
> > transcription errors, shifted cell values, etc. So I thought, hmm, they
> > might not have fixed things perfectly, and that could also lead to some
> > problems. But I still don't see how they get that huge divergence back in
> > time from this sort of error...
> >
> > Still scratching my head at this point...Then finally this afternoon,
> > some clues. After looking at their on-line description one more time, I
> > became disturbed at something I read. The data matrix they're using has
> > 112 columns! Well that can't be right! That's can't constitute the Mann
> > et al (1998) dataset. There are considerably more than that number of
> > independent proxy indicators necessary to reproduce the stepwise Mann et
> > al reconstruction. Something is amiss!
> >
> > Well, 112 is the number of proxy indicators used back to 1820. But some
> > of these indicators are principal components of regional sub-networks
> > (e.g. the Western U.S. ITRDB tree-ring data) to make the dataset more
> > managable in size, and those principal components (PCs) are unique to the
> > time interval analyzed. So there is some set of PC series for the
> > 1820-1980 period. Farther back in time, say, back to 1650 there are fewer
> > data series the regional sub-networks. So we recalculate a completely
> > different EOF/PC basis set for that period, and that constitutes an
> > additional, unique set of proxy indicators that are appropriate for a
> > reconstruction of the 1650-1980 period. PC #1 from one interval is not
> > equivalent to PC#1 from a different interval. This turns out to be the
> > essential detail. A reconstruction back to 1820 calibrated against the
> > 20th century needs to make use of the unique set of proxy PCs available
> > for the 1820-1980 period. A reconstruction back to 1650 calibrated
> > against the 20th century needs to make use of the independent (smaller)
> > set of PC series available for the 1650-1980 period, and so on, back to
> > 1400.
> >
> > So there have to be significantly more than 112 series available to
> > perform the iterative,stepwise reconstruction approach of Mann et al
> > (1998), because each sub interval actually has a unique set of PC series
> > representations of various proxy sub-networks. Then it started to hit
> > me. The PC#1 series calculated for networks of similar size (say, the
> > network available back to 1820 and that available back to 1750) should be
> > similar. But as the sub-network gets sparser back in time, the PC#1
> > series will resemble less and less the PC#1 series of the denser networks
> > available at later times. PC#1 of the western ITRDB tree-ring calculated
> > for the 1400-1980 period will bear almost no resemblance to the PC#1
> > series of the western N.Amer ITRDB data calculated for the 1820-1980
> > period during their interval (1820-1980) of mutual overlap.
> >
> > Then it really hit me. What--just what--if the proxy data had been
> > pigeonholed into a 112 column matrix by the following (completely
> > inappropriate!) procedure: What if it had been decided that there would
> > only be 1 column for "PC #1 of the Western ITRDB tree ring data", even
> > though that PC reflects something completely different over each
> > sub-interval. Well, that can't be done in a reasonable way. But it can be
> > done in an *unreasonable* way: by successively overprinting the data in
> > that column as one stores the PCs from later and later intervals. So a
> > given column would reflect PC#1 of the 1400-1980 data from 1400-1450,
> > PC#1 of the 1450-1980 from 1450-1500, PC#1 of the 1500-1980 data for
> > 1500-1650, PC#1 of the 1650-1980 data for 1650-1750, etc. and so on. In
> > this process, the information necessary to calibrate the early PCs would
> > be obliterated with each successive overprint. The resulting 'series'
> > corresponding to that column of the data matrix, an amalgam of
> > increasingly unrelated information down the column, would be completely
> > useless for calibration of the earlier data. A reconstruction back to AD
> > 1400 would be reconstructing the PC#1 of the 1400-1450 interval based on
> > calibration against the almost entirely unrelated PC#1 of the 1820-1980
> > interval. The reconstruction of the earliest centuries would be based on
> > a completely spurious calibration of an unrelated PC of a much later
> > proxy sub network. And I thought, gee, what if Scott (sorry Scott), had
> > *happened* to do this in preparing the excel file that the authors used.
> > Well it would mean that, progressively in earlier centuries, one would
> > be reconstructing an apple, based on calibration against an orange. It
> > would yield completely meaningless results more than a few centuries ago.
> > And then came the true epiphany--ahhh, this could lead to the kind of
> > result the authors produced. In fact, it seemed to me that this would
> > almost *insure* the result that the authors get--an increasing divergence
> > back in time, and total nonsense prior to 1500 or so. At this point, I
> > knew that's what Scott must have done. But I had to confirm.
> >
> > I simply had to contact Scott, and ask him: Scott, when you prepared that
> > excel file for these guys, you don't suppose by any chance that you might
> > have....
> >
> > And, well, I think you know the answer.
> >
> > So the proxy data back to AD 1820 used by the authors may by-in-large be
> > correct (aside from the apparent transcription/cell shift errors which
> > they purport to have caught, and fixed, anyway). The data become
> > progressively corrupted in earlier centuries. By the time one goes back
> > to AD 1400, the 1400-1980 data series are, in many cases, entirely
> > meaningless combinations of early and late information, and have no
> > relation to the actual proxy series used by Mann et al (1998).
> >
> > And so, the authors results are wrong/meaningless/useless. The mistake
> > made insures, especially, that the estimates during the 15th and 16th
> > centuries are entirely spurious.
> >
> > So whose fault is this? Well, the full, raw ascii proxy data set has been
> > available on our anonymous ftp site
> > ftp://holocene.evsc.virginia.edu/pub/MBH98/
> > and the authors were informed of this in email correspondence. But they
> > specifically requested that the data be provided to them in excel format.
> > And Scott prepared it for them in that format, in good faith--but
> > overlooked the fact that all of the required information couldn't
> > possibly be fit into a 112 column format. So the file Scott produced was
> > a complete corruption of the actual Mann et al proxy data set, and
> > essentially useless, transcription errors, etc. aside. The authors had
> > full access to the uncorrupted data set. We therefore take no
> > reasonability for their use of corrupted data.
> >
> > One would have thought that the authors might have tried to reconcile
> > their completely inconsistent result prior to publication. One might have
> > thought that it would at least occur to them as odd that the Mann et al
> > (1998) reconstruction is remarkably similar to entirely independent
> > estimates, for example, by Crowley and Lowery (2000). Could both have
> > made the same supposed mistake, even though the data and method are
> > entirely unrelated. Or might M&M have made a mistake? Just possibly,
> > perhaps???
> >
> > Of course, a legitimate peer-review process would have caught this
> > problem. In fact, in about 48 hours if I (or probably, many of my
> > colleagues) had been given the opportunity to review the paper. But that
> > isn't quite the way things work at "E&E" I guess. I guess there may just
> > be some corruption of scientific objectivity when a journal editor seems
> > more interested in politics than science.
> >
> > The long and short of this. I think it is morally incumbent upon E&E to
> > publish a full retraction of the M&M article immediately. Its unlikely
> > that they'll do this, but its reasonable to assert that it would be
> > irresponsible for them not to if the issue arises.
> >
> > I think that's the end of the story. Please, again, keep this information
> > under wraps for next day or two. Then, by all means, feel free to
> > disseminate this information as widely as you like...
> >
> > Mike
> >
> > ______________________________________________________________
> > Professor Michael E. Mann
> > Department of Environmental Sciences, Clark Hall
> > University of Virginia
> > Charlottesville, VA 22903
> > _______________________________________________________________________
> > e-mail: mann@virginia.edu Phone: (434) 924-7770 FAX: (434) 982-2137
> > http://www.evsc.virginia.edu/faculty/people/mann.shtml
> >
>
>------
>Stephen H. Schneider, Professor
>Dept. of Biological Sciences
>Stanford University
>Stanford, CA 94305-5020 U.S.A.
>
>Tel: (650)725-9978
>Fax: (650)725-4387
>shs@stanford.edu
Prof. Phil Jones
Climatic Research Unit Telephone +44 (0) 1603 592090
School of Environmental Sciences Fax +44 (0) 1603 507784
University of East Anglia
Norwich Email p.jones@uea.ac.uk
NR4 7TJ
UK
----------------------------------------------------------------------------