cc: "Bai, Zhanguo" date: Fri, 9 Mar 2007 07:32:39 +0100 from: "Dent, David" subject: RE: to: "Phil Jones" Dear Phil Thank you for this helpful advice. Please let me know how we can get access to your updated record. Kind regards David ______________________________________________________________________________________ From: Phil Jones [mailto:p.jones@uea.ac.uk] Sent: donderdag 8 maart 2007 17:48 To: Dent, David Cc: Bai, Zhanguo Subject: RE: David, The GPCC dataset is likely a better product than ours, so I would go with that one. They have tried to use a consistent set of stations through time. Their dataset is not updated in real time, and ours is. There are a few gripes in his email that I've not heard before. I didn't realize he was now at FAO. DWD did get all our raw station data and according to Bruno Rudolf they were merged in. Jurgen's gripes seem to go quite deep. We do have a slightly decreasing # of stations, but at least we get to the near present. At large scales it will not make that much difference. I have some examples from the upcoming 4th IPCC Assessment Report that I can send you after May 4. CRU data compares will with most US datasets and with the two DWD datasets. Here is one - don't pass on. It shows a number of different datasets. It isn't clear which is right if any are. Some contain satellite estimates , some don't. Cheers Phil At 15:37 08/03/2007, Dent, David wrote: Dear Phil Thank you for your kind reply. ISRIC - World Soil Information is undertaking a Global Assessment of Land Degradation and Improvement within an FAO project: Land Degradation in Drylands. Initially, we are identifying black spots of degradation by trend analysis of the GIMMS data set of fortnightly NDVI at 8km resolution since 1981, as a proxy for net primary productivity. Because production depends upon climate as well as land, we are looking at both NDVI trend and rain-use efficiency. For calculation of rain-use efficiency, we have two climatic datasets - yours and VASClimO, GPCC Germany ([1]http://www.dwd.de/en/FundE/Klima/KLIS/int/GPCC/GPCC.htm). We shall use the data at 0.5 degree resolution. We have been using the CRU TS 2.1 dataset in our preliminary case studies in North China and Kenya to calculate the trends of rain-use efficiency from 1981 onwards. However, we have, since then, looked at the VASCLimO dataset: 50 years global precipitation data 1951-2000. Not being climatologists, we cannot make an informed choice, except that VASCLimO is not being updated. The GPCC Full Data Reanalysis Products might be considered - from 2001 onwards but it does not use the same stations as VASClimO - so this does not seem to be a good idea! One of the VASClimO creators, Dr Juergen Grieser, employed by FAO now, has comments on the datasets: 'My objections with respect to the CRU data are that they use record fragments of various lengths and have a continuous decreasing number of stations during both the last decades of the 20th century... 'GPCC usually provides gridded data on 1 or 2.5 degree resolution. This means 17,689 and 3,355 gridpoints for global land surfaces, respectively. In the latter case one had roughly 2-3 stations per gridbox if the stations were equally distributed in space. And they definitely are not. 'With 0.5 degree resolution of the VASClimO dataset we got 65,617 grid points with 9,343 stations! Now if you go for 8 km grid which I suppose is 5'x5' you would have 36 times that much grid cells, which is about 2.3 million locations but still less than 10,000 stations. What I am trying to stress is that the input information is much sparser than the resolution you are interested in. 'The VASClimO dataset was generated with special emphasis on data quality, homogeneity and shortness of gaps in the records. Our goal was to make about the same error for each month and not providing best estimates for each month individually (as it is usually done). The latter case leads to datasets which mix climate trends and data-availability trends. Therefore VASClimO is a unique dataset particularly suitable for precipitation change investigations. 'Unfortunately, the routines had to be programmed and installed on private computers since the subproject leader Dr. Bruno Rudolf refused to provide necessary programmable computer power to the project employees. Therefore, at the end of the project, the programs and the knowledge how to test and interpolate the data (beyond stupidly and suboptimally applying donated software) left the Met Service (DWD) with the employees. No extension of the VASClimO dataset is available. 'However, the GPCC full-data product just uses all available data which means that whole country specific subsets may be switched on in one year and off in another year, making time series analysis useless. No quality control at all is applied to the data used for the full-data product. Finally not relative precipitation is interpolated (as it should be the case in order to get realistic values and to keep the local long-term average in good agreement with the observations) but precip totals are interpolated. This leads to "interpolated" funny rain amounts especially in dry areas. Here at FAO, however, we have lots of fun with the GPCCs results. 'Also note that an extension of the VASClimO dataset would mean a new calculation, quality control, homogeneity testing and station selection, since e.g. the condition of only a certain fraction of gaps would be fulfilled for another subset of stations. 'As I said before the problem with the Full-Data Product is, as with all the other data sets except VASClimO (Does that include CRU?) that it uses a considerably different amount of stations for each month. Each month go the highest station amount available. 'The paper you attached clearly shows the jump in data availability in 1986 (the traditional start year of GPCC). Therefore I personally doubt that you could use the data for time series analysis, i.e. trend estimation, since trends at least in some regions may just result from the fact that in part of the time series the nearest sation is a different one than in other parts of the time series. Since precipitation is highly variable in space this makes quite a difference. Note that GPCC also does not use a base precipitation field (long-term means) and interpolate relative deviations as it is recommended by all the groups which have some understanding of precipitation and interpolation (although they could do) but simply interpolate rainfall observations as they are and by that transfer observations over hundreds of kilometers and across climate regimes an simply do not care whether or not they get reasonable values. 'Therefore, if you are not interested in a snapshot I very much recommend not to use the Full-Data Product. If you are very desperately looking for data you may which to correlate the time series of the Full-Data Product and the VASClimO Product gridpoint by gridpoint and learn about the use of a regression to fit them. Maybe in the region you are interested in and for the ration you are investigating this leads to usable results. 'However, our experience to correlate the records of the CRU grid with the ones of the Full-Data Product is bad. And we gave up merging these data for water balance calculations.' What to do? I seek your advice because we are about to undertake the global analysis of land degradation - a task of 30 months and need to start now. Kind regards David <> <> Prof. Phil Jones Climatic Research Unit Telephone +44 (0) 1603 592090 School of Environmental Sciences Fax +44 (0) 1603 507784 University of East Anglia Norwich Email p.jones@uea.ac.uk NR4 7TJ UK ----------------------------------------------------------------------------