cc: Mike Hulme <m.hulme@uea.ac.uk>
date: Wed Mar 31 10:18:16 2004
from: Phil Jones <p.jones@uea.ac.uk>
subject: Re: ATEAM climate data
to: Timothy Carter <tim.carter@ymparisto.fi>, Tim Mitchell <tim.mitchell@surrey.ac.uk>

    Tim C.
         Quickly reading your response to Tim. M. I think you're defending impacts analysts
    far too much. Whenever I meet some of these people, I have to bite my lip to avoid
    saying something I'll regret. Impacts people need to be made aware of the limitations
    of observed data and even more of model data. What Tim has done is likely the best that
    can be done given the limitations of what we can get hold off, yet still trying to
   maintain
    the weak correlations between variables.
         At many meetings impacts people ask for model futures for variables and time
   intervals
    we just don't have in the real world. How then do they test their models? Chris Kilsby
    is working to derive 5 minute rainfall scenarios for an EPSRC project, because the
    hydrologists on one project want this. There is one raingauge in the UK with 5 minute
    rainfall for 20 years. They want it for urban catchments in northern England, the long
    record is from Farnborough. When pushed on this they gave us one year's data for site
    near Bradford. They said they had techniques for making 1000 years of records from
    one year of data. Despite this being a climate change project they just thought that
    high-frequency rainfall variations will change according to the mean.
       To show them at our next meeting, we're going through HadAM3P/H and HadRM3P/H
    looking at convective/total precip and large-scale/total precip ratios and A2 scenario
    changes. I've never seen these sorts of plots before. The results are frightening. In
   winter
    over the Mediterranean, 90% of the rainfall over the sea is convective, but on land less
    than 10% is convective. I've never seen a variable delineate the coastline so well. How
   does
    large-scale rainfall which falls on the land not fall into the sea.

        Tim  may not have said, but we already have one review of the J. Climate paper
    (from Tom WIgley) which is by Tom's standards good. I'm dreading getting the reviews back
    as I think it will be me who has to respond to them.  I know I'm not going to have much
    time to respond, so the first thing I'll do will be to ask for an extension of the likely
   1 month
    that we'll be given - if the other reviews are as favourable as Tom's.
    Cheers
    Phil
   At 18:35 30/03/2004 +0300, Timothy Carter wrote:

     Hello Tim,
     Thanks for the clarifications. I recognise the enormous amount of thought and effort
     that has gone in to developing these data sets and agree that this is probably the best
     available climatological data set at this resolution. However, therein lies the problem
     encountered in ATEAM, and it really is a problem. This is a climatological data set; it
     is not a data set that is immediately applicable in impact assessment. If the data set
     is to be used, it is essential, at the minimum, to understand where data are present and
     where they are absent. This information is not provided here. I also think that there is
     a difference between information on presence/absence and information on unreliability
     (e.g. due to interpolation). You seem to be arguing that there is a continuum between
     complete absence of data (relax to zero anomaly) and fully reliable data. I would at
     least distinguish first between some data and no data.
     The researchers applying these data are not climatologists, and I think there is a
     perception among most that the data sets are comprehensive in time and space (that word
     is even used in the title of the submitted paper!). Yes, they are comprehensive in that
     they offer values for each grid box and month for all variables. However, in cases where
     there are missing or sparse data, these "values" are simply equivalent to 1961-1990
     means. This makes them unusable in most impact assessments where inter-annual
     variability is of importance.
     So I wonder why we decided to provide the data in this format (I was part of that
     decision process, of course), especially since no detailed information is provided to
     describe those grid boxes/years in which data are missing. I don't think it is
     sufficient to refer to New et al. (2000) for more (and by no means complete)
     information. Nor is it fair to the impact analysts to expect them to "allow for this
     feature in their experimental design". The "feature" is hardly made clear in the
     documentation, and is extremely difficult to avoid, considering that the climate data
     were provided to partners as full 200-year pre-processed data sets. The problem is
     confounded by repeating the historical inter-annual variability into the future. This
     procedure is fine if there is historical variability to repeat, but this wasn't the case
     here for at least half of the 20th century for cloud, VP and DTR.
     I wonder if there is something that can be done to assist those partners who need
     realistic inter-annual data? One method would be to attempt to predict
     cloudiness and DTR from temperature and or precipitation using regression relationships
     developed for periods with more reliable data. The correlations are not always very
     high, but at least this would provide annually varying surrogate series that are related
     in some way to the variables (T and P) for which we do (I assume) have full coverage.
     My colleague has been looking at this possibility with the detrended anomalies. The idea
     would be to create a surrogate series for e.g. 1901-1950, and then to repeat this series
     in 2001-2050, superimposed on the GCM-based trend that is already included in the data
     set.
     Do you have any comments on this approach? It is quick and dirty, and would require some
     documentation. But it would then offer at least the possibility for partners requiring
     these data to run their models for time series that are comparable across the project
     for 1901-2100. This would not be the case if, for example, the 1951-2000 data were used
     twice historically and twice in the future. Nor would it make much sense to apply
     1951-2000 inter-annual variability in cloud alongside 1901-1950 temperature and
     precipitation.
     Sorry for prolonging the agony of this debate. I don't think this in any way invalidates
     the data sets, or the paper describing them. But it does require us to highlight when
     they can be applied and when they cannot.
     Best regards,
     Tim
     At 13:27 30/03/04, Tim Mitchell wrote:

     Tim,
     I'll deal with the issues you raise below, but I think it is important to
     emphasise that:
     (a) these data-sets, warts and all, are already in the public domain and
     cannot be withdrawn, but can be improved and updated;
     (b) the J Clim paper submitted last July should be published ASAP, and
     certainly without undue delay on the part of the authors.
     The data-sets have been publicly available for over a year and the proper
     documentation (in the peer-reviewed literature) ought to be available. Also,
     we have a revised version of the observed 0.5deg grids, based on a complete
     overhaul of the underlying databases, which extends the period covered to
     2002. Thus far we have not felt able to release either these data or the
     accompanying paper (Mitchell and Jones, 2004) into the public domain until
     the previous version of the data-set has been accepted for publication.
     I would recommend that you inspect not just the J Clim paper submitted last
     July, but also Mark New's 2000 paper on the 0.5deg gridded time-series. This
     gives more helpful background on the methods used in the gridding, and might
     clear up some misunderstandings.
     You appear to be have the impression that a time-series is calculated
     independently for each grid-box. That is not the case. A smooth surface of
     anomalies is calculated for each time-step, and the grid of values is
     derived from the smooth surface. See the New et al 2000 paper. We -
     including Mark New - have always presented these spatially complete
     time-series as best-estimates, with data quality varying in space and time.
     We will never have complete records of inter-annual variability, so if we
     were to wait until we had, you would never have a valuable - but imperfect -
     data-set to use.
     Regarding your numbered questions:
     1. Yes it has. See New et al, 2000. There is also some discussion of this in
     Mitchell and Jones, 2004.
     2. I would dispute your labelling of this as a 'problem'. It is actually a
     feature that was specifically allowed and controlled in the design of the
     data-sets. It only becomes a 'problem' in experiments that use these
     data-sets and that have not - for whatever reason - allowed for this feature
     in their experimental design. In some senses this feature is present for all
     boxes and at all periods of time, because the interpolated surface is always
     based on an imperfect representation of the true climate variability. DTR,
     and hence vapour pressure and cloud cover, are likely to be less well
     represented than temperature and precipitation.
     3. If we had such information then it would already be included! The 0.5deg
     grids are based on exactly the same underlying databases, so will offer no
     improvement. There is no substitute for the long-term painstaking
     improvement of the underlying databases. There are no quick fixes.
     Best wishes
     Tim
     On 24/3/04 6:47 pm, "Timothy Carter" <tim.carter@ymparisto.fi> wrote:
     > Dear Tim,
     >
     > I have just talked with some ATEAM colleagues who are applying the climate
     > scenarios in long-term simulations of forest growth over Europe. These
     > simulations have exposed some important problems with the gridded data.
     > This concerns the representation of inter-annual variability in the
     > historical and scenario time series. As I understand it, in some data
     > sparse regions for some periods (early in the historical record) the annual
     > anomalies have been "relaxed to zero". Checking the J. Climate paper, I see
     > that this is indeed reported, but the implications of this procedure have
     > not been apparent to me until now. The text on page 21 of the paper is as
     > follows ....
     >
     > "....If reflected in the time series of c, an abrupt transition in variability
     > would be introduced from one century to the next. This problem is
     > relatively small in Europe, so for TYN SC 1.0 advantage was taken of the
     > larger sample of interannual variability available from the entire 20th
     > century. ....."
     >
     > I'm in France at present, so can't check the data sets. However, it seems
     > that for some (all?) regions of Europe, the 10' cloudiness and DTR time
     > series at individual grid boxes is "flat" for the first half-century, and
     > inter-annual variability only begins in the second half of the century.
     > Moreover, this sequence then repeats into the 21st century, giving a sharp
     > discontinuity at 2001 from variable to flat and then variable again in the
     > second half of the 21st century.
     >
     > This procedure may well be justifiable from the climatological point of
     > view (lack of stations to interpolate between), but perhaps we should have
     > supplied only those parts of the time series for which inter-annual
     > variability could be defined. As it is, people are applying the full series
     > and noticing major effects when alternating between zero variability and
     > realistic variability.
     >
     > I also wonder about the advisability of making these data available and
     > reporting them in the paper until the time series of inter-annual
     > variability are complete for ALL grid boxes.
     >
     > Moreover, in the submitted paper, there is mention of a different procedure
     > that was used for the 0.5 degree global data set involving repeating the
     > 1951-2000 series in 1901-1951.
     >
     > I wonder if you could clarify:
     >
     > 1. For what regions is inter-annual variability information lacking? Has
     > this been mapped/summarised somewhere?
     > 2. For which climatic variables is this a problem? Note that even 1 grid
     > box could be a problem if people happen to be working in that area!
     > 3. Do you have suggestions for providing inter-annual variability
     > information in the periods currently lacking such data? Could we substitute
     > information from the 0.5 degree grid?
     >
     > Sorry to bring this up at this late stage, but some people are having real
     > problems with these data and I need to understand what has been done and
     > how to advise the ATEAM groups.
     >
     > Best regards,
     >
     > Tim
     >
     >
     ____________________________________
     Dr. T. D. Mitchell --- 07906 922 489
     tim.mitchell@surrey.ac.uk

   Prof. Phil Jones
   Climatic Research Unit        Telephone +44 (0) 1603 592090
   School of Environmental Sciences    Fax +44 (0) 1603 507784
   University of East Anglia
   Norwich                          Email    p.jones@uea.ac.uk
   NR4 7TJ
   UK
   ----------------------------------------------------------------------------
