cc: " Moberg; Anders " <anders.moberg@natgeo.su.se>, Gabi Hegerl <hegerl@duke.edu>, esper@wsl.ch, " Briffa; Keith " <k.briffa@uea.ac.uk>, " Osborn; Tim " <t.osborn@uea.ac.uk>, m.allen1@physics.ox.ac.uk, weber@knmi.nl
date: Thu, 17 Aug 2006 18:38:17 +0100
from: Martin Juckes <m.n.juckes@rl.ac.uk>
subject: mitrie -- response to comments from Eduardo
to: Eduardo Zorita <Eduardo.Zorita@gkss.de>

On Thursday 17 August 2006 11:31, Eduardo Zorita wrote:
> 
> ﻿
> 
> Due to the ongoing debate, this has turned  an even more difficult 
manuscript. In general, I think Martin did a very good job in the review of 
the literature. Concerning the new reconstructions and the evaluation of 
McIntyre work, I would not fully agree with some of the conclusions, which I 
thin do not follow from the material presented in the text. I have some 
remarks on this which you may consider useful. But I think that I am not the 
one that should give the manuscript the final shape, as Martin is the person 
in charge of the project. Please, consider the following comments as 
suggestions.
> 
> eduardo
> 
> 
> 
> Consensus: I would tend to avoid the word 'consensus', since it is not a 
well defined concept. 
> Depending on the meaning of consensus, each would agree with it to a certain 
degree. I would prefer to refer to a particular  IPCC conclusion, or 
something similar. I think this review of the literature is very well written 
and informative, but I am not sure that  each one of us will agree with each 
one of the concussions of each of the papers. 
> 
I've removed a couple of uses of `consensus' and tried to make the text 
clearer. There is an IPCC consensus (i.e. something members of the IPCC 
agreed on) -- and I think it is worth making a distinction between this and 
other peer reviewed results such as MBH1999 conclusions etc. I've now said 
that the papers reviewed in section 2 support the IPCC [that 1990s are likely 
to have been the warmest decade of the millenium in the Northern Hemisphere.]

Later I've referred to a "general consensus": this is shorthand for saying 
that almost everyone agrees and where there is disagreement there are good 
grounds for dimissing it with regard to the specific point under discussion 
but not by any means implying taht people agree on other issues.

> Page 12, section 2.8. I think the text is somewhat vague here, and it could 
be misunderstood.
> Mann et al (2005) tested the RegEM method, not the original MBH98 method. It 
is true that applied to the
> real proxies both methods, according to Mann, yield very similar results. 
But strictly speaking , Mann did not test the MBH98 method in the CSM 
simulation. The MBH98 method is thereby only by implication

Text corrected 
> 
> I tested the the sensitivity of the MBH98, and not of RegEM,  to the length 
of the calibration period. It may be the RegEM is less sensitive or not at 
all. Figure 4 and 5, if I understood well, support this dependency of MBH to 
the calibration period.  Am I correct to interpret the large differences 
between the original MBH reconstruction (dashed red) and the black curve as 
due to the different calibration period (1901-1980 versus 1856-1980) and to 
the use of the leading PC or NHT as calibration target? At least in the 
period prior to 1600  I think these are the only methodological differences 
between both curves (?). 

I don't think so: the main difference is that the MBH1999 reconstruction uses 
more data for the more recent period, and also reconstructs more degrees of 
freedom. This should be stated in the text -- I'll check. I've tidied up this 
figure a bit. (Now figure 3, as I've omitted the previous figure 3).

> My interpretation of this figure is also somewhat different. If the final 
reconstructions differs so strongly by using a longer calibration period (in 
general yielding stronger decadal variability in the reconstruction) I would 
tend to think that the method based on these proxies is quite unstable. What 
would happen if the calibration period could have been extended to 1800, for 
instance?.

The main sensitivity which is clearly defined by the calculations I've done is 
that the adjustment of the North American tree-ring proxy 1 in MBH1999 shifts 
the AD1000 to AD1800 reconstruction up roughly 0.2K. This is now commented 
on. I'd like to look more closely at the 15th to 18th centuries, but I think 
this is best achieved by bringing in more proxies -- and I don't want to 
extend the scope of this study that far. I agree with you that there is an 
interesting and challenging issue about the 15th to 18th centuries, and hope 
to follow that up later (i.e. after submitting this).

> 
> 
> Page 15: top. The role of forcing on the global or NH T is also recognized 
in the correlation between the NHT simulated by ECHO-G and CSM for the 
millennium. For the case of  a second ECHO-G simulation /Gonzalez-Rouco et 
al.) the agreement is very close at 30-year timescale.
>
OK,  I'll add the citation. 
> Section 3, beginning.
> In my opinion, MM05 stress the inadequacies and uncertainties in the MBH 
work, but they not put forward their own reconstruction implying a 
warmer-than-today MWP.  They believe that this is true, but in their works so 
far, at least to my knowledge, they do not assert that the MWP was warmer 
than present, only that the uncertainties are too large for such a claim.

Figure 8 of McIntyre and McKitrick (2003) says "Corrected version: 20th 
century no longer highest". There 2005 paper does not reproduce this, so 
their published statement implying they have reproduced the results is false.
They left out most of the data by mistake and got garbage -- it is fairly 
clear. 
> 
> Section 3: Consensus. This paragraph may be problematic. Again what is the 
consensus? If we look at the recent NAS report, which again not every one 
would agree with, the 'consensus' is reduced to the past 400 years in 
comparison to IPCC, leaving ample space for speculation before this period. 
Does the NAS report belong to the consensus?  perhaps partially, but I am not 
sure to what extent.
> 
The fact that there is ample space for speculation does not mean there cannot 
also be a consensus. I don't think a report should "belong" to the consensus, 
the consensus is the body of statements which are agreed on. 

> Section 3, discussion of MM05 and hockey-stick index. I have here a certain 
level of disagreement  with these paragraphs. The issue raised by MM05 would 
be  that the de-centering of the proxies prior to the calculations of the 
principal components tends to produce hockey-stick-shaped leading PC.  I 
think this effect is true, at least with spatially uncorrelated red-noise 
series . It can be easily verified and it has been recognized in the NAS, the 
Wegman report and  by Francis Zwiers. To be fair, following this issue is the 
problem of the truncation- just to keep the leading PC or further Pcs down 
the hiercharchy, and if this is done, the final differences could be probably  
minor. in the final reconstructions.  But the paragraph implies, in my 
opinion, that this criticism  by MM05 has no grounds, which as I said is 
problematic and could open the manuscript with criticisms based on these 
recent reports.

Its a theoretical possibility in certain parameter regimes, yes, but its not 
relevant here. I have no problem with MM05 raising this issue, the problem is 
the inaccurate and misleading material they put in their papers.
> 
> I think that the calculation shown in Figure 3 is very useful, as it boils 
down to the issue raised by MM05: how relevant is the de-centering and 
standardization with real proxies?. Apparently, I get a different message 
from Figure3 (although I may have misinterpreted the text). I see quite large 
differences in the 20th century between the original MBH leading PC and the 
'correct' calculation (whole period centering and standarization,blue line). 
Only the original MBH PC shows a positive trend in the 20th century. The blue 
lines seems even to show a negative trend or no trend at all. If this PCs 
were to be used in the MBH regression model (with trend included in the 
calibration) the results could be quite different. I would tend to think that 
this figure actually supports the MM05 criticism, since the hockey-stick 
shape of the leading PC disappears.

I've moved this, and the associated reconstructions, into "supplementary 
material", mainly to avoid having to discuss all the issues around the AD1400 
to present proxies, and also the difference between reconstructing multiple 
temperature PCs and then evaluating their mean and direct reconstruction of 
the mean temperature. There is some sensitivity to the principal components, 
but very little in the reconstruction. 
> 
> Section 3, end, bristlecone pines. I am also worried by this paragraph. The 
recent NAS report clearly states that the bristlecone pines should not be 
used for reconstructions in view of their potential problems. They cite 
previous analysis on this issue. I think that to refer to just one study  
indicating no fertilization effect could not be enough. However, I am not a 
dendroclimatologist. This could open the door to potential problems.
> 
I've spoken to Ed. Cook last year and he didn't know of any specific evidence 
of CO2 fertilization in mature trees. I haven't seen the NAS report (what's 
its title?) -- it would be interesting to see what they base their argument 
on. As far as I know the one report I've cited covers the only study on 
mature trees in controlled conditions (its not easy to keep large trees in an 
enhanced CO2 environment).

> Section 4 , end. years 1997 and onwards were the warmest in the millennium. 
I see here also potential problems with this claim, and I do not see  the 
need to make our lives more complicated. The NAS report expressed that the 
uncertainties are too large for this type of conclusion and certainly this 
conclusion  would attract some attention from the reader. I see two lines of 
criticism on this: one is that the standard errors have been calculated with 
the calibration residuals and these are an underestimation of the true 
uncertainties. A reviewer may require that the uncertainty range be 
calculated by cross-calibration or bootstraping. In the case of CVM perhaps 
this effect is not very important, as there is just one free parameter, but 
in the case of inverse regression there are much many more free parameters 
and the true uncertainties can be quite different from those estimated from 
the calibration residuals. This potential criticism could be exacerbated by 
the fact that the new reconstruction has not been tested in a validation 
period.

I haven't quoted any uncertainties for the inverse regression result for this 
reason. The statements in the text should be simple statements of factual 
results: the maximum temperature in the preindustrial time is x and the 
highest temperature in the instrumental record is more than 4sigma greater,
where sigma is ....

> 
>  The other line of criticism could be  that the calibration period has been, 
as in all reconstructions, a priori truncated -data after 1980 are not 
considered as the proxies are known to not follow the temperature. Strictly 
speaking this truncation can be only justified by a credible physical 
explanation about the cause of this divergence. Statistically, I think it is 
not correct to a priori ignore some data because they do not fit. If one does 
so, I think the uncertainty range should be enlarged to encompass the 
possibility that this divergence could have happened in the past, i.e. an 
additional standard deviation of the instrumental NH T in the period 
1980-2000 (or perhaps more correct, the square root of the sum of the error 
variance and the NHT variance in 1980-2000). Alternatively, one could include 
the period 1980-2000 in the calibration and due to the divergence the 
standard errors would grow, but perhaps this is practically not possible as 
the proxy time series may not have been archived for the last 20 years.

Which data do you think I'm ignoring "because they don't fit"? This is a 
rather unnecessary accusation. The problem is that many of the proxies are 
not annually updated, so that, with the methods used in this study, extending 
the calibration period reduces the number of proxies. I'll do a couple of 
sensitivity studies, but ideally we need to develop a means of exploiting 
proxies which do not cover the whole calibration period. REGEM might do this, 
but I'm not entirely convinced as yet -- mainly because the complexity makes 
it difficult to know exactly what is going on.

> 
> Section 5, conclusions.
> 
> I share the worry of Anders Moberg about the wording 'serious flaws' in the 
analysis of MM05.  This sentence would be based on Figure 3, if I understood 
properly, but as I said I think Figures 3 actually does not support this 
conclusion.
> 
As far as MM05 goes, they make an clearly inaccurate claim about reproducing 
their earlier results. I only want to say the paper has "serious flaws", not 
that everything in it is wrong.
> 
> Finally, I think it would strategically better to avoid conflicts on the 
particular point of whether some particular year was the warmest of the 
millennium or not, and to stress the fact that all reconstructions, also the 
new ones presented in the manuscript (with one exception) show MWP 
temperatures lower than late 20th century temperatures.
> 
Up to a point (the year of the maximum is only given for information, to 
describe the reconstruction). 
> 
> Another conclusion could be, in my view, that the average temperature in the 
cold centuries in the millennium seems to be still quite uncertain. The new 
reconstructions, or the calculation of the leading PCs of the proxies, seem 
to be still quite sensitive to particular choices in the statistical set-up.
> 
>
yes, I'll try to emphasise this -- it is now in the first paragraph of the 
conclusions.
> 
> 
