Response to John Christy’s blog post regarding ‘Klotzbach Revisited’


Guest blog by Jos Hagelaars

Dr. John Christy wrote an extensive blog post as a response to my Dutch ‘Klotzbach Revisited’ post (English version here), it is published on “Staat van het Klimaat” and WUWT. I would like to thank Dr. Christy for his interest in my writings.

I have some remarks regarding Dr. Christy’s post, which are addressed in this ‘response-post’ and are built upon some quotes taken from Dr. Christy’s response.
For reference, the original Klotzbach et al 2009 paper (K-2009 in the text) can be found here and the correction paper (K-2010) can be found here.

“Klotzbach et al.’s main point was that a direct comparison of the relationship of the magnitude of surface temperature trends vs. temperature trends of the troposphere revealed an inconsistency with model projections of the same quantities.”

This ‘main point’ is not present at all in the K-2009 paper, the only reference to real data coming from a climate model in the paper is the amplification factor, which was ‘sort of obtained’ by Ross McKitrick from the GISS-ER model. In the abstract a short conclusion is given: “These findings strongly suggest that there remain important inconsistencies between surface and satellite records.”. No word about models.

In my opinion the main point of K-2009 is the suggestion that the surface temperature record is biased. One third of the paper is made up by paragraph 2 with the title: “Recent Evidence of Biases in the Surface Temperature Record”. K-2009 explicitly state:
In our current paper, we consider the possible existence of a warm bias in the surface temperature trend analyses …

“It appears Hagelaars’ key point is that when the data from Klotzbach et al. are extended beyond 2008 to include data through 2012, the discrepancies, i.e. the observed difference between surface and tropospheric trends relative to what models project, are reduced somewhat.”

My key point is that if K-2009 were correct, the absolute temperature difference between surface and troposphere would be expected to increase over time (due to the fact that the presumed bias in the surface temperature data has not magically disappeared, see e.g. this paper by Watts et al). In contrast, this temperature difference has decreased about 33% for the ‘NCDC minus UAH’ data (which showed the largest discrepancy). This absolute difference was used by Marcel Crok in his book “De Staat van het Klimaat” to suggest that the surface temperatures is biased.
Why this large difference with only 13% (4 years) more data? If anything, it casts doubt on the robustness of the K-2009 results.

The other point I wanted to make is that the apparent discrepancies could also, perhaps even more likely, be due to biases present in the satellite data, as indicated by Santer et al 2005. Also new biases are constantly being discovered, see the Po-Chedley & Fu paper.
Why is there no attention given to the potential presence of biases in the satellite data in Dr. Christy’s blog post?

“.. there have been many studies which have looked at the relationship between the magnitude of the surface temperature trend relative to that of the tropospheric layer as defined above (e.g. Douglass et al. 2007.)”

Indeed, and I gave some references in “Klotzbach Revisited”, for instance Santer et al 2005, Karl et al 2006 and a review of Thorne et al 2011.
Douglass et al contains serious flaws as indicated by Santer et al 2008. I would recommend this youtube video of a presentation by Santer to people who are interested in the controversy around this Douglass-2007 paper.

“As noted however, several additional calculations confirm the value of 1.1 utilized by Klotzbach et al. 2010.”

The K-2009 paper uses 1.2 as an amplification factor over land and the correction paper K-2010 uses 1.1. I never found this very plausible. Gavin Schmidt, responsible for the GISS-ER model gave a value by e-mail to Phil Klotzbach of 0.95, still K-2010 uses the value of 1.1 as calculated on McIntyre’s blog. On the same blog Gavin Schmidt gives the following amplification factor:

“A range of [0.784,1.234]… and a mean (if you think that is sensible) of 0.9708 . Lest anyone think that volcanoes or something are affecting this, the same calculation for 2010-2100 is a range of [0.914,1.097] and a mean of 0.9897.”
This 0.97 is close to his previous 0.95.

The selection of an amplification factor in K-2009 and K-2010 is arbitrarily, why use the factor 1.1 and not the factor calculated by the person responsible for the GISS-ER model?
In my opinion neither of these factors should be used in scientific papers, like the Klotzbach et al papers, since they are obtained from blogs and are not backed up by peer-reviewed science.

“It is true that these differences are a little closer to zero than shown in Klotzbach et al., but that is due to the fact that there has been no warming in the past 10 years in both types of data”.

I’m glad that Dr. Christy endorses my results, but I do not agree with the ‘but’ part. I do not think conclusions regarding a (change in) climatologically relevant trend can be drawn based upon 10 years of data. However, since Dr. Christy explicitly refers to this very short timescale I will present some trend data regarding the last 10 years (Feb. 2003 – Jan. 2013). Values in °C/decade, first global, then over land, followed by the ocean part:
UAH: +0.055 / +0.159 / -0.003
NCDC: -0.038 / +0.048 / -0.068
NCDC minus UAH: -0.092 / -0.111 / -0.064
It is quite clear that over the last 10 years the UAH dataset (the lower troposphere) gives a positive global warming trend and NCDC does not. Both datasets give a positive warming trend over land and that is not the same as ‘no warming’. The trend over land for UAH is 3.3 times higher than the trend over land by NCDC and this certainly does not reflect the amplification factors 0.95 or 1.1 mentioned in the blog posts.
The land area on earth has warmed during the past 10 years and the relationship between the surface data and the satellite data has turned upside down, over 34 years the trend difference (NCDC minus UAH) was +0.10 °C/decade and it is reversed the last 10 years to -0.11 °C/decade. Why?
I did not use amplification factors in calculating the trend differences (see figure 5 in the original blogpost) due to the fact that these factors over land and ocean are, as far as I can tell, not backed up by peer-reviewed science.

“Therefore models, on average, depict the last 34 years as warming about 1.5 times what actually occurred.”

Dr. Christy is comparing trends based upon averages from climate models with the trend in the observations. Averaging model run will also average out natural variability as present in each model simulation. For instance, the simulated influence of ENSO will not be visible anymore in the averaged data (e.g. see this graph and note the absence of any El Nino variability in the averaged model data). It is not realistic to expect that the observations will nicely follow the model average. Comparing a trend of a model average with the observational surface temperature trend and concluding from such a (simplistic and incomplete) comparison that “the climate sensitivity of models is too high”, is, in my opinion, jumping to conclusions.

In the animation below the temperature observations are compared with two simulations from a climate model. The blue simulation matches the observed trend over the most recent decade and the red does not. In the long run the expected warming is roughly the same. The original (created by Ed Hawkins) can be found at:


“Since this increased warming in the upper layers is a signature of greenhouse gas forcing in models, and it is not observed, this raises questions about the ability of models to represent the true vertical heat flux processes of the atmosphere and thus to represent the climate impact of the extra greenhouses gases we are putting into the atmosphere.”

The increased warming in the upper layers of the troposphere is due to the lapse rate feedback and is not a signature restricted to the influence of greenhouse gases. Since the lapse rate feedback is a negative feedback, a smaller lapse rate feedback would in fact result in a larger climate sensitivity as obtained from models. More info regarding this subject can be found on Skeptical Science and RealClimate, for Dutch readers the subject is also covered at Klimaatportaal.

This statement of Dr. Christy is quite an extraordinary claim and according to Carl Sagan “extraordinary claims require extraordinary evidence”. Following the blog post of Dr. Christy, the claim is based upon satellite and surface data that potentially could contain biases, on amplification factors taken from blogs and not from peer-reviewed science papers and by comparing observations with model averages. I would not classify this evidence as extraordinary, especially since many other lines of evidence contradict Dr. Christy’s claim (see e.g. here or here).

I’m left with a lot of questions regarding this topic:

  • What explains these large differences in the comparison of satellite and surface temperatures with only 13% more data?
  • Are there potential biases in the satellite data, e.g. due to a change in satellites or other factors?
  • Is the (autocorrelated) noise in the data playing tricks on me/us?
  • What is the skill of climate models in simulating the vertical temperature structure and the influence of the planetary boundary layer?
  • What are the actual amplification factors for land and ocean over the satellite period?
  • Are the amplification factors constant in time or do they vary, e.g. are they influenced by ENSO? If they vary, what is the magnitude and cause of the variation?
  • Is it possible to make a strict division between land and ocean for the complete lower troposphere, e.g. is there some averaging out from ocean to land and vice versa at higher altitudes due to convection?

There are several scientific puzzles remaining; hopefully science can resolve some of them in the near future. Until then, let’s not jump to unsubstantiated conclusions.


Tags: , ,

8 Responses to “Response to John Christy’s blog post regarding ‘Klotzbach Revisited’”

  1. Eli Rabett Says:

    You are hinting at it, but as far as Eli knows Christy has never released the code for his and Spencer’s transformation of the (A)MSU measurements to temperatures in various layers. While the consistency of the RSS and UAH records in the troposphere appears now to be good, the same cannot be said for the stratosphere where there is definitely a huge difference (and the models are, shall we say, doing a difficult straddle.

    Still have to catch up with my LIDAR friend as that may be a reasonable way of looking at this, also COSMIC.

  2. gaia.sailboat Says:

    Dear Mr. Christy:
    Climate blogs focus on three subjects:
    1. Scientific measurements of the worsening climate problem
    2. Attacks on climate deniers and their unscientific statements
    3. Gadgets, methods, of living with climate change impacts

    What is almost completely missing is: a. descriptions of what would happen should CO2 emission continue and escalate, non-stop and emission control fails completely. No one, it seems, wants to go there. Oh, there are fleeting doomsday references, but no serious, full explorations.

    The reason may be blogs would be out of business should they countenance the idea emission control may fail (a very reasonable assumption in light of nearly zero progress curbing emissions). No one wants to suggest something may come to pass that makes his blog meaningless and irrelevant.

    We all want to go forward on a “business as usual” basis because we all have a daily part to play in “business as usual.” (James Lovelock has a wonderful discussion of the role of “business as usual” in his last two books).

    I’d be grateful to know your views on lack of discussion of emission control collapse. A blog, such as yours, could play a role in developing a “Plan B” for when emission control is obviously dead and undeniably serious impacts begin to kick in.

  3. Paul S Says:

    Hi Jos,

    Interesting articles. I’ve looked at some of the issues involved in the past so I thought I would give my views on the questions posed at the bottom:

    Potential biases in the satellite data – While the RSS and UAH datasets produce very similar trends at a global average over the full record, this only occurs due to compensating temporal and regional differences, which can be visualised on RSS’s website. Since both can’t be correct it must be the case that there are biases affecting one or the other, or (more likely) both. The radiosonde data tends to support the higher RSS trends in the Tropics and the higher UAH trends in the NH Extratropics. The SH Extratropics are more of a mixed bag, probably due to poor sampling. [without delving into the point that both are markedly inconsistent with the NOAA STAR TMT analysis]

    Constancy of amplification factor – At the global average scale amplification definitely varies over time. While there may be a purely time-varying aspect I think the more important effect relates to regionally-varying amplification factors e.g. amplification up to ~2 over ocean in the Tropics, whereas it is closer to 0.5 over land in NH high latitudes. You can see that the global average amplification factor will depend on where warming is taking place at the surface. As it happens, observed surface warming is strongly biased towards NH high latitudes, so we should not expect a large observed global amplification factor based on the actual surface temperature change over the satellite period. I’ll note at this point that there is a clear link here between TLT/Tsurf amplification and the land-sea warming contrast, both of which CMIP models get wrong in roughly similar proportion.

    Division between land and ocean – There will definitely be mixing between land and ocean areas, one reason why the expected TLT land-sea warming contrast (from CMIP models) is much smaller than that expected at the surface. This is one aspect which suggests the land surface temperatures are not far wrong – the discrepancy between expected and observed TLT land-sea warming contrast is in proportion with the same discrepancy at the surface. In other words, the observations are doing pretty much what we would expect in relation to each other, if only models were getting the surface warming distribution right. Of course, that also means I don’t think the satellite TLT data is far wrong.

    This can be shown explicitly in AMIP type model runs, where SSTs are prescribed from observations but everything else is free. The regional distribution and magnitude of surface warming is therefore much closer to reality so the main determining factor becomes the model’s pure ability to simulate tropospheric temperatures. Isaac Held made a blog post about this a while ago. This is certainly not to say that the issue is resolved: the other Po-Chedly and Fu 2012 collates more AMIP-type runs and finds almost all exhibit a higher trend in the Tropics than the RSS/UAH analyses, although the discrepancy is smaller than with CMIP models.

    For me the TLT-amplification question as commonly discussed – at the global average scale – is largely a sideshow – a symptom of a bigger issue. While AMIP modelling still indicates some discrepancies between models and observations, these pale by comparison to the model-observation discrepancy caused by the difference in surface warming distribution. For me the more important questions are: Why is the land-surface warming so much faster than the sea-surface? and, Why has the sea-surface warming so little? Finding an answer to that last question would go a long way to solving many other questions, including the TLT-amplification one.

  4. Jos Hagelaars Says:

    Hello Paul,

    Thanks for your informative response and sorry for my late reply. Been rather busy with a wheelchair (see the Marcott post).

    I had a look at the radiosonde data, see my reply to Eli Rabett in the Klotzbach Revisited post. I got a slope for NCDC minus HadAT2 (their MSU equivalent, data here) of -0.03 °C/decade and using that ‘average global amplification factor’ of 1.25 I got a slope of exactly 0.00 °C/decade. It would be interesting if a scientist would look into the land/ocean differences of HadAT2 and the surface temperatures.

    I missed Isaac Held’s post about this, so I will study it, thanks again.

    The TLT amplification question is interesting, but indeed not a main issue. I do not think it should be used to cast doubt on the surface temperature record, though.

  5. Paul S Says:

    Hello Jos,

    It’s worth being cautious with something like a “global average” of radiosondes if what you want to infer is a genuine global average temperature. The problem is that the spatial sampling is fairly sparse and very probably biased, in particular towards land areas.

    The RSS page I linked above (here again) gives a good indication of the effect of this bias. They show what you get if you sample the RSS and UAH data at HadAT radiosonde sites and in both cases you get higher trends than with full coverage. I would say then that comparing NCDC to HadAT in this way is not a good test. That said, there does seem to be a post-2005 discrepancy forming between all the radiosonde and satellite datasets. There’s possibly some scope there to suggest the satellite data might be biased low over the past several years, but there’s only a couple of hundredths in it, nothing major.

    I thought I should elaborate, with some numbers, on my point that the satellite data really doesn’t indicate a significant problem with surface temperature records when a more detailed comparison is made with model expectations.

    At global land+ocean average, CMIP3 models tended to produce a TLT/Tsurf amplification factor of ~1.25 as you indicate. Breaking down to separate land and ocean areas, the amp factors are ~0.95 and ~1.4 respectively. If we then look at observed trends over 1981-2008 (the reason for this chosen period is a comparison I will make to the aforementioned AMIP model run, which ends in Dec 2008, and removal of the first two years due to discrete anomalous behaviour of unknown origin) we get a land+ocean amp factor of ~0.9 for RSS/NCDC and UAH/NCDC, a land amp factor of ~0.7-0.75 and an ocean amp factor of ~1.1-1.15. You can see the model-obs discrepancy here is of roughly equal size across both geographies.

    Another way to look at the issue is through land-sea warming contrasts at both levels. NOAA land/ocean surface temperatures produce a contrast factor of ~2.6, whereas RSS and UAH produce ~1.65. CMIP3 models averaged at about 1.6 and 1.1 respectively for surface and TLT, so again there are large discrepancies between “expected” and observed values here in roughly similar proportion over both areas. So we can see the relatively large land-area warming in surface station observations is actually supported by the TLT data, since that also indicates there has been a disproportionate amount of warming over land. For any surface temperature bias to be the main issue here it would need to be found in roughly similar proportion over land and ocean measurements, or there would have to be, coincidentally, similar-size unrelated biases in each. This seems unlikely.

    As a final point I’ll introduce outputs from a realisation of the GFDL AMIP model I mentioned above, in which the distribution of surface warming is effectively prescribed by setting SSTs to observed values: TLT/Tsurf amplification factors are 1.1 land+ocean, 0.85 land, and 1.3 ocean. Land-sea warming contrasts are 2.65 at surface level, and 1.65 at TLT. So we can see that modifying the spatial distribution of warming can comfortably reconcile modelled and observed land-sea warming contrasts. There remains a discrepancy in the difference between modelled and observed TLT/Tsurf amplification, but significantly smaller than in CMIP models.

  6. Jos Hagelaars Says:

    Hello Paul,

    The Met Office presents MSU comparable data based on HadAT2 and I just presumed they took care of the spatial coverage to some extent. When comparing HadAT2-MSU with UAH and RSS over 1981 to March 2012, I get differences in trends, resp. +0.18, +0.14 and +0.13 °C/decade. But, looking at the RSS page you link to, which contains sampled data comparable to HadAT2 (the opposite way in comparing the data), it could well be that my assumption is incorrect. HadAT2 interested me because it has data way back to 1958, 20 years more than the satellites, so uncertainty in trends would be lower.
    I still have Santer’s conundrum in my mind. Why are the satellite and radiosonde data consistent with theory on short timescales and not on decadal timescales?

    I looked into the differences between the first 30 years of the satellite era and the last 30 years (1983-2012) before, the latter is roughly comparable to the period you chose. The trend ratio’s for the satellites versus surf.T (looking at NCDC, HadCRUT and GISTEMP) over this second period differ a lot, first column is land+ocean, second land only and third ocean only:
    UAH/NCDC: 1.05 / 0.74 / 1.27
    RSS/NCDC: 0.93 / 0.74 / 1.05
    UAH/Hadley: 1.01 / 0.78 / 1.06
    RSS/Hadley: 0.89 / 0.78 / 0.88
    UAH/GISS: 0.98 / 0.73 / 1.42
    RSS/GISS: 0.87 / 0.74 / 1.17
    So specifically ocean trend ratio’s are very different and your conclusion about the model-obs. discrepancy does not apply to all datasets. Is this due to the uncertainty in the data, different landmasks, different algorithms? More conundrums.

    The trend difference for NCDC minus UAH (the highest value in Klotzbach-2009), has dropped to 0.07 over 1983-2012, a decrease of about 50%. So indeed, the first years of the satellite period do show anomalous behavior. Something Christy en co. are probably well aware of but did not mention at all.

    I think some care should be taken with regard to those land/ocean contrast factors. Looking at all datasets regarding 1983-2012, I get:
    NCDC: 2.5
    HadCRUT4: 2.0
    GISTEMP: 2.9
    UAH: 1.5
    RSS: 1.8
    There’s a large difference between the three surface temperature datasets. Is it again due to the landmask used or to the spatial coverage (GISTEMP does take the Arctic into account)?

    And thanks for your info regarding AMIP, these data are indeed closer to the observed amplification factors. Are the AMIP data accessible for the general public?

    As you state, there remains a discrepancy between models and observations. So for me, lots of scientific puzzles regarding these data remain, and it is up to scientists to resolve these puzzles rather than emphasize uncertainties on one specific item.

  7. gaia.sailboat Says:

    I have this fantasy aliens will land on a burned out earth a hundred thousand years from now and discover a storehouse of all the great research on global warming and say, “Man, those guys had it down to a gnat’s eyelash in their studies. But they must have forgot to tell the people the end results – extinction. But, jeez, their data was pretty!

  8. Paul S Says:

    Hi Jos,

    One important thing I forgot to mention is that all the model/obs calculations quoted, other than for CMIP3, were from latitudinally cropped outputs between 70S and 82.5N. For obs this was done using Climate Explorer and for the AMIP model data I used my own area averaging code.

    This was done partly because that is the area specified in the RSS data, and partly because surface data is very sparse beyond these limits. I would have contracted further to about 70S – 70N but couldn’t find an easy way to produce separate land and ocean RSS datasets specifically between these intervals. I would think this step would reduce differences across the surface datasets since high latitude coverage and processing are a major factor.

    AMIP model data from a number of groups are archived at PCMDI. You’ll need to create an account to download, if you don’t have one already. Data are all stored as gridded netcdf files. There isn’t a “TLT” dataset to download, only 3D gridded atmospheric temperature files. You have to use RSS (can be found on their FTP area) or UAH vertical atmospheric weights to produce a weighted average of the model atmosphere which we can assume is representative of TLT. Be warned that file sizes for higher resolution models, such as the GFDL one, are 1GB+.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: