The (missing) tropical hot spot

The (missing) tropical hot spot is one of the long-standing controversies in climate science. Climate models show amplified warming high in the tropical troposphere due to greenhouse forcing. However data from satellites and weather balloons don’t show much amplification. What to make of this? Have the models been ‘falsified’ as critics say or are the errors in the data so large that we cannot conclude much at all? And does it matter if there is no hot spot?

We are really glad that three of the main players in this controversy have accepted our invitation to participate: Steven Sherwood of the University of New South Wales in Sydney, Carl Mears of Remote Sensing Systems and John Christy of the University of Alabama in Huntsville.

Climate Dialogue editorial staff
Rob van Dorland, KNMI
Marcel Crok, science writer
Bart Verheggen 

Introduction The (missing) tropical hot spot

The (missing) hot spot in the tropics

Based on theoretical considerations and simulations with General Circulation Models (GCMs), it is expected that any warming at the surface will be amplified in the upper troposphere. The reason for this is quite simple. More warming at the surface means more evaporation and more convection. Higher in the troposphere the (extra) water vapour condenses and heat is released. Calculations with GCMs show that the lower troposphere warms about 1.2 times faster than the surface. For the tropics, where most of the moist is, the amplification is larger, about 1.4.

This change in thermal structure of the troposphere is known as the lapse rate feedback. It is a negative feedback, i.e. attenuating the surface temperature response due to whatever cause, since the additional condensation heat in the upper air results in more radiative heat loss.

IPCC published the following figure in its latest report (AR4) in 2007:

Source: http://www.ipcc.ch/publications_and_data/ar4/wg1/en/figure-9-1.html (based on Santer 2003)

The figure shows the response of the atmosphere to different forcings in a GCM. As one can see, over the past century, the greenhouse forcing was expected to dominate all other forcings. The expected warming is highest in the tropical troposphere, dubbed the tropical hot spot.

The discrepancy between the strength of the hot spot in the models and the observations has been a controversial topic in climate science for almost 25 years. The controversy[i] goes all the way back to the first paper of Roy Spencer and John Christy[ii] about their UAH tropospheric temperature dataset in the early nineties. At the time their data didn’t show warming of the troposphere. Later a second group (Carl Mears and Frank Wentz of RSS) joined in, using the same satellite data to convert them into a time series of the tropospheric temperature. Several corrections, e.g. for the orbital changes of the satellite, were made in the course of years with a warming trend as a result. However the controversy remains because the tropical troposphere is still showing a smaller amplification of the surface warming which is contrary to expectations.

Positions
Some researchers claim that observations don’t show the tropical hot spot and that the differences between models and observations are statistically significant[iii]. On top of that they note that the warming trend itself is much larger in the models than in the observations (see figure 2 below and also ref.[iv]). Other researchers conclude that the differences between the trends of tropical tropospheric temperatures in observations and models are statistically not inconsistent with each other[v]. They note that some radiosonde and satellite datasets (RSS) do show warming trends comparable with the models (see figure 3 below).

The debate is complex because there are several observational datasets, based on satellite (UAH and RSS) but also on radiosonde measurements (weather balloons). Which of the dataset is “best” and how does one determine the uncertainty in both datasets and model simulations?

The controversy flared up in 2007/2008 with the publications of two papers[vi][vii] of the opposing groups. Key graphs in both papers are the best way to give an impression of the debate. First Douglass et al. came up with the following graph showing the disagreement between models and observations:

Figure 2. Temperature trends for the satellite era. Plot of temperature trend (°C/decade) against pressure (altitude). The HadCRUT2v surface trend value is a large blue circle. The GHCN and the GISS surface values are the open rectangle and diamond. The four radiosonde results (IGRA, RATPAC, HadAT2, and RAOBCORE) are shown in blue, light blue, green, and purple respectively. The two UAH MSU data points are shown as gold-filled diamonds and the RSS MSU data points as gold-filled squares. The 22-model ensemble average is a solid red line. The 22-model average ±2σSE are shown as lighter red lines. MSU values of T2LT and T2 are shown in the panel to the right. UAH values are yellow-filled diamonds, RSS are yellow-filled squares, and UMD is a yellow-filled circle. Synthetic model values are shown as white-filled circles, with 2σSE uncertainty limits as error bars. Source: Douglass et al. 2008

Santer et al. criticized Douglass et al. for underestimating the uncertainties in both model output and observations and also for not showing all radiosonde datasets. They came up with the following graph:

Figure 3. Vertical profiles of trends in atmospheric temperature (panel A) and in actual and synthetic MSU temperatures (panel B). All trends were calculated using monthly-mean anomaly data, spatially averaged over 20 °N–20 °S. Results in panel A are from seven radiosonde datasets (RATPAC-A, RICH, HadAT2, IUK, and three versions of RAOBCORE; see Section 2.1.2) and 19 different climate models. The grey-shaded envelope is the 2σ standard deviation of the ensemble-mean trends at discrete pressure levels. The yellow envelope represents 2σSE, DCPS07’s estimate of uncertainty in the mean trend. The analysis period is January 1979 through December 1999, the period of maximum overlap between the observations and most of the model 20CEN simulations. Note that DCPS07 used the same analysis period for model data, but calculated all observed trends over 1979–2004. Source: Santer (2008)

The grey-shaded envelope is the 2σ standard deviation of the ensemble-mean trends of Santer et al. while the yellow band is the estimated uncertainty of Douglass et al. Some radiosonde series in the Santer graph (like the Raobcore 1.4 dataset) show even more warming higher up in the troposphere than the model mean.

Updates
Not surprisingly the debate didn’t end there. In 2010 McKitrick et al.[viii] updated the results of Santer (2008), who limited the comparison between models and observations to the period 1979-1999, to 2009. They concluded that over the interval 1979–2009, model projected temperature trends are two to four times larger than observed trends in both the lower troposphere and the mid troposphere and the differences are statistically significant at the 99% level.

Christy (2010)[ix] analysed the different datasets used and concluded that some should be discarded in the tropics:

Figure 4. Temperature trends in the lower tropical troposphere for different datasets and for slightly differing periods (79-05 = 1979-2005). UAH and RSS are the estimates based on satellite measurements. HadAt, Ratpac, RC1.4 and Rich are based on radiosonde measurements. C10 and AS08[x] are based on thermal wind data. The other three datasets give trends at the surface (ERSST being for the oceans only while the other two combine land and ocean data). Source: Christy (2010)

Christy (2010) concluded that part of the tropical warming in the RSS series is spurious. They also discarded the indirect estimates that are based on thermal wind. Not surprisingly Mears (2012) disagreed with Christy’s conclusion about the RSS trend being spurious writing that “trying to determine which MSU [satellite] data set is “better” based on short-time period comparisons with radiosonde data sets alone cannot lead to robust conclusions”.[xi]

Scaling ratio
Christy (2010) also introduced what they called the “scaling ratio”, the ratio of tropospheric to surface trends and concluded that these scaling ratios clearly differ between models and observations. Models show a ratio of 1.4 in the tropics (meaning troposphere warming 1.4 times faster than the surface), while the observations have a ratio of 0.8 (meaning surface warming faster than the troposphere). Christy speculated that an alternate reason for the discrepancy could be that the reported trends in temperatures at the surface are spatially inaccurate and are actually less positive. A similar hypothesis was tested by Klotzbach (2009).[xii]

In an extensive review article about the controversy published in early 2011 Thorne et al. ended with the conclusion that “there is no reasonable evidence of a fundamental disagreement between tropospheric temperature trends from models and observations when uncertainties in both are treated comprehensively”. However in the same year Fu et al.[xiii] concluded that while “satellite MSU/AMSU observations generally support GCM results with tropical deep‐layer tropospheric warming faster than surface, it is evident that the AR4 GCMs exaggerate the increase in static stability between tropical middle and upper troposphere during the last three decades”. More papers then started to acknowledge that the consistency of tropical tropospheric temperature trends with climate model expectations remains contentious.[xiv][xv][xvi][xvii]

Climate Dialogue
We will focus the discussion on the tropics as the hot spot is most pronounced there in the models. Core questions are of course whether we can detect/have detected a hot spot in the observations and if not what are the implications for the reliability of GCMs and our understanding of the climate?

Specific questions

1) Do the discussants agree that amplified warming in the tropical troposphere is expected?

2) Can the hot spot in the tropics be regarded as a fingerprint of greenhouse warming?

3) Is there a significant difference between modelled and observed amplification of surface trends in the tropical troposphere (as diagnosed by e.g. the scaling ratio)?

4) What could explain the relatively large difference in tropical trends between the UAH and the RSS dataset?

5) What explanation(s) do you favour regarding the apparent discrepancy surrounding the tropical hot spot? A few options come to mind: a) satellite data show too little warming b) surface data show too much warming c) within the uncertainties of both there is no significant discrepancy d) the theory (of moist convection leading to more tropospheric than surface warming) overestimates the magnitude of the hotspot

6) What consequences, if any, would your explanation have for our estimate of the lapse rate feedback, water vapour feedback and climate sensitivity?


[i]Thorne, P. W. et al., 2011, Tropospheric temperature trends: History ofan ongoing controversy. WIRES: Climate Change, 2: 66-88

[ii]Spencer RW, Christy JR. Precise monitoring of global temperature trends from satellites. Science 1990, 247:1558–1562.

[iii] Christy, J. R., B. M. Herman, R. Pielke Sr., P. Klotzbach, R. T. McNider, J. J. Hnilo, R. W. Spencer, T. Chase, and D. H. Douglass (2010), What do observational datasets say about modeled tropospheric temperature trends since 1979?, Remote Sens., 2, 2148–2169, doi:10.3390/rs2092148.

[iv] http://www.drroyspencer.com/wp-content/uploads/CMIP5-73-models-vs-obs-20N-20S-MT-5-yr-means1.png

[v]Thorne, P.W. Atmospheric science: The answer is blowing in the wind. Nature Geosci. 2008, doi:10.1038/ngeo209

[vi] Douglass DH, Christy JR, Pearson BD, Singer SF. A comparison of tropical temperature trends with model predictions. Int J Climatol 2008, 27:1693–1701

[vii] Santer, B.D.; Thorne, P.W.; Haimberger, L.; Taylor, K.E.; Wigley, T.M.L.; Lanzante, J.R.; Solomon, S.; Free, M.; Gleckler, P.J.; Jones, P.D.; Karl, T.R.; Klein, S.A.; Mears, C.; Nychka, D.; Schmidt, G.A.; Sherwood, S.C.; Wentz, F.J. Consistency of modelled and observed temperature trends in the tropical troposphere. Int. J. Climatol. 2008, doi:1002/joc.1756

[viii] McKitrick, R. R., S. McIntyre and C. Herman (2010) “Panel and Multivariate Methods for Tests of Trend Equivalence in Climate Data Sets.” Atmospheric Science Letters, 11(4) pp. 270-277, October/December 2010 DOI: 10.1002/asl.290

[ix] Christy, J. R., B. M. Herman, R. Pielke Sr., P. Klotzbach, R. T. McNider, J. J. Hnilo, R. W. Spencer, T. Chase, and D. H. Douglass (2010), What do observational datasets say about modeled tropospheric temperature trends since 1979?, Remote Sens., 2, 2148–2169, doi:10.3390/rs2092148

[x] Allen RJ, Sherwood SC. Warming maximum in the tropical upper troposphere deduced from thermal winds. Nat Geosci 008, 1:399–403

[xi] Mears, C. A., F. J. Wentz, and P. W. Thorne (2012), Assessing the value of Microwave Sounding Unit–radiosonde comparisons in ascertaining errors in climate data records of tropospheric temperatures, J. Geophys. Res., 117, D19103, doi:10.1029/2012JD017710

[xii] Klotzbach PJ, Pielke RA Sr., Pielke RA Jr., Christy JR, McNider RT. An alternative explanation for differential temperature trends at the surface and in the lower troposphere. J Geophys Res 2009, 114:D21102. DOI:10.1029/2009JD011841

[xiii] Fu, Q., S. Manabe, and C. M. Johanson (2011), On the warming in the tropical upper troposphere: Models versus observations, Geophys. Res. Lett., 38, L15704, doi:10.1029/2011GL048101

[xiv] Seidel, D. J., M. Free, and J. S. Wang (2012), Reexamining the warming in the tropical upper troposphere: Models versus radiosonde observations, Geophys. Res. Lett., 39, L22701, doi:10.1029/2012GL053850

[xv] Po-Chedley, S., and Q. Fu (2012), Discrepancies in tropical upper tropospheric warming between atmospheric circulation models and satellites, Environ. Res. Lett

[xvi] Benjamin D. Santer, Jeffrey F. Painter, Carl A. Mears, Charles Doutriaux, Peter Caldwell, Julie M. Arblaster, Philip J. Cameron-Smith, Nathan P. Gillett, Peter J. Gleckler, John Lanzante, Judith Perlwitz, Susan Solomon, Peter A. Stott, Karl E. Taylor, Laurent Terray, Peter W. Thorne, Michael F. Wehner, Frank J. Wentz, Tom M. L. Wigley, Laura J. Wilcox, and Cheng-Zhi Zou, Identifying human influences on atmospheric temperature, PNAS 2013 110 (1) 26-33; published ahead of print November 29, 2012, doi:10.1073/pnas.1210514109

[xvii] Thorne, P. W., et al. (2011), A quantification of uncertainties in historical tropical tropospheric temperature trends from radiosondes, J. Geophys. Res., 116, D12116, doi:10.1029/2010JD015487

Guest blog Carl Mears

Thoughts and plots about the tropical tropospheric hot spot.

Carl Mears, Remote Sensing Systems

In the deep tropics, in the troposphere, the lapse rate (the rate of decrease of temperature with increasing height above the surface) is largely controlled by the moist adiabatic lapse rate (MALR). This is true both in complicated simulations performed by General Circulation Models, and in simple, back of the envelope calculations (Santer et al, 2005). The reasoning behind this is simple. If the lapse rate were larger than MALR, then the atmosphere would be unstable to convection. Convection (a thunderstorm) would then occur, and heat the upper troposphere via the release of latent heat as water vapor condenses into clouds, and cool the surface via evaporation and the presence of cold rain/hail. If the lapse rate were smaller than MALR, then convection would be suppressed, allowing the surface to heat up without triggering a convective event. On average, these processes cause the lapse rate to be very close to the MALR. Note that this argument does not apply outside the tropics, because the dynamics become more complex due to the Coriolis force and the presence of large north/south temperature gradients, or in regions with very low relative humidity, such as deserts, where the atmosphere may be far from saturated near the surface and thus the MALR does not apply.

Because the MALR decreases with temperature, this means the any temperature increase at the surface becomes even larger high in the troposphere. This causes the so called hot spot, a region high in the troposphere that shows more warming (or cooling) than the surface. Note that at this point, I haven’t said a thing about greenhouse gases. In fact, this effect has nothing to do with the source of the warming, as long as it arises near the surface. Surface warming due to any cause would show a tropospheric hotspot in the absence of other changes to the heating and cooling of the atmosphere. Never the less, the tropospheric hotspot is often presented as some sort of lynchpin of global warming theory. It is not. It is just a feature of a close-to-unstable moist atmosphere.

Now, I will turn my attention to one of the core questions of this discussion – “can we detect/have detected a hot spot in the observations “. On monthly time scales, there is no question. If we average across the tropics, the temperature of the upper troposphere is strongly correlated with the temperature of the surface, only with larger amplitude (Santer et al., 2005). On decadal time scales, the results obtained depend on the datasets chosen, as Santer 2005 showed for the RSS and UAH satellite datasets and a few homogenized radiosonde datasets. Here we expand this a little further to include more homogenized radiosonde datasets and two of the more recent reanalysis datasets, MERRA and ERA-Interim. Figure 1 shows the ratio of the mid to upper tropospheric temperature trends to surface temperature trends in the deep tropics (20S to 20N). Each point on the graph is the trend starting in January 1979, and ending at the date on the x-axis. The surface temperature is from HADCRUT4. The mid to upper tropospheric data is the “temperature tropical troposphere” product, or TTT, first introduced by Fu and Johanson (2005). For MSU/AMSU, it is equal to 1.1*TMT – 0.1*TLS. This combination has the effect of adjusting for the cooling effect of the stratosphere on TMT by subtracting off part of the stratospheric cooling measured by TLS. The weighting function for this product is centered in the mid to upper tropical troposphere, where we expect the hot spot to be most pronounced.

Fig. 1. Ratio of trends in TTT to trends in TSurf as a function of the ending year of the trend analysis. The starting point is January 1979. The surface dataset used is HADCRUT4. The pink horizontal line is at a value of 1.4, the amplification factor for TTT in reference 1.

Two conclusions can easily be reached from this plot. First, it takes about 25 years (or more) for the measured trend ratios to settle down to reasonably constant values. This is due to the effects of both measurement errors, and “weather noise”. I think that this is part of the cause of the controversy surrounding this topic – we began discussing such trend ratios before we had enough data for the ratios to be stable over time. Second, the values that are ultimately reached depend strongly depend on which upper air dataset is used. For some datasets (HadAT, UAH, IUK, RAOBCORE 1.5, ERA-Interim), the trend ratio is less than 1.0, indicating lack of a tropospheric hotspot. For other datasets (RICH, RAOBCORE 1.4, RSS, MERRA, and STAR), the ratio is greater than one, indicating tropospheric amplification and the presence of a hotspot. CMIP-3 Climate models predicted an amplification value of about 1.4 for the TTT temperature product used here (Santer et al., 2005). Some upper air datasets are in relatively close agreement with these expectations, such as the RSS and STAR satellite data, the older version of RAOBCORE (V1.4), and the MERRA reanalysis (which uses the STAR data as one of its inputs, so it is not completely independent of STAR). Often one or more of these datasets is used to argue that a tropical hotspot exists or does not exist. A more balanced analysis shows that it is difficult to prove or disprove the presence of the tropospheric hotspot given the current state of the data.

In Fig. 2, I have reproduced panel D of Fig. 4 from Santer et al. (2005), except with updated measured data, the addition of the reanalysis data, and the use of CMIP-5 model results. The CMIP-5 model results for 1979-2012 were made by splicing together results from 20th century simulations (before 2005, using measured values of the forcings), and RCP8.5 21st century predictions (after 2005, results using predicted values for the various forcings. For details on this process, see Santer et al., 2012)

Fig. 2. Scatter plot of trend (1979-2012) in TTT as a function of trend in TSurf. The model results cluster around a line with a slope of 1.45, indicating a tropospheric hotspot. For the observed results, HadCRUT4 is used for the surface temperatures, and various sources of tropospheric temperature (Satellites, Radiosondes, and Reanalysis) are used to TTT.

The general story around the hotspot remains unchanged from Santer et al 2005, except that the expected scaling ratio has increased from 1.40 to 1.45 with the use of CMIP-5 data. Two sources of measured data (I realize that a reanalysis is not really a measurement), STAR and MERRA, are reasonably close to the fitted line, while others, such as the HADAT Radiosonde dataset and the UAH satellite dataset, are far from the line. Other datasets are distributed in between. Note that the RSS data point has error bars both in the X and Y direction. These are 90% uncertainty ranges derived from the error ensembles that have been recently produced for the RSS dataset (Mears et al., 2011) and the HadCRUT4 dataset (Morice et al, 2012). (These error ensembles are made up of different realizations of the datasets that are consistent with the estimated errors, including measurement, sampling, and construction errors. The correlations of the errors across both time and location are thus automatically included if the error ensemble members are processed by the user in the same way as the baseline data.) This is the first time that we have been able to put error bars on the observed points on this plot, in both directions, in such a consistent manner.

Looking at Fig. 2., it is obvious that the observed trends in both temperature datasets are at the extreme low end of the model predictions. This problem has grown over time as the length of the measured data grows. (As the comparison time period gets longer, the uncertainty in linear trends both the measured and modeled time series decreases simply because of the longer time period.) For the time being, I am tabling the discussion of this problem and focusing in the discussion of the hot spot. In my mind, the problem of the trend magnitude is more interesting than the argument about the hotspot, and I hope to return to it later in this process. But for now I will stay focused on the hotspot.

Fig. 3. Histograms of the troposphere/surface trend scaling ratio from the RSS/HadCRUT4 error ensembles, and from 33 CMIP-5 model runs.

In Fig. 3, we explore the implications of the RSS and HadCRUT4 error ensembles further. The top histogram shows the range of scaling ratios consistent with the RSS satellite data and the HadCRUT4 surface data when the estimated errors in each are taken into consideration. The bottom histogram shows the range of scaling ratio shown in the 33 CMIP-5 model runs. The two distributions overlap, indicating consistency of this set of observations with the models, though the mean value shown by the observations is clearly lower than that predicted by the models.

It has been suggested that the lack of a tropospheric hotspot (if there is such a lack) is mostly due to errors in the surface temperature datasets, which are (in this story line) suspected of being biased in the direction of too much warming. This seems unlikely. Clearly, the above spread in results for different upper air datasets reveals considerable structural uncertainty (Thorne et al, 2005) for the upper air data, and the error bar on the RSS trend values is much larger than the error bar for the HadCRUT4 value. Also, the various surface datasets are much more similar to each other. To show this, I redo the analysis in Figure 1 using a different surface dataset constructed by NOAA (GHCN-ERSST). The final trend ratios are almost identical to those found using HADCRUT4, and the conclusions reached are unchanged.

Figure 4. Ratio of trends in TTT to trends in TSurf as a function of the ending year of the trend analysis. The starting point is January 1979. The surface dataset used is GHCN-ERSST.

Conclusion: Taken as a whole, the errors in the measured tropospheric data are too great to either prove or disprove the existence of the tropospheric hotspot. Some datasets are consistent (or even in good agreement) with the predicted values for the hotspot, while others are not. Some datasets even show the upper troposphere warming less rapidly than the surface.

Biosketch
Dr. Mears has a B.S. in Physics from the University of Washington (1985), and a PhD. in Physics from University of California, Berkeley (1991), where his thesis research involved the development of quantum-noise-limited superconducting microwave heterodyne receivers. He joined Remote Sensing Systems in 1998. Since then, he has validated SSM/I and TMI winds versus in situ measurements, developed and validated a rain-flagging algorithm for the QuikScat Scatterometer. Over the past several years he has constructed and maintained a climate-quality data record of atmospheric temperatures from MSU and AMSU, and studied human-induced change in atmospheric water vapor and oceanic wind speed using measurements from passive microwave imagers. Dr. Mears was a convening lead author for the U.S. Climate Change Science Program Sythesis and Assessment product 1.1 (the first CCSP report to reach final form), and a contributing author to the IGPP 4th assessment report. He is a member two international working groups, the Global Climate Oberving System Working Group on Atmospheric Reference Observations, and the WCRP Stratospheric Trends Working Group, which is part of the Stratospheric Processes and their Role in Climate (SPARC) project.

References

B. D. Santer et al., “Amplification of surface temperature trends and variability in the tropical atmosphere,” Science, vol. 309, no. 5740, pp. 1511-1556, 2005.

C. A. Mears and F. J. Wentz, “Construction of the Remote Sensing Systems V3.2 atmospheric temperature records from the MSU and AMSU microwave sounders,” Journal of Atmospheric and Oceanic Technology, vol. 26, pp. 1040-1056, 2009.

J. R. Christy, R. W. Spencer, W. B. Norris, W. D. Braswell, and D. E. Parker, “Error estimates of version 5.0 of MSU-AMSU bulk atmospheric temperatures,” Journal of Atmospheric and Oceanic Technology, vol. 20, no. 5, pp. 613-629, 2003.

P. W. Thorne et al., “Revisiting radiosonde upper-air temperatures from 1958 to 2002,” Journal of Geophysical Research, vol. 110, 2005. (This is the HADAT dataset)

L. Haimberger, “Homogenization of radiosonde temperature time series using innovation statistics,” Journal of Climate, vol. 20, no. 7, pp. 1377-1403, 2007. (This describes the RAOBCORE dataset)

L. Haimberger, C. Tavolato, and S. Sperka, “Towards the elimination of warm bias in historic radiosonde records -- some new results from a comprehensive intercomparison of upper air data,” Journal of Climate, vol. 21, pp. 4587-4606, 2008. (This describes the RICH dataset)

S. C. Sherwood, C. L. Meyer, R. J. Allen, and H. A. Titcher, “Robust tropospheric warming revealed by iteratively homogenized radiosonde data,” Journal of Climate, vol. 21, no. 20, pp. 5336-5352, Oct. 2008. (This describes the IUK dataset)

R. H. Rienecker et al., A. da Silva, “MERRA - NASA’s Modern-Era Retrospective Analysis for Research and Applications,” Journal of Climate, 2011. (This describes the MERRA dataset)

Q. Fu and C. M. Johanson, “Satellite-derived vertical dependence of tropospheric temperature trends,” Geophysical Research Letters, vol. 32, 2005. (This introduces the concept of TTT)

Mears, CA, FJ Wentz, P Thorne and D. Bernie, 2011, “Assessing uncertainty in estimates of atmospheric temperature changes from MSU and AMSU using a Monte-Carlo estimation technique”, Journal of Geophysical Research, 116, D08112, doi:10.1029/2010JD014954. (This discussed RSS error ensembles).

Thorne, P. W., D. E. Parker, J. R. Christy, and C. A. Mears, 2005, “Uncertainties in Climate Trends: Lessons From Upper-Air Temperature Records”, Bulletin of the American Meteorological Society, 86, 1437-1442. (This discusses the idea of structural uncertainty)

Morice, C. P., J. J. Kennedy, N. A. Rayner, and P. D. Jones (2012),Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set, J. Geophys. Res., 117, D08101, doi:10.1029/2011JD017187. (This discussed HadCRUT4 and the HadCRUT4 error ensembles)

Santer, B. D., J. F. Painter, C. A. Mears, C. Doutriaux, P. Caldwell, J. M. Arblaster, P. J. Cameron-Smith, N. P. Gillett, P. J. Gleckler, J. Lanzante, J. Perlwitz, S. Solomon, P. A. Stott, K. E. Taylor, L. Terray, P. W. Thorne, M. F. Wehner, F. J. Wentz, T. M. L. Wigley, L. J. Wilcox, and C. Z. Zou, 11-29-2012: Identifying Human Influences on Atmospheric Temperature. Proceedings of the National Academy of Sciences, 110, 26-33, 10.1073/pnas.1210514109. (This describes, amoung other things, the construction of the 1979-2012 model datasets by combining 20th century simulations with RCP8.5 21st century predictions)

Guest blog Steven Sherwood

The tropical upper-tropospheric warming “hot spot”: is it missing, and what if it were?

Prof. Steven Sherwood, Director, Climate Change Research Centre, University of New South Wales, Sydney Australia.

In this post I’ll address two issues: first, confidence in tropical lapse-rate* changes and what they would mean for our understanding of atmospheric physics; second, the broader implications for global warming. My main positions on this issue could be summarised as: a) lapse-rate changes differing significantly from those expected from basic thermodynamic arguments would be very interesting, but, b) they would have no clear implications for global warming, and c) evidence that they have occurred is not reliable (which in a way is too bad, because of (a)). A side point is that there are other model-observation discrepancies that I think are more worthy of attention (and are accordingly receiving more attention from the mainstream scientific community).

Confidence and Implications for Atmospheric Physics
I first became interested in the tropical lapse rate (now alternatively known as “hot spot”) issue around 2001, shortly after it was raised in a prominent paper in Science (Gaffen et al. 2000). I had been using radiosonde data to look at wind fields and temperature trends near the tropical tropopause and lower stratosphere. This new problem drew my attention because I was interested in how tropical atmospheric convection (e.g. storms) responds to its environment, one of the grand unsolved problems in atmospheric modelling. Tropical convection was supposed to prevent the kind of lapse-rate changes that were being reported, so what was going on?

I considered various ways that aerosols might alter the convective lapse rate and how to test these hypotheses. Before going too far with this, however, I wanted to assure myself that the reported trends were robust, and began my own analysis of the radiosonde data (or in fact continued it, since I was already using radiosondes to understand change in the lower stratosphere). By 2005, based on my own work and others’ plus a better understanding of the basic challenges, I no longer thought there was credible evidence for any unexpected changes in atmospheric temperature structure. Consequently I dropped this as a research topic (my student who was thinking along these lines, Bob Allen, changed gears to examine the possible impacts of aerosol on the general circulation which did lead to some very interesting results published later; he also showed that wind trends in the tropics were consistent with the hot spot).

Small changes
Although there has been more to-ing and fro-ing in the literature since then, as described in the opening article for this exchange, I still remain unconvinced that we can observe the small changes in temperature structure that are being discussed. Tests of radiosonde homogenisation methods (e.g., Thorne et al. 2011) show that they are often unreliable. MSU is not well calibrated and its homogenisation issues are also serious, as shown by the range of results previously obtained from this instrument series. To obtain upper-tropospheric trends from Channel 2 of MSU requires subtracting out a large contribution to trends in this channel coming from lower-stratospheric cooling. The latter remains highly uncertain due to a discrepancy between cooling rates in radiosondes and MSU. Tropical ozone trends are sufficiently uncertain so as to render either of these physically plausible (Solomon et al. 2012). I used to think (as do most others) that the radiosondes were wrong, but in Sherwood et al. 2008 we found (to my surprise) that when we homogenised the global radiosonde data they began to show cooling in the lower stratosphere that was very similar to that of MSU Channel 4 at each latitude, except for a large offset that varied smoothly with latitude. Such a smoothly varying and relatively uniform offset is very different from what we’d expect from radiosonde trend biases (which tend to vary at lot from one station to the next) but is consistent with an uncorrected calibration error in MSU Channel 4. If that were indeed responsible, it would imply that there has been more cooling in the stratosphere than anyone has reckoned on, and that the true upper-tropospheric warming is therefore stronger than what any group now infers from MSU data. By the way, our tropospheric data also came out very close to those published at the time by RSS, both in global mean and in the latitudinal variation (Sherwood et al., 2008).

Changes in tropical lapse rate remain an interesting problem in principle, because we know that convective schemes in global atmospheric models need improving, and this could be informative as to what is wrong. Current schemes enforce the theoretical “moist-adiabatic” lapse rate quite strongly. It would not surprise me much if it turned out that they are too heavy-handed in this respect, and that a better model would anchor the upper tropospheric temperature less firmly to the surface temperature. Indeed there is reason to believe that other problems with these models, such as difficulties in generating proper hurricanes or a tropical phenomenon known as the Madden-Julian oscillation, may also derive from the schemes triggering convection too easily and enforcing these lapse rates too vigorously. So I would not at all discount the possibility of these lapse-rate changes occurring, but one needs strong evidence, and we just don’t have that.

Broader implications for global warming
Perhaps the most remarkable and puzzling thing about the “hot spot” question is the tenacity with which climate contrarians have promoted it as evidence against climate models, and against global warming in general.

If I were looking for climate model defects, there are far more interesting and more damning ones around. For example, no climate model run for the IPCC AR4 (c. 2006) was able to reproduce the losses of Arctic sea ice that had been observed in recent decades (and which have continued accelerating since). No model, to my knowledge, produces the large asymmetry in warming between the north and south poles observed since 1980. Models underpredict the observed poleward shifts of the atmospheric circulation and climate zones by about a factor of three over this same period (Allen et al. 2012); cannot explain the warmings at high latitudes indicated by paleaoclimate data in past warm climates such as the Pliocene (Fedorov et al. 2013); appear to underpredict observed trends in the hydrological cycle (Wentz et al. 2007, Min et al. 2011) and in their simulated climatologies tend to produce rain that is too frequent, too light, and on land falls at the wrong time of day (Stephens et al. 2010). Finally, the tropical oceans are not warming as much as the land areas, or as much as predicted by most models, and this may be the root cause of why the recent warming of the tropical atmosphere is slower than predicted by most models (there is a nice series of posts about this on Isaac Held’s blog). What makes the “hot spot” more important than these other discrepancies which, in many cases, are supported by more convincing evidence? Is it because the “missing hot spot” can be spun into a tale of model exaggeration, whereas all the other problems suggest the opposite problem?

Nil
Let us suppose for the moment that the “hot spot” really has been missing while the surface has warmed. What would the implications be?

The implications for attribution of observed global warming are nil, as far as I can see. The regulation of lapse rate changes by atmospheric convection is expected to work exactly the same way whether global temperature changes are natural or forced (say, by greenhouse gases from fossil fuel burning).

The implications for climate sensitivity are also roughly nil. The total feedback from water vapour and lapse-rate changes depends only on the changes in relative humidity in the upper troposphere, not on the lapse rate itself (see Ingram, 2013). In fact, in climate models where the lapse rate becomes relatively steeper as climate warms (as would be the case with a missing hot spot), the total warming feedback is very slightly stronger because the increased lapse rate increases the greenhouse effect of carbon dioxide and other well-mixed greenhouse gases. So a missing hot spot would not mean less surface warming, at least according to our current understanding.

Moreover, the discrepancy with models was opposite from 1958-1979 (Gaffen et al. 2000)—that is to say, the observed tropical upper-tropospheric warming was evidently stronger than expected. But the world was warming then too. So if this interesting phenomenon is real, it probably is not connected to global warming.

Fig. 1. Weaker upper-tropospheric warming and hence weaker water-vapour feedback actually implies, on average, slightly stronger overall positive feedback due to lapse rate and water vapour combined (from Ingram 2013).

Anyone who wants to argue that the “missing hot spot” implies something as to the future (say, that global warming will be less than current models predict) needs to come up with an alternative model of climate that agrees just as well with observations, obeys physical laws, predicts the absence of a “hot spot,” and predicts less future global warming (or whatever other novel outcome). This is how science advances---through the consideration of multiple hypotheses. If a new one comes along that fits the observations, I’ll gladly consider it.

Currently none of the explanations I can see for the “missing hot spot” would change our estimate of future warming from human activities, except one: that the overall warming of the tropics is simply slower than expected. It does seem that global-mean surface warming is starting to fall behind predictions, and this is particularly so in the tropical oceans (though not, curiously, on land). Possible causes are (a) aerosols, solar or other forcings have recently exerted a stronger (temporary) cooling influence than we think; (b) negative feedbacks from clouds have kicked in; or (c) the oceans are burying the heat faster than we expected. If (b) were true, we would revise our estimates of climate sensitivity downward. There are observations supporting options (c) and to a small extent (a), but there is plenty of room for new surprises. If it is (c) (which appears most likely), we then have to decide whether this is a natural variation or if it is a feature of global warming. In the former case the heat will soon come back; in the latter, the oceans will delay climate change more effectively than we thought. Another decade or so of observations should reveal the answer.

*for readers unfamiliar with the term “lapse rate,” it is the rate at which air temperature decreases with altitude.

Biosketch
Dr. Steven Sherwood is professor at the Climate Change Research Centre of the The University of New South Wales in Sydney. He received his M.S. degree (1989) in Eng. Physics/Fluid Mechanics at the University of California San Diego, USA and his Ph.D. degree (1995) in Oceanography at the Scripps Institution of Oceanography.
Sherwood studies how the various processes in the atmosphere conspire to establish climate, how these processes might be expected to control the way climate changes, and how the atmosphere will ultimately interact with the oceans and other components of Earth. Clouds and water vapour in particular remain poorly understood in many respects, but are very important not only in bringing rain locally, but also to global climate through their effect on the net energy absorbed and emitted by the planet. Tropospheric convection (disturbed weather) is a key process by which the atmosphere transports water and energy and in the process creates clouds, but it is also a turbulent phenomenon for which we have no basic theory and which observations cannot yet fully characterise.
Sherwood leads a research group that applies basic physics to complex problems by a combination of simple theoretical ideas and hypotheses and directed analyses of observations.

References
Allen, R. J., S. C. Sherwood, J. R. Norris and C. Zender, Recent Northern Hemisphere tropical expansion primarily driven by black carbon and tropospheric ozone, Nature, Vol. 485, 2012, 350-355.

Fedorov, A. V., C. M. Brierley, K. T. Lawrence, Z. Liu, P. S. Dekens and A. C. Ravelo (2013). "Patterns and mechanisms of early Pliocene warmth." Nature 496(7443): 43-+.

Gaffen, D. J., B. D. Santer, J. S. Boyle, J. R. Christy, N. E. Graham and R. J. Ross, Multidecadal changes in the vertical temperature structure of the tropical troposphere, Science, 2000, V. 287, 1242-1245.

Ingram, W. (2013). "Some implications of a new approach to the water vapour feedback." Climate Dynamics 40: 925-933.

Min, S. K., X. B. Zhang, F. W. Zwiers and G. C. Hegerl (2011). "Human contribution to more-intense precipitation extremes." Nature 470(7334): 378-381.

Sherwood, S. C., C. L. Meyer, R. J. Allen, and H. A. Titchner, Robust tropospheric warming revealed by iteratively homogenized radiosonde data. Journal of Climate, Vol. 21, 2008, 5336-5352.

Solomon, S., P. J. Young and B. Hassler (2012). "Uncertainties in the evolution of stratospheric ozone and implications for recent temperature changes in the tropical lower stratosphere." Geophysical Research Letters 39.

Stephens, G. L., T. L'Ecuyer, R. Forbes, A. Gettlemen, J.-C. Golaz, A. Bodas-Salcedo, K. Suzuki, P. Gabriel and J. Haynes (2010). "Dreary state of precipitation in global models." Journal of Geophysical Research 115: D24211.

Thorne, P. W. et al., A quantification of uncertainties in historical tropical tropospheric temperature trends from radiosondes, J. Geophys. Res., Vol. 116, 2011, D12116.

Wentz, F. J., L. Ricciardulli, K. Hilburn and C. Mears, 2007, How much more rain will global warming bring?, Science, Vol. 317, 233-235.

Guest blog John Christy

Why should we care about the tropical temperature?

John R. Christy, Distinguished Professor, Department of Atmospheric Science, Director Earth System Science Center, The University of Alabama in Huntsville

One important part of climate change research is to document the amount of change that can already be attributed to human activity. In other words we want to know the answer to the question, “How has the climate changed specifically because of the enhancement of the natural greenhouse effect caused by extra emissions due to human progress?” These rising emissions come primarily from energy production using carbon-based fuels which emit, as a by-product, the ubiquitous and life-sustaining greenhouse gas, carbon dioxide (CO2). From about 280 ppm in the 19th century, the current concentration of CO2 has risen to about 400 ppm.

So, what has the extra CO2 and other greenhouse gases done to the climate as of today? Climate model simulations indicate that a prominent and robust response to extra greenhouse gases is the warming of the tropical troposphere, a layer of air from the surface to about 16 km altitude in the region of the globe from 20°S to 20°N. A particularly obvious feature of this expected warming, and is a key focus of this blog post, is that this warming increases with altitude where the rate of warming at 10 km altitude is over twice that of the rate at the surface. This clear model response should be detectible by now (i.e. 2012) which gives us an opportunity to check whether the real world is responding as the models’ simulate for a large-scale, easy-to-compare quantity. This is why we care about the tropical atmospheric temperature.

Accumulating heat
There are two aspects to this tropical warming that are sometimes confused. One aspect is the simple magnitude of the warming rate, or temperature trend, of the entire troposphere. This metric quantifies the amount of heat that is accumulating in the bulk atmosphere. A well-established result of adding greenhouse gases to the atmosphere is that heat energy (in units of joules) will accumulate in the troposphere which can be detected as a rise in temperature. [The fundamental issue of the effects of greenhouse warming is: how many joules of heat are accumulating in the climate system per year?]

We don’t know at what rate that accumulation might occur as other processes may come into play which reduce or magnify it. For example, with extra greenhouse gases, the rate at which the joules are allowed to escape to space may be reduced by additional responses, causing even more heating. On the other hand, there could be an increase in cloudiness which may limit the number of joules (from the sun) which enter the climate system, thus causing a cooling influence. A reaction of the climate system to extra CO2 that promotes even more accumulation over what would have happened due to CO2 alone is a positive feedback, while one that limits the accumulation of joules is a negative feedback. In the climate system, there are numerous feedbacks of both signs, all interdependent and intertwined.

The second aspect of enhanced temperature change is the amount of amplification the higher altitude layers will experience relative to the surface warming as noted earlier – which is linked to the first aspect and is a feature discussed as a complement to the first aspect. In simple thinking, if enough joules are added to the troposphere to increase its temperature by 1 °C throughout, one would expect a uniform 1 °C warming from the surface to the top of the troposphere. However, as seen in the way the real atmosphere behaves on monthly and yearly time scales, the surface temperature change tends to be less than 1 °C while the upper troposphere warms to more than 1 °C. Since there is a reduction in the expected increase of the surface temperature given the number of joules added, this phenomenon is called a negative lapse-rate feedback to surface temperature (even though the upper air heats up more.) So the models anticipate that there will be a strong amplification of the surface temperature change as one ascends through the troposphere. [So, if someone claims that surface and upper air trends agree in magnitude, then they are also claiming that this is not consistent with the enhanced greenhouse effect since, according to models, the temperature trend of those two levels should not agree.]

Thus, there are two ideas to test in the tropics, (1) the overall magnitude of the layer-average temperature rise and (2) the magnification or amplification of the surface temperature change with height.

Balloons and satellites
Measurements of tropical tropospheric temperature have been performed by balloons that ascend through the air and radio back the atmosphere’s vital statistics, like temperature, humidity, etc. Due to a number of changes in these instruments through the years research organizations have spent a lot of effort to remove such problems and create homogenous or consistent databases of these readings. For this study we shall assume that the average of four major and well-published datasets (known as RATPAC, RAOBCORE, RICH and HadAT2) will serve as the “best guess” of the tropical temperatures at the various elevations (see Christy and Hnilo, 2007, Christy et al. 2010, 2011 for descriptions and earlier results.)

For a layer-average of the tropospheric temperature there are two satellite-based tropospheric datasets (known as UAH and RSS) which have by independent methods combined the readings from several spacecraft carrying microwave instruments into a time series beginning in late 1978. There are dozens of publications which detail the methods used by the various groups to generate both balloon and satellite products. Through the years each group has updated their products as new information has come to light, and we use the latest versions as of June 2013.

The time frame we shall consider here will begin in Jan 1979 and end in Dec 2012 as this is the time we have output from models and from observations, both balloons and satellites. It is also the period for which the greatest amount of accumulation of heat energy (joules) should be evident due to the increasing impact of the rising greenhouse gas concentrations.

To examine the simple magnitude of full-tropospheric trends we look at two layers as represented by what satellites measure which are roughly the average temperature of the surface and to about 10 km (lower troposphere or TLT) and surface to about 17 km (mid-troposphere or TMT). TMT gives more weight to the region between 500 (5.5 km) and 200 hPa (12 km) where the warming is expected to be most pronounced according to models, so the figures will focus on TMT. We can simulate the satellite layer using both balloon data and model output for direct, apples-to-apples comparisons (Fig. 1.)

Figure 1. Time series of the mid-tropospheric temperature (TMT) of 73 CMIP-5 climate models (rcp8.5) compared with observations (circles are averages of the four balloon datasets and squares are averages of the two satellite datasets.) Values are running 5-year averages for all quantities. [There are four basic rcp emission scenarios applied to CMIP-5 models, but their divergence occurs after 2030. Thus, for our comparison which ends in 2012, there are essentially no differences among the rcp scenarios.] The model output for all figures was made available by the KNMI Climate Explorer.

We see that all 73 models anticipated greater warming than actually occurred for the period 1979-2012. Of importance here too is that the balloons and satellites represent two independent observing systems but they display extremely consistent results. This provides a relatively high level of confidence that the observations as depicted here have small errors. The observational trends from both systems are slightly less than +0.06 °C/decade which is a value insignificantly different from zero. The mean TMT model trend is +0.26 °C/decade which is significantly positive in a statistical sense. The observed satellite and balloon TLT trends (not shown) are +0.10 and +0.09 °C/decade respectively, and the mean model TLT trend is +0.28 °C/decade. In a strict hypothesis test, the mean model trend can be shown to be statistically different from that of the observations, so that one can say the model-mean has been falsified (a result stated in a number of publications already for earlier sets of model output.) In other words, the model mean tropical tropospheric temperature trend is warming significantly faster than observations (See Douglass and Christy 2013 for further information.)

Amplification
Regarding the second aspect of temperature change, we show the vertical structure of those changes in Fig. 2 where we display the temperature trend by vertical height (pressure) as indicated by the four balloon datasets (circles), their average (large circle) and 73 model simulations (lines of various types).

Figure 2 Temperature trends in °C/decade by pressure level with 1000 hPa being the surface and 100 hPa being around 16 km. Circles represent the four observational balloon datasets, the largest circle being their mean. The lines represent 73 CMIP-5 model simulations (identities in Fig. 3) with the non-continuous lines representing models sponsored by the U.S. The large black dashed line is the 73-model mean. The pressure values are very close to linear with respect to mass but logarithmic with respect to altitude, so that 500 hPa is near 5.5 km altitude, 300 hPa near 9 km altitude and 200 hPa about 12 km altitude.

Figure 3 Caption for Fig. 2, identifying model runs and observational datasets.

The models (especially) show increasing trends as altitude increases to 250 hPa (about 10 km) before decreasing toward the stratosphere (~90 hPa). In comparing model simulations with the observations it is clear that between 850 and 200 hPa, all model results are warmer than the average of the balloon observations, a result not unexpected given the information in Fig. 1.

The amount of the amplification of the value of the surface trend with elevation in Fig. 2 is somewhat difficult to discern as each model has its own surface trend magnitude. To better compare the amplification effect, we normalize the pressure-level trend values by the trend of the surface value for each dataset and model simulation.

Figure 4. Value of the 1979-2012 temperature trend at various upper levels divided by the magnitude of the respective surface trend, i.e. the ratio of upper air trends to surface trends. Model simulations are lines with the average of the models as the dotted line. Squares are individual balloon observations (green – RATPAC, gray RAOBCORE, purple – RICH and orange – HadAT2) with the averages of observations the gray circles.

Figure 4 displays the ratio, or amplification factor, that observations and models depict for 1979-2012 in the tropics (see Christy et al. 2010 for further information). The mean observational result indicates the values are between +0.5 and +1.5 through the lower and middle troposphere (850 to 250 hPa). [The observational results tend to have greater variability due to the denominator (surface trend) being relatively small. Viewing Fig. 2 shows that the observations are rather tightly bunched for absolute trends in comparison to the model spread.] The models indicate a systematic increase in the ratio from 1.0 at the surface with amplification factors well above +1.5 from 500 to 200 hPa. What this figure clearly indicates is that the second aspect of this discussion, i.e. namely the rising temperatures with increasing altitude, is also over-done in the climate models. The differences of the means between observations and models are significant.

Overwarm
While there is much that can be discussed from these results, we wonder simply why the models overwarm the troposphere compared with observations by such large amounts (on average) during a period when we have the best understanding of the processes that cause the temperature to change. During a period when the mid-troposphere warmed by +0.06 °C/decade, why does the model average simulate a warming of +0.26 °C/decade?

Unfortunately, a complete or even satisfactory answer cannot be provided. Each model is constrained by its own sets of equations and assumptions that prevent simple answers, especially when all of the individual processes are tangled together through their unique complex of interactions. The real world also presents some baffling characteristics since it is constrained by the laws of physics which are not fully and accurately known for this wickedly complex system.

An interesting feature of the models is that almost all show greater year-to-year variability than observations (Fig. 1.) The average model annual variance (detrended) of anomalies is 60 percent greater than that of the observational datasets. This is a clue that suggests the models’ atmospheres are more sensitive to forcing than is the real climate system, so that an increase in greenhouse forcing in models will lead to a greater temperature response than experienced by the actual climate system. But saying the climate models are too sensitive only identifies another symptom of the issue, not the cause.

We want to know why the extra joules of energy that increasing CO2 concentrations should be trapping in the climate system are not found in Nature’s atmosphere compared with what the models simulate.

Could the extra joules be absorbed by the deep ocean and prevented from warming the atmosphere (Guemas et al. 2013)? This requires extremely accurate measurements of the deep ocean (better than 0.01 °C precision) which are not now available comprehensively in space and time. Current studies based only on observations suggest this enhanced sequestration of heat is not happening.

Could there be a separate process like enhanced solar reflection by aerosols that is keeping the number of joules available for absorption at a smaller level relative to the past? The interaction of aerosols with the entire array of climate processes is another fundamental area of research that has more questions than answers. How do aerosols affect cloudiness (more?, less?, brighter?, darker?). What is the precise, time-varying distribution of all types of aerosols and what exactly does each type do in terms of affecting the absorption and reflection of the joules in all frequencies? The IPCC typically shows very large error ranges for our knowledge of the aerosol effects, so there is a possibility that models have significant and consistent errors in dealing with them (IPCC 2007 AR4 Fig SPM.2).

Clouds and water vapor
Could there be a complex feedback response in the way the real atmosphere handles water vapor and clouds that acts to enhance the expulsion of joules to space under extra greenhouse forcing so they don’t accumulate very rapidly? Of the many processes that models struggle to represent, none are more difficult than clouds and water vapor. As recently shown by Stevens and Bony (2013) different models driven by an identical, simplified forcing produced very different results for cloudiness. This is my favorite option in terms of explaining the lack of joule-accumulation. As my colleague Roy Spencer reminds us, if you think about it, the atmosphere should have 100 percent humidity because it has an essentially infinite source of water in the oceans. However, precipitation prevents that from happening, so precipitation processes are apparently in control of water vapor concentrations – the greenhouse gas with the largest impact on temperature. This means the way precipitation and clouds behave (both in causing changes or responding to them) when slight changes occur in the environment is key in my view. We have actually measured large temperature swings that were preceded by changes in cloudiness in our global temperature measurements. So a response to the extra CO2 forcing by clouds and water vapor, which have a massive impact on temperature, could be the reason for the rather modest temperature rise we’ve experienced (Spencer and Braswell, 2010).

Or, could there be natural variations that completely overcome small enhancements in greenhouse-joule-trapping? These variations have demonstrated the ability to drive large temperature swings in the past, but we cannot simulate or predict them well at all. For that we need extremely accurate ocean simulations along with accurate representations of clouds, precipitation and water vapor (among other things).

The bottom line is that, while I have some ideas based on some evidence, I don’t know why models are so aggressive at warming the atmosphere over the last 34 years relative to the real world. The complete answer is probably different for each model. To answer that question would take a tremendous model evaluation program run by independent organizations that has yet to be formulated and funded.

What I can say from the standpoint of applying the scientific method to a robust response-feature of models, is that the average model result is inconsistent with the observed rate of change of tropical tropospheric temperature - inconsistent both in absolute magnitude and in vertical structure (Douglass and Christy 2013.) This indicates our ignorance of the climate system is still enormous and, as suggested by Stevens and Bony, this performance by the models indicates we need to go back to the basics. From this statement there is only a short distance to the next - the use of climate models in policy decisions is, in my view, not to be recommended at this time.

Biosketch
J.R. Christy is Distinguished Professor of Atmospheric Science at the University of Alabama in Huntsville and Director of the Earth System Science Center. He is Alabama’s State Climatologist. In 1989 he and Dr. Roy Spencer, then of NASA, published the first global, bulk-atmospheric temperatures from microwave satellite sensors. For this achievement they were recognized with NASA’s Medal for Exceptional Scientific Achievement and the American Meteorology Society’s Special Award for developing climate datasets from satellites. Christy has served on the IPCC panels as Contributor, Key Contributor and Lead Author and has testified before the U.S. Congress, federal court, many state legislatures and regulatory boards on climate issues.

References

Christy, J. R., W. B. Norris, R. W. Spencer, and J. J. Hnilo. Tropospheric temperature change since 1979 from tropical radiosonde and satellite measurements, J. Geophys. Res., 2007. 112, D06102, doi:10.1029/2005JD006881.

Christy, J.R., B. Herman, R. Pielke, Sr., P. Klotzbach, R.T. McNider, J.J. Hnilo, R.W. Spencer, T. Chase and D. Douglass, (2010): What do observational datasets say about modeled tropospheric temperature trends since 1979? Remote Sens. 2, 2138-2169.

Christy, J.R., R.W. Spencer and W.B. Norris, 2011: The role of remote sensing in monitoring global bulk atmospheric temperatures. Int. J. Remote Sens., 32, 671-685, DOI:10.1080/01431161.2010.517803.

Douglass, D. and J.R. Christy, 2013: Reconciling observations of global temperature change: 2013. Energy and Env., 24 No. 3-4, 414-419.

Guemas, V., F.J. Doblas-Reyes, I. Andrea-Burillo and M. Asif, 2012: Retrospective prediction of global warming slowdown in the past decade. Nature Clim. Ch., 3, 649-653, DOI:10.1038/nclimate1863.

Spencer, R.W. and W.D. Braswell, 2010: On the diagnosis of radiative feedback in the presence of unknown radiative forcing. J. Geophys. Res., 115, DOI:10.1019/2009JD013371.

Stevens, B. and S. Bony, 2013. What Are Climate Models Missing? Science. 31 May 2013. Doi:10.1126/science/1237554.

Leave a Reply

Expert comments to The (missing) tropical hot spot

Jump to public comments | Jump to off-topic comments
  • Marcel Crok

    Sherwood, Mears and Christy will write a first response to the two other blog posts. These reactions will be published in a few days.
    Meanwhile feel free to comment in the public comments section. Be aware that comments are moderated in advance by our moderator of KNMI. This will be done during Dutch day hours.

  • Marcel Crok

    The first reactions of the participants to eachother’s guest blogs will be published early next week. It was quite difficult to find a period in which all three were just at the office. So currently one of them – Christy – is on the road with infrequent internet access.
    Meanwhile just continue the discussion.

  • Steven Sherwood

    Ross McKitrick seems to be implying that we should not trust the surface warming record, and should regard the atmospheric temperature record as a separate measure of climate change which somehow discounts global warming itself. He and others seem unwilling to accept the clear evidence that the surface and near-surface warming records are, collectively (that is including independent ocean surface, near-surface maritime, and 2-meter terrestrial records which at least over the 20th century are all quite consistent) far more trustworthy and solid than the dodgy free-atmosphere trends. I think we all agree that recent warming in the Tropics has been less than we would have expected no matter how it is measured, and I agree this merits further research (and indeed has spurred a flurry of efforts in the last year or two, so rest assured more papers will be coming out looking at this).

  • Carl Mears

    This is a response to phi. I think that phi makes a good point, which I’ll try to explain a
    little more fully. If we calculated the amplification starting at 850 hPa, instead of the surface (1000 hPa), the trend ratios from John’s figures 2 and 4 would be much more in line with the expectations of moist adiabatic lapse rate theory, at least up to about 250 hPa. So the problem (at least in the radiosonde data) mostly occurs between 1000 hPa and 850 hPa. (We can’t resolve features this small in height using the satellite data.) What might this mean? Some people with more of a climate change denial perspective might claim that the surface data are wrong, but I don’t think this is very likely. What I think is more likely is that the models are getting something wrong about the boundary layer response to global warming. The boundary layer is much more complicated than the free atmosphere above it, so if the models are wrong, it seems more likely to be in this region.

    It is indeed surprising that the trend is less at 850 hPa than the surface. Here is one way that it might be happening. Assuming that the lapse rate is determined by the moist adiabatic rate from the surface to the tropopause might be too simplistic. Under most convecting clouds, there is a sub-cloud region where the air is not saturated, and the lapse rate is closer to the dry adiabatic lapse rate, which is much larger. (You can read about this in Kerry Emanuel’s book “Atmospheric Convection”) If the thickness of this layer increased over time fast enough, then the temperature at 850 hPa (above this layer) would have a smaller trend than the surface, and we would get the observed temperature trends. This (I think) would require that the relative humidity at the surface trend downward, and I am not sure how this might occur.

    Maybe Steve could comment on this idea — he knows a lot more about tropical convection than I do.

  • Carl Mears

    First comments of Carl Mears on the two other blog posts:

    It appears to me that the three of us agree fairly well about the basic facts, but differ on interpretation.
    Like Steve, I am somewhat mystified about all the attention given to the tropical hotspot, as I don’t think it is very important for global warming theory, and it is relatively poorly observed. The uncertainties involved in the various types of observations are made worse by focusing on a ratio between two relatively small trend values.
    I think all three of us agree that the observed temperature changes in the tropics (and globally) are less than predicted over the last 35 years. John uses this fact to argue that there are fundamental flaws in all climate models, and that there results should be excluded from influencing policy decisions. This goes much too far. First, many imperfect models are used to inform policy makers in many areas, including models of the economy, population growth, environmental toxins, new medicines, traffic flow, etc. etc. As pointed out by a commenter in this thread, policy makers are used to dealing with uncertain predictions. If we throw out all imperfect models, we will be reduced to consulting the pattern of tea leaves on the bottom of our cups to make decisions about the future. Second, as I argue below, there are many possible reasons for this discrepancy, and only a few substantially influence the long-term predictions.
    Let’s return from this philosophical aside to a discussion of the difference between recent observations and climate model predictions. In my mind, the possible causes for this disagreement fall into 3 general categories.

    Bad Luck
    Bad Forcings
    Bad Model Physics

    Bad Luck. By Bad Luck, I mean that the last decade is cooler than normal due to the random occurrence of some pattern of unforced internal variability. Most climate model simulation exhibit decade-long periods of little or no warming, as shown by Eastering and Wehner (2009). And in general, climate models, even though they tend to have too much year-to-year variability (as mentioned by John in his initial post), often show too little variability on multidecadal on longer scales. There is an interesting discussion of this topic in a recent issue of the AGU newsletter (Lovejoy, 2013). So, for multidecadal time periods, I would expect the real world to be bumpier (on multidecadal time acales) than a typical model simulation, and much bumpier than the mean of many simulations. Thus I think that there is some possibility part of the cause the current discrepancy may be just bad luck. Though the time period is getting long enough, and the discrepancy is getting large enough that we should be able to begin to understand something about what is going on. In other words, even if it is due to a random fluctuation, we should be able to see the fluctuation in other variables or parts of the system, such as heat flux into either the ocean or into space.

    Bad Forcings. Forcings are the inputs to the climate system by processes external to Earth’s climate. These include anthropogenic modifications to the atmosphere (CO2, methane, various aerosols), volcanic aerosols, and changes in solar output. If the estimates of these forcings, which are used as input to the climate model simulations, are not correct we can hardly expect the climate model output to be correct. Is there any evidence for incorrect forcings over the past 35 years being used for model input? In fact, there is.
    One example is radiative forcing due to stratospheric sulfate aerosols, which are little droplets of SO2 and water that scatter the incoming light from the sun. It is well accepted that increases in stratospheric aerosols warm the stratosphere and cool the surface and troposphere. Both effects can clearly be seen in the MSU temperature record after the colossal eruptions of El Chichon and Pinatubo. These eruptions spewed large amount of gaseous sulfur into the stratosphere, where it oxidized to form excess levels sulfate aerosols. These events, and others before 2000, are well represented in the stratospheric aerosols datasets used to drive the 20th century simulations for CMIP-5. After 2000, the level of stratospheric aerosols in the input datasets is allowed to decay to zero. In real life, however, observations indicate that the background level of stratospheric aerosols increased over the 2000-2010 period (Solomon et al, 2011), probably due to a large number of small volcanic eruptions (Neely et al, 2013). The effect is large enough to offset about 25% of the effect of increasing CO2 over this period (Solomon et al, 2011).
    Other forcings with possible problems include solar output, stratospheric ozone, and black carbon aerosols. The sun has been in a quiet, low-output phase for longer than expected. This is not included in the CMIP-5 forcings, and thus model results should be expected to be slightly warmer than real life. Temperature changes in the upper troposphere and lower stratosphere have been shown to be very sensitive to the stratospheric ozone concentrations used (Solomon et al, 2012). These effects appear to extend below the tropical tropopause, low enough to affect tropospheric temperature trends and the tropospheric hotspot. The ozone dataset used in the CMIP-5 simulations is the one with the most conservative trends in ozone. If one of the other datasets had been used, the models would have shown less upper tropospheric warming. There are probably other similar problems that I am not aware of.
    None of these effects are large enough to explain the model/measurement discrepancies by themselves, but they are each likely to be part of the cause. The cumulative effect of all has not been evaluated.

    Bad Model Physics. It is also possible that Bad Model Physics could be part of the cause. Possible causes in this category include problems with cloud feedback, problems with the effects of tropospheric aerosols (and in particular the interaction of aerosols with cloud formation), and poorly-modeled interaction between the atmosphere and ocean. The first two are widely acknowledged to be major contributors to the uncertainty in model predictions. For the third, there is some evidence that heat is being subducted into the ocean at a rate higher than the models expect, though exactly where it is going is less clear (Balmaseda et al,, 2013, Levitus et al., 2012). In our own observations of ocean surface winds, we see trends in wind speed in the tropical pacific that are far larger than those predicted by models. These winds may serve to stir up the ocean, and remove heat from the surface. We do not know whether these effects represent part of the response to global warming, or a part of a pattern of decadal-scale random fluctuation.
    Note that only some of the possible problems with model physics affect the long-term model predictions. Increased heat flux into the ocean only serves to delay the temperature increase (and increase the rate of sea level rise), while an error in cloud feedback could affect the long-term temperature rise.

    In summary, there are a large number of possible explanations for the model/measurement discrepancy in recent temperature rise. Only a few of these, such as errors in cloud feedback, affect the long-term predictions, while others, such as errors in the natural forcings used as model input, or simulated ocean heat uptake do not. At this time, we simply do not know the exact cause or causes, but I strongly suspect that it is due to a combination of causes rather than one dominant cause.

    References.

    Eastering, D. R. and M. F. Wehner, “Is the Climate Warming or Cooling?”, Geophysical Research Letters, 36, L08706, doi:10.1029/2009GL037810, 2009.
    Lovejoy, S., “What is Climate”, EOS, 94, number 1, January 2013.
    Solomon, S., J.S. Daniel, R. R. Neely III, J. P.Vernier, E. G. Dutton, and L. W. Thomason, ‘The Persistently Variable “Background” Stratospheric Aerosol Layer and Global Climate Change’, Science 333, pp 866-870, 2011.
    Solomon, S., P. J. Young, and B. Hassler, ‘Uncertainties in the evolution of stratospheric ozone
    and implications for recent temperature changes in the tropical lower stratosphere’, Geophysical Research Letters, 39, L17706, doi:10.1029/2012GL052723, 2012.
    M. A. Balmaseda, K. E. Trenberth, E. Kallen, ‘Distinctive climate signals in reanalysis of global ocean heat content’. Geophys. Res. Lett. 40, doi:10.1002/grl.50382 (2013).
    S. Levitus, et al.,World ocean heat content and thermosteric sea level change (0-2000 m), 1955-2010. Geophys. Res. Lett. 39, L110603, doi:10.1029/2012GL051106 (2012).

  • Steven Sherwood

    First comments of Steven Sherwood on the two other blog posts:

    I agree with pretty much everything Carl says, and he’s gone into more detail than I did on the latest results. We agree that the data we have are basically not stable enough over time to distinguish whether a “hot spot” exists or not, or is as prominent as we would expect. We also agree that warming over the past couple of decades is running lower than nearly all CMIP5 models predict it should be, which is perhaps a more worthy “debate” topic and one that I think will get a lot of attention when the IPCC report comes out. The reasons for this are likely due to cooling influences that have not been applied to the models, such as the unprecedented recent solar minimum, the continuing rise in atmospheric aerosol concentrations and the decline in stratospheric water vapour. To some extent it may also be a chance fluctuation that will go the other way in a few years. Finally, it may signal a somewhat low climate sensitivity–but a sensitivity low enough to make global warming cease to be a problem is basically ruled out by other evidence, particularly palaeoclimate evidence.

    As to John Christy’s post, I don’t really think he’s being forthright about the uncertainties in the data. His Fig. 1 does not identify what datasets are actually being used, but Carl’s own plots show that the results depend on this. Also, the plot states that the model calculations he compares to are based on scenario “RCP8.5,” but that is a high-emissions future scenario so I am puzzled by why he is doing this — there are historical simulations in CMIP5 that are meant for comparing with observations.

    John makes a number of rambling but sometimes interesting points. One is that models have too much interannual variability, which he suggests may be a sign they are too sensitive. He is right that they have too much interannual variability, but they have too *little* decadal variability as compared to paleoclimate data over the Holocene. So by his own reasoning maybe they are actually too insensitive.

    His statement that heat is not being sequestered in the oceans is false. A paper last year by Rahmstorf et al. showed that temporary heat storage associated with recent La-Nina conditions could explain why warming during the last decade has been slower than in previous decades, and other studies of Earths’ heat balance (most importantly papers by Syd Levitus and Murphy et al.) have shown that the heat is indeed appearing in the worlds’ oceans more or less as expected, as far as we can tell.

    I agree with John Christy that our models are imperfect (see my own post, which discusses this even more than he does), but not that this implies climate change is necessarily any less of a concern.

    I can also respond to one posted comment, on whether heat absorption by the oceans would mean we have less to worry about. The main problem is that this absorption is likely to be temporary.

  • John Christy

    First comments of John Christy on the two other blog posts:

    For some readers of the science of climate, the topic of tropical surface and atmospheric temperature differences has become the “issue that would not die.” We have been discussing this for 15 years and yet resolution has not been achieved, either in the trends themselves or in the physical understanding of the problem. This has become a specific example of the fact that the science of climate change from human causes is not a “settled science.”

    Comments on Mears Original Post

    The information provided in Mears’s comment focuses on a little-used quantity to investigate the tropical temperature. After presenting the information, Mears arrives at the conclusion that the observational data are not yet accurate enough to prove or disprove the magnitude of the model-generated hot-spot as real (i.e. not accurate enough to falsify the dominant model response regarding the enhanced greenhouse effect). I agree with virtually all that Mears writes as background. However, I think the fundamental question examined here may be viewed from a larger perspective that draws on more information that then can lead to a less ambiguous conclusion.

    Mears focuses on “observations” of a quantity known as the “Temperature Tropical Troposphere” TTT which is not actually observed, but is rather a derived quantity dependent upon the difference of two measurements (temperature of the mid-troposphere (TMT) and lower stratosphere (TLS)). The differencing process tends to increase the error opportunities for the derived product. Observations of the lower stratosphere (TLS) have greater uncertainty, and this contributes to the spread of the TTT results among the various datasets.

    A more direct method that reduces observational error (especially of the more uncertain stratospheric portion) is simply to examine the temperature product of one channel which captures the bulk of the desired signal (mid-to-upper troposphere temperature) and which avoids the compounding of errors that a “difference” of products introduces. (The stratosphere only contributes about 7% to the TMT tropical signal.) Hence in my posting, the comparisons focused on the TMT product about which much more is known. [I calculated and discussed, but did not show by chart, that the results using the lower tropospheric temperature or TLT were very similar to those of TMT.] Mears also tends to discuss those datasets which are the “warmest”, i.e. RSS, STAR and MERRA.

    However, there is information on why the datasets differ, and this can be used to infer a more confident assessment (see Christy et al. 2010 and 2011 for more information).

    The following are a few examples of the knowledge we have of these datasets. STARv2.0 contains a spurious warming shift on 1 Jan 2001 which will be corrected in the new v3.0 to be released later this year. So, STAR’s results in Mears’s contribution overstate the warming. MERRA is an outlier dataset for temperature trends with problems, including a significant warm shift between 1990 and 1992 probably due to the inability to correct for infrared contamination from Mt. Pinatubo. As a result, MERRA produces the warmest tropospheric trend by far of all observational and reanalysis datasets, being almost +0.10 °C/decade warmer than the average of the balloons. HadAT2, using a more conservative methodology for detecting shifts in balloon measurements, likely has retained spurious upper troposphere/lower stratosphere cooling from radiosonde equipment changes over time which contributes to its relatively “cool” trend. ERA-I appears to be excellent in the lower troposphere, but with the inclusion of aircraft reports after 2002 experienced a spurious warming in the upper troposphere due to the previously too-cool analyzed values in that region (note: this also impacts the RAOBCORE and RICH datasets).

    A minor controversy appeared last year when Po-Chedley and Fu (2012) allegedly found an error in our UAH TMT dataset when in fact the main source of their finding was based on an incorrect understanding of the UAH merging sequence of the satellites (Christy and Spencer 2013). It is understandable that Mears highlights RSS data since he is the source of the product, but evidence has been published to demonstrate the RSS TMT product likely has spurious tropical warming due to an apparent overcorrection of the diurnal cycle errors (e.g. Christy et al. 2010). [In a counter-intuitive result, this correction causes the RSS global TLT trend to be cooler than UAH’s.] I’m not clear as to why RATPAC (NOAA balloon dataset) nor JRA-25 (a Japanese reanalysis dataset) were not included in Mears analysis. Both are very near the overall averages shown in my earlier posting and are both cooler than RSS, STAR and MERRA. So, while none of the datasets can claim to be perfect, we can explain many of their differences and by averaging, reduce the independent errors.

    Thus there are clear reasons for not highlighting RSS, MERRA or STAR as observational datasets. Rather, to take a more unbiased approach to the observations, I had simply calculated the mean of the two categories of datasets (satellite and balloon separately) to reduce the random error opportunities. In this way the impact of independent errors that lead to trends that are too warm or too cool may be limited. The fact the tropospheric trends from the average of two very different and independent set of monitoring systems, i.e. balloons and satellites, are within 0.01 °C/decade of each other, lends confidence to the result. [I did not include STAR due to the known shift in its temperature and the fact it uses the identical diurnal corrections as RSS – thus it is very similar to RSS but with a known spurious shift. However, even if STAR were included as an “independent” dataset, the significance of the results would not change.]

    The simple numbers tell the story and can’t be overlooked. From 73 CMIP-5 model runs, the 1979-2012 mean tropical TMT trend is +0.26 °C/decade. The same trends calculated from observations, i.e. the mean of four balloon and mean of two satellite datasets, are slightly less than +0.06 °C/decade. Tropical TMT is a quantity explicitly tied to the response of models to the enhanced greenhouse effect (or any applied forcing). Because the sample of climate model runs is relatively large (N = 73) we have a very confident assessment of the model-mean value and its error range does not encompass the observations. In addition, the agreement of the means of two independent observational systems further indicates that we have a very good idea of the actual TMT trend. The mean of the models (often used as the “best estimate” in IPCC assessments) and observations differ by +0.20 °C/decade which is highly significant. And, we are not talking about 10 or 15-year trends – this is a 34-year period over which this discrepancy has grown. Regarding the highly significant nature of the differences in my initial posting, I failed to mention the many papers led by Ross McKitrick (e.g. McKitrick et al. 2010, McKitrick et al. 2011 and others) in which they demonstrate with more advanced statistical tools that the models and observations are indeed significantly different regarding tropical tropospheric temperature trends.

    All in all there is little to argue with in the posting by Mears as it reflects a typically careful and clinical examination of the issue, and indeed allows for the conclusion I state above. However, the extra information shown above, I believe, enhances the confidence in the observational results that then leads to a more definitive statement that the models, on average, have significantly misrepresented the evolution over the past 34 years of a bulk quantity directly tied to the models’ response to the enhanced greenhouse effect.

    Comments on Sherwood’s Original Post

    Sherwood provides a more theoretical discussion of the topic, as well as bringing up some different problems with which climate models must also contend. He points to the likely tendency of models to tie the surface trends too tightly to upper tropospheric temperature and I agree. If I read the post correctly, however, Sherwood also expresses the opinions that (1) the observations are too error-prone for any definitive use, and (2) even if they were useful, the large disagreement between observations and models in the tropical troposphere would be largely inconsequential.

    I hold very different opinions in that (1) with time and increased understanding, the observations of the troposphere from independent systems are converging to the true answer, and that (2) the fact the average model is accumulating heat in the upper atmosphere at a rate three times faster than the observations has serious implications for representing the entire climate system, including a mis-modeling of the surface temperature (see next paragraph). It also betrays a mis-handling of the hydrologic/convective processes of the troposphere – processes that are fundamental to understanding climate variability and change over time, as Sherwood notes. And, by utilizing TMT as I did, I avoided the uncertainties of the stratospheric correction to TTT that Sherwood rightly points out.

    Sherwood displays a plot that shows how water vapor feedback and lapse-rate feedback tend to cancel. It is important to realize that this is generated from model output which I find difficult to accept as a proxy for the real world. For models in general, water vapor feedback doubles the surface warming. The lapse rate feedback mitigates this somewhat at the surface. Now, if I follow this train of thought correctly with the idea the upper temperature trend doesn’t matter, if the water vapor feedback in models was zero, is Sherwood claiming there would still be the extra surface warming? The observational evidence suggests the water vapor feedback is weak to non-existent for multi-decadal time scales which implies less warming than that depicted by models with their strong positive water vapor feedback.

    Sherwood indicates that the surface temperature record is robust for climate purposes. I have three comments to make here. First, we and others have shown that the land surface record, as represented by daily mean temperature, is likely contaminated by a warming nighttime trend due to surface development around the world (e.g. Christy et al. 2006, McKitrick and Michaels 2007, Christy et al. 2009, McKitrick 2010, McKitrick and Nierenberg 2010, McNider 2012, Christy 2013.) Secondly, if the surface temperature is the most robust and important metric, we find that the 73 models shown in the earlier post, on average, produce a surface trend in the tropics that is almost twice too warm since 1979 even with the contaminated observational data (+0.19 vs. +0.11 °C/decade). Thus, even with the surface temperature metric, there are problems for models. Thirdly, the fundamental measure of greenhouse warming is the accumulation of joules in the climate system. The surface temperature is woefully inadequate to document this metric as the deep atmosphere and ocean represent the reservoirs that should be monitored to detect changes in this quantity.

    When pointing out other model problems Sherwood notes, as an example, that no climate model has replicated the rapid north polar ice loss and that this is an interesting problem. However, none of the models have shown an increasing extent of sea ice in the southern hemisphere either – so we have a problem at both poles for which models have opposing answers and thus opposing issues to solve. There are other such examples of models overwarming the climate. However, as stated in my original post, the importance of the tropical atmospheric temperature is key to model fidelity to the real world because it involves the complicated and ubiquitous interrelationships among the various water components of the climate system.

    I did not quite understand Sherwood’s closing comments challenging scientists to come up with a different theory regarding the hot spot. I understand the challenge regarding a new explanation for tropical features, but the implication is that the current theory should reign. However, it is clear that the current theory (as expressed in climate models) fails the test against observations and thus should not be granted any particular meritorious status. The current theory may be close to reality in some aspects but is missing something important since it diverges so far from reality. Other theories have been offered (i.e. negative cloud feedback as mentioned by Sherwood) which in fact more closely match the observed temperature changes.

    As I indicated in the original post, I cannot say why any particular model departs so much from reality, but I believe the evidence is clear that the departures are real and significant which then exposes serious problems for the models as long-range forecasting tools.

    John R. Christy
    University of Alabama in Huntsville

    References:

    Christy, J.R., W.B. Norris, K. Redmond and K. Gallo, 2006: Methodology and results of calculating central California surface temperature trends: Evidence of human-induced climate change? J. Climate, 19, 548-563.

    Christy, J.R., W.B. Norris and R.T. McNider, 2009: Surface temperature variations in East Africa and possible causes. J. Clim. 22, DOI: 10.1175/2008JCLI2726.1.

    Christy, J.R., B. Herman, R. Pielke, Sr., P. Klotzbach, R.T. McNider, J.J. Hnilo, R.W. Spencer, T. Chase and D. Douglass, 2010: What do observational datasets say about modeled tropospheric temperature trends since 1979? Remote Sens. 2, 2138-2169. Doi:10.3390/rs2092148.

    Christy, J.R., R.W. Spencer and W.B Norris, 2011: The role of remote sensing in monitoring global bulk tropospheric temperatures. Int. J. Remote Sens. 32, 671-685, DOI:10.1080/01431161.2010.517803.

    Christy, J.R. and R.W. Spencer, 2013: Comments on “A bias in the midtropospheric channel warm target factor on the NOAA-9 Microwave Sounding Unit.” J. Atmos. Oceanic Techno., 30, 1006-1013. Doi:10.1175/JTECH-D-12-00107.1.

    Christy, J.R., 2013: Monthly temperature observations for Uganda. J. Applied Meteor. Clim. (in press).

    McKitrick, R.R. and P.J. Michaels, 2007: Quantifying the influence of anthropogenic surface processes and inhomogeneities on gridded global climate data. J. Geophys. Res., 112:D24S09. DOI:10.1029/2007JD008465.

    McKitrick, R.R., S. McIntyre and C. Herman, (2010): Panel and multivariate methods for tests of trend equivalence in climate data sets. Atmos. Sci. Lett., 11(4), 270-277. doi: 10.1002/asl.290.

    McKitrick, R.R. and N. Nierenberg, 2010: Socioeconomic patterns in climate data. J. Econ. Soc. Meas. 35:149-175. DOI:10.3233/JEM-2010-0336.

    McKitrick, R.R., S. McIntyre and C. Herman, (2011): Corrigendium. Atmos. Sci. Lett., 12(4), 386-388. doi: 10.1002asl.360.

    McNider, R.T., G.J. Steeneveld, A.A.M. Holtslag, R.A. Pielke Sr., S. Mackaro, A. Pour-Biazar, J. Walters, U. Nair, and J.R. Christy, 2012: Response and sensitivity of the nocturnal boundary layer over land to added longwave radiative forcing. J. Geophys. Res., 117, D14106, doi:10.1029/2012JD017578.

    Po-Chedley, S. and Q. Fu, 2012: A bias in the midtropospheric channel warm target factor on the NOAA-9 Microwave Sounding Unit. J. Atmos. Oceanic. Technol., 29, 646-652.

  • A big thank you to the invited experts, Carl Mears, Steven Sherwood, and John Christy, for their detailed essays regarding the tropical hotspot.

    As Carl Mears noted in his response to the others: “It appears to me that the three of us agree fairly well about the basic facts, but differ on interpretation.”

    In that vein I’d like to focus on questions 1 and 2 from the introduction, which address some basic facts about the hotspot, about which it may be easiest to obtain agreement. I would like to invite all three invited experts to explicitly address these two questions (either with a yes or no, or with some context if desired), which I’ll repeat below.

    1) Do the discussants agree that amplified warming in the tropical troposphere is expected?

    Carl Mears explicitly addressed the thermo-dynamical cause of a tropical hotspot (I’ll take that as a “yes”) and Steven Sherwood alluded to such. John Christy referred to it only as a model prediction, without addressing the plausibility of the physical underpinning. Do all three of you agree that amplified warming (of an as yet unquantified magnitude) in the tropical troposphere is expected, based on established physics?

    2) Can the hot spot in the tropics be regarded as a fingerprint of greenhouse warming?

    Carl Mears and Steven Sherwood explicitly stated that enhanced tropospheric warming over the Tropics is not specific to a greenhouse mechanism, but should occur for any surface warming, irrespective of its cause (I’ll take that as a “no”). John Christy referred to the hotspot as a model-predicted consequence of the enhanced greenhouse effect, giving the impression that he regards it as a fingerprint specific for a greenhouse mechanism. John, perhaps you could confirm whether or not you view the hotspot as specific for a greenhouse mechanism?

    After getting these issues clarified, we can move on to Q3, on which there seems to be ample disagreement (on whether the enhanced tropical tropospheric warming is significantly different from observations or not).

  • Based on emails from both Steven Sherwood and John Christy, and based on Carl Mears’ blogpost, I can report that all three agree that

    1) Yes, amplified warming in the tropical troposphere is expected.

    And that

    2) No, the hot spot in the tropics is not specific to a greenhouse mechanism.

    Notice that I changed the wording of question/statement 2 here, because the word “fingerprint” was interpreted differently by John Christy than how we meant it.

    In his email to us, John Christy wrote regarding Q1: “Yes, the hot spot is expected via the traditional view that the lapse rate feedback operates on both short and long time scales.” Regarding Q2 he wrote: “it [the hot spot] is broader than just the enhanced greenhouse effect because any thermal forcing should elicit a response such as the “expected” hot spot.” Further elaborations in the email exchange, e.g. regarding whether to call this a fingerprint, involved interpretations as to the meaning of (a lack of) a hot spot, which we will defer for the moment.

    The next issue that we’ll take up is encapsulated in Q3:

    3) Is there a significant difference between modelled and observed amplification of surface trends in the tropical troposphere (i.e. between the modelled and the observed hot spot)?

  • Marcel Crok

    As concluded in the comment above the three discussants agree about the first two basic issues. They all expect tropical amplification and they agree that any forcing in the tropics should generate the hot spot so that the hot spot is not strictly a fingerprint of greenhouse forcing.

    Based on their guest blogs and first comments there is much less agreement though about the existence of the hot spot in the observations. Put shortly, Sherwood and Mears state the uncertainties in the data are so big that one cannot conclude much. Mears for example wrote in his guest blog: “Taken as a whole, the errors in the measured tropospheric data are too great to either prove or disprove the existence of the tropospheric hotspot. Some datasets are consistent (or even in good agreement) with the predicted values for the hotspot, while others are not. Some datasets even show the upper troposphere warming less rapidly than the surface.”

    Christy on the other hand is pretty sure that differences between observations and models are significant. According to him this applies both to the absolute warming trends of the TMT (Tropical Mid Troposphere)(see his figure 1) as well as to the amplification (the ratio between warming in the tropical troposphere and the warming at the surface)(see his figure 3).

    Mears (see his figure 1, 2 and 3) and Christy go in most detail about the data they use and they end up drawing different conclusions. This needs clarification. How is that possible and can we understand why they come to different conclusions?

    Both Mears and Sherwood (and this opinion is also given in several public comments) say that Christy underestimates the uncertainties in the data. Sherwood in his first comments wrote: “As to John Christy’s post, I don’t really think he’s being forthright about the uncertainties in the data.”

    1. Which datasets to use?
    Christy on the other hand states that he feels pretty confident about the observational trends because he used the average of four radiosonde datasets (RATPAC, RAOBCORE, RICH and HadAT2) and two satellite datasets (RSS and UAH) and these averages are pretty close to eachother.

    Mears in his figure 2 shows RAOBCORE, RICH and HadAT2, RSS and UAH as well, but not RATPAC. In addition he shows two reanalysis datasets (ERA-Interim and MERRA) and a third satellite dataset (STAR). STAR and MERRA come closest to the modelled amplification in Mears’ figure 2.

    Christy in his first comments was critical about some of the datasets used by Mears:

    However, there is information on why the datasets differ, and this can be used to infer a more confident assessment (see Christy et al. 2010 and 2011 for more information).

    The following are a few examples of the knowledge we have of these datasets. STARv2.0 contains a spurious warming shift on 1 Jan 2001 which will be corrected in the new v3.0 to be released later this year. So, STAR’s results in Mears’s contribution overstate the warming. MERRA is an outlier dataset for temperature trends with problems, including a significant warm shift between 1990 and 1992 probably due to the inability to correct for infrared contamination from Mt. Pinatubo. As a result, MERRA produces the warmest tropospheric trend by far of all observational and reanalysis datasets, being almost +0.10 °C/decade warmer than the average of the balloons. HadAT2, using a more conservative methodology for detecting shifts in balloon measurements, likely has retained spurious upper troposphere/lower stratosphere cooling from radiosonde equipment changes over time which contributes to its relatively “cool” trend. ERA-I appears to be excellent in the lower troposphere, but with the inclusion of aircraft reports after 2002 experienced a spurious warming in the upper troposphere due to the previously too-cool analyzed values in that region (note: this also impacts the RAOBCORE and RICH datasets).

    So Christy gives a reason why he doesn’t use ERA, MERRA and STAR. Here a reaction of Mears and Sherwood is wanted.

    2. TMT vs TTT
    Another issue is that Christy used TMT while Mears used TTT (Temperature of the Tropical Troposphere) where TTT is defined as 1.1*TMT – 0.1*TLS (Temperature Lower Stratosphere). Christy thinks TTT needlessly increases the uncertainties by using a signal from the lower stratosphere:

    Mears focuses on “observations” of a quantity known as the “Temperature Tropical Troposphere” TTT which is not actually observed, but is rather a derived quantity dependent upon the difference of two measurements (temperature of the mid-troposphere (TMT) and lower stratosphere (TLS)). The differencing process tends to increase the error opportunities for the derived product. Observations of the lower stratosphere (TLS) have greater uncertainty, and this contributes to the spread of the TTT results among the various datasets.

    A more direct method that reduces observational error (especially of the more uncertain stratospheric portion) is simply to examine the temperature product of one channel which captures the bulk of the desired signal (mid-to-upper troposphere temperature) and which avoids the compounding of errors that a “difference” of products introduces. (The stratosphere only contributes about 7% to the TMT tropical signal.) Hence in my posting, the comparisons focused on the TMT product about which much more is known.

    3. Trends
    Based on the six datasets he used Christy concludes the TMT trend since 1979 is “slightly less than +0.06 °C/decade which is a value insignificantly different from zero. The mean TMT model trend is +0.26 °C/decade which is significantly positive in a statistical sense.” He didn’t mention error bars.
    Mears and Sherwood didn’t give such a number, maybe because they believe such a number is meaningless given the uncertainties in the data.
    It would be informative of course to have their best estimates or guesses as well including uncertainty intervals. Could all three of you give these (trends and error bars)?

    In summary, we want to get clear why Christy feels sure observations and models are inconsistent with eachother while Sherwood and Mears say the data are too uncertain to draw this conclusion.
    Let’s start with the three issues described here and see how far we can come clearing this up.

    Marcel

    • Carl Mears

      I’ll address the question of which datasets to consider in this post, and get to the other questions in later posts.

      I think it is dangerous to eliminate specific datasets from consideration based on a limited set of criteria. Most of the temperature trend community have agreed that it is best to show all the datasets so that the analyst can use the spread between datasets to assess how well temperature changes are understood. If we start to throw out datasets as soon as we detect a small flaw, we may be left with datasets with larger, but undetected (so far) flaws. Several groups, including ours, have moved to making a large number of possible datasets to further flesh out the range of reasonable datasets.

      John Christy likes to use arguments based on short term trend differences and jumps to throw out datasets — usually those with warmer than average trends. We showed in Mears et al, 2012 that this method is strongly dependent on the comparison datasets used, and that the entire time series should be assessed before drawing conclusions, as opposed to only analyzing one or more segments that are under suspicion. Note that in this paper, we found that when the entire time series was investigated, the STAR 2.0 dataset had short term trends that were closest to those in the various adjusted radiosonde datasets.

      In Christy’s last post, he made an argument for excluding STAR V2.0 based on a small positive jump in temperature in 2001, and that is is the same as RSS, since it uses the same diurnal correction. First, the jump in 2001 is fairly small, and does not change the 34 year trend very much when removed in V3.0 (the global trend in STAR V3.0 TMT will be about 0.015 K/decade lower — but still warmer than RSS). Second, the STAR analysis uses a completely different calibration scheme based on simultaneous nadir overpasses. In the STAR scheme, the satellite calibration is not polluted by errors in the diurnal correction, because it occurs before the diurnal correction in processing. So I really think STAR 2.0 is an independent dataset, and cannot be excluded based on dependence on RSS. Also note that STAR 3.0 TMT will no longer use the RSS diurnal correction, but still shows more warming than RSS TMT.

      I do tend to leave out RATPAC. This is because the individual station data are adjusted before 2005, and then not adjusted after 2005. We have shown that when comparing global radiosonde averages to satellites, it is critical to subset the satellite data to the radiosonde locations
      (Mears et al, 2011).
      (This is not so important for tropics only averages because of the smoothness of temperatures in the tropics)

      I tend to de-emphasive reanalysis output, because I think reanalysis is even less ready than the satellite data for use in global temperature trend assessment. In general, the reanalysis projects ingest uncorrected satellite data, and hope that their analysis system can make the needed adjustments. This has certainly not been proven to be the case, and there are many examples of it not working out — e.g. problems with vapor and clouds in the MERRA reanalysis caused by the advent of AMSU brightness temperatures.

      None of this changes my overall conclusions:

      1. The presence (or not) of the tropospheric hotspot depends on which pair of datasets you use. Thus the result is not statistically significant in the grossest sense.

      2. Measured trends in the tropical troposphere are less than all of the modeled trends (or almost all in the case of STAR 2.0). This is an important, statistically significant, and substantial difference that needs to be understood. I addressed this in my last post.

      References:

      Assessing the value of Microwave Sounding Unit–radiosonde comparisons in ascertaining errors in climate data records of tropospheric temperatures. JOURNAL OF GEOPHYSICAL RESEARCH: ATMOSPHERES Volume 117, Issue D19, 16 October 2012, Carl A. Mears, Frank J. Wentz and Peter W. Thorne

      Assessing uncertainty in estimates of atmospheric temperature changes from MSU and AMSU using a Monte-Carlo estimation technique. JOURNAL OF GEOPHYSICAL RESEARCH: ATMOSPHERES Volume 116, Issue D8, 27 April 2011, Carl A. Mears, Frank J. Wentz, Peter Thorne and Dan Bernie

    • Carl Mears

      Why use TTT.

      In this post, I explain why I use TTT, especially when comparing to radiosondes.

      Here is a figure showing the TMT, TTT, and TLS weighting functions over the ocean.
      Over land, it is a little different for TMT and TTT, but these differences do not affect my argument.

      MSU/AMSU weighting functions

      First, lets consider TMT,shown in Blue. This weighting function peaks in the mid to lower troposphere, but still has considerable weight above 17 km, which is about where the stratosphere starts in to deep tropics. The problem is that the stratosphere is cooling more rapidly than the troposphere is warming, and this cooling tends to cancel some of the tropospheric warming, making the signal harder to see. There is another MSU/AMSU channel, TLS, (red curve) which is sensitive to the upper troposphere and lower stratosphere in the tropics (not just the lower stratosphere, as is sometimes assumed from its name). Fu and Johanson proposed a combination of these two channels that removes much of the weight above 17 km from the TMT weighting function. Fu calls this TTT, for Temperature Tropical Troposphere. The TTT weighting function is shown in purple.

      First let me address the accusations of increased uncertainty. Let me assume that the TMT product has an trend uncertainty of 0.038 K/decade (from Mears et al, 2011), and the TLS has a trend uncertainty of 0.060 K/decade (this is twice what we found in Mears et al 2011). If we assume these two errors are independent, we find a resulting error of 0.0422 K/decade.
      Now, since a lot of the error comes from possible problems with the diurnal adjustment, the errors may not be independent, so this could be worse. But even if we assume the worst case scenario (perfectly anti-correlated errors), we only get an uncertainty of 0.0478 K/decade.

      So our procedure has increased the uncertainty, by a factor (in the worst case) of about 1.25. That must be bad, right? No, because we have also increased the signal we are trying to see using the procedure. By calculating TTT, we have increased the signal in both the RSS and STAR data by a factor of 1.35. For UAH, the factor is even larger, about 1.8. So, by using TTT instead of TMT, we have increased the signal to noise ratio in all three satellite cases. This would not be true only if the uncertainty in TLS were HUGE compared to TMT

      Trends 1979-2010 (K/decade)

      Dataset TMT TTT Ratio
      UAH 0.050 0.091 1.82
      RSS 0.117 0.158 1.35
      STAR 0.144 0.194 1.35

      This is not the end of the story. TTT has even more benefits when we are considering radiosonde data. It is fairly well established that the problems with the radiosonde increase at higher altitude, with most indications being that even the homogenized records show spurious cooling at high altitude. The exact level where this problem sets in is not known. By using TTT instead of TMT, the weights for the radiosonde levels above 100 hPa are very much reduced, reducing the contribution to the error from these troublesome levels.

      References:

      Carl A. Mears, Frank J. Wentz, Peter Thorne and Dan Bernie (2011) Assessing uncertainty in estimates of atmospheric temperature changes from MSU and AMSU using a Monte-Carlo estimation technique. JOURNAL OF GEOPHYSICAL RESEARCH: ATMOSPHERES Volume 116, Issue D8

      Fu, Q., and C. M. Johanson (2005), Satellite!derived vertical dependence of
      tropospheric temperature trends, Geophys. Res. Lett., 32, L10703,
      doi:10.1029/2004GL022266.

  • Steven Sherwood

    John Christy reasons that if the means of two different subsets of the data are roughly the same, then we know the answer, even if there is large scatter within each subset. That is very interesting: according to that reasoning there is no longer any doubt about equilibrium climate sensitivity, because the average of the models and of various estimates based on past data are each around 3C (e.g., IPCC 2007).

    As for his reasons for rejecting various datasets, they seem like subjective, a posteriori rationalisations. Every dataset shows rapid changes somewhere or another which look like they could be artificial, or has some design limitation. Is there any peer-reviewed paper using objective criteria to show that the datasets John rejects are truly worse than the others? My 2008 paper showed that the warming trends from the UAH version of MSU TMT at the time were significantly smaller than those from radiosonde data, in a fairly consistent manner across different parts of the globe, while the other two analyses available at the time were consistent with the sondes (please compare the comprehensive global approach in that paper with the pick-and-choose methods one sometimes sees). This was never either refuted or acknowledged by John who continues to maintain that his products are the ones to believe.

  • John Christy

    As Mears indicates, he, along with Sherwood, are “mystified” that so much attention is drawn to the tropical hot spot, or lack thereof, because they suspect it is not germane to the global warming issue. Others are even more than “mystified” and seek to completely shut off debate which reminds me of the line “… move along now, nothing to see here” used in several movies including Men In Black and The Naked Gun where secretive and embarrassed authorities try to divert a curious public from observing an obvious disaster caused by said authorities. Seriously, in my opinion, there IS something critically important to see here.

    The “hot spot”, as I stated earlier, represents an integration of much of our understanding of the energy cycle of the climate system. It is the energy cycle that must be well-characterized before attempting to forecast the climate response to a very slight increase in total energy forcing due to the enhanced greenhouse effect. The tropical atmosphere represents about 30% of the global atmospheric mass, holds a significant role of the planetary hydrologic cycle, and is the entry point for about half of the Earth’s solar energy. If the processes that combine to create the observed tropical structure, variations and change are not understood and replicated well, then we cannot claim we know enough about the system to make confident predictions. Thus, I agree with the instigators of this blogpost, by saying “…DON’T move along now, because there IS something to see here.”

    I will address issues as I came across them while reading the posts, so this may appear as a set of disjoint paragraphs. But without reading the somewhat boring details of my following comments, in summary, I don’t believe there is disagreement regarding the basic statement that tropical tropospheric trends of observations and models are significantly different. This means we have some serious questions to explore regarding the energy cycle that have not been well-characterized in the climate modeling establishment to date.

    Why RCP8.5?
    My use of a particular CMIP-5 scenario (here RCP8.5) is irrelevant when dealing with the observational period (a question from Sherwood). All of these RCP scenarios used the same forcing to 2006, then continued on until a prescribed forcing was achieved for each scenario. The differing scenarios don’t lead to different results until after 2030 when the lowest level is approached and that scenario’s forcing is fixed at that level. Thus the use of any of the forcing scenarios is fine as long as one deals with pre-2030 output.

    Dataset numbers
    The idea that the average of two completely different sets of measurements gives the same result is quite helpful in my view (a question from Sherwood). [As an aside, Sherwood describes the average of the balloons and satellites as “roughly” the same when in fact their average is different by only 0.01 C/decade, so “roughly” is not an accurate characterization.]

    To be more specific about the numbers, we can say the trend of the radiosonde-average tropical TMT annual anomalies is +0.047 ± 0.035 °C/decade where the error is large enough to incorporate all four of the realizations. This trend value is smaller than assumed in my original posting because I had erroneously inserted the lower troposphere (TLT) rather than TMT in one of the datasets. It is curious to me that the latest version of RICH displays the warmest trend of the balloon datasets whereas in earlier versions, RAOBCORE was more positive than RICH. I suspect there may be an unwanted artifact that arises from the use of the ECMWF first-guess in the adjustment process. There is an obvious spurious warm trend in RAOBCORE at the 100 hPa pressure level [see light green circle in my original Fig. 2 at 100 hPa where the trend warms dramatically from 150 hPa to 100 hPa when all other evidence indicates the trend should be less at 100 hPa than 150 hPa], but as this has a small impact on TMT (not so TUT retrieval) we shall not address it.

    Averaging the satellites (UAH and RSS) we have +0.059 ±0.031 °C/decade where again the error range encompasses both datasets. Recently I have begun to use the average of RSS and UAH as a useful product in climate discussions as this reduces independent errors that are present in each of our datasets (e.g. McKitrick et al. 2010). Of the six datasets used here, RSS displays the warmest trend and RATPAC the coolest (see Christy et al. 2010 and 2011 for quantitative discussions on dataset differences – information which demonstrates that my decisions regarding dataset selection were based on evidence and were not “subjective, a posteriori rationalizations.”). Taking all datasets together, this provides an average trend between +0.05 and +0.06 °C/decade.

    [The appeal to the climate sensitivity analogy regarding dataset differences is actually very interesting (Sherwood question) as recent estimates of its value have fallen well below the IPCC AR4 estimate with no less than 10 recent papers indicating central estimates between 1.0 and 2.1 °C. This is evidence that the calculation of climate sensitivity has considerable uncertainty and that IPCC estimates are likely too high for a number of reasons. By comparison, the observational estimates of TMT shown here are highly consistent.]

    Dataset Selection
    The comment about selecting or deselecting datasets is a minor issue. First, all four of the updated radiosonde datasets were indeed used, so there is no issue there. Even though there is some dependence of RAOBCORE and RICH, as they are produced by the same investigator, their individual construction processes appear to be sufficiently different that I utilized them as independent realizations.

    Secondly, due to obvious issues with the inability to seamlessly incorporate new observing systems into the assimilation process, the Reanlayses (JRA, ERA-I, MERRA) were not used at all (see Sakamoto and Christy 2009 and Christy et al. 2011– again, the reasons for not using Reanalyses are not “subjective” but based on published information). Additionally, a quick examination of The State of the Climate 2012 (BAMS 2013) indicates the three Reanalyses have much greater spread than the simple observational datasets. Here, regarding Reanalyses, I agree with Mears.

    Thirdly, the only observational satellite dataset not utilized was STAR TMT for the reasons stated, i.e. (a) it has a glitch in 2001 that the authors recognize and (b) it uses the same diurnal corrections as RSS, and thus is not an independent realization of satellite temperatures. I included RSS even though it has been demonstrated that the dataset contains a warming shift in the 1990s in the tropics relative to all other datasets (including surface datasets) that suggests errors in the diurnal correction (Christy et al. 2010). Thus, RSS, with a likely spurious warming due to diurnal overcorrections, is accepted, but to compound this by adding STAR (with the same diurnal correction issue) will lead to double counting a documented problem. The differences in the additional adjustments of bias and calibration are small by comparison.

    However, if it settles anything, adding STAR into the mix does not change the ultimate conclusion to which I had arrived. Since the STAR shift is known to its authors, I calculated the value relative to RSS (since both STAR and RSS use the same diurnal correction and the AMSUs were in use in 2001, this focuses on the shift apart from other adjustments) as +0.056 °C. Subtracting this shift at 1 Jan 2001, now produces a time series almost identical with RSS with a difference in trend of only +0.004 °C/decade. So, using the three datasets as independent realizations (a poor assumption as noted) we have a satellite result of +0.071 °C/decade.

    Mears states the new STAR product (v3.0) differs from RSS in the tropics by an unspecified amount but from Mears numbers would appear to be about +0.010 to 0.015 °C/decade warmer than RSS. We have no information about how STAR has applied adjustments so no independent assessment has yet been performed which may reveal problems. In any case, the calculation done in the previous paragraph with an estimate for STARv3.0 doesn’t change any of the basic numbers and result. Thus, even if we throw out the radiosonde datasets, a mean satellite trend of +0.07 °C/decade is still highly significantly different from +0.26 °C/decade found in the models. I don’t see any other conclusion that can be justified and I note that the other authors essentially agree with this finding.

    The use of TTT
    The issue of TTT vs. TMT is discussed by Mears who prefers TTT. One objection he expresses is the “considerable weight” that the stratosphere exerts on TMT. Actually, that weight is only 7 percent, hardly “considerable” in my opinion. Now, if TTT were a directly-measured quantity, I would agree that it produces a purer tropical tropospheric signal than TMT. However, it is not directly measured and contains error that is larger than TMT alone. Too, one wonders why models should be exempt from getting the tiny portion of the stratosphere that resides in TMT correct to begin with?

    With regard to TTT, I calculated the quantity for all 73 CMIP-5 models with the mean of the tropical TTT 1979-2012 trends of +0.32 °C/decade. The mean of the models’ TMT trends was +0.27 °C/decade (this is the mean of the trends whereas the trend of the annual means is +0.26 °C/decade). Throughout the individual comparisons, a consistent result, and an artifact of the retrieval scheme, was that for models, TTT was warmer than TMT by an amount that was inconsistent with the TMT trend. Indeed for many models with a wide range of TMT trends of +0.13 to +0.45, the addition to TMT to calculate TTT was between +0.04 and +0.05 °C/decade. Note too that the trends of TTT from the satellite datasets provided by Mears are also about 0.04 to 0.05°C/decade warmer than TMT.
    Therefore, from the models, the average ratio of TTT to TMT was 1.18, meaning very little new information is provided by the retrieval as far as the models go. The satellite ratios are much larger (1.35 to 1.82) than the models simply because the denominators (i.e. observed TMT trends) are so much smaller.
    So, whatever one thinks about TTT, and I think the errors are larger than estimated by Mears, one still must come to the conclusion that the trend of TMT is a problem for models to replicate. This seems to be accepted by Mears and Sherwood too.

    Readers Comments
    Reader GavinCawley: The results of Douglass et al. 2007, which are actually less remarkable than shown in my initial post here, still stand. The confusion created by reading later papers and blogs is the misunderstanding of the question that was addressed. We asked the question this way, ”If climate models had the same surface temperature trend as the real world, would their upper air temperature trends agree with the real world?” This was clearly stated in the paper. This condition of a required agreement in surface trends allowed us to directly compare the models and observations, and we found that the climate models, on average, were significantly different than observations. We discussed problems with the criticisms of our 2007 paper in Douglass and Christy 2013, such as the improper use of datasets known to be obsolete and the comparison of upper air trends even though surface trends between models and observations were not consistent (i.e. apples to oranges).

    Reader Phi: The 1000 hPa temperature is taken from the NCDC surface temperature dataset. We did not have values at 950 hPa.

    Reader Paul Matthews: Agreed. There are other papers too that are now pointing out the obvious. I’m wondering how the IPCC AR5 will discuss the issue as the earlier drafts were less than forthcoming.

    • Carl Mears

      I think that John and I will continue to disagree about TTT.

      First about John’s point about TTT not being “directly measured”. Using this argument, many satellite retrievals would not be acceptable, including, to cite a RSS example, all the wind speed and total column water vapor retrievals from microwave imaging instruments, which are derived from measured radiances. But wait a minute, even the MSU/AMSU measurement are derived from radiances! Which are derived from small currents crossing the PN junction in a detector diode. So none are directly measured! In my view, if you don’t believe in the ideas behind calculating TTT, then you don’t believe in the possibility of atmospheric sounding with microwaves. John’s own TLT product uses a similar “combination of weighted measurements” approach, with the added complication that the measurements are made at different locations or times.

      I like TTT because it separates the trends is the troposphere and stratosphere more than TMT does. The trends in these area have somewhat different drivers (greenhouse gases vs. ozone), and different measurement problems when talking about radiosondes (radiosondes are probably more screwed up in the stratosphere than the troposphere), and different forcing problems in the models (ozone vs. well mixed green house gases).

      But, that being said, let’s leave it at John’s statement–

      “So, whatever one thinks about TTT, … one still must come to the conclusion that the trend of TMT is a problem for models to replicate. This seems to be accepted by Mears and Sherwood too.”
      (ellipsis mine)

      None of my conclusions would not be altered by using TMT instead of TTT.

      To summarize my conclusions:
      1. The observations are probably not good enough to prove or disprove the presence of the hot spot. This is in part due to the added noise that one gets when calculating the ratio of two small, relatively similar, uncertain numbers.
      2. Models are showing much more tropical tropospheric warming than observations.
      3. I don’t think errors in the datasets are large enough to account for this discrepancy.
      4. There are a lot of possible reasons for this, some having to do with inputs to the models, and some having to do with model physics.
      5. I doubt these problems are such that throwing out the idea of anthropogenic global warming is warranted.

  • Marcel Crok

    Ok, let’s try to summarise again what the discussants said about trends. Let’s start with the satellite trends. A few things are still unclear.

    Mears in his latest comment said the following about trends:

    Dataset TMT TTT Ratio
    UAH 0.050 0.091 1.82
    RSS 0.117 0.158 1.35
    STAR 0.144 0.194 1.35

    Christy in his reaction also mentioned some trend numbers.

    “Averaging the satellites (UAH and RSS) we have +0.059 ±0.031 °C/decade”
    Now if I take the Mears numbers (0.050 + 0.117/2) this gives an average trend of 0.0835 which is higher than the 0.059 that Christy mentioned? What is the reason for this discrepancy?

    Also the 0.117 trend is outside the “error range” of 0.059 + 0.031. So while Christy said that “again the error range encompasses both datasets”, this doesn’t seem to work for the numbers that Mears gave.

    Sherwood so far didn’t give trend estimates. Do you accept those of Mears?

  • Marcel Crok

    In the public comments Ross McKitrick is commenting extensively about the trends. He has two papers about this topic and I therefore add his trend estimates to the list:
    “The UAH trend (0.040 C/decade) is insignificant, the RSS trend (0.111 C/decade) is significantly different from zero at 5%”

    So according to McKitrick and Vogelsang the UAH trend is insignificant and the RSS trend is significant. This relates to our question 4: What could explain the relatively large difference in tropical trends between the UAH and the RSS dataset?

    McKitrick added this statement that is relevant here: “(c) UAH and RSS are significantly different from each other at 5%.”

    Do the discussants agree with this conclusion that differences between UAH and RSS are significant?

    What could be the reason?

    Christy wrote in his latest comment:
    I included RSS even though it has been demonstrated that the dataset contains a warming shift in the 1990s in the tropics relative to all other datasets (including surface datasets) that suggests errors in the diurnal correction (Christy et al. 2010).

    Mears in his comment didn’t address the difference between UAH and RSS directly, but his comment about STAR is relevant:
    “In Christy’s last post, he made an argument for excluding STAR V2.0 based on a small positive jump in temperature in 2001, and that is is the same as RSS, since it uses the same diurnal correction. First, the jump in 2001 is fairly small, and does not change the 34 year trend very much when removed in V3.0 (the global trend in STAR V3.0 TMT will be about 0.015 K/decade lower — but still warmer than RSS).”

    So Mears suggests the small jump in 2001 cannot explain the differences between UAH and RSS.
    For outsiders these differences are intriguing as on a global scale the trends from UAH and RSS are very close to one another.

    What are the possible reasons for the difference between UAH and RSS?

    Marcel

  • Carl Mears

    In his last post, Marcel asked:
    Do the discussants agree with this conclusion that differences between UAH and RSS are significant?

    What could be the reason?

    The significance question depends on the uncertainty estimates in both the RSS and UAH data. The uncertainty in UAH has not been documented well enough for me to feel comfortable doing analysis with it. Using the uncertainty analysis that we have done using a Monte-Carlo approach, I can say that in general for TLT, UAH is within our 95% confidence interval, while for TMT, it is not. This suggest that for TMT, the RSS/UAH differences may be significant.

    We tried to explore the reasons for these differences several years ago. The two main possibilities are differences in the non-linearity correction (the “target factor” — this accounts for errors due to changes in temperature in the hot calibration target) and differences in the diurnal adjustment applied to account for changes in measurement time. The target factor differences are largest for NOAA-09, where UAH uses a very large value for alpha. This is discussed in Po-Chedley and Fu, 2012. The diurnal adjustment is most important for NOAA-11, due to its long life and large measurement time drift. In 2005, UAH made changes to their diurnal adjustment to TLT that brought their results into closer agreement with RSS, but I am not sure if any changes were made to the UAH TMT adjustment. Around that time, there was a paper circulating describing the new UAH diurnal adjustment, but to my knowledge it is not published (John, is this true?). At any rate, the geographic distribution of RSS/UAH TMT differences points to the diurnal cycle as the most likely culprit. The best agreement is in the Southern Hemisphere Extratropics, where the diurnal adjustment is small due to the prevalence of ocean. The largest disagreement is in the tropic, where the diurnal cycle tends to be large. There is also a ramp in the difference time series during the NOAA-11 lifetime. If the main culprit were the target factors, the differences would be more similar in the different regions.

    A pointed out by Paul S, there is a tool on our website that allows you to look at trend difference between RSS, UAH, and STAR, as well as the radiosonde datasets. The error bars on the RSS data are from our Monte-Carlo Analysis, and include estimates of error in the diurnal adjustment, as well as the subsequent effects of these errors on the intersatellite merging process. The error analysis is documented in our 2011 paper.

    Paul S also said, in response to Marcel

    Marcel: For outsiders these differences are intriguing as on a global scale the trends from UAH and RSS are very close to one another.

    Paul S: That’s pretty much true for TLT but there is a reasonable discrepancy for global TMT: 0.078 against 0.046ºC/Dec. That is intriguing though, given that TLT is produced from the same base data as TMT. It suggests the similarity in global TLT is largely due to compensating errors rather than agreement.

    I (Carl) agree with this last sentence.

    Po-Chedley, S. and Q. Fu, 2012: A bias in the midtropospheric channel warm target factor on the NOAA-9 Microwave Sounding Unit. J. Atmos. Oceanic. Technol., 29, 646-652.

    Mears, C. A., F. J. Wentz, P. Thorne and D. Bernie, (2011) Assessing Uncertainty in Estimates of Atmospheric Temperature Changes From MSU and AMSU Using a Monte-Carlo Estimation Technique, J. Geophys. Res., 116, D08112, doi:10.1029/2010JD014954.

  • John Christy

    Marcel:

    I will be very busy in the next few days so I will respond as best I can now:

    From Mears (stamp 11 Sep 6:18 p.m.): As Carl notes, “I think John and I will continue to disagree about TTT.” This is a statement with which I agree, and I do so mainly because TTT has greater error than TMT, not because of the physical profile it attempts to produce.

    From Mears (stamp 11 Sep 5:50 p.m.): In a statistical sense one could say TMT for RSS and UAH are significantly different from each other in the tropics. However, one could not say the same regarding the MEAN of UAH and RSS, i.e. both datasets are within error ranges of their common mean value. I often use the mean of RSS and UAH now because whatever spurious warming/cooling there might be in our separate constructions it will likely be minimized in the average. I think Carl’s discussion on the cause is correct – the divergence in the mid-1990s is the key and likely relates to the diurnal adjustments.

    When Wentz and Mears discovered the UAH diurnal adjustment error in TLT back in 2005, we corrected the problem with a new diurnal calculation. However, that new calculation had virtually no impact on TMT’s diurnal adjustment since it was not subject to the erroneous cross-swath-subtraction artifact as was TLT. This diurnal correction was based on 3 co-orbiting AMSU instruments. [For those unfamiliar with this adjustment, UAH uses an empirical technique drawn from (admittedly noisy) observations and RSS relies on a climate model simulation. Neither will be perfect, hence an average of the two products I think is the best way to deal with the differences, since the average is within error ranges.] It should be obvious that an accurate representation of the diurnal drift error is not an easy problem to solve.

    Ultimately, the divergence of model projections and observations tells us we have much to learn regarding the climate system, and satellite observing systems are critical to improving our knowledge. With so much more to learn, and the apparent relative insensitivity of the climate system to CO2 forcing as demonstrated by very modest temperature trends, I believe we are in a situation to question the presumed outcomes of specific carbon-control proposals which will also have tremendous economic impacts. These outcomes are based on model projections which to this point have low credibility in my view.

    Readers Comments
    Paul S.: Starting the comparison with 850 hPa misses the key point of the relationship of the surface to the troposphere. This “within troposphere”, say 850-200, lapse rate is useful in its own right, but different from a more fundamental issues of the relationship between the surface and troposphere that involve the climate system. Recall that the vast majority of heat flux to the atmosphere occurs at the surface, not 850 hPa. How one partitions that heat (i.e. fluxed back to air with sensible and latent heat vs. being diffused down into the ocean) is critical to the question posed here. So, by examining the surface-troposphere relationship one delves into these tremendously important parts of the climate system.

    Paul S.: The validation comparison of RSS in the “Tropics – 30S-30N” uses a very large number of questionable radiosondes outside of the 20S-20N band. [See Christy et al. 2007 and 2010 for comparisons in the band we are discussing. The updated sonde comparisons in Christy et al. 2011 give a slight tip to UAH.] I also notice that the website’s satellite comparisons with RAOBCORE and RICH may be utilizing their older versions. I note too that though UAH lies outside of RSS for TMT in 30S-30N, UAH agrees almost exactly with HadAT2 and IUK. I agree with Carl that the disagreement would be considerably reduced if we could deal with the NOAA-11 (and some NOAA-14) differences. [Note that UAH cuts off NOAA-11 in Aug 1994 when its instrument temperature started swinging wildly, whereas I believe RSS uses NOAA-11 a little further out.]

  • Marcel Crok

    As we are still discussing datasets, I wanted to raise a point that Sherwood made in his guest blog and that so far hasn’t been discussed.

    Sherwood:
    “I used to think (as do most others) that the radiosondes were wrong, but in Sherwood et al. 2008 we found (to my surprise) that when we homogenised the global radiosonde data they began to show cooling in the lower stratosphere that was very similar to that of MSU Channel 4 at each latitude, except for a large offset that varied smoothly with latitude. Such a smoothly varying and relatively uniform offset is very different from what we’d expect from radiosonde trend biases (which tend to vary at lot from one station to the next) but is consistent with an uncorrected calibration error in MSU Channel 4. If that were indeed responsible, it would imply that there has been more cooling in the stratosphere than anyone has reckoned on, and that the true upper-tropospheric warming is therefore stronger than what any group now infers from MSU data. By the way, our tropospheric data also came out very close to those published at the time by RSS, both in global mean and in the latitudinal variation (Sherwood et al., 2008).”

    I wonder what Mears and Christy have to say on this.

  • Marcel Crok

    In one of his latest comments Mears summarizes his findings (bold mine):

    To summarize my conclusions:
    1. The observations are probably not good enough to prove or disprove the presence of the hot spot. This is in part due to the added noise that one gets when calculating the ratio of two small, relatively similar, uncertain numbers.
    2. Models are showing much more tropical tropospheric warming than observations.
    3. I don’t think errors in the datasets are large enough to account for this discrepancy.

    Mears backed up his first conclusion with the figures 1 and 3 in his guest blog which I will repeat here:

    Mears figure 1

    Mears figure 3

    Both these figures deal with the amplification of the warming compared to the surface.

    Christy’s comparable figure is this one:

    Christy figure 4

    Christy’s conclusion is that even if you look at the trend ratios the differences are significant:

    What this figure clearly indicates is that the second aspect of this discussion, i.e. namely the rising temperatures with increasing altitude, is also over-done in the climate models. The differences of the means between observations and models are significant.

    Ross McKitrick, who also has a number of relevant publications on this issue tends to agree with Mears in this case. In a public comment he wrote:

    (b) Mears’ Figures 1 and 3 (+4) nonetheless show that the amplification rate in models is high relative to the distribution in the observations. Current data sets are too short to say whether the difference is statistically significant or not. I doubt they will ever be long enough. The statistical issues involved in figuring out the distributions of ratios of random numbers get complicated quickly, and I wouldn’t be surprised if the problem is intractable.

    It would be interesting to hear Christy’s reaction on this. Would he be willing to reconsider his position on this specific issue?

    Definition of the hot spot
    Having read most of the discussion again I realised there is still some confusion left about the definition of the hot spot. In their contributions Sherwood and Mears clearly referred to the hot spot as “the amplification of the surface warming in the troposphere”.
    In our introductory article we were less clear about it, I would say in hindsight. About our figure 1 (repeated below) we wrote: “The expected warming is highest in the tropical troposphere, dubbed the tropical hot spot.”

    Intro figure 1

    Here the “hot spot” doesn’t refer to the amplification alone but also to the fact that models expect lots of warming aloft. Ross McKitrick also made a remark about this (my bold):

    Regarding the uniqueness of the tropical “hotspot”, the uniqueness arises from the magnitude of the trend, not the amplification with respect to the surface. While it is true that amplification would be observed in response also to increased solar forcing, it’s clear from comparing panels (a) (solar), (c) (GHG) and (f) (all) in the IPCC figure that only GHG’s are expected to have had a sufficiently strong effect to yield the level of warming projected overall. Were there to be a lack of warming, it would be most inconsistent with the GHG simulation.

    Do all the participants agree that the term hot spot is also used for the large absolute warming trend that is expected high in the tropics? Should we limit the term to the “amplification”? If so what other term could we use for the high warming rates aloft?

  • Steven Sherwood

    Marcel asks whether by “hot spot” one means the warming aloft, or the difference (or ratio) between warming aloft and at the surface. The problem here is that the “hot spot” concept was not created by scientists (as far as I know) but is a term coined by climate skeptic bloggers. If one looks at the problem from the point of view of climate physics it decomposes naturally into one on lapse rates (which are governed by atmospheric convective processes) and global surface temperature (which is controlled by top-of-atmosphere radiative balance and ocean heat uptake). For this reason the focus in the scientific literature (as opposed to the internet) has been on either lapse rates, or surface temperatures, and this is the focus I prefer. Obviously it is fair enough to ask whether warming in any particular location is consistent with models or not, if one’s only goal is to falsify models. But if one is trying to understand the system it is better to ask first what is happening at the surface, and then, given that, what is happening in the atmosphere.

  • John Christy

    This installment of my comments on the “hot spot” truly descends into what we in the USA call “the weeds” and deals only with some side issues. Some information on these side issues is provided below and I hope it will be informative to the reader who has considerable patience.

    Carl and I performed our error testing of the satellite in differing ways. Going back to Christy et al. 1995, we indeed tested the parametric ranges of the bias calculation by varying the number of overlapping observations utilized, testing the effect of the magnitude of the noise cut-off parameter, and testing the various options for the sequence of satellites employed to create the backbone. We also tested the reduction in daily noise based on the window-width of the time-filter by latitude. The total impact of the diurnal effect for MT was relatively small (it is larger for other layers.) Thus, we did perform several types of variational testing.

    But testing parametric uncertainties of a dataset construction process does not get to all of the potential error. If a diurnal adjustment, for example, is fundamentally flawed, the parametric variations around that flaw won’t lead to better understanding. So, the main effort for our error calculations was to employ a completely independent observational dataset for testing – that being radiosondes. Unfortunately, the majority of sonde data records are plagued with numerous, and often uncatalogued, changes through time. Our solution was to select a subset that geographically spanned the tropics to the high latitudes, which, by design, had the most consistent set of instrumentation and methods – the U.S. VIZ subset of 32 stations. This was done at the grid point level using the full radiation code, including humidity effects. Below is a table of the results from Christy et al. 2011 regarding the comparison of UAH, RSS and STAR against the VIZ radiosondes.

    Statistical properties of the difference time series between the adjusted VIZ sonde series and each satellite dataset.

    TMT

    Mon Std Dev Differences

    °C

    Ann Std Dev Differences

    °C

    Monthly r2 Composite

    Annual r2 Composite

    UAHv5.3

    0.088

    0.037

    0.90

    0.96

    RSSv3.2

    0.104

    0.065

    0.89

    0.91

    STARv2.0

    0.102

    0.065

    0.89

    0.91

    With such a comparison (to make a long story short) we were able to generate a set of error characteristics that was not affected by our own subjective notions of parametric uncertainty. We did some of this in our 1992 papers (Spencer and Christy 1992a,b), but did so more thoroughly beginning with Christy et al. 2003. This type of analysis was most recently updated in Christy et al 2011. We also utilized the 28-station, well-documented Australian network in this paper (results below), but with less success due to some significant changes in their stations that were not consistent in time. Thus each station had to be treated separately and it was UAH which successfully pin-pointed the Australian instrument changes (for which there was documentation) more often than RSS and STAR.

    Characteristics of the detection of breakpoints for the 28 Australian sondes.

    TMT

    Positive break-points

    Negative break-points

    Mon Std Dev Differences °C

    Median r 28 Stations

    Annual r2 Composite

    UAHv5.3

    41

    15

    0.114

    0.92

    0.96

    RSSv3.2

    29

    11

    0.123

    0.85

    0.91

    STARv2.0

    27

    13

    0.126

    0.88

    0.91

    Of the three datasets (UAH, RSS and STAR) the results indicated UAH data achieved the smallest error characteristics relative to both the U.S. VIZ and the Australian radiosondes.

    In Christy et al. 2011 (and earlier papers) we found a clear trend in the difference between the VIZ sondes and both RSS and STAR during the 1990s which was not present relative to UAH. This was during a period in which the VIZ instrumentation was completely consistent. We interpret this to demonstrate a spurious warming due to diurnal correction errors in RSS and STAR (they both used the same diurnal adjustment) for NOAA-11 and NOAA-14. Christy et al. 2010 demonstrated the same result using the more traditional area-average surface and radiosonde datasets in the tropics.

    Mears, evidently, does not appreciate these results as do I. He indicates that “The uncertainty in UAH has not been documented well enough for me to feel comfortable in doing analysis with it.” According the published results above, I could say exactly the same about the RSS dataset. Rather, Mears group prefers to use a collection of “adjusted” individual multi-country radiosondes which are, from my perspective, plagued with unknown instrumentation and other changes (Mears et al. 2011). I’ve looked at these sondes individually through the years and many require “adjustments” for undocumented changes whose magnitudes impact the trend to an extent greater than the true trend-signal itself (see Christy and Norris 2004). So I chose not to add such a significant complication into the evaluation methodology. However, I have utilized the tropical average of these radiosonde datasets as a way to minimize their individual errors.

    For the reader, this question of “which is better?” ends up being a dilemma because both of our groups can make strong claims to back up our decisions as to why we chose the particular method of testing. It is entirely understandable that the reader would be suspicious of any group whose methodology of evaluation supports the results of that group. So, which dataset is better? In many of my presentations, as done here, I simply utilize the average of UAH and RSS and so by-pass the question. By so doing, I essentially assume that UAH and RSS contain an equal amount of error on either side of the truth. This seems reasonable to me. (However, if we both contain a systematic error, such as inclusion of what we believe is spurious warming of the NOAA-12 sensor, that error remains in the average.) Averaging the radiosonde datasets (with four members) is reasonable as well.

    Reader Phi: (stamped 2013-09-12 16:34:04)
    Others have discussed this through the years – i.e. estimating the surface trend by making it consistent with the upper air profile rather than using the scattered ,and often deficient, surface thermometer stations. Doing so gives a result for the 1979-2012 surface trend (in your diagram about +0.035°C/decade) that is outside of the measurement errors of the observations. There is some information to support the hypothesis that the surface warming over the tropical land is misrepresented by the current datasets since they utilized TMean which contains the contamination of TMin by surface development (see my papers on East Africa temperatures Christy et al. 2009, Christy 2013). While I think the current surface datasets show more warming than they should, I don’t think they are off by that much. I’m comfortable with the idea that the complex vertical temperature structure of the tropics contains enough degrees of freedom that allow for departures in the area-average from the strict moist-adiabatic lapse rate profile. However, if the tropical surface trend is only +0.04 °C/decade, then model projections of the past 35 years are in even greater error than demonstrated in this blog topic.

    Reader GavinCawley:
    The issue in Douglass et al. 2007 is straightforward. There were two populations of a defined metric in the tropics (a linear trend) that we compared – trends from observations and trends from a model average. There were several altitude levels for this test so the model averages consisted of comparisons at each level. Our analysis simply compared the MEAN of the trends of the models for significance testing at each altitude. (We used the MEAN because this is often used as the “Best Guess” for IPCC assertions. Again, our study focused on the MEAN.) There were 67 model runs representing 22 different modeling groups in our model population. We chose to be conservative and assigned for the models a sample size of N = 22 rather than 67 even though all 67 were included. We calculated the mean and standard error of the model trends at each level according to the most simple of statistical methods in which the magnitude of the population mean is estimated from the sample mean with standard errors appropriately calculated. The observations were significantly different (highly significantly different) from the models’ means. Several new publications are appearing which support this conclusion (e.g. Douglass and Christy 2013, Fyfe et al. 2013).

    The idea of parallel Earths is not needed. To answer the point about calculating the standard-error-of-the-mean by using an infinite number of samples from the population which then produces a value of zero for the std-err – this answer is correct, i.e. you would have perfect knowledge of the mean since you used all members of the population and thus no error. The simple result of Douglass et al. 2007 is still valid for the limited, restrictive question we asked and answered. We were comparing the MEAN tropospheric trends of the models for the restricted case in which their surface trend was +0.13 °C/decade. As more realizations are added (whether in parallel Earth’s or not), and because models have a very rigid relationship between the surface and the troposphere, the MEAN of whatever set of models is chosen will be found within a narrow range. This was further demonstrated in Christy et al. 2010. If I understand GavinCawley’s claim, it appears that he/she believes that there should be a very wide range of tropospheric trends for the case in which all of the selected models posses a surface trend of +0.13 °C/decade. I have never seen evidence that would support such a claim, and indeed have seen considerable published evidence that contradicts it. A simple check of the plots I presented in the initial blog indicates as much. [We use the restriction on the surface trend being +0.13 °C/decade for the obvious reason that the tropospheric observations are so constrained as well. Thus we can have a true “apples-to-apples” comparison throughout the atmosphere.]

    Christy, J.R., R.W. Spencer and R.T. McNider, 1995. Reducing noise in the MSU daily lower-tropospheric global temperature dataset. J. Climate. 8, 888-896.
    Christy, J.R., R.W. Spencer, W.B. Norris, W.D. Braswell and D.E. Parker, 2003. Error estimates of version 5.0 of MSU-AMSU bulk atmospheric temperatures. J. Atmos. Oc. Tech., 20, 613-628.
    Christy, J.R., W.B. Norris and R.T. McNider, 2009: Surface temperature variations in East Africa and possible causes. J. Clim. 22, DOI: 10.1175/2008JCLI2726.1.
    Christy, J.R., B. Herman, R. Pielke, Sr., P. Klotzbach, R.T. McNider, J.J. Hnilo, R.W. Spencer, T. Chase and D. Douglass, 2010: What do observational datasets say about modeled tropospheric temperature trends since 1979? Remote Sens. 2, 2138-2169. Doi:10.3390/rs2092148.
    Christy, J.R., R.W. Spencer and W.B Norris, 2011: The role of remote sensing in monitoring global bulk tropospheric temperatures. Int. J. Remote Sens. 32, 671-685, DOI:10.1080/01431161.2010.517803.
    Christy, J.R., 2013, Monthly temperature observations for Uganda. J. App. Meteor. Clim., in press.
    Douglass, D.H. and J.R. Christy, 2013. Reconciling observations of global temperature change: 2013. Energy and Env., 24, 415-419.
    Fyfe, J.C., N.P. Gillett and F.W. Zwiers, 2013. Overestimated global warming over the past 20 years. Nature Climate Change. 3. 767-769.

    Mears, C.A., F.J. Wentz, P. Thorne, D. Bernie, 2011. Assessing uncertainty in estimates of atmospheric temperature changes from MSU and AMSU using a Monte-Carlo estimation technique. J. Geophys. Res. 116, Issue D8, 27.

  • Bob Tisdale

    I suspect one of the reasons for the difference between the models (the tropical hotspot) and observations (no hotspot) may result from how poorly climate models simulate sea surface temperatures, primarily in the Pacific.

    The following is a model-data comparison of the Pacific sea surface temperature anomaly trends for the past 31 years on a zonal-mean basis. The data is the Reynolds OI.v2 SST, and the models are the multi-model ensemble mean of the CMIP5-archived models—simulation of TOS (Historic/RCP6.0).
    http://bobtisdale.files.wordpress.com/2013/02/02-zonal-pacific.png

    The models show a relatively high warming rate in the tropics, but the data show little to no warming. In fact, over this period, the equatorial Pacific has cooled.

    And I also suspect the differences between the modeled and observed sea surface temperature trends in the Pacific are caused by the failure of climate models to properly simulate ENSO processes.

    The above graph is from this post:
    http://bobtisdale.wordpress.com/2013/02/28/cmip5-model-data-comparison-satellite-era-sea-surface-temperature-anomalies/

    If the models can’t simulate ENSO or Pacific sea surface temperature trends properly, one wonders how they can ever hope to project regional variations in temperature and precipitation.

    Regards

  • Arthur Smith

    I’m wondering who even thought this was a good or important point for discussion. The intro text is itself misleading in a number of ways – on the facts and on the history, some of which I believe it has wrong (or at least is overly petty or without appropriate error bars). As far as I can tell from my study of the history of this, the first person to highlight tropical lapse rate changes with the name “hot spot” and claim it was a “fingerprint” of greenhouse warming was Christopher Monckton in an August 2007 article here:
    http://scienceandpublicpolicy.org/monckton/greenhouse_warming_what_greenhouse_warming_.html

    and as can be read there (compared with the rational discussion above), he clearly was very confused by FIgure 9.1 in IPCC AR4 WG1. It’s not a “spot”, for one thing – we’re talking about the entire equatorial band around the earth. And it’s hardly “hot” – the troposphere is still cooler than the surface; the issue is the ratio of a small relative temperature change – making it hard to measure (something John Christy seems to refuse to acknowledge in his comments which refer only to problems in models, not in observations). And it’s very definitely and absolutely and certainly not a “fingerprint” of greenhouse warming – the same pattern is there, as discussed clearly by Mears above, for every source of surface warming.

    Furthermore, the ratio of tropical troposphere to surface changes *is* greater than one if you look over relatively short periods of time – as Mears also discussed here, as can be read clearly from Santer’s 2008 paper for instance. So the expected amplification from theory (“models” hardly does the reasoning here justice) seems to be very visible on short time scales. The discrepancy is *ONLY* with regard to the ratio of long-term temperature changes, over periods of greater than a decade.

    For example, look at monthly temperature anomalies between a relative low in January 1989 and relative high in February 1998, a large temperature change over less than a decade. From some numbers I looked at a few years ago (apologies if corrections have changed this significantly) we have:

    January 1989 to February 1998 change:
    Hadley tropical surface temp: 0.970 C (-0.172 to 0.798)
    RSS tropical TLT: 1.836 (-0.522 to 1.314)
    RSS tropical TMT: 1.806 (-0.466 to 1.340)
    UAH tropical T2LT: 1.83 (-0.52 to 1.31)
    UAH tropical T2: 1.76 (-0.48 to 1.28)
    so dividing by the 0.97 surface temperature difference gives an amplification of between 1.81 and 1.89. What explains this clear case of amplification over a little under a decade, but the apparent lack of amplification in the longer-term trends? If there’s no reasonable theoretical explanation for such a difference, the difficult of observational calibration over such long time-periods strongly suggests it’s not the theory that’s wrong.

    So we have a case here of tricky and difficult observations being promoted in a wildly incorrect fashion by the flamboyant Mr. Monckton – and then echoed in venues such as this one – for what reason? The burden of proof here is with the satellite observations, not with basic physical theory. As has been pointed out earlier as well – surface temperature changes are well understood. If the tropical troposphere is warming less than expected, the logic points to higher climate sensitivity, not lower. So this whole mess has only downsides for those trying to avoid action.

  • chris colose

    First, thanks to all participants for their comments. A number of interesting issues were raised, not just with respect to the physical science of the “tropical hotspot,” but also philosophical issues relating to how we place uncertainties in an appropriate context for model (or observational data) evaluation, and how these things then get translated into what is shared with policymakers at varying levels of confidence.

    I was very happy to see an extensive discussion by Steve Sherwood and Carl Mears on the very large uncertainties in the observational datasets, which right now do not provide a robust direct comparison when evaluating whether the tropical troposphere has stayed close to a moist adiabat. Other “proxy” measurements such as those developed from the thermal wind equation (e.g., Allen and Sherwood, 2008) or those looking the structure of deep convection changes in the tropics (e.g., Johnson and Xie, 2010) are also a good supplement to the topic, because they are independent from the satellite or radiosonde temperature data, and do not suggest a fundamental data-theory-model mismatch. I was also happy to see a discussion by Steve Sherwood on various implications of a real data-model mismatch should it exist. In the next paragraph, I will outline some points where I disagree with Dr. Sherwood on this. Unfortunately, John Christy’s post read like a defense lawyer’s argument on why models stink and why everything is too complex, with only fairly limited substance on the actual issue of the tropical hotspot (and with only limited reference to a large body of literature on observational uncertainty). Nonetheless, several of his points require elaboration.

    Steve Sherwood correctly concludes that there is no obvious connection between a tropical hotspot and climate sensitivity. In fact, because the greenhouse effect depends on the temperature difference between the surface and layers aloft, lack of upper-level amplification could actually mean a slightly higher climate sensitivity, since the lack of enhanced infrared emission aloft (with no hotspot) would be compensated for by higher temperatures lower down to restore planetary energy balance. This would be a small effect though, and somewhat counteracted by a weaker water vapor feedback.

    I do think Steve oversells his point by saying “nil” however. Large departures from a moist adiabat would signal a fundamental destabilization of the tropical troposphere and have some influence on basically any tropical process involving deep convection. Personally, I tend toward the Sherwood-Mears argument that this is a big problem with observations and probably less of a “real” issue, but to the extent the issue is real, wide communities (e.g., those in hurricane projections) would need to take it seriously. This doesn’t translate trivially into climate change projections or into climate sensitivity.

    Much of John Christy’s post involved all sorts of topics not directly related to the problem of the tropical hotspot. As others have pointed out for example, the CO2 itself has little to do with the moist adiabat structure. The tropical hotspot would exist even if the Sun were the root cause of global warming (the stratospheric structure is a different story). Disappointingly, John Christy’s post does not give this impression. Moreover, the discussions about aerosol-cloud interactions, or how much the deep oceans are responsible for joule uptake and the decadal-scale changes in global temperature, are rather important but (IMO) distracting issues from the topic at hand. I think that Carl Mears and Steve Sherwood did a much better job at discussing multiple issues with both the data and models.

    It is important to note that no climate modeler regards their model as “the truth” but as testbeds of varying levels of complexity and usefulness, depending on the question being raised. Model evaluation should not be done on the expectation of perfection but on the skill at simulating various features of climate or climate change (e.g., there are countless questions one could raise for Mt. Pinatubo, the LGM, the mid-Holocene, etc). The tropical hotspot is just one of thousands of topics deserving of attention. Other topics like Arctic sea ice, snowpack in Colorado, precipitation changes in the subtropics, etc all are governed by different physical phenomena and need to be evaluated on their own. Sometimes, mismatches between models and observations are expected to arise for a number of good reasons- one of which simply being that the observed temperature record is one single realization of many possible realizations that could have emerged in the last few decades given internal climate variability. Another issue is observational uncertainty, or mis-specified forcings in the model. It is non-trivial to establish a real mis-match, but if one exists, finding why that exists is how interesting science gets done.

    After all of this has settled down, those aspects of climate (like e.g., the water vapor feedback, or summertime cooling following a volcanic eruption) that are robust to multiple datasets/methods/groups, are emerging in observations, have a sound theoretical basis, and are borne out paleoclimatically will be brought forth to policymakers or other groups with strong confidence. Other things that are not yet settled (like what dataset is best for evaluating the tropical hotspot, or whether there is a link between tornadoes and climate change) which do not meet this criteria are not brought forward or are done so with less confidence. Along the way, the implications for the “bigger picture” (e.g., the attribution of global warming to human activity or sensitivity) should be kept in mind.

    Finally, it is bizarre to me that it is recommended that models should not be used to inform policy makers because of uncertainty (which by extension, would also have to apply to the radiosondes and satellites in this case). Those seeking information in policy, agriculture, military, insurance, etc will need to be told about uncertainty in an appropriate way, and those groups I’m sure are used to dealing with uncertainty. Unfortunately, we do not yet have UAH satellite observations of the future, so we need to rely on models to inform outcomes. From this, there is only a short distance to the next- uncertainty works in a number of ways, and using it to suggest we cannot inform policy on a number of topics is, in my view, not recommended at this time.

  • Glenn Tamblyn

    A question for John & Carl, as the Satellite Temperature experts.

    Could the Hot Spot be hiding in plain site in the existing data? Follow my chain of thought and see what you think.

    The Upper Troposphere channel (TTS, MSU 3, AMSU 7) is showing essentially no warming or cooling. But this is the channel we would expect to see the hot spot appearing in.

    TTS has a weighting function where nearly 1/2 the signal (perhaps 40%) for TTS originates in the lower stratosphere and the rest in the upper troposphere.

    Channel TLS (MSU 4, AMSU 9) is associated with the lower stratosphere. Its weighting function is strongly located in the lower stratosphere and middle stratosphere with only a very small percentage (less than 10%) of its signal originating from the upper troposphere. And this channel is showing substantial cooling (more than 0.3 DegC/decade cooling which is stronger than the warming trend in the lower troposphere). Also the less commonly cited higher altitude channels for the rest of the stratosphere are all showing the stratosphere cooling as well.

    So, if the stratosphere is cooling strongly then nearly half the signal for the TTS channel, the one that is relevant when looking for the hot spot, is coming from a region of strong cooling. If the overall signal for TTS is showing no warming or cooling, doesn’t this suggest that the part of the TTS signal that originates in the upper troposphere must actually be warming, in order to balance the stratospheric component to give a net flat-line trend?

    Isn’t the hot-spot visible in the data if we understand what the data actually means?

    Could an approach similar to that used to create the TLT synthetic channel, using a differencing algorithm applied to off-nadir samples taken from MSU 3/AMSU 7, be used to synthesize a true upper troposphere measurement? Alternatively, could an approach similar to that of Fu & Johansson be used to extract such a signal?

    That might clear up this question once and for all.

  • Glenn Tamblyn

    Also a question for John Christy.

    In your first graph you have taken the average of the data for UAH and RSS satellite products for the mid troposphere channel (TMT). But there is a third team producing such a product from the same channel, the STAR/NESDIS team. Why haven’t you included their analysis. UAH is showing a trend for TMT of 0.04 DegC/decade and RSS of 0.078, giving an average of around 0.06.

    However, STAR/NESDIS are reporting a TMT trend of around 0.124 DegC/decade. So an average of all three products would be more like 0.08, 1/3rd higher. Also wouldn’t it be more meaningful to show the range of results produced by all three products. A trend of 0.124 from STAR/NESDIS would match very well with the climate model results for example. Although all three teams are working hard to try and find the correct result, the range of values obtained still leaves open some significant questions about what is actually happening up there. Obviously all three teams can’t be right and showing the range of results seems far more meaningful.

    Also, in using the radiosonde data, what consideration have you given to the well recognized issues with radiative heating effects and the way in which they distort the trends reported by the radiosonde products. Many people have expressed reservations about how much credence can be given to the radiosonde data, particularly at higher altitudes.

    Also, the sideways step in the radiosonde data between ground level and 850 hpa looks really suspicious, almost unphysical. If there is an issue with the radiosonde data, with perhaps a bias introduced there, maybe the entire curve needs to be shifted to the right. In which case the radiosondes aren’t as far out of step with the models as they appear.

    Also, why haven’t you included the satellite data from the three teams here where applicable? Although the sat’ data covers broader altitude ranges it would still provide a useful point of comparison. Why omit it?

  • GavinCawley

    It is worth noting that the statistical test used in Douglass et al. (2008) is obviously inappropriate as a perfect climate model is almost guaranteed to fail it! This is because the uncertainty is measured by the standard error of the mean, rather than the standard deviation, which falls to zero as the number of models in the ensemble goes to infinity. If we could visit parallel universes, we could construct a perfect climate model by observing the climate on those parallel Earths with identical forcings and climate physics, but which differed only in variations in initial conditions. We could perfectly characterise the remaining uncertainty by using an infinite ensemble of these parallel Earths (showing the range of outcomes that are consistent with the forcings). Clearly as the actual Earth is statistically interchangable with any of the parallel Earths, there is no reason to expect the climate on the actual Earth to be any closer to the ensemble mean than any randomly selected parallel Earth. However, as the Douglass et al test requires the observations to lie within +/- 2 standard errors of the mean, the perfect ensemble will fail the test unless the observations exactly match the ensemble mean as the standard error is zero (because it is an infinite ensemble). Had we used +/- twice the standard deviation, on the other hand, the perfect model would be very likely to pass the test. Having a test that becomes more and more difficult to pass as the size of the ensemble grows is clearly unreasonable. The spread of the ensemble is essentially an indication of the outcomes that are consistent with the forcings, given our ignorance of the initial conditions and our best understanding of the physics. Adding members to the ensemble does not reduce this uncertainty, but it does help to characterise it.

    The thing that really concerns me though, is that Douglass and Christy (2013) discuss their earlier paper quite uncritically, despite the statistical shortcomings of that paper having been widely discussed, both on-line and in the peer-reviewed litterature.

    Douglass DH, Christy JR, Pearson BD, Singer SF. A comparison of tropical temperature trends with model predictions. Int J Climatol 2008, 27:1693–1701

    Douglass, D. and J.R. Christy, 2013: Reconciling observations of global temperature change: 2013. Energy and Env., 24 No. 3-4, 414-419.

  • Theo Wolters

    The explanation of the physical origin of the tropical hot spot by Mr Mears is based on undisputed physics.
    However, in my opinion that does not mean that it is possible to claim that climate in all it’s complexity, should behave according to our simplified physical representations. There is for instance no doubt that many climate phenomena behave as chaotic systems, seemingly disobeying physical laws that we would expect to produce linear behaviour.

    In the case of the hot spot, it is in my opinion possible to suggest mechanisms that are consistent with the reality we know, and still don’t produce a hotspot.
    I would like to bring to your attention a theory that might explain why the hot spot is missing.

    It is based on the essence of the tropical thunderstorm: it originates from the temperature difference between the warming surface and the air of the lower troposphere. So it will develop as soon as this temperature difference has reached a certain value. On a hot day that will be earlier than on a cold day, but at the same surface temperature. And it will immediately cool the surface beneath the storm center considerably.
    This mechanism can be observed on satellite images.

    This would result in vertical temperature profiles of storms that are the same on both warm and cold days. On a hot day, or in a warmer climate, they just start earlier and last longer, transporting a lot more energy to the tropopause, without the necessity of a temperature change at any height. An increase of the volume of air and water vapour that is transported upwards would suffice to transport and store the extra energy, without producing a hot spot.

    This theory is explained in more detail on http://www.climatetheory.net/11-analysing-the-missing-tropical-hot-spot/

  • Nichol Brummer

    I have a question about the last paragraph from Sherwood, about where the heat has been going in the past decade. If more of the heat has gone into the oceans, as seems likely, is it then appropriate to relax and think that the oceans are keeping us cool? Or will more heat in the oceans also cause a bit more sea level rise. Also: a small change of sea temperature could be a big problem for an ecosystem that may have evolved to depend on the past relative stability of ocean temperatures? So it may be that even if atmospheric temperature has not been rising much, we should still worry about the effects on the ocean ecosystem, and sea level.

  • Andries Rosema

    As I understand it, the basic basic argument for a tropical hotspot would be the increase in evaporation due to an increase of surface temperature. It should be clear that evapotranspiration does not depend on surface air temperature, but on the surface “skin” temperature, which may differ considerably from the air temperature. We have studied the surface “skin” temperature, using Meteosat data for the period 1982-2006, and find a significant decrease in both land and ocean skin temperature. Why should we then expect a tropical hotspot?

    See: Rosema A, Foppes S, van der Woerd J(2013) “Meteosat derived planetary temperature trend 1982-2006″, ENERGY & ENVIRONMENT, Volume 24, No. 3 & 4, 2013, pp 381-395

  • chris colose

    Andries

    Actually evaporation is not really relevant for this picture. The mechanisms governing the amount of water vapor in the air and (non-linear amount of latent heat release via Clausius-Clapeyron) work even if you reduce evaporation a bit, say through reducing wind speed.

  • Theo Wolters

    @ Chris and Andries
    The way Mr Mears explains the hot spot is clearly illustrated in the link I mentioned before:
    http://www.climatetheory.net/11-analysing-the-missing-tropical-hot-spot/

    Using the graph in the link, assuming the same humidity instead of the same relative humidity of the parcel at the surface, a 10 degrees warmer parcel would start ascending over the DALR till 1 km high, and then follow the SALR.
    This SALR curve would be very close to the 20C curve, resulting in a complete lack of warming at 10km.

    So I don’t think that the (absolute) humidity at the surface can be disregarded.

  • Ross McKitrick

    Carl Mears distinguishes two aspects of model behaviour in the tropics that seem to be conflated in the use of the term ‘hot spot’: the observed rate of warming, and the ratio of warming aloft to that at the surface. Mears’ post focuses on the amplification aspect, whereas John Christy’s focuses on the rate itself.

    Regarding the uniqueness of the tropical “hotspot”, the uniqueness arises from the magnitude of the trend, not the amplification with respect to the surface. While it is true that amplification would be observed in response also to increased solar forcing, it’s clear from comparing panels (a) (solar), (c) (GHG) and (f) (all) in the IPCC figure that only GHG’s are expected to have had a sufficiently strong effect to yield the level of warming projected overall. Were there to be a lack of warming, it would be most inconsistent with the GHG simulation.

    To ask whether the ‘hotspot is missing’ is evidently too ambiguous a title. I distinguish 4 variants of the question: (a) Is there any observed amplification at all, (b) is there as much as is predicted by climate models, (c) is there any warming trend aloft, and (d) is there as large a trend as is predicted by models.

    Here is what I glean from the 3 postings so far:

    (a) From his Figure 1, Mears argues that, with long enough data sets, enough of the observational series show troposphere/surface trend ratios greater than 1 to allow us to say that evidence does not refute the expectation of amplification.

    (b) Mears’ Figures 1 and 3 (+4) nonetheless show that the amplification rate in models is high relative to the distribution in the observations. Current data sets are too short to say whether the difference is statistically significant or not. I doubt they will ever be long enough. The statistical issues involved in figuring out the distributions of ratios of random numbers get complicated quickly, and I wouldn’t be surprised if the problem is intractable.

    (c) None of the authors focused on this question. In the correction to McKitrick McIntyre and Herman (2010, herein MMH), see Atmospheric Science Letters October 7 2011, we show that, for a 1979-2009 sample, the answer is mostly yes in the LT layer and mostly no in the MT layer. In my work with Tim Vogelsang (under review, herein MV, available at http://econapps-in-climatology.webs.com/MV-revision-April_2013.pdf) we find that over the 1958-2010 interval for HadAT, RICH and RAOBCORE the answer is yes in LT & MT, but if you allow a step-change in 1977 the trends go to zero in both the LT and MT layers. So it’s not a “trend” it’s a single step that accounts for the change in the mean over the sample.

    (d) Christy focused on this question, which I consider the more interesting one as well. Mears brought the issue up in regards to his Figure 2, acknowledging that the observed trends are at the extreme low end of the model distribution. I thought it was unfortunate that he tabled further discussion, despite agreeing that it is the more interesting issue. In MMH we showed that the difference between models and observations is weakly significant over 1979-1999 and significant (p<0.05) over 1979-2009, at both the LT and MT layers. In MV we find the difference over 1958-2010 is significant whether or not a step-change is permitted at 1977, that the data endogenously determine the necessity of a 1977 step-change, and that the rejection of climate models is very strong (p<0.001 in all cases). In both these papers we use robust time series methods that address the shortcomings of the methods in the Douglass et al/Santer et al dispute.

    Specialists might find the discussion of (a) and (b) interesting, but it seems to me that it amounts to arguing over a specific aspect of the behavior of climate models that contributes toward, but doesn't fully determine, their overall accuracy and validity. (c) and (d) are interesting because the models are in such clear agreement that, given the underlying assumptions they share in common, there ought to be, not merely this or that ratio of warming aloft-versus-surface, but a lot of warming, period. The failure to observe anything like that much warming means either that multiple independent observational systems are missing the warming that really is there, or models share some biases in common. Since the observational record involves two different systems (radiosondes and MSU) and the average balloon record does not differ from the average MSU series (MMH Table III), the credibility of the observed record merits serious consideration.

    Parenthetically, having published numerous papers on problems in the various land surface temperature series and paleoclimate reconstructions, I am aware that people may base their assessment of the reliability of temperature data sets on whether they support (or not) hypotheses that are preferred on a priori bases. So the details of why data sets get deemed to be reliable or not matter. I find the IPCC and CCSP processes implausibly quick to accept the land surface record while dismissing the tropospheric record.

    I don't buy Steven Sherwood's argument that a lack of warming in the tropics is ultimately irrelevant either for attribution or sensitivity calculations. The Gaffen et al data cited includes the 1977 Pacific Climate Shift, and I suspect that had the sample ended in, say, 1976, the results would show model trend overprediction, as is the case in MV. His challenge, that those arguing for the importance of the issue need to come up with an alternative model that "agrees just as well with observations" is misplaced, since the starting point of the whole discussion is that the existing models don't agree with the observations.

    The Figure shown at the top of the discussion, from the AR4, is a backcast indicating that models projected that a lot of warming should have been observed due to the increase in GHG's over the 20th century. Mears' Figure 2 and Christy's Figure 1 show that relatively little has been observed since 1979 in comparison to model projections, and as I explained, the discrepancies are statistically significant. My work in MV shows very little trend since 1958, only a step change at 1977 that is not predicted by the models and is associated with a different cause. It adds up to a potentially serious inconsistency, to coin a phrase.

    Overall, it looks to me like there is enough agreement among GCMs about what "should" be happening in the tropical troposphere if the underlying mechanisms are all well-understood and accurately represented in models, and enough agreement among different data sets that it is either not happening or happening only at a very attenuated level, to take it as read that the models are overstating the expected rate of warming in the tropical troposphere in response to rising GHG levels. I think it will be very important in the years ahead to figure out why this is the case.

  • phi

    I have a question about figures 2 and 4 of the text of JR Christy: How observations trends are determined at 1000 hPa and is there values ​​at 950 hPa?

    And two comments:

    - the declining trend from 1000 hPa to 850 hPa is quite surprising,
    - the calibration of Figure 4 at 850 hPa tells a very different story.

  • Arthur Smith

    Ross McKitrick asserts consistency in the tropospheric observations because “the observational record involves two different systems (radiosondes and MSU) and the average balloon record does not differ from the average MSU series (MMH Table III)”. If ever there was an assertion about scientific observations that tells far more about the asserter than it does about reality, this is surely it. I urge the diligent on this thread to investigate this claim, and what it really implies about the consistency of tropospheric temperature trends, or in more common terminology, what their uncertainty is likely to be, a term that McKitrick notably avoids while talking about “averages”.

  • phi

    Carl Mears,

    Thank you for your interesting answer.

    From my side, I would make a different assumption even if I do not know enough about the source of the data and the tropical dynamics to be very assertive.

    First, I don’t think that surface temperatures trends are very reliable. You noted that the boundary layer was complex, you can add it is very inhomogeneous in the horizontal plane. In addition, these inhomogeneities are not temporally stable and a series of puzzles are related to surface temperatures (various divergences and inconsistencies).

    For these reasons, I prefer not to take into account the trend in 1000 hPa and analyze what happens from 850 hPa. It may not be statistically very significant, but in this case, the observed (relative) hot spot is even more pronounced than the modeled one (about 25% more at the tropopause).

    On the other hand, the modeling of the initial effect of GHG (before feedback) postulates the invariance of the temperature gradient. This assumption is in contradiction with the general theory of heat transfer (flow distribution). In fact, we should expect an initial hot spot (not related to water vapor) in parallel with the increase of the average temperature.

    The strong hot spot observed and the overvaluation of the surface warming by models are two characters that tend to confirm this theory.

  • Ross McKitrick

    Steven, I am not making a categorical claim that the land surface record should be dismissed entirely and the free atmosphere data should be accepted uncritically. My parenthetical comment was to point out that the research community seems readily to accept that there are possible non-climatic trends in the balloon and MSU records, yet is very resistant to evidence of the same problems in the land data products. The IPCC described biases in the land record as “negligible” and dismissed their possible influence out of hand, though in the most recent draft they seem to have climbed down from this rigid stance somewhat. All data products, including the SST records, have strengths and weaknesses. It strikes me that one of the strengths of the free atmosphere data is that two independent measurement systems operate simultaneously in the same locations. So the fact that, on average, in the tropics, balloons do not disagree with satellites but both disagree with models points to a real mismatch between models and observations.

  • Paul S

    Ross McKitrick,

    While it is true that amplification would be observed in response also to increased solar forcing, it’s clear from comparing panels (a) (solar), (c) (GHG) and (f) (all) in the IPCC figure that only GHG’s are expected to have had a sufficiently strong effect to yield the level of warming projected overall.

    As far as I can see you haven’t referenced the figure in question but I think I recall from arguments you’ve made previously that you’re talking about this one. You’re either misinterpreting this figure or missing the point. What it shows is the expected temperature change given our understanding of how those different climate drivers have changed historically. The GHG hotspot is larger simply because the model considers that historical GHG forcing should have caused more surface warming than, say, historical solar forcing. If we conjecture that historical solar forcing is actually of a similar magnitude to GHG forcing (as some have) the expected hotspot would be the same. The point is that the hotspot in these models is a function of surface warming, mostly unrelated to the cause of that warming.

    As others have indicated, trying to explain a missing hotspot (if it is missing) by invoking biases in the land surface temperature record doesn’t work well. I’ll offer a different perspective than others and simply focus on the magnitudes involved. Land takes up about 23% of surface area in the Tropics (20S to 20N). Even if we were extreme and decided to cut tropical land surface trends by 50% it would only decrease tropical land+ocean trends by about 10-15%. Furthermore, since the proposed hotspot in mid-tropospheric temperatures is a function of the moist adibiatic lapse rate, it is sea surface temperature change, rather than change over land, which will dominate. So, even a large warming bias in the land surface temperature record would have only a small effect on land+ocean surface trends and would be negligable for expectations of mid-tropospheric temperature trends.

    I think there may be scope for discussion of the SST records. Carl Mears states that HadCRUT4 and NCDC tropical land+ocean surface trends are very similar. However, modelling using prescribed SST observations (e.g. as discussed by Isaac Held here) is usually performed using the HadISST1 dataset, which doesn’t feature in any of the major land+ocean records. Whereas the 20S-20N 1979-2012 linear trend in HadSST3, part of HadCRUT4, is 0.10ºC/Dec, in HadISST1 it is 0.055ºC/Dec. Replacing HadSST3 with HadISST1 for the SST portion of HadCRUT4 changes the 20S-20N 1979-2012 trend from 0.115ºC to 0.080ºC/Dec. This is fairly significant for the surface trend and, because the difference is in the sea surface area, is potentially important for our expectations of mid-tropospheric trends.

    I’ve yet to find anything published looking at reconciliation of tropical HadISST1 and HadSST3 trends. Would be grateful if someone here has seen anything relevant.

  • Paul S

    Regarding the decreased trend at 850hPa compared to the surface, could it be partially related to radiosondes being launched nearly exclusively from land? It has been discussed in numerous papers that part of the land-sea warming contrast is caused by decreased evapotranspiration due to increased CO2. Even on islands this might enhance near-surface warming.

    I would guess the CMIP5 comparisons use complete sampling of the tropics (?) though even attempting to mask for radiosonde sampling might not work since GCM grid cells are too coarse for many of the islands used.

  • chris colose

    Regarding the question of feedback between water vapor and the lapse rate:

    The anti-correlation between the lapse rate and water vapor feedback is well understood physically (see e.g., Ingram, 2010), since outgoing radiation changes are largely determined by the relative humidity structure, but this partial cancellation has been known for decades, and alternative ways to setup feedback definitions, such as keeping relative humidity fixed while warming the troposphere (instead of the usual base state in which just the temperature is allowed to adjust and specific humidity is held fixed, see e.g., Held and Shell, 2012) yield insight into the framework behind this cancellation. But I think John Christy is trying to dodge any acknowledgment that we know something about climate.

    For the feedback, it’s useful in this context to decompose the tropical water vapor feedback into two components, one of which is the enhanced water vapor that you’d get throughout the column if you warmed the tropical troposphere by a uniform amount. I’ll call that WV(w). The second component to the water vapor feedback would be any “extra water” that you get on top of the uniform-warming assumption by amplifying the upper tropospheric surface temperature. I’ll call that WV(a). So the total water vapor feedback, WV, would be WV = WV(w)+WV(a).

    Let LR be the lapse rate feedback. Ignoring cloud feedbacks, and any other surface albedo changes, the total feedback in the tropics would roughly be feedback = WV + LR = WV(w) + WV(a) + LR.

    In general, we’d expect WV(w) and WV(a) > 0 (positive feedbacks) and LR |LR| > WV(a). It is the WV(a) component that is partially cancelled by the lapse rate feedback, and since |LR| > WV(a), any departure from the moist adiabat to something closer to a uniform warming situation would result in a positive feedback (since the negative lapse rate feedback is larger than this second water vapor contribution). Nonetheless, the total water vapor feedback, bringing in W(w), makes the sum positive regardless.

    There is now clear evidence for positive water vapor feedback in observations, even on longer timescales (see e.g., Soden et al., 2005, Science Shi and Bates, 2011, JGR), despite Christy’s assertion otherwise. This is also true in the lower troposphere and on interannual timescales, with many papers on this (e.g., Dai et al., 2011, J. Climate, and see some of Andrew Dessler’s papers). Consistent water vapor responses are also seen in response to Pinatubo.

  • Paul Matthews

    The introductory item glosses over some significant recent papers:
    “More papers then started to acknowledge that the consistency of tropical tropospheric temperature trends with climate model expectations remains contentious.[xiv][xv][xvi][xvii]”

    [xv] Po-Chedley and Fu 2012 says “It is demonstrated that even with historical SSTs as a boundary
    condition, most atmospheric models exhibit excessive tropical upper tropospheric warming”

    [xvi] Santer et al 2012 says that models “overestimate warming of troposphere” (and it is amusing that Santer et al 2012 does not cite Santer et al 2008).

    [xvii] Thorne et al 2011 says “agreement between models, theory, and observations within the troposphere is uncertain over 1979 to 2003 and nonexistent above 300 hPa. ”

    These recent papers would suggest that the question of consistency between observed and modelled tropical temperature trends is in fact not very contentious :)

  • Ross McKitrick

    Here is some information on the question of whether observational trends are significant and whether they match model trends; also what individual data sets say versus what they jointly say. These are relevant results from my papers MMH2010 and MV2013 (references at end). Note the final results for MMH2010 were in the Correction of Oct 2011, and MV2013 is under review.

    MMH2010 looked at the UAH, RSS, RICH and HadAT series, and for each asked (among other things) if the 1979-2009 trends are significantly different from zero and significantly different from the model mean, at both the LT- and MT-equivalent layers. Each paper uses HAC estimation of the variance-covariance matrices to be fully robust to dependence among the series and over time, but in MMH2010 we had some incomplete data segments, so we also used panel estimators allowing for an AR1 correction, since that method can handle unbalanced panels. In MV2013 all the panels are balanced so we only used the HAC estimators.

    For LT:
    (a) The UAH trend (0.078 C/decade) is significantly different from zero at the 10% level, the RSS trend (0.146 C/decade) is significantly different from zero at the 5% level, the HadAT trend (0.091 C/decade) is significantly different from zero at the 10% level and the RICH trend (0.111 C/decade) is significantly different from zero at the 5% level. The 4 series averaged together have a trend (0.105 C/decade) that is significantly different from zero at 5%. So I conclude the data exhibit a trend at the LT layer.
    (b) UAH and RSS are individually and jointly significantly different from (i.e. below) the models at the 1% and 5% levels respectively. HadAT and RICH are jointly significantly different from (i.e. below) the models at the 1% level. (We didn’t test them individually.) All 4 series averaged together have a trend significantly different from models at 5%. So I conclude the data are significantly below LT model trends.
    (c) UAH and RSS are significantly different from each other at 5%. The MSU series averaged together is not significantly different from the balloon series averaged together (p>0.9). The disagreement within the basic data types is stronger than that across the data types, and is not large compared to the difference between observations and models.

    For MT:
    (a) The UAH trend (0.040 C/decade) is insignificant, the RSS trend (0.111 C/decade) is significantly different from zero at 5%, the HadAT trend (0.018 C/decade) is insignificant and the RICH trend (0.025 C/decade) is insignificant. The 4 series averaged together have a trend (0.025 C/decade) that is statistically insignificant (p=0.53). So only RSS exhibits a trend at the MT layer.
    (b) UAH and RSS are individually and jointly significantly different from (i.e. below) the models at the 1% and 5% levels respectively. HadAT and RICH are jointly significantly different from (i.e. below) the models at the 1% level. (We didn’t test them individually.) All 4 series averaged together have a trend significantly different from models at 5%. So I conclude the data are significantly below MT model trends.
    (c) UAH and RSS are significantly different from each other at 5%. The MSU series averaged together is not significantly different from the balloon series averaged together (p>0.4). The disagreement within the basic data types is stronger than that across the data types, and is not large compared to the difference between observations and models.

    MV2013 looks at the 1958-2010 interval using 3 balloon series: HadAT, RICH and RAOBCORE, in each case the latest version at the time of the analysis. We derived a HAC estimator robust not only to dependence over time and among series, but also to a step-change at either a known or an unknown point. The data identify a significant step-change at December 1977 when comparing observations to models. Allowing for this step change we get the following results:

    For LT:
    (a) HadAT trend is 0.064 C/decade (0.135 without the step change). RICH trend is 0.093 C/decade (0.134 without the step change). RAOBCORE trend is 0.065 C/decade (0.147 without the step change). The trends are all insignificant with the step change (p=.35, .18, .24 resp) and significant without the step change (p<.001 each). So I conclude that evidence for a significant 1958-2010 LT trend in the balloons is not robust to inclusion of a step-change around 1978.
    (b) The average balloon trend is significantly different from the average model (i.e. below) at < 0.1% significance without the step-change, and with a step-change at a known date the difference is significant at <0.02%. In tests of individual models, 12 of 23 individually over-predict the trend by a significant margin. If the break date is assumed unknown the difference between balloons and the average model is significant at <0.1%. So I conclude that over the 1958-2010 period the models have a collective tendency to over-predict the LT trend in the balloon data.

    For MT:
    (a) HadAT trend is -0.001 C/decade (0.089 without the step change). RICH trend is 0.048 C/decade (0.096 without the step change). RAOBCORE trend is 0.042 C/decade (0.132 without the step change). The trends are all insignificant with the step change (p=.99, .50, .39 resp) and significant without the step change (p<.001 each). So I conclude that evidence for a significant 1958-2010 MT trend in the balloons is not robust to inclusion of a step-change around 1978.
    (b) The average balloon trend is significantly different from the average model (i.e. below) at < 0.1% significance without the step-change, and with a step-change at a known date the difference is significant at <0.01%. In tests of individual models, 17 of 23 individually over-predict the trend by a significant margin. If the break date is assumed unknown the difference between balloons and the average model is significant at <0.1%. So I conclude that over the 1958-2010 period the models have a collective tendency to over-predict the MT trend in the balloon data.

    Regarding the comment by Steven Sherwood, the average model sensitivity calculation only tells us something about the central tendency of models. Whether models approximate the real world is precisely the point at issue.

    Carl points out that, depending on the surface-troposphere pair one uses, one can observe amplification with altitude, so the result concerning the absence of a hotspot is not significant in a gross sense. My understanding is that this depends to some extent on the use of an earlier version of ERSST which was believed to have a cold bias and later replaced. But in any case, each tropospheric series implies a likely surface trend based on the reciprocal of the amplification factor. The distribution of implied surface trends would be interesting to compare to the distribution of observed trends, as another check on surface data quality.

    References:
    McKitrick, Ross R., Stephen McIntyre and Chad Herman (2010) “Panel and Multivariate Methods for Trend Comparisons in Climate Data Series” Atmospheric Science Letters Volume 11, Issue 4, pages 270–277, October/December 2010 DOI: 10.1002/asl.290. Correction: October 2011.

    McKitrick, Ross R. and Timothy Vogelsang (2013) HAC-Robust Trend Comparisons Among Climate Series with Possible Level Shifts. In review.

  • Paul S

    From my perspective there has been much discussion of observations but little regarding quantification of exactly what it is we should expect to see in these observations. I thought it would be a good idea to offer a review the main methods and extracted expectations pertaining to this topic:

    TLT/Tsurf amplification factor
    The trend ratio between a particular vertical-weighting of model air temperature to model land+ocean surface/near-surface temperature. Christy et al. 2010 found an expected TLT/Tsurf amplification factor of ~1.4 using CMIP3 models.

    Using CMIP5 1981-2005 TLT trend figures reported by Po-Chedley and Fu 2012 comparing to CMIP5 land+ocean SAT (2m surface-air temperature) data from Climate Explorer I found the average ratio to be 1.33 with 95% spread 1.05-1.75. However, there would be a small bias here if we tried to relate this ratio to observations because Tsurf records tend to use ocean SST data rather than SAT. For the models which also had consistent SST data stored at Climate Explorer I produced combined LandSAT+OceanSST time series using an area ratio of 0.232:0.768. From these I found an average ratio of 1.4, 95% spread 1.15-1.5. Part of the reduced spread seems to be due to the use of SSTs rather than SATs, and part due to dropping models without available consistent data.

    Equally we can check TTT/Tsurf amplification factors, again using TTT values from P-CF2012. Against CMIP5 landSAT+oceanSST I find an average 1.65 and range 1.35-1.8.

    One caveat I would offer here is that very few CMIP5 models produce surface trends close to observed values. It’s possible that the range may be overconfident or even biased due to them relating to model runs with mostly larger surface trends.

    TTT/TLT ratio
    This was the actual focus of Po-Chedley and Fu 2012. While the TTT and TLT weightings substantially overlap TTT has greater weighting in the upper troposphere and TLT in the lower troposphere. Therefore the ratio between the two can be considered to reflect the presence and strength of any “hotspot” around the upper troposphere in relation to lower tropospheric warming.

    In all CMIP5 models there was a ratio greater than 1 for 1981-2005 trends, meaning that TTT warmed more than TLT in every model, and the average was 1.19.

    AMIP prescribed SST simulations
    Unlike the fully coupled atmosphere-ocean (and land) CMIP5 model runs, AMIP simulations use atmosphere-only models fed with SSTs from observations. Automatically this means surface trends are a decent match for those which actually occurred, both in terms of magnitude and spatial distribution, so at first glance we might expect that the model output can be compared directly to observed trends as a test of how the model translates surface temperature changes up through the troposphere. However, I found a few details which make me question the value of AMIP simulations in this regard.

    Despite using substantially the same basic models as in CMIP5 historical runs, AMIP simulations produce a substantially shifted TLT/Tsurf range, with average TLT/land+oceanSAT moving from 1.33 to 1.55. Now, it might be the case that the specific pattern of observed SST trends tends to cause enhanced tropospheric amplification, at least in the models – that is, after all, why you might want to run these types of simulations rather than trusting a universal scaling factor and it is still within the 1.05-1.8 spread. However, there also appears to be a considerable discrepancy between CMIP5 and AMIP simulations regarding ocean SAT/SST trend ratios – averaging about 1.07 in CMIP5 and 1.15 in AMIP – meaning that TLT/LandSAT+oceanSST ratios average ~1.8 compared to ~1.4 in coupled CMIP5 runs. This suggests to me that the enhanced amplification is really caused by a bug/feature of the AMIP experimental setup (i.e. non-interactive SSTs) which is causing unrealistically rapid trend gradients, particularly in the boundary layer.

    This apparent issue makes me doubt the usefulness of AMIP model simulations for the topic in question. I don’t think the outputs can really be considered to be expected values. At the least it needs to be recognised that they are likely biased high for TLT, TMT and TTT trends.

    Discrete vertical interval trend comparisons
    Basic model air temperature outputs are provided in discrete vertical levels, defined in terms of atmospheric pressure, from the surface to upper stratosphere. We can simply find model trends at each level as John Christy does in his figures 2 and 4. Since satellite MSU/AMSU data doesn’t have the fidelity to properly compare discrete levels in this way, the only observations available for this task are from radiosondes.

    At first glance this seems like the most obvious and best way to test model expectations – it avoids the messy overlapping of vertical domains in the satellite data and uncertainties in vertical weighting functions used to produce the model TLT/TMT/TTT equivalents. However, the problem is in assessing whether the radiosonde observations are up to the task. It’s beyond the scope of this little review of expectations to discuss deficiencies in radiosonde observations (not to mention putting me way out of my depth) but it is relevant to suggest how model data should be processed in order to provide like-for-like expectation comparisons with the radiosonde observations. At the least the model data should be masked to cover only the sites from which radiosondes were launched, rather than sampling the whole tropics as I assume was done to produce John Christy’s graph.

    Other comparisons
    John Christy has offered another comparison, in his original post and latest comment: that of absolute CMIP5 (TMT in this case) trends to relevant satellite and radiosonde observations. It’s difficult to see how this comparison is useful for the discussion at hand given that modelled mid/upper tropospheric trends are so closely linked to surface trends and CMIP5 tropical surface trends are almost all larger than those observed. So, even if mid/upper tropospheric warming was working perfectly consistently with model expectations (relating to the scaling discussed earlier), this comparison might well incorrectly indicate otherwise.

    John Christy might argue that surface trend is affected by what happens aloft so it is an appropriate test. While it is very likely the case that surface trends will be affected by the specifics of lapse rate changes and water vapor feedbacks offering this comparison as evidence of something would require an assumption that those combined factors are the dominant reason for lower observed surface trends. Since that is both a major assumption and an, as yet, unfounded one I would suggest this type of comparison contributes more confusion than insight. I think it’s an important topic for discussion why modelled tropical surface trends are generally so much larger than observed, but it’s not this topic.

  • Paul S

    Regarding John Christy’s statement in his last comment:

    The 1000 hPa temperature is taken from the NCDC surface temperature dataset. We did not have values at 950 hPa.

    This is problematic because the NCDC product has very different spatial sampling from radiosonde datasets and the NCDC land surface component mostly relates to the 925hPa level rather than 1000hPa.

    It would make more sense to use the radiosonde 850hPa level as the base reference for comparison with model trend ratios.

  • Paul S

    Marcel,

    The trend discrepancies are mostly due to referencing different periods – 1979-2010 for Mears, 1979-2012 for Christy.

    uah5.6 1979-2010 = 0.048
    uah5.6 1979-2012 = 0.03

    rss3.3 1979-2010 = 0.114
    rss3.3 1979-2012 = 0.089

    To support the idea that RSS and UAH Tropical TMT trends are significantly different from each other the RSS site displays a validation comparison. If you select TMT and Summary buttons you can see that sampled UAH trends are outside the uncertainty range produced by RSS for the same sampling.

    For outsiders these differences are intriguing as on a global scale the trends from UAH and RSS are very close to one another.

    That’s pretty much true for TLT but there is a reasonable discrepancy for global TMT: 0.078 against 0.046ºC/Dec. That is intruiging though, given that TLT is produced from the same base data as TMT. It suggests the similarity in global TLT is largely due to compensating errors rather than agreement.

  • GavinCawley

    Prof. Christy replies to my comment “The results of Douglass et al. 2007, which are actually less remarkable than shown in my initial post here, still stand. The confusion created by reading later papers and blogs is the misunderstanding of the question that was addressed.”

    Actually I spotted the error from reading the original paper, the problem is not to do with the question posed in the paper, but the statistical validity of the analysis used to provide an answer.

    “We asked the question this way, ”If climate models had the same surface temperature trend as the real world, would their upper air temperature trends agree with the real world?” ”

    Whis is a reasonable question, but a reliable answer requires a valid statistical test. It is dissapointing that Prof. Christy responded to my comment, but without actually addressing the “parallel Earths” thought experiment that explains why the test used in Douglass et al. is clearly inappropriate as a perfect climate model ensemble would be virtually guaranteed to fail it!

    Prof. Christy continues: “We discussed problems with the criticisms of our 2007 paper in Douglass and Christy 2013, such as the improper use of datasets known to be obsolete and the comparison of upper air trends even though surface trends between models and observations were not consistent (i.e. apples to oranges).”

    The discussion in the 2013 paper does not include a discussion of the validity of the statistical test used, so it fails to address the criticism raised in my comment.

  • Arthur Smith

    The focus in the dialogue here on instrumental series problems, and particularly issues with the satellite measurements, I think is fascinating. I congratulate the organizers – at least in this instance we seem to have hit on a substantive issue with serious discussion.

    I still find John Christy’s argument (and Ross McKitrick’s supporting comments) about averages to be rather disconcerting. In Christy’s latest he writes “Neither will be perfect, hence an average of the two products I think is the best way to deal with the differences, since the average is within error ranges.” However, we are discussing errors of unknown origin. The difference between the two shows there are real observational errors here. It’s possible the two datasets have errors in opposite directions, and the average cancels those out. But it’s equally possible they have errors in the same direction, with one larger than the other – the average in that case is not getting us closer to the truth, but keeping the large error bars in mind (is Christy quoting only one-sigma standard errors?) even so it may be the best we can do. The fact that a third series, STAR, has numbers outside of the range between the other two certainly suggests that there really is a strong possibility the truth is not between but outside the range of the other two.

    In any case, the impression Christy gives here is one of trying to minimize uncertainties. I wonder what Judith Curry would have to say about it? In actual fact the measurements here are still clearly very uncertain, and are simply not of sufficient quality to say anything about differences between tropical troposphere amplification in observation and theory – the theory is well within 90% confidence interval regarding the ratio of troposphere to surface warming, and as I noted in my earlier comment, if you look at shorter time intervals the observations actually show clear and consistent amplification. Here the certainty Christy conveys is to imply an error in theoretical models of climate, and therefore that all conclusions about climate change are uncertain. To the contrary, there is no strong evidence here at all of any problem with models, but even if there were – uncertainty about climate change is hardly a reason to be certain that nothing bad will happen, and therefore we can ignore it!

  • phi

    There are inconsistencies but also things that agree and sometimes surprisingly well. As has been said, a ratio based on the values at 850 hPa avoid the serious inhomogeneity radiosondes / SST and stations. The following graph (http://img708.imageshack.us/img708/6844/s8ht.png) is an overprint on figure 2 of John Christy (see the orange curve).

    Also shown an alternative with correction of surface temperature which can be justified as follows:

    1. This is an extrapolation of tropospheric data to the surface which is founded on the known atmospheric physics and the results of models.

    2. This is roughly the ocean component of the trend, see tropical SST by NCDC (http://www.climate4you.com/images/NOAA%20SST-Tropics%20GlobalMonthlyTempSince1979%20With37monthRunningAverage.gif).

    3. The high value of surface trends is due to stations data which are notoriously unreliable (see this pretty amazing example : http://imageshack.us/a/img21/1076/polar2.png).

    The tropical hot spot is therefore there and even more pronounced in relative terms than what is expected. This should not be too surprising since the addition of CO2 is not assimilable to a forcing but act (before feedback by temperature) especially on the gradient.

  • GavinCawley

    Prof Christy: You have missed the point. Of course having an infinite sample means that we know the MEAN exactly, however we know a-priori that the observed trend is not going to be identical to the ensemble MEAN, even if the model is perfect.

    The ensemble MEAN is an estimate of the forced response of the climate system (i.e. the response of the climate due to changes in the forcings), as the unforced response (i.e. “internal variability” or “weather noise” etc.) is not coherent across model runs and hence will average out. The obseved trend however is the result of both the forced response and a single realisation of the unforced response. So we shouldn’t expect them to be identical, even if the model is perfect; the expected difference between the ensemble mean and the observations depends on the plausible variability of the unforced response (which we cannot estimate from a single realisation of the observed climate, but we could estimate it from the spread of the model runs from a perfect GCM).

    When we compare the observed trend with the GCMs we are comparing ONE realisation of a chaotic process with the MEAN of a set of simulations of that chaotic process. Even if the model producing the simulations is absoultely perfect, there is no reason to expect the realisation we actually observe to be any closer to the MEAN than any of the individual simulations. Hence the correct test for consistency is to determine whether the observed trend could plausibly be a sample from the population of simulated trends (i.e. the standard deviation test).

    As I have pointed out, an infinite ensemble of GCM runs from a GCM with perfect physics and infinite temporal and spatial resolution is almost guaranteed to fail the test used in Douglass et al. (2007). As someone that works in (a branch of) statistics, that seems to me to be absurd, if the test is reasonable, a perfect model should pass with high probability. Your response does not address that point.

    The “standard error” test is the statistics cookbook solution to the problem of comparing means, but that assumes that the corresponding population means should be expected to be identical. In this case they should not.

  • John Christy, Richard McNider and Roy Spencer trying to overturn mainstream science by rewriting history and re-baselining graphs | My view on climate change

    [...] to the basic science that this hot spot is not specific to a greenhouse effect? Yes, he did (at the ClimateDialogue discussion in which he participated): “Yes, the hot spot is expected via the traditional view that the lapse [...]

  • John Christy, Richard McNider and Roy Spencer trying to overturn mainstream science by rewriting history, knowingly spreading falsehoods and re-baselining graphs

    [...] to the basic science that this hot spot is not specific to a greenhouse effect? Yes, he did (at the ClimateDialogue discussion in which he participated): “Yes, the hot spot is expected via the traditional view that the lapse [...]

  • Keerpunt in klimaatdebat: 6 maart a.s. – en u kunt erbij zijn! - Climategate.nl

    [...] het is om daarbij tot de kern van het meningsverschil te komen blijkt wel uit de discussies op climatedialogue.org. Intussen gaan de duizenden klimaatonderzoekers door met onderzoeken en publiceren, en blijft het [...]

  • Global Warming...Fact or Fiction? - Page 200

    [...] [...]

  • IPCC stevig op de vingers getikt door Lewis en Crok - Climategate.nl

    [...] voorzag de sprekers voor zover nodig van een introductie en besteedde uitgebreid aandacht aan climatedialogue.org als “voorbeeldig” platform voor klimaatdiscussie.Overigens besteedde ook Judith Curry in [...]

  • En blogg för kvalificerad klimatdiskussion - Stockholmsinitiativet - Klimatupplysningen

    [...] The (missing) tropical hot spot [...]

  • Vaughan Pratt

    Just a thought: why can’t the missing tropical hotspot be a side effect of a cell caused by the Hadley cell, call it the Stratocell, similarly to how the Ferrel cell is caused? (I’ll describe the NH, the SH is its mirror image.)

    Whereas the Ferrel cell sits poleward of the Hadley cell, as a much smaller doughnut sitting on the ground at 30N-60N side by side with the larger doughnut at 0N-30N, the Stratocell is also at 0N-30N but as a very slightly bigger doughnut in the stratosphere encircling the Hadley cell (i.e. above it from the point of view of an observer on the ground at 15N looking up vertically).

    Just as the Hadley cell drives the Ferrel cell like one gearwheel driving another touching it, so does it also drive the Stratocell. The Stratocell’s bottom just above the tropopause is driven poleward accompanying (and driven by) the poleward flow of the Hadley cell’s top.

    As the top of the Hadley cell approaches 30N it finds territory getting scarce (decreasing perimeter of the increasing latitudes), so to keep Navier and Stokes happy it dives down and flows back to the equator.

    The bottom of the Stratocell encounters the same problem but it can’t solve it by diving down the way the Hadley cell does because the Hadley cell is selfish: it needs every bit of room it can get at 30N, in fact the pressure there should be getting larger on that account. So instead the Stratocell solves its space crisis by shooting up where there is no opposition, then over the top and back to the equator.

    So now we have one Hadley cell driving two neighbors like touching gearwheels, one beside it, the Ferrel cell, and one sitting on top, the Stratocell. (Actually 6 touching gearwheels altogether when you include the SH, or 8 when the polar cells are counted.)

    For the duration of the Stratosphere’s ride where it is in contact with the Hadley cell it continually picks up heat from the top of the Hadley cell. At 30N this heated stratospheric air then rises bringing the heat with it, though losing temperature due to lapse rate. On the way back, with no further heat input, it loses heat. By the time it dives down to the equator it has become a refreshing breeze cooling what theory would otherwise have predicted would be the tropical hot spot.

    Since this mechanism seems pretty obvious I assume it was considered and discarded decades ago on account of some fatal flaw, such as evidence against any significant poleward flow up there. Nevertheless I’d be interested in seeing the literature where this mechanism is discussed. And if there isn’t any it would be interesting to know what’s wrong with this theory.

Off-topic comments (click to expand)
  • Climate Dialogue about the (missing) hot spot « De staat van het klimaat

    [...] Marcel Crok op 16 juli 2013% Over at the Climate Dialogue website we start with what could become a very interesting discussion about the so-called tropical hot spot. Climate models show amplified warming high in the tropical [...]

  • visionar@comcast.net

    Our Sun went into an inactive cycle in 1998 and each sun cycle is diminishing into a Dalton Minimum. Due to a very active sun in the later half of the 20th century our oceans have stored this thermal rise. The PDO has shifted to cold several years ago and the AMO should shift to cold soon. When sun cycle 25 arrives expect a major cool down, ocean levels failing, corp failures and all the bad outcomes of cold weather. With the weaker sun and its solar wind, expect more Cosmic Ray interaction forming clouds with additional cooling. Hopefully we won’t have a serious volcano eruption in these coming days.

    For those looking for a clean power source, I will be on the Mother Love Show Wednesday the 17th at 2PM PST time zone http://www.LATALKRADIO.COM How a forgotten energy source Thorium LFTR will change the world. http://www.energyfromthorium.com

  • About that missing hot spot in the upper troposphere | Watts Up With That?

    [...] at the Climate Dialogue website we start with what could become a very interesting discussion about the so-called tropical hot spot. Climate models show amplified warming high in the tropical [...]

  • Climatedialogue.org op zoek naar de missing Hot Spot - Climategate.nl

    [...] standpuntenHot spot in modellen vs waarnemingenVandaag gaat op http://www.climatedialogue.org  de discussie van start over de roemruchte hot spot, die in alle klimaatmodellen ontstaat boven de tropen, maar door sceptici steevast aangeduid wordt [...]

  • Climate Dialogue About The (Missing) Hot Spot | The Global Warming Policy Foundation (GWPF)

    [...] at the Climate Dialogue website we start with what could become a very interesting discussion about the so-called tropical hot spot. Climate models show amplified warming high in the tropical [...]

  • Tropospheric hot spot | My view on climate change

    [...] current topic under discussion at ClimateDialogue is the tropospheric hot spot: Is it there, and if not, so what? Invited discussant are Steven Sherwood of the University of New [...]

  • Het klimaatdebat in 2014 - Climategate.nl

    [...] spot Een tweede reden om de betrouwbaarheid van de klimaatmodellen ernstig in twijfel te trekken is de discussie over de missing tropical hot spot op climatedialogue.  Daarin is tot nu toe alleen nog maar de vraag over de waarnemingen besproken, maar daaruit valt [...]

  • Fear and Climate Change - SLO Lincoln Club - San Luis Obispo County Lincoln Club

    [...]          “The (missing) tropical hot spot” at http://www.climatedialogue.org/the-missing-tropical-hot-spot/ [...]


Jump to expert comments | Jump to public comments | Jump to off-topic comments

Leave a Reply