Studies on the reconstruction of global mean temperature series are reviewed by introducing three series, HadCRUT3, NCDC, and GISS in details. Satellite data have been used since 1982 in NCDC and GISS series. NCDC series has the most complete spatial coverage among the three by using statistic interpolation technique. The weakened global warming in 2000–2009 as revealed in HadCRUT3 data is possibly caused by the lack of data coverage of this dataset over the Arctic. GISS and NCDC series showed much stronger warming trends during the last 10 years (~ 0.1 °C per 10 years). Three series yielded almost the same warming trend for 1910–2009 ( 0.70–0.75 °C per 100 years).
global mean temperature ; HadCRUT3 ; NCDC ; GISS
The reconstruction of high quality global mean surface air temperature data is fundamental to climate change research. More accurate assessment of global warming can be made with such data. Meanwhile, it can be used to determine the magnitudes of global warming, such as the characteristics of its spatial patterns and temporal variations, which constitutes the basis for the research on climate change causes. Therefore, it is of great importance to compare the various global mean temperature series and analyze the data and methodologies in use.
The research of global mean temperature began in the late 19th century. Since then, more and more data sources and statistical methods have become available and been incorporated into the reconstruction and improvement of global mean temperature series [ Wigley et al ., 1985 ; Ellsaesser et al ., 1986 ; Hansen and Lebedeff, 1987 ]. The short history of global mean temperature series research can be roughly divided into three periods: prior to the 1970s, the 1980s–1990s, and the 2000s. The first two periods can be called “classical” period, whereas the third is the “modern” period.
A great deal of pioneering and innovative work was made in the first period. To determine whether the global climate had a warming or cooling trend, scientists started to study the global mean temperature in late 19th century. Up till the mid-20th century, there had been more than 30 scientists who had constructed their own global mean temperature series, among which the most representative was due to Mitchell  . Since the total number of regular meteorological stations were only about 100–200 over land and islands, Mitchell had to divide the globe into six 30° latitudinal belts to calculate the zonal mean temperature for each belt, and then to combine them to estimate a global mean temperature. He compared 5-year averaged temperature with the 1880–1884 mean and first indicated that 1940s was the beginning of a warming period. Mitchell’s work was a good example showing that the study of global warming, from its very beginning, highly depended on the meteorological observations.
The number of in-situ stations significantly increased from 1980s to 1990s. As a result, in the second key period, Jones et al. [1986a ; 1986b ], Hansen and Lebedeff  , and Vinnikov et al.  independently developed three global temperature datasets. The numbers of available stations used in their studies are shown in Table 1 .
|Author||Northern Hemisphere||Southern Hemisphere||Global|
|Jones et al. [1986a ; 1986b ]||2,666||610||3,276|
|Hansen and Lebedeff ||1,902||738||2,640|
|Vinnikov et al. ||301||265||566|
In the 1980s, major sources of monthly mean temperature data were derived from World Weather Records and Monthly Climatic Data for the World. The major difference among the three authors’ reconstructions was in the methods of processing observational data. For example, Jones et al. [1986a ; 1986b ] used gridding method. They first interpolated the discrete station data into a 5° latitude by 10° longitude grid with the exception of the area south of 60°S where there were no observational data, and then calculated the hemispheric and global mean temperature based on area-weighted grid point estimates. On the other hand, Hansen and Lebedeff  divided the globe into 80 equal-area regions. They established the temperature time series for each region first, and then combined the 80 regional time series to form the global mean temperature series. Vinnikov et al.  used data from fewer stations. They calculated zonal means for each 30° latitudinal belt first, and then established hemispheric and global mean temperature from the zonal means (no data south of 60°S).
Although the numbers of stations used to obtain the three series were vastly different and the statistical methods employed were not the same, the estimated series of global mean temperature were highly correlated with one another, with correlation coefficients between 0.94 and 0.95 [ Jones et al ., 1986b ; Vinnikov et al ., 1990 ]. In general, the overall feature of the studies in this period is that mean temperature series representative of land areas were constructed but the observational data over the oceans were not included.
It was obvious that global representation was lacking with observational data only from land areas. The key feature of the studies in the third period is the incorporation of oceanic observations for better representativeness. At the beginning, the scientists attempted to use the on-deck observations of air temperature. However, it was found that the observations were largely affected by the reflection of solar radiation over the deck of the ship. Then, people improved the approach by using only night-time observations. Finally, Parker et al.  and Rayner et al.  showed that sea surface temperature (SST) should be used to give a better estimate of surface temperature over the oceans. Therefore, the present global temperature datasets all use the combination of land surface temperature and SST observations to obtain global mean temperature or earth surface temperature. In the following, three global mean temperature series that have been widely used will be introduced and reviewed.
Climatic Research Unit (CRU) at University of East Anglia compiled a dataset [Jones and Moberg , 2001] of global land surface temperature by using 5,159 meteorological stations, among which 4,167 stations were used to calculate the climatology of 1961–1990. Observations with anomalies above 5σ (standard deviation) were removed. The 5°×5° grid average temperatures weighted by grid area were used to obtain Southern Hemisphere and Northern Hemisphere series, which were further combined to become a global mean temperature series. In addition, this land surface series was merged with the SST data from the Hadley Centre to form the global surface temperature series named HadCRUT2.
HadCRUT2 was subsequently revised by identifying the uncertainties and error spreads, and was combined with the improved Hadley Centre SST data to become HadCRUT3 [ Brohan et al. , 2006 ], which was cited in the latest Intergovernmental Panel on Climate Change reports (IPCC AR4). In the IPCC version of this dataset, there are more than 4,500 land stations used. And the spatial resolution is flexible rather than a fixed 5°×5° grid, which is easy to compare with model results.
The uncertainties of land surface temperature variations in HadCRUT3 came from three sources: 1) station local errors, including instrumental error, normalized bias, and computational bias; 2) spatial coverage errors; and 3) representativeness bias, including heat island effect. In particular, the spatial coverage error is worth mentioning. To understand the effect of the spatial coverage problem, NCEP/NCAR reanalysis dataset over 50 years, which has complete spatial coverage, can be used to identify the difference between the mean temperature over the entire area and that over the grids with land stations. With regard to the influence of urbanization in developing countries, emphasis has been placed mainly on the contrast between urban and rural areas. However, the classification of an area to be urban or rural affects the outcome of the contrast. In building HadCRUT3, some urban stations were explicitly excluded and a few other urban stations were corrected by applying quality control and normalization processing. Within the estimate of the whole uncertainties, urban heat island contributes a standard error of 0.055 °C per 100 years since 1900. In all, the three kinds of uncertainties can explain the long-term bias by ± 0.1 °C, or 0.2 °C spread, with 95% confidence level. This 0.2 °C spread has decreased to 0.1–0.15 °C since mid-20th century due to the incorporation of more and more observations, especially the SST data [ Rayner et al. , 2006 ].
National Climatic Data Center (NCDC) established a global temperature dataset with a complete global coverage. Its global mean temperature was grid (5°×5°) area weighted average of surface air temperature [ Smith and Reynolds , 2005 ]. The processing of the SST data was characterized by not only taking into account of changes in observational technologies but also assimilating satellite data [ Smith and Reynolds, 2003 ; Smith and Reynolds, 2004 ]. For the SST data prior to 1941, a negative bias of 0.3 °C was consistently removed, because at that time sea water temperature was measured by taking a bucket of water onto the deck of the ship and a temperature drop was introduced with the evaporation of the water in the process. Since World War II, the way of measuring sea-water temperature has been changed by pumping water onto the deck. For the SST data after 1982, satellite observations were incorporated to improve the quality of SST data [ van den Dool et al. , 2000 ].
The data of land surface temperature were derived from Global Historical Climatology Network (GHCN) [ Peterson and Vose , 1997 ], with more than 4,400 stations and the base period of 1961–1990.
Hansen and his collaborators from the Goddard Institute for Space Studies (GISS) extended the GISS global temperature series to 2009 and gave a comprehensive overview of its construction and extension [ Hansen et al. , 2010 ]. They also compared the GISS series with the HadCRUT3 and NCDC series. The land surface temperature in the updated version of GISS data was also derived from GHCN [ Peterson and Vose , 1997 ], with more than 6,300 stations. A record of anomaly greater than 2.5σ was removed and was interpolated by station records within a circle of 1,200 km in radius weighted by the inverse of the distance between the two grid points. The global gridded data were used to calculate the zonal mean for four latitudinal belts (90°–23.6°S; 23.6°S-0°; 0°–23.6°N; 23.6°–90°N), and then the global mean was estimated by averaging the four zonal mean values with the weights of 0.3, 0.2, 0.2, and 0.3 respectively. In the latest version of GISS data, two improvements were made: 1) the total area of each zonal belt, rather than the area of grids with observations, was used as weights to calculate hemispheric or global mean temperature using the zonal mean; 2) the correction of urban heat island effect was done by identifying urban and countryside with satellite observed night-time lighting for America and with population density data for other regions. Hansen et al.  noted that the warming rate without heat island correction over the United States from 1900–2009 is 0.70 °C, whereas the corrected one is 0.64 °C. For the global mean temperature, the effect of urban heat island is about 0.01 °C, which is a minor bias.
GISS also uses Hadley Centre’s SST data [ Rayner et al. , 2003 ]. In particular, after 1982 the SST data are improved by incorporating satellite observations corrected by ship and buoy station data [Reynolds , et al. , 2002], which are commonly referred to as HadISST+OISST.
The time series of global mean temperature from HadCRUT3, NCDC, and GISS, are shown in Figure 1 [ Hansen et al. , 2010 ]. We further calculated the 10-year averaged temperature anomalies relative to 1961–1990 mean state, and the 100-year warming trend from 1910 to 2009, as summarized in Table 2 . From Figure 1 and Table 2 , we can see all three time series are very close. The differences among the 10-year mean temperature anomalies of these series are about 0.10 °C for the first 40 years, are decreased to 0.05 °C by 1950s, and fall below 0.05 °C in the past 50 years. The global mean warming rate from 1910 to 2009 is 0.70–0.75 °C per 100 years.
Global mean temperature anomalies for 1880–2009 relative to 1961–1990 [ Hansen et al. , 2010 ]
An interesting issue can be found by comparing the three time series in details (Fig. 2 ). HadCRUT3 shows 1998 is the warmest year, whereas both GISS and NCDC exhibit 2005 is the warmest year. In particular, GISS series has been much higher than HadCRUT3 since 2005. A possible reason for this is that HadCRUT3 lacks observations over the Arctic where the warming has become significant in the last 10 years [ Hansen et al. , 2010 ]. Based on the weak warming trend in 1999–2008 calculated with the HadCRUT3 series, some scientists concluded that global warming had come to a halt in the last 10 years [ Knight et al ., 2009 ; Kerr, 2009 ]. To analyze the change in the last 10 years, we calculated the warming trends in 2000–2009 with the three series. We found that the warming rates are 0.12 °C per 10 years for GISS, 0.07 °C per 10 years for NCDC, and 0.03 °C per 10 years for HadCRUT3.
Global surface temperature anomalies for 1990–2009 relative to 1961–1990 (in the right plot, there are two GISS cases with one based on primitive definitions and another limited to HadCRUT3 area) [ Hansen et al. , 2010 ]
It is obvious that GISS and NCDC, both of which incorporated observations over the North Pole, yielded higher warming rates than HadCRUT3. Hansen et al.  recalculated the GISS series to mimic HadCRUT3 grids, and found the resulted warming rate extremely similar to that from the HadCRUT3. This further proved that the weak warming rate in HadCRUT3 is primarily due to the sparse coverage of the data over the North Pole.
The global mean temperature series is an important basis for detecting climate changes and finding their causes. From its early days when observations were only available from land and island stations, to the present when more oceanic and satellite observations are incorporated, the studies of global mean temperature have experienced three periods. In these three periods, data quality, spatial coverage, and analysis methods have been significantly improved and enhanced. In this paper, we reviewed and compared three popular datasets of global mean temperature which are often used by the international research community. Our analyses indicated that, in general the three time series are highly correlated although there was almost 0.10 °C difference among the decadal mean temperature anomalies in the first 40 years; and the 100-year (1910–2009) warming trends from the three series are very close (0.70–0.75 °C). However, for the past 10 years (2000–2009), GISS and NCDC series showed a much stronger warming trend than HadCRUT3 did. This is very likely due to the lack of observations over the Arctic in HadCRUT3. This phenomenon not only shows the importance of the spatial coverage of the data, but it might also to some extent explain the discrepancies among the three datasets in the early period of time.
This work is supported by LASG Open Research Program and National Natural Science Foundation of China (No. 41005035/D0507). We also thank the reviewers for their helpful suggestions.
Received: 19 August 2011