World Data Center for Paleoclimatology
World Data Center for Paleoclimatology
News Article | April 6, 2016
We systematically searched the scientific literature for proxy records of hydroclimate and temperature. We retrieved all records fulfilling the following criteria: (1) Temporal coverage. All proxy records must cover at least the period 1000–1899. (2) Temporal resolution. All proxies must have at least, on average, two data points per century between 1000 and 1899 and no more than two centuries over the 1000–1899 period may have fewer. In the data subset containing only hydroclimate records resolved decadally or better, we required that all proxies have, on average, one data point per century between 1000 and 1899. (3) Publication requirements. We use only proxy records that have been published in the scientific literature. (4) Requirements regarding the climate signal. The proxies must, according to their published description, contain a hydrological signal described as precipitation, drought, moisture balance, stream flow, lake level change, bog surface wetness, flooding or similar hydrological information. (5) Geographical requirements. All the proxies must be located in, or nearly in, the Northern Hemisphere. Proxies just south of the Equator (by a few degrees) may be included if they contribute to gridded values north of the Equator. The majority of records were obtained from public repositories (for example, http://www.ncdc.noaa.gov/paleo/ and http://www.pangaea.de/). Those records not in the public domain were acquired either by direct request from their authors, or digitized from figures in the original articles (Supplementary Tables 1 and 2). The proxy data were divided into seven categories: (1) ice cores, (2) peat cores, (3) tree rings, (4) speleothems, (5) documentary, (6) lake sediments, and (7) marine sediments. In cases where there exist several versions of a proxy record from the same site, preference was given to the latest published version. The collected proxy records were divided into three simplified categories of seasonal response: annual, winter and summer, with spring and early autumn considered as part of the summer season. In cases when no clear information about a proxy’s seasonality could be found, either in the original article or in subsequent publications, we assumed the seasonality to be annual. We only included proxy records with age models constrained by at least one dating point after 1900 (D1), another between 1000 and 1900 (D2) and one dating point before 1000 (D3). The number of years between D1 and D2 and D2 and D3 should be less than 1,000 years. (1) The top of the core exists (that is, year of collection) or, if it does not, there must be a dating point (for example, 137Cs, 210Pb) to verify the age (A1 is the top age). (2) There should be at least one dating point (14C, U/Th, tephra, 210Pb) between 800 and 1999 (A2). (3) There should be at least one dating point between 800 and 1 (A3). (4) The spacing between these three dating points should be less than 800 years (A1 − A2 < 800 years, A2 − A3 < 800 years). (5) A1, A2 and A3 cannot be bulk datings. (6) The acceptable dating errors for A1, A2 and A3 must be <200 years (2σ).For the decadal- or better-resolved subset, each record must have at least one dating point per full century. Prior to computing centennial anomalies, all proxy records with irregularly spaced time steps were converted into time series with annually spaced time steps using simple linear interpolation, then smoothed by a cubic smoothing spline having a 50% frequency response at 100 years. The smoothed proxy series are transformed to standard normal deviates with respect to the 1000–1899 baseline period. From the resulting centennial anomalies every 25th value is extracted from each century with at least 85 (interpolated) values. From those proxy records spanning the complete 800–1999 period, we obtain 45 centennial anomalies. Of these 45 centennial anomalies the 12 representing the hundred-year periods 800–899, 900–999, 1000–1099, 1100–1199, …, 1900–1999, are those presented in the majority of figures throughout this Letter (Extended Data Fig. 8b). However, all 45 centennial anomalies are used in the correlation experiments as well as in Fig. 2b and c. For the subset including only hydroclimate records decadally resolved or better, we obtained decadal anomalies using the same method as described above but instead using a spline with a 50% frequency response at 10 years. Because individual proxy records can contain noise, a more robust signal is presumably obtained by producing an average of multiple proxies from within a geographic area where hydroclimate can be assumed to be coherent. Unfortunately, the size of such geographic areas, on centennial timescales, is not well constrained owing to the limited length and distribution of instrumental observations. This deficiency required the development of a measure of correlation decay length (CDL)31 of hydroclimate32, 33 on centennial timescales. Encouraged by the agreement found between instrumental precipitation data on annual and decadal timescales34 and simulated values from the ECHO-G model35 on the same timescales, we used the simulated centennial values from the ECHO-G model to calculate a conservative estimate of the spatial auto-correlation function for hydroclimate on centennial timescales by the following equation: where t is the time index, i and j are the longitude index and the latitude index of the grid, respectively; f(t, i, j) are the deviations from the local long-term mean at time t and at a grid-cell (i, j) divided by the local temporal standard deviation (that is, standardized to one standard deviation), d is the spatial separation between grid cells (i, j) and (k, l). The sum of the product f(t, i, j)f(t, k, l) over time and space is restricted to the grid cells (k, l) that are separated from the grid cell (i, j) by the distance between L − ΔL and L + ΔL, where ΔL is 100 km. This sum is divided by the number of occurrences of grid-cell pairs. The outer sum is calculated over all grid cells located on an index circle j, and then this sum is divided by that number of grid cells. It is thus assumed that the spatial autocorrelation is isotropic and zonally homogeneous, that is, it depends only on length L and on latitude j. The assumption of isotropy and zonal homogeneity is an assumption that is not completely fulfilled for all terrestrial locations. However, this approach yields an approximate and average estimation of the decorrelation length at multi-decadal to centennial timescales, and this estimation amounts to a few hundred kilometres (Extended Data Fig. 1a). The precise value of the decorrelation length does not affect the basic features of the reconstruction. A simplified CDL function of hydroclimate variability on centennial timescales was derived from the output of the climate model ECHO-G35 over the period 1000–1990 to calculate the maximum search distances as a function of latitude in this study (Extended Data Fig. 1b). We use the distance function described above, derived from the estimated centennial hydroclimate CDL, to define the search radius for averaging proxy anomalies (Extended Data Fig. 1a and b). Finding the decorrelation of hydroclimate increases with distance from a search centre, we assign a weight for each proxy record, for any given latitude, decreasing from 1 at the centre to e−2 ≈ 0.14 at the search periphery using the following Gaussian weight function: Here, w is the weight assigned to a proxy value located at x distance from a grid node, and R is the search radius defined by equation (1). Every reconstructed grid cell location is required to have three or more proxy centennial anomalies within the search radius. This requirement effectively reduces the total number of proxies used in the experiment, although some regions remain densely covered (such as North America, China and Europe; Extended Data Fig. 8a). Weighted means are computed for all 45 centennial anomalies and are those used in the correlation analyses. All gridding and weighting of proxy (and model) data are performed using the method described above for producing the proxy local weighted averages, although the search area centres are moved to each intersection of a 1° × 1° longitude by latitude grid superimposed over a polar projection of the Northern Hemisphere. Maps of centennial gridded proxy hydroclimate variability are shown in Fig. 2a, and time series corresponding to the fraction of land area at or above a given wetness or dryness threshold are shown in Fig. 2b. The weighted and gridded proxy temperature reconstruction shown in Extended Data Fig. 6a and used in Fig. 3 is produced in the same manner as the hydroclimate reconstructions, with the exception of the CDL function used for weighting and gridding. Detailed descriptions of the CDL, the proxy weighting function, and the search distance function used for reconstructing Northern Hemisphere centennial temperature variations are given in ref. 15. We have used simulations from six coupled atmosphere–ocean general circulation model simulations that are part of the Climate Model Intercomparison project Phase 5 (CMIP5)36. The simulations denoted as ‘past1000’ cover the period 850–1850, whereas the ‘historical’ simulations cover the period 1851–2005. The ‘past1000’ simulations are sometimes performed using a lower resolution or simplified versions of the more comprehensive models used for the ‘historical’ simulations. We have restricted our analysis to those six simulations—bcc-csm1-137, CCSM438, IPSL-CM5A-LR39, GISS-E2R40, HadCM341, and MPI-ESM-P42—conducted with the same model versions for both periods and stitch the ‘past1000’ and ‘historical’ simulations that carry the same ensemble run-initialisation-physics (rip) identifier in the CMIP5 repository (Table 1). However, the simulation conducted with the MIROC-ESM model was not used owing to a documented, unrealistic, long-term positive temperature trend in global mean annual near-surface temperature over the past millennium43. The models were driven by estimations of external forcing, including variations in the orbital changes, total solar irradiance, volcanism, atmospheric trace gases and land-use changes. The CMIP5 Consortium recommended similar, but somewhat different, options for these external forcings23, 44 and the modelling groups did not always use the same options for all CMIP5 simulations. From each of the six simulations we extracted the annual mean precipitation and temperature at the model grid-cell location nearest to our proxy locations. Once the selection was made, an equal length extraction was performed to make all comparisons between model and proxy as equitable as possible. The smoothing, weighted averaging and gridding of all model data followed the same protocol as described above for the proxy data. The centennial forcing anomalies in Fig. 3 are produced from time series of solar45, volcanic46 and CO forcing47 the same or similar to those used in the models. All temporal correlations between two weighted and gridded data sets are calculated by Pearson product-moment correlation coefficient48. Assessment of the statistical significance of these correlations is complicated by the fact that any two data sets are both temporally and spatially correlated. We have taken temporal correlation, that is, autocorrelation, into account by using the method of blocked bootstrap (3,001 repetitions)49 to provide a more realistic estimate of the correlation’s statistical significance. In Extended Data Fig. 4a we show the correlations between the gridded hydroclimate proxy data and corresponding gridded hydroclimate model data and Extended Data Fig. 4b shows the statistical significance of the correlations. In Extended Data Fig. 4c we show the correlations between the gridded hydroclimate proxy data and gridded temperature proxy data and Extended Data Fig. 4d shows the statistical significance of the correlations. We produce histograms of all cross-correlations in this study (Extended Data Fig. 9a–d) and their Fisher z transform50 (Extended Data Fig. 9e–h) to assess the normality of their distributions. A slightly bimodal distribution is seen, for both the original and Fisher z-transform correlations, between the gridded hydroclimate proxy data and gridded temperature proxy data. This is not what we would expect to see if hydrological and temperature anomalies were unrelated: the distribution of the Fisher z transform of the correlations should then be approximately normal and, in particular, unimodal. Instead, the histograms suggest that the distributions of the correlations between proxy hydroclimate and proxy temperature values are a mixture of two unimodal distributions, corresponding to two distinct mechanisms relating hydroclimate and temperature. To assess the robustness of our gridded weighted hydroclimate reconstruction we have produced a number of ‘subset’ reconstructions, using different data screening schemes, as well as comparing our multi-proxy reconstruction to other hydroclimate products. We created a subset of proxies with only those records that have decadal to annual resolution and at least one dating point per century. The gridded reconstruction obtained from this subset has less spatial coverage than the reconstruction from the full data set. The median Pearson product-moment correlation coefficient, at grid point level, between the two reconstructions (using 45 centennial values, lagged by 25 years) is r = 0.97 (p < 0.01), suggesting that any potential dating uncertainties that exist in the full data set have a small influence on our overall results (Extended Data Table 1). Furthermore, we correlated the full reconstruction with reconstructions produced excluding one of the seven proxy types at the time, which notably also involves excluding the numerous but less well age-constrained lake sediment records (Extended Data Table 1). Moreover, we produced similar tests excluding/retaining proxy records reflecting quantitative estimates of precipitation as well as excluding/retaining proxy records with annual signal, summer signal, and winter signal, respectively. Finally, we correlated our reconstruction from the full data set as well as the reconstruction from the more strictly screened data set resolved decadally or better, set against the North American Drought Atlas20 and the Monsoon Asia Drought Atlas21. At grid point level the median Pearson’s correlation is r = 0.34 (p < 0.15) and r = 0.50 (p < 0.03), respectively. These significant, but not very high, correlation values are acceptable considering our hydroclimate reconstruction metric is not an equivalent measure of the PDSI51, 52 and furthermore that the PDSI reconstructions are for the summer season only. The Earth’s climate system contains variability from sub-daily to Milankovitch to deep geological timescales17, 53, 54, 55. It has been known for a long time that proxy records may record climatic variation occurring on different wavelengths (that is, interannual to decadal to centennial to millennial) with varying fidelity resulting in disproportionate manifestations of the variance across these different wavelengths56, 57, 58, 59, 60, 61, 62. Such ‘spectral biases’ are a type of noise that, if not considered or corrected, potentially challenge a precise quantitative assessment of the continuum of climate variability. Using instrumental records to characterize and subsequently correct the possible spectral biases in proxy records should yield more accurate reconstructions of climate. However, we note that this endeavour to characterize proxy biases is questionable, because the true long-term characteristics of the climate system remain broadly unknown, which is part of the motivation to develop high-quality proxy data and reconstructions. Knowledge of proxy data sets is, however, sufficiently advanced to state that individual proxy types may be characterized by one or more biases. For example, tree-ring data, arguably the most intensively investigated proxy archive for the past millennium in both the temporal and frequency domains, are known to have two potentially opposing biases. On the one hand, owing to a ‘biological memory’, tree-ring width data especially contain somewhat less high-frequency (interannual) variability than instrumental temperature (or precipitation) data59, 60, 61, 62. On the other hand, the necessary removal of the long-term age/size-related trends in tree-ring records, if performed using certain techniques, results in the so-called ‘segment length curse’63 that limits our ability to preserve trends/variability exceeding the length of the mean length of the individual tree-ring sequences. In this case, diminished low-frequency (centennial to millennial) variability would be expected. There are, however, different techniques that can be applied to overcome or mitigate such spectral biases. The removal of the age-related trends using a single population estimate as in the Regional Curve Standardisation technique has been shown to break the ‘segment length curse’ and to preserve long wavelengths better63, 64, 65. Similarly, the biological memory can be estimated and removed by autoregressive modelling and the resulting time series can be assigned the spectral properties expected from the instrumental data sets56. Analyses of long-term instrumental data sets are limited to the past couple of centuries, which does not, as also mentioned in the main text, allow a full validation of the low-frequency (for example, centennial to millennial timescale) components of proxy data or reconstructions. In most proxy records, just as with the tree-ring data, the low-frequency variability can be affected by nonlinear processes such as the compression or mixing of layers, a time-varying temperature–precipitation relationship, anthropogenic impacts on the local environment and, in the case of ice-core data, ice-flow movements in the ice cap. It should be noted that the interpolation of non-annual records to annual resolution also results in a spectral bias towards lower frequencies66. Given the uncertainties back in time in many individual proxy records, both in the high- and low-frequency signal, a multi-proxy approach, using the average signal from a number of disparate proxy records, is arguably best suited for the study of low-frequency climate variability. In multi-proxy reconstructions, where the aim is to obtain both the high (annual) and low (centennial) frequency signal, any spectral bias in the proxies, influencing the ratio of the high- and low-frequency signal, is a problem of major importance. Our aim, however, is only to obtain the signal on centennial and longer timescales. By standardizing centennial mean values, proxy records that have both strong and weak low-frequency signals are transformed to records of just centennial-scale information of the same amplitude. This transformation essentially eliminates the influence of spectral biases by making irrelevant whatever prior ratio of high- and low-frequency information had existed in the proxy before standardization. Even if a proxy does not capture the full amplitude of centennial-scale variability (for example, tree-ring width chronologies built by many short individual series and/or limited replication63) the impact is small because the standardization process gives all proxies the same amplitude of centennial-scale variability. As long as the proxies have some centennial-scale variability (which is the only variability we retain for our primary reconstruction) the proxies are useful to us. Our different sensitivity tests (Extended Data Table 1), removing various proxy types from the complete data set and correlating their resulting reconstruction with the full reconstruction, demonstrate a very limited impact of potential spectral biases from particular proxy types on our full reconstruction. These experiments produce remarkably similar hydroclimate reconstructions, suggesting that the purported negative effect of ‘mixing’ proxies with potential spectral biases is not large (if not altogether mitigated by the standardization). Especially important in this regard are results from those tests in which all the records with less than decadal resolution, with only centennial dating control, and those in which all the tree-ring records and all the lake sediment records are excluded, respectively. Not surprisingly, the reduction in proxy coverage gave rise to some spatially local differences; however, the overall patterns remained the same. Most of the changes seen are less likely to be attributable to spectral biases in the proxies, but rather to a diminishment in the considered number of proxies, and hence greater noise in the reconstruction. We mitigate the impact of proxy noise in our reconstruction approach by requiring multiple records within the search radius for every grid point reconstructed. All the proxy data and programs used to perform the experiments herein have been made publicly available at NOAA Paleoclimatology/World Data Center for Paleoclimatology (https://www.ncdc.noaa.gov/paleo/study/19725). The Source Data we used to plot each figure in the Letter are also stored there. The suggested computing environment and dependencies are: Mac OSX 10.6 or greater, Generic Mapping Tools (GMT) 4.5.6 or later67, Ghostscript 9.10, and Matlab2007b or later. A detailed description of the software used, with example data and runtime commands is supplied with the archived data. The climate model data included in the simulations of the Climate Model Intercomparison Project Version 5 (CMIP5) can be downloaded from any of the nodes of the Earth System Grid Federation (registration is required and a password is provided after registration), for example from: https://esgf-data.dkrz.de/search/cmip5-dkrz/. The menu on the left side of the page allows the user to select the ‘experiment’ (in our case, ‘past1000’ and ‘historical’), the model (for example, MPI-ESM-P), the variable (pr = precipitation), and the time frequency (mon = monthly). A click on the search button displays all matching results. The data download can be performed interactively by clicking on the links of the individual files or by downloading a Unix script ‘wget’, which can be locally run on a Unix computer. All files are written in ‘netcdf’ format, which also includes the metadata information, and that can be accessed with a variety of software tools, for instance with the public statistical packages R, Climate Data Operators (CDO) or netcdf Operators (NCO). Data of the ECHO-G simulation can be retrieved from the CERA data bank of the German Climate Computing Centre (DKRZ) at http://cera-www.dkrz.de/. An account on the CERA data bank is required to access the data and can be obtained by sending an email to email@example.com. CERA will then provide a personal account and password as soon as possible. Once an account has been created, the data of the ECHO-G simulation can be accessed.
Wahl E.R.,World Data Center for Paleoclimatology |
Anderson D.M.,World Data Center for Paleoclimatology |
Bauer B.A.,World Data Center for Paleoclimatology |
Buckner R.,World Data Center for Paleoclimatology |
And 4 more authors.
Geochemistry, Geophysics, Geosystems | Year: 2010
High-resolution temperature reconstructions (typically annually to seasonally resolved) have played a key role in understanding paleoclimate immediately prior to the beginning of the instrumental record, especially when calibrated to form an extension of comparable instrumental data coverage (global, hemispheric, and regional). Such calibration allows the information in the instrumental record to be quantitatively extended backward in time in an objective way, enabling description of much longer term fluctuations in climate than possible with instrumental data alone, although with significantly increased uncertainties inherent in proxy-based reconstructions. This data brief describes a newly integrated archive of nearly all the high-resolution temperature reconstructions of the past 2+ millennia included in NOAA's National Climatic Data Center, from small-regional to global scale, which also have been recalibrated to a standard set of instrumental data. Examination of the spectral structure of the data is additionally provided. Copyright 2010 by the American Geophysical Union.