News Article | April 20, 2016
Our historical weather data came from two sources: the Global Surface Summary of the Day (GSOD) data set, version 7 (https://data.noaa.gov/dataset/global-surface-summary-of-the-day-gsod; accessed 22 July 2014)21, and the US Historical Climatology Network (USHCN), version 2.5 (http://www.ncdc.noaa.gov/oa/climate/research/ushcn/; accessed 22 September 2015)22, both maintained by the National Centers for Environmental Information. Block group-level and county-level US Census data, including geographical boundary data, came from the Minnesota Population Center’s National Historical Geographic Information System, version 2.0 (https://data2.nhgis.org/main; accessed 30 July 2014)31. We obtained county-level monthly temperature projections from the National Climate Change Viewer (NCCV) (http://www.usgs.gov/climate_landuse/clu_rd/nccv.asp; accessed through direct communication with J. Alder on 20 July 2015 and 3 December 2015)26, 27, a US Geological Survey product that takes downscaled climate scenarios prepared by NASA (the National Aeronautics and Space Administration) and averages the 800-m gridded temperature data to the county level. Our international projections data came from the Royal Netherlands Meteorological Institute’s Climate Change Atlas (http://climexp.knmi.nl/plot_atlas_form.py; accessed 11 September 2015)32. We limited our analysis to data from weather stations in the contiguous United States that operated continuously between 1974 and 2013. This 40-year period is long enough to minimize sensitivity to natural variability in weather data, and it begins at a point in time when the number of weather stations included in standard data sets and the completeness of the data they reported both increased. The timespan covers the entire history of Americans’ exposure to the idea of climate change, allowing us to track how weather has shifted during the time when the public might have perceived such shifts as attributable to climate change. Daily weather data on temperature and humidity came from the GSOD data set, produced by the National Centers for Environmental Information from hourly weather station observations contained in the Integrated Surface Hourly data set21. Of the various land-based weather station data sets that offer daily summary data, GSOD is the only one that includes weather records necessary to measure a location’s relative humidity, which the urban economics literature has shown to be an important climate amenity driving regional population growth. Temperature and humidity records in our data set came from the GSOD’s daily station records of mean and maximum temperatures and mean dew point temperature (from which we calculated daily relative humidity and, in turn, daily heat index values). We included in the study only those GSOD stations reporting valid data on each of these weather indicators for at least 50% of the days in each of the 480 months of our study period, reducing the total number of stations in the analysis from 672 to n = 324 (Supplementary Table 1). Raising the threshold for valid data from 50% to 75% produced similar results (Extended Data Table 4a). Our final data set included a small number (n = 34) of stations that were relocated to nearby sites at some point between 1974 and 2013. In each case, the site location changed no more than 10 m in elevation and 0.1 decimal degree in latitude or longitude, and data reported by the relocated stations covered the entire study period with no more than a 15-day gap. Running our main analysis after omitting these relocated stations produced similar results (Extended Data Table 4b). Synoptic reporting of weather conditions in the GSOD data set introduces error in the calculation of daily precipitation indicators, so our main analyses employ daily precipitation records from weather stations in the USHCN, a designated subset of the National Oceanic and Atmospheric Association’s Cooperative Observer Program Network22. Sites are chosen for inclusion in the USHCN according to their spatial coverage, record length, data completeness, and historical stability. USHCN records are subject to rigorous quality control checks and have been demonstrated to be less error-prone than the GSOD33. We included in the study only those USHCN stations for which at least 90% of daily precipitation data were available in no fewer than 95% of the 480 months of our study period, reducing the total number of stations in the analysis from 1,218 to n = 601. This reduced the share of days in the USHCN data set with missing precipitation data to 1.2%. Because any time-dependent missing daily precipitation data could potentially affect our measurements of annual total precipitation and precipitation days, we used a procedure for simulating the occurrence of precipitation on missing data days that has been employed in leading research on over-time precipitation trends34, 35. All simulations were conducted at the station-by-month level. For any day with missing precipitation data, we first employed a random-number generator to simulate whether precipitation occurred by using the observed frequency of precipitation within the station-month over the 40-year period of our analysis. We fitted a separate gamma distribution—which has been shown to realistically represent precipitation processes—to each station’s daily precipitation by month (for a total of 601 × 12 = 7,212 distributions), using only months for which the station had complete data. A random draw from the fitted distribution was then used to simulate missing daily precipitation for any day in the station-month on which precipitation was simulated to occur. As a robustness check, we carried out the same analysis using GSOD precipitation data; results were similar (Extended Data Table 4c). To estimate population exposure to weather conditions, we used a method employed by health geographers that weights weather station observations based on their distance from population centroids of US counties36, 37. Unlike other geographic units we might use to measure population exposure, counties are the smallest unit of geography for which boundaries remained almost entirely unchanged during the 40-year period. We located the population-weighted centroid for each county using block group population and boundary data from the 1990 Census, which was conducted approximately at the midpoint of our study period. We then assigned weights to GSOD and USHCN weather stations located within 160 km of a county’s population centroid based on the inverse of the station’s squared distance from the centroid. Counties with no weather station within 160 km from their centroids (n = 66 of the 3,103 counties in the contiguous United States) were dropped from the analysis; the counties remaining accounted for 98% of the 2010 contiguous US population. The median number of GSOD weather stations assigned to counties was 7; the median number of USHCN stations was 13 (Supplementary Table 1). Extended Data Figure 1 shows a map of the weather stations and counties in our data set. Our findings are robust to other methods of matching weather conditions to the population. The results were similar when including only stations located within 80 km of population centroids (Extended Data Table 4d). In a separate analysis, we created a Voronoi polygon around each GSOD weather station with valid temperature, humidity and precipitation data (Extended Data Fig. 3) and then assigned population to the polygons by 1990 Census block groups. Using this method—which relies only on a single weather station’s data for each block group and does not include USHCN data—we find results similar to those in our main analyses (Extended Data Table 5). We used daily data to calculate monthly averages by weather station for each of the weather indicators in our data set, yielding data at the station × year × month level, and then calculated annual values of January average daily maximum temperature, amount of precipitation, and number of days on which precipitation occurred. We used standard formulas to calculate July average daily mean relative humidity38 and July daily heat index39. Because we were interested in Americans’ experience with the weather rather than distinguishing between short-term natural variability and long-term climate trends, we did not adjust the data to remove urban heat island effects. We also did not adjust for changes in instrumentation or observation routine. Research on the effects of these changes on temperature measurements suggests that the effects are modest and should bias results against our findings. The transition from afternoon to morning temperature observations and the adoption of electronic instruments both had the effect of recording lower maximum temperatures4, 40, 41. These effects do not seem to vary between winter and summer41. Because warming that has occurred over the last 40 years has been more pronounced and widespread in winter than in summer, instrument changes would result in understating the amount of January warming that has occurred and the corresponding increase in WPI. The effects of instrument changes on dewpoint temperature, and thus relative humidity measurements, are less systematic over time, but they have been detected at only a small proportion of stations24. Considering the modest role that relative humidity plays in our preference model and the limited evidence that instrumentation substantially affects measurements, we determined that data adjustments were not necessary. We assigned weather station data to counties using the inverse-distance weights described earlier. Because our interest was in population exposure, we weighted our annual indicators by 2010 county population42. We used constant population weights—rather than adjusting them for shifts over time in county population—to isolate the impact of weather trends from any changes in aggregate exposure that are attributable to population migration or growth. As shown in Table 1, using population weights from 1970 rather than 2010 or unweighted values produced similar results. Summary statistics for the county-level weather indicators over our 40-year study period are reported in Extended Data Table 1; mean values of WPI by county are shown in Fig. 2a. To produce the results reported in the paper, we used a WPI derived from a population growth model reported in a widely cited study14 (reported in ref. 14 table 3, model 6). The model includes both linear and quadratic weather terms to flexibly assess preferences about weather. In the model used here, county population growth from 1970 to 2000 is a function of five long-term normal weather indicators: January average daily maximum temperature (JAN_MAX); July daily heat index (JULY_HI); July average daily mean relative humidity (JULY_RH); annual precipitation (PRECIP_IN); and the number of days on which precipitation occurs annually (PRECIP_DAYS). Control variables include county geographical, coastline and topological features, baseline population density and total population, and baseline shares of county population employed in different sectors, including those tied closely to weather such as agriculture and transportation. Taking the reported coefficients estimating the partial relationships between the weather indicators included in the model and population growth, we calculated a WPI score for each county j in each year t using All weather indicators are centred at their means, and thus the linear term coefficients can be interpreted as the effect of a one-unit shift in the indicator on WPI at the indicator’s mean value. As shown in equation (1), the analyses in this study (and all published studies from which we derive WPIs) were conducted using US conventional (imperial) units of measure. To comport with these studies, we employed imperial units in our calculations of all WPIs and then transformed results into SI units to report temperature and precipitation trends. We checked the robustness of our finding by calculating alternative WPIs based on five other published analyses9, 11, 12, 13 estimating the effect of climate amenities on local population growth. These studies employ simpler treatments of climate amenities, in some cases including only two or three indicators related to temperature, precipitation or humidity. All treat summer and winter temperatures separately. For each study, we developed a WPI based on reported coefficients on the study’s weather-related variables (Extended Data Tables 2 and 3). We then used our county-level weather data to calculate annual WPI scores for all US counties from each of these WPI formulas. We were able to measure all weather-related variables at the county level over the entire 40 years except for sunshine hours, a variable that appears in two of the models12, 13. Our estimates therefore assume no long-term change in the amount of sunshine experienced by individual counties. County-level temperature estimates for the RCP4.5 and RCP8.5 emissions scenarios came from the NCCV, which uses NASA Earth Exchange Downscaled Climate Projections (NEX-DCP30) data to project future changes in climate and water balance for states, counties and hydrologic units26, 27. The NEX-DCP30 data set statistically downscales projections from 33 models included in the 5th Climate Model Intercomparison Program (CMIP5) to an 800-m grid. The NCCV includes 30 of the 33 models that cover both emissions scenarios and creates area-weighted averages at the county level. Consistent with the NCCV’s presentation of these data, we have examined projections over three time periods—2025–2049, 2050–2074, and 2075–2099—under the two emissions scenarios, comparing mean WPI values within each time period to the observed 1974–2013 means for every county (Extended Data Table 6). Because of data availability, we estimated changes in WPI based only on changes in summer and winter temperatures, weighting counties by their 2010 populations and fixing other weather indicators at their means for the final 10 years of our study period. Consistent with studies of CMIP5 model performance, we found discrepancies between the observed temperature record and hindcasts yielded by climate models7, 43, with the effect that simulated temperature data for the 40-year historical period of our study produce average WPI scores that are lower than those calculated using observed data. Recognizing that discrepancy between modelled and observed temperatures may persist into the future, we performed an additional analysis in which we regression-adjusted projected future WPI scores under all time frames and scenarios to account for the discrepancy. The adjustment was performed by regressing observed annual WPI on simulated WPI derived from the CMIP5 models for the 1974–2013 period, yielding We adjusted the projections in Fig. 3a using predictions from this model and display the adjusted projections in Fig. 3b. We obtained these projections using the Royal Netherlands Meteorological Institute’s Climate Change Atlas, which provides CMIP5 climate model output for a variety of countries, seasons, time periods and scenarios through its web-based interface32. For each country, we obtained the mean surface-averaged projections of change in maximum winter and summer near-surface temperatures in the 2075–2099 period under RCP4.5 and RCP8.5 with respect to the reported mean of those observed in the 1974–2013 period (Extended Data Table 7).
News Article | November 29, 2016
Smallholder and family farms are crucial to feeding the planet, and successful policies aimed at alleviating poverty, boosting food security and protecting biodiversity and natural resources depend on the inclusion and participation of small farmers. However, despite the recent spotlight on small farms and increasing consensus on their importance, detailed information on location and size of smallholder farms is virtually absent. Small farms exist in some of the planet's most diverse landscapes and are home to many of the planet's most vulnerable people, and yet we have very little information about them. A new study led by researchers at the University of Minnesota Institute on the Environment attempts to fill this crucial knowledge gap using household census data made available by the Minnesota Population Center to identify and map smallholder farms in developing countries. The study was published today in the journal Environmental Research Letters. "This map is a first step toward a better understanding of where and how smallholder farming can be sustainable for both landscapes and livelihoods," said Leah Samberg, lead author of the new study and scientist with IonE's Global Landscapes Initiative. Information about the number, location and distribution of small farms can be used to guide investments and target policies for agricultural development, food security and sustainable land use, says Paul West, GLI co-director and study co-author. "Surprisingly, there was not a map like this before. Combining both agriculture and household survey data creates a map that is a critical piece of the puzzle for targeting the billions of dollars invested in programs to improve people's lives," he said. "This study is only a first effort at utilizing these rich and complex data sets," said Samberg. "We envision numerous future applications of this farm size product in combination with other variables related to food security, natural resource use and human well-being that will further increase our understanding of the dynamics of small farms and the livelihoods of those who depend on them." James S. Gerber, Institute on the Environment; Navin Ramankutty, University of British Columbia; and Mario Herrero, Commonwealth Scientific and Industrial Research Organisation, Australia, are study co-authors. The University of Minnesota Institute on the Environment is leading the way toward a future in which people and the environment prosper together. For more information, visit environment.umn.edu.
Esteve A.,Center dEstudis Demografics |
McCaa R.,Minnesota Population Center |
Lopez L.A.,University of Costa Rica
Population Research and Policy Review | Year: 2013
The explosive expansion of non-marital cohabitation in Latin America since the 1970s has led to the narrowing of the gap in educational homogamy between married and cohabiting couples (what we call "homogamy gap") as shown by our analysis of 29 census samples encompassing eight countries: Argentina, Brazil, Chile, Colombia, Costa Rica, Ecuador, Mexico, and Panama (N = 2,295,160 young couples). Most research on the homogamy gap is limited to a single decade and a small group of developed countries (the United States, Canada, and Europe). We take a historical and cross-national perspective and expand the research to a range of developing countries, where since early colonial times, traditional forms of cohabitation among the poor, uneducated sectors of society have coexisted with marriage, although to widely varying degrees from country to country. In recent decades, cohabitation is emerging in all sectors of society. We find that among married couples, educational homogamy continues to be higher than for those who cohabit, but in recent decades, the difference has narrowed substantially in all countries. We argue that assortative mating between cohabiting and married couples tends to be similar when the contexts in which they are formed are also increasingly similar. © 2012 Springer Science+Business Media Dordrecht.
PubMed | Center dEstudis Demografics, Minnesota Population Center and Center dEstudis Demograrics
Type: Journal Article | Journal: Population research and policy review | Year: 2014
The explosive expansion of non-marital cohabitation in Latin America since the 1970s has led to the narrowing of the gap in educational homogamy between married and cohabiting couples (what we call homogamy gap) as shown by our analysis of 29 census samples encompassing eight countries: Argentina, Brazil, Chile, Colombia, Costa Rica, Ecuador, Mexico and Panama (
Saporito S.,College of William and Mary |
van Riper D.,Minnesota Population Center |
Wakchaure A.,University of Southern California
URISA Journal | Year: 2013
The School Attendance Boundary Information System (SABINS) is a social science data infrastructure project that assembles, processes, and distributes spatial data delineating K through 12th grade attendance boundaries for thousands of school districts in the United States. Until now, attendance boundary data have not been made readily available on a massive basis and in an easy-to-use format. SABINS removes these barriers by linking spatial data delineating attendance boundaries with tabular data that describe the demographic characteristics of populations living within those boundaries. This paper explains why a comprehensive GIS database of K through 12 attendance boundaries is valuable, how original spatial information delineating attendance boundaries is collected from local agencies, and techniques for modeling and storing the data so they provide maximum flexibility to the user community. The goal of this paper is to share the techniques used to assemble the SABINS database so that federal, state, and local agencies can apply a standard set of procedures and models as they gather data for their regions.
McCaa R.,Minnesota Population Center |
Ruggles S.,Minnesota Population Center |
Sobek M.,Minnesota Population Center
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2010
In the last decade, a revolution has occurred in access to census microdata for social and behavioral research. More than 325 million person records (55 countries, 159 samples) representing two-thirds of the world's population are now readily available to bona fide researchers from the IPUMS-International website: www.ipums.org/international hosted by the Minnesota Population Center. Confidentialized extracts are disseminated on a restricted access basis at no cost to bona fide researchers. Over the next five years, from the microdata already entrusted by National Statistical Office-owners, the database will encompass more than 80 percent of the world's population (85 countries, ~100 additional datasets) with priority given to samples from the 2010 round of censuses. A profile of the most frequently used samples and variables is described from 64,248 requests for microdata extracts. The development of privacy protection standards by National Statistical Offices, international organizations and academic experts is fundamental to eliciting world-wide cooperation and, thus, to the success of the IPUMS initiative. This paper summarizes the legal, administrative and technical underpinnings of the project, including statistical disclosure controls, as well as the conclusions of a lengthy on-site review by the former Australian Statistician, Mr. Dennis Trewin. © 2010 Springer-Verlag Berlin Heidelberg.
PubMed | Minnesota Population Center
Type: Journal Article | Journal: Chinese journal of sociology | Year: 2015
IPUMS-International www.ipums.org/international disseminates harmonized census microdata for more than 80 countries at no cost, although access is restricted to bona-fide researchers and students who agree to the stringent conditions of use license. Currently over 270 samples are available, totalling more than 600 million person records. Each year 15-20 additional samples are released, as more countries cooperate with the IPUMS initiative and the integration of 2010 round census samples is completed. With so much microdata so readily available, questions of data quality naturally arise. This paper focusses on the concept of statistical coherence over time for a single concept, primary schooling completed. From an analysis of the percentage completing primary schooling by birth year for pairs of samples for thirteen Asia-Pacific countries, we find outstanding coherence for four-China, Mongolia, Vietnam, and Indonesia-with mean differences of less than 0.5 percentage points, regression coefficient (b) ranging from 0.93 to 1.07 and R
PubMed | Minnesota Population Center
Type: Journal Article | Journal: Historical methods | Year: 2012
The Minnesota Population Center (MPC) has released linked datasets through its NAPP and IPUMS projects, making them readily accessible to researchers. Prior to the availability of complete count census microdata from the MPC, researchers applied various forms of record-linking software. This essay describes the techniques used in the MPCs linking program and briefly compares this technique with those used by other researchers. The key feature of the MPC linking method is the construction of cumulative name similarity scores, based on approximately 2.5 billion record comparisons; we also use support vector mechanics to classify potential links. This article explains modifications made for the final linked datasets and includes a discussion of the role of weighting variables when using linked data.
PubMed | University Pompeu Fabra and Minnesota Population Center
Type: Journal Article | Journal: Review of economics of the household | Year: 2016
In the context of dramatic changes in family organization, this research analyzes time shared with the family (partner and children) among couples with young children in Spain. The main purpose of the paper is to analyze the differences in the roles of mothers and fathers in dual-earner and male-breadwinner couples. For this purpose, we use information derived from the question with whom the activity is done, which is included in the enumeration form of the Spanish Time Use Survey 2009-2010. The availability of time-use diaries for all the members of a household allows the use of the couple as a unit of analysis. The descriptive and multivariate results show that mothers spend more time with children than fathers do and that the employment-status variables are the most determining factors. Gender-balanced couples have lower differences in the time that fathers and mothers spend on activities with their children. However, the differences remain high, and mothers are still the main caregivers in the household. These findings apply to a specific context characterized by weak policies related to balancing family and work and by the persistence of a division of roles in the couple with some resemblances to the traditional model, especially in the role that considers mothers the main caregivers.
PubMed | Minnesota Population Center
Type: Journal Article | Journal: Demography | Year: 2013
In many different fields, social scientists desire to understand temporal variation associated with age, time period, and cohort membership. Among methods proposed to address the identification problem in age-period-cohort analysis, the intrinsic estimator (IE) is reputed to impose few assumptions and to yield good estimates of the independent effects of age, period, and cohort groups. This article assesses the validity and application scope of IE theoretically and illustrates its properties with simulations. It shows that IE implicitly assumes a constraint on the linear age, period, and cohort effects. This constraint not only depends on the number of age, period, and cohort categories but also has nontrivial implications for estimation. Because this assumption is extremely difficult, if not impossible, to verify in empirical research, IE cannot and should not be used to estimate age, period, and cohort effects.