IPUMS Terra Research Examples

The following examples illustrate the types of research that could be facilitated by the IPUMS Terra data access system. IPUMS Terra can significantly reduce the amount of time scientists working on human-environment issues need to spend collecting, processing, and integrating data from a variety of sources.

  • Scenario 1: Explaining mortality and health outcomes at the district level in Ghana, Malawi, and Tanzania
  • Scenario 2: Modeling deforestation and agriculture in the Yucatan Peninsula of Mexico
  • Scenario 3: Reconstructing Water History to Inform a Sustainable Water Future
  • Scenario 4: Exploring the Links Between Agricultural Trends, Deforestation, and Community Composition at the District Level in the Brazilian Amazon
  • Example Publications

Scenario 1:
Explaining mortality and health outcomes at the district level in Ghana, Malawi, and Tanzania

Hypothesis/research question:

Health outcomes depend on local variations in the physical environment.

Research objective:

Demonstrate enhanced understanding of health outcomes related to environmental conditions at sub-national levels. Both physical and social environmental effects are crucial to the study as inequalities in health outcomes can be derived from disparities in access to natural resources as well as gaps in socioeconomic conditions.

Data required:

Variables Source Data Structure
Temperature WorldClim Raster

Rainfall

WorldClim Raster
Elevation ASTER-GDEM Raster
Infant and child mortality IPUMS International Microdata
Age structure IPUMS International Microdata
Literacy rate IPUMS International Microdata
Household access to toilet IPUMS International Microdata

Pre-IPUMS Terra data processing steps:

  • Obtain district boundaries from SALB, GADM, Malawi National Statistical Office
  • Obtain temperature and rainfall data from WorldClim
  • Obtain elevation data from ASTER-GDEM
  • Obtain population data from IPUMS
  • Match district boundaries to IPUMS geography variables
  • Use ArcGIS to process environmental datasets
    • Extract relevant portions
    • Remove no data points and other artifacts
    • Calculate mean rainfall, temperature, and elevation by district
  • Use a statistical package to summarize IPUMS population data by district
  • Use ArcGIS to join all variables to boundary shapefiles

Obtaining data through IPUMS Terra:

  • Choose area-level output
  • Select desired environmental and population variables
  • Select operations to summarize raster data to district level
  • Obtain extract of variables summarized by district

Approximate time saved by using IPUMS Terra:

25 hours

Scenario 2:
Modeling deforestation and agriculture in the Yucatan Peninsula of Mexico

Hypothesis/research question:

Where and why do farmers cut down tropical rainforest to plant agricultural crops?

Research objective:

Establish the social and environmental factors in farm-level decision making. The land change science community has identified a suite of these factors and has made one of its research goals finding out how these factors play out in specific places, times, contexts.

Data required:

VARIABLES SOURCE DATA STRUCTURE
Temperature Interpolated stations Raster
Rainfall Interpolated stations Raster
Elevation Interpolated topographic Vector and Raster
Socioeconomic/Demographic IPUMS Microdata
Land use/land cover Interpreted RS or GLC Raster

Pre-IPUMS Terra data processing steps:

  • Obtain district boundaries from SALB, GADM, INEGI (Mexican census)
  • Obtain temperature and rainfall data from variety of sources, interpolate, and validate
  • Obtain elevation data from a variety of sources, interpolate, and validate
  • Obtain raw remotely sensed data, classify, and validate
  • Obtain population data from IPUMS
  • Match district boundaries to IPUMS geography variables
  • Use ArcGIS and Idrisi to process environmental datasets
    • Extract relevant portions
    • Remove no data points and other artifacts
    • Calculate mean rainfall, temperature, and elevation by district
    • Further validate and verify interpolations via cross-comparisons among data
  • Use a statistical package to summarize IPUMS population data by district
  • Use ArcGIS to join all variables to boundary shapefiles

Obtaining data through IPUMS Terra:

  • Choose area-level output
  • Select desired environmental and population variables
  • Select operations to summarize raster data to municipalities
  • Obtain extract of variables summarized by district

Approximate time saved by using IPUMS Terra:

100 hours base-case scenario; 200 additional hours if GLC data was substituted for remotely sensed data

Scenario 3:
Reconstructing Water History to Inform a Sustainable Water Future

Goal:

Identify common pathways of sustainability success, and of failure, in integrated human-natural water resource systems.

Research objective:

Combine 60 years of model-based water history for more than 11,000 sub-basins across the globe with demographic information indicating the historical development of irrigation and domestic water consumption. Compile an entirely new data set of crop-specific irrigated area and water use during the last 60 years based on land use modeling using LandSHIFT and the global water model WaterGAP3. Conduct regional case studies focusing on emerging conflicts between historical agricultural water use and increasing urban water demand. Generalize observations from the regional case studies based on demographic data.

Data Required:

VARIABLES SOURCE DATA STRUCTURE
Irrigated land GLI Raster

Crop-based land use

GLI Raster
Population IPUMS/GPW Raster
Labor force in agriculture IPUMS Raster/by water basin
Access to piped water IPUMS Raster/by water basin
Sewer service IPUMS Raster/by water basin
Household structure IPUMS Raster/by water basin

VariablesSourceData structureIrrigated landGLIRasterCrop-based land useGLIRasterPopulaitonIPUMS/GPWRasterLabor force in agricultureIPUMSRaster/by water basinAccess to piped waterIPUMSRaster/by water basinSewer serviceIPUMSRaster/by water basinHousehold structureIPUMSRaster/by water basin

Pre-IPUMS Terra data processing steps:

  • Obtain historical weather data and crop-based land use data from various sources
  • Format weather and land use data as input for LandSHIFT and WaterGAP3 models
  • Run LandSHIFT and WaterGAP3 models
  • Process and format output from models
  • Obtain sub-national boundaries from various sources (where possible)
  • Obtain population, water service access, and demographic data from IPUMS
  • Match district boundaries to IPUMS geography variables
  • Determine method to redistribute population and demographic data from administrative units to raster/water basins
  • Implement redistribution
  • Integrate model output and population data by water basin

Obtaining data through IPUMS Terra:

  • Choose raster output
  • Select weather, irrigation, and crop-based land use variables for models
  • Select population, water service access, and demographic variables
  • Obtain extract of rasters needed for models and for integration with population data
  • Run and process data from models 
  • Integrate model results with rasterized population, water service access, and demographic variables from IPUMS Terra

Approximate time saved by using IPUMS Terra:

Potentially hundreds to thousands of hours. In addition, IPUMS Terra may include sub-national boundary data not available elsewhere, allowing finer scale spatial resolution of population and demographic data.

Scenario 4:
Exploring the Links Between Agricultural Trends, Deforestation, and Community Composition at the District Level in the Brazilian Amazon

Hypothesis/research question:

Deforestation and agricultural trends will correlate with changes in demographics. (Research question: how are agricultural trends and deforestation trends correlated with markers of community composition?)

Research objective:

Characterize the agricultural trends, deforestation trends, and community composition trends over time in districts of the Brazilian Legal Amazon; consider linkages and correlations between those simple descriptive statistics. Exploring the relationship between the environment, agriculture and communities is a critical first step in determining how political interventions might promote or depress human welfare in Amazonia.

Data Required:

VARIABLES SOURCE DATA STRUCTURE
  • PopulationAge StructureGenderLiteracy
    RateFamily SizeMarital StatusIncome

  • Occupation (Sector)

  • Labor Force

  • Migration Status

IPUMS microdata

1970-2000, decadal

  • Deforestation Rate

INPE

(30-250 m grid cells)

1988-2011, annual

2004-present, monthly

  • Cropland Area

  • Pasture Area

  • Cropland Yield (per ha)

  • Cropland Production

  • Farmland Size

GLI
(5 arc-minute grid
cells)

IBGE
(aggregate, by state)

1700-2007, annual

2000

Desired Data Structure:

Area-level, summarized by municipio with boundaries harmonized over 1970-2000

Pre-IPUMS Terra data processing steps:

  • Obtain deforestation rate, agricultural data, and farm size data from INPE, GLI and IBGE/SIDRA
  • Obtain municipio boundaries for Brazil
  • Digitize and process municipio boundaries
  • Harmonize municipio boundaries over time
  • Match municipio boundaries to IPUMS geography variables
  • Use a statistical package to summarize IPUMS population data by municipio
  • Use ArcGIS to process environmental datasets
    • Extract relevant portions
    • Remove no data points and other artifacts
    • Summarize raster variables to harmonized municipio boundaries
  • Use ArcGIS to join all variables to boundary shapefiles

Obtaining data through IPUMS Terra:

  • Choose area-level output
  • Select desired environmental and population variables
  • Choose harmonized municipios as the geographic level
  • Select operations to summarize data to municipios
  • Obtain extract of boundaries and variables summarized by municipios

Approximate time saved by using IPUMS Terra:

On the order of hundreds of hours, primarily needed for digitizing, processing, and harmonizing boundaries and linking to IPUMS microdata.

Example Publications

  1. Kugler, Tracy, David C Van Riper, Steven M Manson, David A Haynes II, Joshua Donato, Katie Stinebaugh. (2015). Terra Populus: Workflows for Integrating and Harmonizing Geospatial Population and Environmental Data. Journal of Map & Geography Libraries 11(2):180-206. DOI: 10.1080/15420353.2015.1036484
  2. Nawrotzki, R. J., Riosmena, F., Hunter, L. M., Runfola, D. M. (2015). Undocumented migration in response to climate change. International Journal of Population Studies 1(1), 60-74. DOI: 10.18063/IJPS.2015.01.004
  3. Nawrotzki, R. J., Hunter, L. M., Runfola, D. M., Riosmena, F. (2015). Climate change as migration driver from rural and urban Mexico. Environmental Research Letters 10(11), 114023. DOI: 10.1088/1748-9326/10/11/114023
  4. Nawrotzki, R. J., Riosmena, F., Hunter, L. M., & Runfola, D. M. (2015). Amplification or suppression: Social networks and the climate change – migration association in rural Mexico. Global Environmental Change 35, 463-474. DOI: 10.1016/j.gloenvcha.2015.09.002
  5. Ruggles, Steven, Tracy A. Kugler, Catherine A. Fitch, David C. Van Riper. (2015). Terra Populus: Integrated Data on Population and Environment. Data Mining Workshop (ICDMW), 2015 IEEE 15th International Conference on. 222-231. DOI:10.1109/ICDMW.2015.204
  6. Essawy, Bakinam T., Jonathan L. Goodall, Hao Xu, Arcot Rajaseckar, James D. Myers,Tracy Kugler, Mirza M. Billah, Mary C. Whitton, Reagan W. Moore. (2016). Server-Side Workflow Execution using Data Grid Technology for Reproducible Analyses of Data-Intensive Hydrologic Systems. Earth and Space Science. 3. DOI: 10.1002/2015EA000139.

Supported By

National Science Foundation University of Minnesota