Global Data Hound

All about global data sets

Global Data Hound header image 2

Gridded Global Population Datasets: LandScan and GPW

April 22nd, 2009 · 1 Comment · Data Explorations, TerraViva Data

I  have always found it both remarkable and puzzling that there are only two widely accepted, geospatially explicit, gridded data sets describing the current* population of the entire globe. Why only two? It’s not as if the population of the world is a niche subject! Knowing where people live is essential to understanding almost any phenomenon that has global scope and is geospatially heterogeneous (i.e., varies spatially).

Unfortunately, the ubiquitous national-scale map, with every country color-coded to correspond to a single indicator value, is the default solution to many global mapping problems. I say “unfortunately” because national scale maps can be as misleading as they are ubiquitous, for the simple reason that most change has varying impacts on people in different locations. For example, national-scale maps that are based on per capita figures can be grossly misleading when they show a densely populated areas like a megacity with the same data value as a deserted wasteland.

natl-scale-pop.jpg

If you want to go beyond the cartoon-like level of the national-scale map, and really understand what is happening to people and places on a global scale, you have to involve spatially explicit population data. Yet, if you need a gridded global population data set that provides high geospatial resolution, LandScan 2007 from Oak Ridge National Laboratory and Gridded Population of the World v 3.0 from the NASA-funded Socioeconomic Data and Applications Center at the Center for International Earth Science Information Network are the only two options.

sedac-pop-density.jpg

What’s more, many, if not most, users of gridded global population data are only aware of the existence of one or the other of these global population data sets, but not both. People seem to use the first data set they find, or the first one that is readily available. Even rarer than seeing both data sets mentioned is seeing an analysis that shows the contrasting implications of using both data sets, or a methodological discussion that explicitly articulates the competing considerations in choosing one over the other. To encourage such conscious analysis is the purpose of this post.

The fundamental insight is that LandScan and GPW are different; they have somewhat different objectives, they make different assumptions, they use different methodologies, and they are designed to measure two different indicators. It would not be fair to say that one or the other is “better” than the other, it’s more a question of what tool(s) are best for the job at hand.

The LandScan dataset estimates “ambient” population density — where people are likely to be at noon. LandScan pixels are 30 arc-seconds (or about 1 km wide at the equator).  LandScan is updated every year on a one-year lag (thus, LandScan 2007  is available now).

The CIESIN GPW dataset estimates residential population density — where people live and are likely to be at night.  GPW pixels are 2.5 arc-minutes, or 5x bigger than the LandScan pixels.  GPW is updated every five years.

LandScan uses population reports from official publications, then disaggregates the population according to an algorithm that is based mostly on land use, topography and transportation information. The algorithms used are regionally tailored and reflect local culture. Landscan moves people within administrative reporting units based on suitability models driven by terrain features, infrastructure density, and the like (i.e., people like to be near but not in water bodies, tend to avoid high slope areas with northern exposure, and spend time on or near roads). So people are not placed in lakes even though the population may be reported in an administrative unit that includes lakes. Or population may be dispersed over barren land much more sparsely etc. Locations of central places are noted so greater densities appear at those locations. The net effect is that LandScan tends to put people in cities and along roads.

ornl-2007.jpg

CIESIN opts for researching the most highly resolved census data possible. For US and other developed countries the maps are more detailed because the reporting is more detailed. As a result they usually have more polygons per country than LandScan does.  Their methods do take into account some land use considerations. People are excluded from water bodies and reported urban population is placed within delineated urban areas (and visa-versa for rural population). After that, they spread the people into pixels evenly for each administrative unit.

admin-spread.jpg

Obviously, the nature of the problem at hand is important in choosing whether you care more about ambient or residential population density, or whether you must be concerned with both. Take the example of vulnerability to natural disasters. For most natural disasters (perhaps excepting tornadoes), it is impossible to anticipate whether the disaster will occur at night or during the day, so an estimate of the people vulnerable for a single spot really ought to be a range, not a single value.  Sensitivity analysis is a useful tool to assess the different implications of these values.

johann.jpg

Tags: ····

One Comment so far ↓

Leave a Comment