github twitter linkedin facebook instagram email rss
Analyzing Bangalore - Population (Part 1 - Choropleth)
Nov 18, 2017
5 minutes read

This is the start of a series of articles analyzing data specific to Bangalore (or Bengaluru as it’s now called). I plan to look at different facets of the city in detail, starting with population. We will visualize population related data in three different representations, of which this is Part 1 (see here for Part 2 and Part 3, each of which I hope will provide unique insights into the city’s past, present and future.

A word about the data sources before we get started; my primary source of information came directly from the website of BBMP (Bruhat Bengaluru Mahanagara Palike, the administrative body responsible for civic and infrastructural assets of Bangalore). One of my internet search rabbit holes landed me on this pdf of a table somewhere inside the BBMP website, and it contained some fascinating data for each of the 198 BBMP wards (locations within Bangalore city), this is what got me started down this road.

All the visualization and data analysis was done using R, if you’re interested in the details scroll down to the end for an explanation on how the code works.

This is an animated choropleth time series map of Bangalore, a choropleth map is a thematic map in which areas are shaded in proportion to the measurement of the statistical variable being displayed on the map - in this case that variable is population. And the animation shows the change in population over time from 2001 to 2021. The population numbers used for 2001 and 2011 are from the national census data, the population for the rest of the years has been interpolated using the spline function in R.

Let’s talk about some of the key takeaways that this visualization offers, they key insights we were able to mine from the data.

Key insights

  • Population has grown rapidly the last couple of decades
  • The bulk of the population growth happened in the outskirts (or what used to be the outskirts) of the city, the central areas didn’t really grow much at all. This likely indicates that immigrants settled down in areas which weren’t traditionaly places that Bangaloreans lived in.
  • Horamavu, Bellandur and Begur are now the big population centers of the city and things are expected to stay that way for some time to come.

The next census, scheduled for 2021 would be interesting data, and would help measure the accuracy of my predictions. I wonder if town planners and civic authorities really consider such forecasts when making decisions, they obviously have the data and probably have all the information they need to make the right decisions. But seeing as I do, for example, that we do not have any metro rail connectivity to any of the most populous locations in Bangalore. Namma metro anachronistically continues to link places that were the centers of growth two decades ago, while ignoring the locations that have the most people living in them.

How does the code work?

First we include all the libraries that we’ll be using, rgdal to read the kml file for Bangalore wards and leaflet to do the actual choropleth plotting. If we just wanted to do an interactive choropleth that would have been sufficient, but here we’re trying to create an animated time series, so we’ll be using htmlwidgets and webshot libraries as well.

library(rgdal)
library(leaflet)
library(htmlwidgets)
library(webshot)

Next up, we read the kml using readOGR and also read the BBMP ward data (which I have done some pre-processing on, I used Tabula to extract the data from the BBMP pdf linked at the bottom of the article and had to clean it up a little). We also do some basic post processing to remove special characters from numeric data.

blr <- readOGR("data", "BBMP-Wards")
blr_wards <- read.csv("data/blr_wards.csv", sep=",", dec=".")

# Converting special characters to numeric
blr_wards[,18] <- as.numeric(gsub(",", "", blr_wards[,18]))
blr_wards[,19] <- as.numeric(gsub("%", "", blr_wards[,19]))

The BBMP data contains the population of each ward in 2011 and a decadal growth rate which allows us to calculate the population of each ward in 2001, I use splinefun to interpolate the population of years 2002 to 2010 and also the predict the population for the years 2012 up to 2021.

# Use splinefun to interpolate and extrapolate the population of each ward from 2001 to 2021 
# As reference we use the population from 2011 and the decadal growth rate of each ward from 2001-2011
# Comparing the results of this formula to the United Nations estimates show that they align very closely
for (ward in 1:length(blr_wards[,3])) {
  range <- c(2001:2011)
  for (j in 1:21) {
    population <- seq(as.integer((100*blr_wards[ward,3])/(blr_wards[ward,19] + 100)),blr_wards[ward,3], length.out=11)
    func = splinefun(x=range, y=population, method="fmm",  ties = mean)
    blr_wards[ward,(j+19)] <- as.integer(func(seq(2000+j,2000+j,1)))
    names(blr_wards)[j+19] <- paste0("pop",2000+j,"")
  }  
}

The final step here is to plot the leaflet chart, which has layers to add the map, polygons depicting the wards, the legend and other features for interactive operation like popups which we won’t be using in our animated version. I couldn’t find a straightforward way to create an animated gif in R so I’m looping through the years one by one, creating a snapshot of the html and saving it as png. Once the loop is done I use the convert function (from ImageMagick) to convert the group of png image files into one gif and delete all the png and html files that were created during the process.

colorrange <- c(15000,30000,50000,80000,125000,163000)
palettePopulation <- colorNumeric("Reds",domain = colorrange)

for (year in 2001:2021) {
  colname <- paste0("pop",year,"")
  labelname <- paste("Bangalore population",year)
  popupPopulation <- paste0(blr_wards$WardName,":",blr_wards[,colname])
  name <- paste("pop", year, sep="")
  html_name <- paste(name, ".html", sep="")
  png_name <- paste(name, ".png", sep="")
  png_absolute_name <- paste("images/", png_name, sep="")
  
  img <- leaflet() %>% 
    addProviderTiles("Esri.WorldGrayCanvas", 
                     options = tileOptions(minZoom = 10, maxZoom = 16)) %>%
    addPolygons(data = blr, fillColor = ~palettePopulation(blr_wards[,colname]),
                fillOpacity = 0.6, color = "darkgrey", weight = 1.5, 
                popup = popupPopulation, group = "Population") %>%
    addLegend(position = 'topright', 
              pal = palettePopulation,
              values = colorrange,
              opacity = 0.6,    
              title = "Population") %>%
    addLabelOnlyMarkers(
      lng = 77.47, lat = 13.18,
      label = labelname,
      labelOptions = labelOptions(noHide = T,textsize = "20px"))
  
  saveWidget(img, html_name, selfcontained = FALSE)
  webshot(html_name, file=png_absolute_name, cliprect = 'viewport')
}

system("rm -rf pop*.html")
system("rm -rf pop*_files")
system("convert -delay 50 images/pop*.png -loop 0 images/blr_population.gif")

Data sources

BBMP ward data United Nations Top 30 Urban Agglomerations - Population and Growth Rate

References

http://journocode.com

Suggestions and corrections are always welcome, please use the comments section.


Back to posts


comments powered by Disqus