Natural Earth Data and R in ggplot2

I really like the data layers available from naturalearthdata.  They take a simple map and make it look beautiful, which I suppose is the point.

Figure 1.  Here's an amazing background for your maps using Natural Earth Data and ggplot2

Figure 1. Here’s an amazing background for your maps using Natural Earth Data and ggplot2

One thing that I’ve found frustrating though is the need to constantly go back and forth between R and ArcGIS/QGIS/GRASS to make figures with spatial data if I want them to look really pretty.  In preparation for AGU2013 this year (come see our poster) I decided I’d finally solve this annoying problem by figuring out how to get a pretty base raster layer into ggplot, and then superimpose the lakes and streams that serve to make a nice map look pretty.

Here’s the solution to make this fine figure:

library(raster)
library(rgdal)
library(ggplot2)
library(reshape2)
library(plyr)

#  Assuming you have a path 'Maps' that you store your spatial files in.  This
#  is all downloaded from <a href="http://www.naturalearthdata.com/downloads/">http://www.naturalearthdata.com/downloads/</a> using the
#  1:50m "Medium" scale data.

nat.earth <- stack('./Maps/NE2_50M_SR_W/NE2_50M_SR_W.tif')

ne_lakes <- readOGR('./Maps/NaturalEarth/ne_50m_lakes.shp',
                    'ne_50m_lakes')

ne_rivers <- readOGR('./Maps/NaturalEarth/ne_50m_rivers_lake_centerlines.shp',
                     'ne_50m_rivers_lake_centerlines')

ne_coast <- readOGR('./Maps/NaturalEarth/ne_50m_coastline.shp',
                     'ne_50m_coastline')

#  I have a domain I'm interested in, but there's no reason you can't define something else:
quick.subset <- function(x, longlat){

# longlat should be a vector of four values: c(xmin, xmax, ymin, ymax)
  x@data$id <- rownames(x@data)

  x.f = fortify(x, region="id")
  x.join = join(x.f, x@data, by="id")

  x.subset <- subset(x.join, x.join$long > longlat[1] & x.join$long < longlat[2] &
                           x.join$lat > longlat[3] & x.join$lat < longlat[4])

  x.subset
}

domain <- c(-98.6, -66.1, 36.5, 49.7)
lakes.subset <- quick.subset(ne_lakes, domain)
river.subset <- quick.subset(ne_rivers, domain)
coast.subset <- quick.subset(ne_coast, domain)

nat.crop <- crop(nat.earth, y=extent(domain))

rast.table <- data.frame(xyFromCell(nat.crop, 1:ncell(nat.crop)),
                         getValues(nat.crop/255))

rast.table$rgb <- with(rast.table, rgb(NE2_50M_SR_W.1,
                                       NE2_50M_SR_W.2,
                                       NE2_50M_SR_W.3,
                                      1))
# et voila!

ggplot(data = rast.table, aes(x = x, y = y)) +
  geom_tile(fill = rast.table$rgb) +
  geom_polygon(data=lakes.subset, aes(x = long, y = lat, group = group), fill = '#ADD8E6') +
  scale_alpha_discrete(range=c(1,0)) +
  geom_path(data=lakes.subset, aes(x = long, y = lat, group = group), color = 'blue') +
  geom_path(data=river.subset, aes(x = long, y = lat, group = group), color = 'blue') +
  geom_path(data=coast.subset, aes(x = long, y = lat, group = group), color = 'blue') +
  scale_x_continuous(expand=c(0,0)) +
  scale_y_continuous(expand=c(0,0)) +
  xlab('') + ylab('')

Some issues still remain for me. For one, there’s a problem with holes in this plotting. You’ll notice that Manitoulin Island and half of the Keweenaw Peninsula are basically lake (the light blue background) outlined in blue, and not the textured raster. I’m not sure exactly how to fix that, there’s plenty of discussion about how to fix holes in polygons using ggplot, but I haven’t worked out the solution yet.

 

If you have a solution I’m all ears.

About these ads

4 thoughts on “Natural Earth Data and R in ggplot2

  1. Pingback: “Lies, damn lies, and statistics” (and math) | ScienceBorealis.ca Blog

  2. I am really excited about this post! But it’s not working for me. I also got my data from Natural Earth: no .tif. I went back & looked by following your link & there isn’t a .tif file! Just your typical dbf, prj, shp, shx. I just need a simple map of the US cut off at the 102 meridian, so I figured I could just read in the ne_50m_land.shp file. I got as far as”
    rast.table <- data.frame …
    And it gives me error:
    Error in (function (classes, fdef, mtable) :
    unable to find an inherited method for function ‘xmin’ for signature ‘"SpatialPolygonsDataFrame"’
    Even with your exact code (except .shp , no .tif, but I don't think that matters at this point).
    I know this isn't StackOverload, but I wanted to let you know. There's a small chance your code needs to be updated, though I, R newbie, probably unwittingly messed it up somewhere.

    • If you’re not going to use a raster base layer then you don’t need the stack command (where I reference the ‘tif’ file). That’s why the ‘rast.table’ command isn’t working, because it’s looking for a raster stack. When I look here:

      http://www.naturalearthdata.com/downloads/

      I see links to the raster files, maybe you’re at too high a scale?

      My email address is linked in the ‘About’ section, feel free to send me the code and we can work it out together.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s