13 Spatial Data with ggplot2

In Geospatial Sciences we’re constantly working with spatial datasets that come in many different projections. We’ve previously shown how R can be used to read in spatial data, reproject spatial data, and resample spatial datasets. Once a spatial dataset can be stored in R as a data frame, we can use ggplot to plot it.

Function Description
geom_sf() Adds geometry stored in a sf object to a ggplot
coord_sf() Provides coordinate specifications for a ggplot
geom_raster() Plots a data frame as a raster on a ggplot
st_crs() Obtains projection information from an EPSG code
geom_sf_label() adds label to an sf geometry
scale_fill_gradientn() adds a density plot to a ggplot
borders() Adds country borders to a ggplot
coord_quickmap() Approximates projected lines for faster image creation
colorRampPalette() Create custom color ramp palette
brewer.pal() Grabs premade color palette from RColorBrewer
grid.arrange() Function that allows you to organize multiple ggplot objects into the same window

13.1 Simple Features

One package useful for spatial data in R is the sf package (short for simple features). It contains functions that perform equations that place uneven projections on a 2 dimensional computer screen. It can handle many projections and is built to work seamlessly with ggplot. For sample data, we’re also going to load ozmaps which contains maps of Australia.

## Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
## Simple feature collection with 9 features and 1 field
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 105.5507 ymin: -43.63203 xmax: 167.9969 ymax: -9.229287
## geographic CRS: GDA94
## # A tibble: 9 x 2
##   NAME                                                                  geometry
##   <chr>                                                       <MULTIPOLYGON [°]>
## 1 New South Wales      (((150.7016 -35.12286, 150.6611 -35.11782, 150.6373 -35.…
## 2 Victoria             (((146.6196 -38.70196, 146.6721 -38.70259, 146.6744 -38.…
## 3 Queensland           (((148.8473 -20.3457, 148.8722 -20.37575, 148.8515 -20.3…
## 4 South Australia      (((137.3481 -34.48242, 137.3749 -34.46885, 137.3805 -34.…
## 5 Western Australia    (((126.3868 -14.01168, 126.3625 -13.98264, 126.3765 -13.…
## 6 Tasmania             (((147.8397 -40.29844, 147.8902 -40.30258, 147.8812 -40.…
## 7 Northern Territory   (((136.3669 -13.84237, 136.3339 -13.83922, 136.3532 -13.…
## 8 Australian Capital … (((149.2317 -35.222, 149.2346 -35.24047, 149.2716 -35.27…
## 9 Other Territories    (((167.9333 -29.05421, 167.9188 -29.0344, 167.9313 -29.0…

oz_states is a simple feature which stores data like data frames but has a projection (proj4string), spatial extents, and polygon geometry. In other words, we have shapes on a map that are georeferenced and stored within a data frame that ggplot is great at working with. There are two columns - NAME and geometry. The NAME column is the name of the Australian State. The geometry specifies where the lines of the polygons are drawn (the state boundaries). So, let’s plot these polygons!

The function coord_sf allows to deal with the coordinate system, which includes both projection and extent of the map. By default, the map will use the coordinate system of the first layer that defines one (i.e. scanned in the order provided), or if none, fall back on WGS84 (latitude/longitude, the reference system used in GPS). Remember, our oz_states is a simple feature data frame that contains projection information. Thus, coord_sf is assuming we want the projection of oz_states - +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs. We can specify our own crs argument if we want to override the assumed projection. If we set the crs argument to a valid PROJ4 string, we can accoimplish this. Let’s project our oz_states onto a Mollewide graph.

We did not overwrite the data to this new projection. We simply performed a transformation on the fly to fit the data to a new grid. Instead of a PROJ4 string, we could use an EPSG code with the st_crs() function. Let’s not project our data onto a grid with EPSG code 3112.

st_crs() is that simple, just plug in the code and the function will grab the PROJ4 string for you. coord_sf has other options that are useful too such as x and y limits. Let’s go back to our regular projection of this data and show the limits in action.

We also added geom_sf_label which adds a label to an sf geometry. In this case, we added the Tasmania name tag to the geometry.

13.2 Sea Surface Temperature ggplot

When it comes to spatial data, the raster package is our go-to library. After the data is translated to a dataframe and ready to plot, we’ll need the help of a package called mapproj which makes sense of some external maps that ggplot2 uses.

## Loading required package: maps
## class      : RasterLayer 
## dimensions : 774, 1111, 859914  (nrow, ncol, ncell)
## resolution : 0.018, 0.0181  (x, y)
## extent     : -99.99915, -80.00115, 15.99378, 30.00318  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : Sea.Surface.Temperature 
## values     : 25.86023, 34.9397  (min, max)
## time       : 2019-08-14 16:30:43
##           x        y Sea.Surface.Temperature
## 1 -99.99015 29.99413                     NaN
## 2 -99.97215 29.99413                     NaN
## 3 -99.95415 29.99413                     NaN
## 4 -99.93615 29.99413                     NaN
## 5 -99.91815 29.99413                     NaN
## 6 -99.90015 29.99413                     NaN

In this case, df is just a data frame - not an sf data frame. We need to use geom_raster in this case to plot a regular data frame as a raster. As for the projeciton, ggplot automatically guessed lat/long which in this case is correct. A better practice is to use the crs from the sstRast which contains all of the projection information.

## CRS arguments: +proj=longlat +datum=WGS84 +no_defs

The georeferenced data is properly placed. Let’s now turn our attention to the coloring of the data. We can use the RColorBrewer package to create our own color palettes. Also, becuase we’re in lat/long projection, we’re going to use coord_quickmap instead of coord_sf. This function approximates geolocated lines for faster plotting. This can be used in lat/long projected data close to the equator.

Let’s get rid of the expanded area beyond the raster domain using the expand=FALSE argument in coord_quickmap()

13.3 R Color Brewer Palettes

There are a number of pre-made color palettes from RColorBrewer. Here is a list.

Let’s plot using a Yellow-Orange-Red palette and a white NA value

13.4 Assignment

  1. Download treecov.nc from the datasets folder

  2. For South America and Africa, plot the tree cover variable using ggplot2. Use a green color theme from RColorBrewer. Add borders.

  3. Place each ggplot next to each other in one plot window using grid.arrange.

  4. Submit resulting image to UD Canvas.

Your final product should look something like…

13.4.1 Extra Credit - 2 Points

Using the data above, approximate the average tree cover for South America and Africa (extents don’t have to be exact, just generally)

  1. Plot the same domains above but this time color each continent (yes, the whole thing) based on the average tree cover. For example, if one of the continent has an average tree cover of 30%, the entire continent would be red based on the colorscale you choose where 30% is red.

  2. Submit plot and code to canvas