In this lab, we will learn how to import shapefiles into R.
Goal: by the end of this lab, you will be able to import arbitrary shapefiles into R.
GIS data in the form of shapefiles is all over the web. Government agencies are particularly good sources for these. Here, we illustrate how to find bicycling trails in Massachusetts from MassGIS.
<- "http://download.massgis.digital.mass.gov/shapefiles/state/biketrails_arc.zip" url <- basename(url) local_file download.file(url, destfile = local_file) unzip(local_file, exdir = "biketrails")
Once we have the files unzipped, we can explore what layers are available in that folder. The
sf package does this nicely.
library(sf) <- path.expand("biketrails/biketrails_arc") dsn dsnlist.files(dsn)
There many files in this directory, each containing different spatial information. But how many coherent layers are there? In this case, just one. What does that layer contain?
Looks good, let’s read it into R.
<- read_sf(dsn) bike_trails bike_trails
This is a collection of
MULTILINESTRING geometries with many attributes. What does it look like?
%>% bike_trails select(TRAILNAME) %>% plot()
This is somewhat evocative of the state of Massachusetts…
Note the projection string:
N.B. - The
mass_gis()function in the
macleishpackage can do some of this work for you, if you only care about intersections with the MacLeish property boundary.
bike_trails is a
data.frame – in addition to being an
sf object – you can wrangle the data just like you would in any other
dplyr functions like
summarize() work just as you might imagine they would.
%>% bike_trails filter(OWNER == "MBTA")
# SAMPLE SOLUTION nrow(bike_trails)
# SAMPLE SOLUTION %>% bike_trails arrange(desc(SHAPE_LEN)) %>% head(1)
# SAMPLE SOLUTION %>% bike_trails filter(str_detect(TRAILNAME, "Norwottuck Rail Trail"))
# SAMPLE SOLUTION %>% bike_trails group_by(TRAILNAME) %>% summarize( num_segments = n(), total_length = sum(SHAPE_LEN) %>% ) arrange(desc(total_length))
summarize() will aggregate the geometries using
st_union(). Thus, while the following two pipelines will produce similar looking plots, the first one consists of 33 rows, each consisting of a single trail segment, while the second one consists of just one row, which contains a
MULTILINESTRING geometry that has the union of the information for all 33 trail segments.
# 33 rows %>% bike_trails filter(OWNER == "MBTA") # one row %>% bike_trails filter(OWNER == "MBTA") %>% group_by(OWNER) %>% summarize( num_segments = n(), total_length = sum(SHAPE_LEN) %>% ) arrange(desc(total_length))
The bike trails are in a Lambert conformal conic projection. Note that the units of the coordinates are very different from lat/long.
In order to get these data onto our
leaflet map, we need to re-project them. The
leaflet data are projected in EPSG:4326 (Web Mercator)—and we can’t change that. So we’ll have to convert the bike trails to EPSG:4326. This is done with the
<- bike_trails %>% bike_trails_4326 st_transform(4326) bike_trails_4326
Note that the projection string and the coordinates are very different now.
Now we can put this on our map.
library(leaflet) leaflet() %>% addTiles() %>% addPolylines(data = bike_trails_4326)
Please consult the Leaflet documentation for more details and examples.
# SAMPLE SOLUTION <- colorNumeric( pal palette = "viridis", domain = bike_trails_4326$SHAPE_LEN ) leaflet(data = bike_trails_4326) %>% addTiles() %>% addPolylines(color = ~pal(SHAPE_LEN)) %>% addLegend("bottomright", pal = pal, values = ~SHAPE_LEN)
Please respond to the following prompt on Slack in the
Prompt: What questions to you still have about working with shapefiles in