# Chapter 18 Geospatial computations

In Chapter 17, we learned how to work with geospatial data. We learned about *shapefiles*, *map projections*, and how to plot spatial data using both **ggplot2** and **leaflet**.
In this chapter, we will learn how to perform computations on geospatial data that will enable us to answer questions about how long or how big spatial features are.
We will also learn how to use geometric operations and spatial joins to create new geospatial objects. These capabilities will broaden the spectrum of analytical tasks we can perform, and accordingly expand the range of questions we can answer.

## 18.1 Geospatial operations

### 18.1.1 Geocoding, routes, and distances

The process of converting a human-readable address into geographic coordinates is called *geocoding*. While there are numerous APIs available online that will do this for you, the functionality provided in **tidygeocoder** by the `geocode()`

function uses Open Street Map and does not require registration to use the API. Here, we build a data frame of the places of business of the three authors, geocode the addresses of the schools, convert the resulting data frame into an **sf** object, and set the projection to `epsg:4326`

(see Chapter 17).

```
library(tidyverse)
library(mdsr)
library(sf)
library(tidygeocoder)
<- tribble(
colleges ~school, ~address,
"Smith", "44 College Lane, Northampton, MA 01063",
"Macalester", "1600 Grand Ave, St Paul, MN 55105",
"Amherst", "Amherst College, Amherst, MA 01002"
%>%
) geocode(address, method = "osm") %>%
st_as_sf(coords = c("long", "lat")) %>%
st_set_crs(4326)
colleges
```

```
Simple feature collection with 3 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: -93.2 ymin: 42.3 xmax: -72.5 ymax: 44.9
CRS: EPSG:4326
# A tibble: 3 x 3
school address geometry
* <chr> <chr> <POINT [°]>
1 Smith 44 College Lane, Northampton, MA 01063 (-72.6 42.3)
2 Macalester 1600 Grand Ave, St Paul, MN 55105 (-93.2 44.9)
3 Amherst Amherst College, Amherst, MA 01002 (-72.5 42.4)
```

*Geodesic* distances can be computed using the `st_distance()`

function in **sf**. Here, we compute the distance between two of the Five Colleges^{36}.

```
%>%
colleges filter(school != "Macalester") %>%
st_distance()
```

```
Units: [m]
[,1] [,2]
[1,] 0 11982
[2,] 11982 0
```

The geodesic distance is closer to “*as the crow files*,” but we might be more interested in the *driving distance* between two locations along the road. To compute this, we need to access a service with a database of roads. Here, we use the openroute service, which requires an API key^{37}. The **openrouteservice** package provides this access via the `ors_directions()`

function. Note that the value of the variable `mdsr_ors_api_key`

is not shown. You will need your own API to use this service.

```
library(openrouteservice)
ors_api_key(mdsr_ors_api_key)
```

```
<- colleges %>%
smith_amherst filter(school != "Macalester") %>%
st_coordinates() %>%
as_tibble()
<- smith_amherst %>%
route_driving ors_directions(profile = "driving-car", output = "sf")
```

Note the difference between the geodesic distance computed above and the driving distance computed below. Of course, the driving distance must be longer than the geodesic distance.

```
%>%
route_driving st_length()
```

`13545 [m]`

If you prefer, you can convert meters to miles using the `set_units()`

function from the **units** package.

```
%>%
route_driving st_length() %>%
::set_units("miles") units
```

`8.42 [miles]`

Given the convenient Norwottuck Rail Trail connecting Northampton and Amherst, we might prefer to bike. Will that be shorter?

```
<- smith_amherst %>%
route_cycling ors_directions(profile = "cycling-regular", output = "sf")
%>%
route_cycling st_length()
```

`14066 [m]`

It turns out that the rail trail path is slightly longer (but far more scenic).

Since the *Calvin Coolidge Bridge* is the only reasonable way to get from Northampton to Amherst when driving, there is only one shortest route between Smith and Amherst, as shown in Figure 18.1.
We also show the shortest biking route, which follows the Norwottuck Rail Trail.

```
library(leaflet)
leaflet() %>%
addTiles() %>%
addPolylines(data = route_driving, weight = 10) %>%
addPolylines(data = route_cycling, color = "green", weight = 10)
```

However, shortest paths in a network are not unique (see Chapter 20). Ben’s daily commute to *Citi Field* from his apartment in *Brooklyn* presented three distinct alternatives:

- One could take the
*Brooklyn-Queens Expressway*(I-278 E) to the*Grand Central Parkway*E and pass by*LaGuardia Airport*. - One could continue on the
*Long Island Expressway*(I-495 E) and then approach Citi Field from the opposite direction on the Grand Central Parkway W. - One could avoid highways altogether and take
*Roosevelt Avenue*all the way through*Queens*.

The latter route is the shortest but often will take longer due to traffic. The first route is the most convenient approach to the Citi Field employee parking lot. These two routes are overlaid on the map in Figure 18.2.

```
<- tribble(
commute ~place, ~address,
"home", "736 Leonard St, Brooklyn, NY",
"lga", "LaGuardia Airport, Queens, NY",
"work", "Citi Field, 41 Seaver Way, Queens, NY 11368",
%>%
) geocode(address, method = "osm") %>%
st_as_sf(coords = c("long", "lat")) %>%
st_set_crs(4326)
<- commute %>%
route_direct filter(place %in% c("home", "work")) %>%
st_coordinates() %>%
as_tibble() %>%
ors_directions(output = "sf", preference = "recommended")
<- commute %>%
route_gcp st_coordinates() %>%
as_tibble() %>%
ors_directions(output = "sf")
leaflet() %>%
addTiles() %>%
addMarkers(data = commute, popup = ~place) %>%
addPolylines(data = route_direct, color = "green", weight = 10) %>%
addPolylines(data = route_gcp, weight = 10)
```

### 18.1.2 Geometric operations

Much of the power of working with geospatial data comes from the interactions between various layers of data. The **sf** package provides many features that enable us to compute with geospatial data.

A basic geospatial question is: what part of one series of geospatial objects lies within another set? To illustrate, we use geospatial data from the MacLeish field station in *Whately, MA*. These data are provided by the **macleish** package. Figure 18.3 illustrates that there are several streams that pass though the MacLeish property.

```
library(sf)
library(macleish)
<- macleish_layers %>%
boundary pluck("boundary")
<- macleish_layers %>%
streams pluck("streams")
<- ggplot(boundary) +
boundary_plot geom_sf() +
scale_x_continuous(breaks = c(-72.677, -72.683))
+
boundary_plot geom_sf(data = streams, color = "blue", size = 1.5)
```

The data from MacLeish happens to have a variable called `Shape_Area`

that contains the precomputed size of the property.

```
%>%
boundary pull(Shape_Area)
```

`[1] 1033988`

Is this accurate? We can easily compute basic geometric properties of spatial objects like area and length using the **st_area** function.

`st_area(boundary)`

`1034035 [m^2]`

The exact computed area is very close to the reported value. We can also convert the area in square meters to *acres* by dividing by a known conversion factor.

`st_area(boundary) / 4046.8564224`

`256 [m^2]`

Similarly, we can compute the length of each segment of the streams and the location of the *centroid* of the property.

```
%>%
streams mutate(length = st_length(geometry))
```

```
Simple feature collection with 13 features and 2 fields
geometry type: LINESTRING
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
First 10 features:
Id geometry length
1 1 LINESTRING (-72.7 42.5, -72... 593.2 [m]
2 1 LINESTRING (-72.7 42.5, -72... 411.9 [m]
3 1 LINESTRING (-72.7 42.5, -72... 137.8 [m]
4 1 LINESTRING (-72.7 42.5, -72... 40.2 [m]
5 1 LINESTRING (-72.7 42.5, -72... 51.0 [m]
6 1 LINESTRING (-72.7 42.5, -72... 592.5 [m]
7 1 LINESTRING (-72.7 42.5, -72... 2151.1 [m]
8 3 LINESTRING (-72.7 42.5, -72... 1651.7 [m]
9 3 LINESTRING (-72.7 42.5, -72... 316.6 [m]
10 3 LINESTRING (-72.7 42.5, -72... 388.1 [m]
```

```
%>%
boundary st_centroid()
```

```
Simple feature collection with 1 feature and 3 fields
geometry type: POINT
dimension: XY
bbox: xmin: -72.7 ymin: 42.5 xmax: -72.7 ymax: 42.5
CRS: 4326
OBJECTID Shape_Leng Shape_Area geometry
1 1 5894 1033988 POINT (-72.7 42.5)
```

As promised, we can also combine two geospatial layers. The functions `st_intersects()`

and `st_intersection()`

take two geospatial objects and return a `logical`

indicating whether they intersect, or another **sf** object representing that intersection, respectively.

`st_intersects(boundary, streams)`

```
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
1: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ...
```

`st_intersection(boundary, streams)`

```
Simple feature collection with 11 features and 4 fields
geometry type: GEOMETRY
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
First 10 features:
OBJECTID Shape_Leng Shape_Area Id geometry
1 1 5894 1033988 1 LINESTRING (-72.7 42.5, -72...
1.1 1 5894 1033988 1 LINESTRING (-72.7 42.5, -72...
1.2 1 5894 1033988 1 LINESTRING (-72.7 42.5, -72...
1.3 1 5894 1033988 1 MULTILINESTRING ((-72.7 42....
1.4 1 5894 1033988 1 LINESTRING (-72.7 42.4, -72...
1.5 1 5894 1033988 3 LINESTRING (-72.7 42.5, -72...
1.6 1 5894 1033988 3 LINESTRING (-72.7 42.5, -72...
1.7 1 5894 1033988 3 LINESTRING (-72.7 42.5, -72...
1.8 1 5894 1033988 3 LINESTRING (-72.7 42.5, -72...
1.9 1 5894 1033988 3 LINESTRING (-72.7 42.4, -72...
```

`st_intersects()`

is called a *predicate* function because it returns a `logical`

. It answers the question: “Do these two layers intersect?”
On the other hand, `st_intersection()`

performs a set operation. It answers the question: “What is the set that represents the intersection of these two layers?” Similar functions compute familiar set operations like unions, differences, and symmetric differences, while a whole host of additional predicate functions detect containment (`st_contains()`

, `st_within()`

, etc.), crossings, overlaps, etc.

In Figure 18.4(a) we use the `st_intersection()`

function to show only the parts of the streams that are contained within the MacLeish property.
In Figure 18.4(b), we show the corresponding set of stream parts that lie *outside* of the MacLeish property.

```
+
boundary_plot geom_sf(
data = st_intersection(boundary, streams),
color = "blue",
size = 1.5
)
+
boundary_plot geom_sf(
data = st_difference(streams, boundary),
color = "blue",
size = 1.5
)
```

Different spatial geometries intersect in different ways. Above, we saw that the intersection of steams (which are `LINESTRING`

s) and the boundary (which is a `POLYGON`

) produced `LINESTRING`

geometries. Below, we compute the intersection of the streams with the trails that exist at MacLeish. The trails are also `LINESTRING`

geometries, and the intersection of two `LINESTRING`

geometries produces a set of `POINT`

geometries.

```
<- macleish_layers %>%
trails pluck("trails")
st_intersection(trails, streams)
```

```
Simple feature collection with 10 features and 3 fields
geometry type: GEOMETRY
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
name color Id geometry
8 entry trail - 3 POINT (-72.7 42.4)
9 Eastern Loop Blue 3 POINT (-72.7 42.5)
13 Snowmobile Trail <NA> 3 MULTIPOINT ((-72.7 42.5), (...
15 Driveway <NA> 3 POINT (-72.7 42.4)
6 Western Loop Red 3 POINT (-72.7 42.4)
1 Porcupine Trail White 3 POINT (-72.7 42.5)
6.1 Western Loop Red 3 POINT (-72.7 42.5)
4 Vernal Pool Loop Yellow 3 POINT (-72.7 42.4)
3 Poplar Hill Road Road 3 POINT (-72.7 42.5)
14 Snowmobile Trail <NA> 3 POINT (-72.7 42.5)
```

Note that one of the features is a `MULTIPOINT`

. Occasionally, a trail might intersect a stream in more than one place (resulting in a `MULTIPOINT`

geometry). This occurs here, where the Snowmobile Trail intersects one of the stream segments in two different places. To clean this up, we first use the `st_cast()`

function to convert everything to `MULTIPOINT`

, and then cast everything to `POINT`

. (We can’t go straight to `POINT`

because we start with a mixture of `POINT`

s and `MULTIPOINT`

s.)

```
<- st_intersection(trails, streams) %>%
bridges st_cast("MULTIPOINT") %>%
st_cast("POINT")
nrow(bridges)
```

`[1] 11`

Note that we now have 11 features instead of 10. In this case, the intersections of trails and streams has a natural interpretation: these must be bridges of some type! How else could the trail continue through the stream? Figure 18.5 shows the trails, the streams, and these “bridges” (some of the points are hard to see because they partially overlap).

```
+
boundary_plot geom_sf(data = trails, color = "brown", size = 1.5) +
geom_sf(data = streams, color = "blue", size = 1.5) +
geom_sf(data = bridges, pch = 21, fill = "yellow", size = 3)
```

## 18.2 Geospatial aggregation

In Section 18.1.2, we saw how we can split `MULTIPOINT`

geometries into `POINT`

geometries. This was, in a sense, geospatial disaggregation. Here, we consider the perhaps more natural behavior of spatial aggregation.

Just as we saw previously that the intersection of different geometries can produce different resulting geometries, so too will different geometries aggregate in different ways. For example, `POINT`

geometries can be aggregated into `MULTIPOINT`

geometries.

The **sf** package implements spatial aggreation using the same `group_by()`

and `summarize()`

function that you learned in Chapter 4. The only difference is that we might have to specify *how* we want the spatial layers to be aggregated. The default aggregation method is `st_union()`

, which makes sense for most purposes.

Note that the `trails`

layer is broken into segments: the Western Loop is comprised of three different features.

` trails`

```
Simple feature collection with 15 features and 2 fields
geometry type: LINESTRING
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
First 10 features:
name color geometry
1 Porcupine Trail White LINESTRING (-72.7 42.5, -72...
2 Western Loop Red LINESTRING (-72.7 42.5, -72...
3 Poplar Hill Road Road LINESTRING (-72.7 42.5, -72...
4 Vernal Pool Loop Yellow LINESTRING (-72.7 42.4, -72...
5 Eastern Loop Blue LINESTRING (-72.7 42.5, -72...
6 Western Loop Red LINESTRING (-72.7 42.5, -72...
7 Western Loop Red LINESTRING (-72.7 42.4, -72...
8 entry trail - LINESTRING (-72.7 42.4, -72...
9 Eastern Loop Blue LINESTRING (-72.7 42.5, -72...
10 Easy Out Red LINESTRING (-72.7 42.5, -72...
```

Which trail is the longest? We know we can compute the length of the features with `st_length()`

, but then we’d have to add up the lengths of each segment. Instead, we can aggregate the segments and do the length computation on the full trails.

```
<- trails %>%
trails_full group_by(name) %>%
summarize(num_segments = n()) %>%
mutate(trail_length = st_length(geometry)) %>%
arrange(desc(trail_length))
```

```
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
```

` trails_full`

```
Simple feature collection with 9 features and 3 fields
geometry type: GEOMETRY
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
# A tibble: 9 x 4
name num_segments geometry trail_length
<fct> <int> <GEOMETRY [°]> [m]
1 Snowmobi… 2 MULTILINESTRING ((-72.7 42.5, -72.7 4… 2575
2 Eastern … 2 MULTILINESTRING ((-72.7 42.5, -72.7 4… 1940
3 Western … 3 MULTILINESTRING ((-72.7 42.5, -72.7 4… 1351
4 Poplar H… 2 MULTILINESTRING ((-72.7 42.4, -72.7 4… 1040
5 Porcupin… 1 LINESTRING (-72.7 42.5, -72.7 42.5, -… 700
6 Vernal P… 1 LINESTRING (-72.7 42.4, -72.7 42.4, -… 361
7 entry tr… 1 LINESTRING (-72.7 42.4, -72.7 42.4, -… 208
8 Driveway 1 LINESTRING (-72.7 42.4, -72.7 42.4, -… 173
9 Easy Out 2 MULTILINESTRING ((-72.7 42.5, -72.7 4… 136
```

## 18.3 Geospatial joins

In Section 17.4.3, we show how the `inner_join()`

function can be used to merge geospatial data with additional data. This works because the geospatial data was stored in an **sf** object, which is also a data frame. In that case, since the second data frame was not spatial, by necessity the key on which the join was performed was a non-spatial attribute.

A geospatial join is a fundamentally different type of operation, in which *both* data frames are geospatial, and the joining key is a geospatial attribute. This operation is implemented by the `st_join()`

function, which behaves similarly to the `inner_join()`

function, but with some additional complexities due to the different nature of its task.

To illustrate this, we consider the question of in which type of forest the two campsites at MacLeish lie (see Figure 18.6).

```
<- macleish_layers %>%
forests pluck("forests")
<- macleish_layers %>%
camp_sites pluck("camp_sites")
+
boundary_plot geom_sf(data = forests, fill = "green", alpha = 0.1) +
geom_sf(data = camp_sites, size = 4) +
geom_sf_label(
data = camp_sites, aes(label = name),
nudge_y = 0.001
)
```

It is important to note that this question is inherently spatial. There simply is no variable between the `forests`

layer and the `camp_sites`

layer that would allow you to connect them other than their geospatial location.

Like `inner_join()`

, the `st_join()`

function takes two data frames as its first two arguments. There is no `st_left_join()`

function, but instead `st_join()`

takes a `left`

argument, that is set to `TRUE`

by default. Finally, the `join`

argument takes a predicate function that determines the criteria for whether the spatial features match. The default is `st_intersects()`

, but here we employ `st_within()`

, since we want the `POINT`

geometries of the camp sites to lie *within* the `POLYGON`

geometries of the forests.

```
st_join(camp_sites, forests, left = FALSE, join = st_within) %>%
select(name, type)
```

```
Simple feature collection with 2 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: -72.7 ymin: 42.5 xmax: -72.7 ymax: 42.5
CRS: EPSG:4326
# A tibble: 2 x 3
name type geometry
<chr> <fct> <POINT [°]>
1 Group Campsite Old Field White Pine Forest (-72.7 42.5)
2 Remote Campsite Sugar Maple Forest (-72.7 42.5)
```

We see that the Group Campsite is in a *Eastern White Pine* forest, while the Remote Campsite is in a *Sugar Maple* forest.

## 18.4 Extended example: Trail elevations at MacLeish

Many hiking trails provide trail elevation maps (or elevation profiles) that depict changes in elevation along the trail. These maps can help hikers understand the interplay between the uphill and downhill segments of the trail and where they occur along their hike.

More formally, various trail railing systems exist to numerically score the difficulty of a hike. *Shenandoah National Park* uses this simple trail rating system:

\[ rating = \sqrt{gain \cdot 2 \cdot distance} \] A rating below 50 corresponds to the easiest class of hike.

In this example, we will construct an elevation profile and compute the trail rating for the longest trail at MacLeish.
The **macleish** package contains elevation contours that are 30 feet apart. These are relatively sparse, but they will suffice for our purposes.

```
<- macleish_layers %>%
elevations pluck("elevation")
```

First, we leverage the spatial aggregation work that we did previously to isolate the longest trail.

```
<- trails_full %>%
longest_trail head(1)
longest_trail
```

```
Simple feature collection with 1 feature and 3 fields
geometry type: MULTILINESTRING
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
# A tibble: 1 x 4
name num_segments geometry trail_length
<fct> <int> <MULTILINESTRING [°]> [m]
1 Snowmobi… 2 ((-72.7 42.5, -72.7 42.5, -72.7 42.5,… 2575
```

Next, we compute the geospatial intersection between the trails and the elevation contours, both of which are `LINESTRING`

geometries. This results in `POINT`

s, but just as we saw above, there are several places where the trail crosses the same contour line more than once. If you think about walking up a hill and then back down the other side, this should make sense (see Figure 18.7). These multiple intersections result in `MULTIPOINT`

geometries. We unravel these just as we did before: by casting everything to `MULTIPOINT`

and then casting everything back to `POINT`

.

```
<- longest_trail %>%
trail_elevations st_intersection(elevations) %>%
st_cast("MULTIPOINT") %>%
st_cast("POINT")
```

Figure 18.7 reveals that the Snowmobile trail starts near the southernmost edge of the property at about 750 feet above sea level, snakes along a ridge at 780 feet, before climbing to the local peak at 870 feet, and finally descending the back side of the hill near the northern border of the property.

```
+
boundary_plot geom_sf(data = elevations, color = "dark gray") +
geom_sf(data = longest_trail, color = "brown", size = 1.5) +
geom_sf(data = trail_elevations, fill = "yellow", pch = 21, size = 3) +
geom_sf_label(
data = trail_elevations,
aes(label = CONTOUR_FT),
hjust = "right",
size = 2.5,
nudge_x = -0.0005
)
```

Finally, we need to put the features in order, so that we can compute the distance from the start of the trail. Luckily, in this case the trail goes directly south to north, so that we can use the latitude coordinate as an ordering variable.

In this case, we use `st_distance()`

to compute the geodesic distances between the elevation contours.
This function returns a \(n \times n\) `matrix`

with all of the pairwise distances, but since we only want the distances from the southernmost point (which is the first element), we only select the first column of the resulting matrix.

To compute the actual distances (i.e., along the trail) we would have to split the trail into pieces. We leave this as an exercise.

```
<- trail_elevations %>%
trail_elevations mutate(lat = st_coordinates(geometry)[, 2]) %>%
arrange(lat) %>%
mutate(distance_from_start = as.numeric(st_distance(geometry)[, 1]))
```

Figure 18.8 shows our elevation profile for the Snowmobile trail.

```
ggplot(trail_elevations, aes(x = distance_from_start)) +
geom_ribbon(aes(ymax = CONTOUR_FT, ymin = 750)) +
scale_y_continuous("Elevation (feet above sea level)") +
scale_x_continuous("Geodesic distance from trail head (meters)") +
labs(
title = "Trail elevation map: Snowmobile trail",
subtitle = "Whately, MA",
caption = "Source: macleish package for R"
)
```

With a rating under 20, this trail rates as one of the easiest according to the Shenandoah system.

```
%>%
trail_elevations summarize(
gain = max(CONTOUR_FT) - min(CONTOUR_FT),
trail_length = max(units::set_units(trail_length, "miles")),
rating = sqrt(gain * 2 * as.numeric(trail_length))
)
```

`although coordinates are longitude/latitude, st_union assumes that they are planar`

```
Simple feature collection with 1 feature and 3 fields
geometry type: MULTIPOINT
dimension: XY
bbox: xmin: -72.7 ymin: 42.4 xmax: -72.7 ymax: 42.5
CRS: 4326
# A tibble: 1 x 4
gain trail_length rating geometry
<dbl> [miles] <dbl> <MULTIPOINT [°]>
1 120 1.6 19.6 ((-72.7 42.5), (-72.7 42.5), (-72.7 42.5), (-72…
```

## 18.5 Further resources

Lovelace, Nowosad, and Muenchow (2019) and Engel (2019) were both helpful in updating this material to take advantage of **sf**.

## 18.6 Exercises

**Problem 1 (Medium)**: The `Violations`

data frame in the `mdsr`

package contains information on violations noted in Board of Health inspections of New York City restaurants. These data contain spatial information in the form of addresses and zip codes.

Use the

`geocode`

function in`tidygeocoder`

to obtain spatial coordinates for these restaurants.Using the spatial coordinates you obtained in the previous exercise, create an informative static map using

`ggspatial`

that illustrates the nature and extent of restaurant violations in New York City.Using the spatial coordinates you obtained in the previous exercises, create an informative interactive map using

`leaflet`

that illustrates the nature and extent of restaurant violations in New York City.

**Problem 2 (Medium)**:

Use the spatial data in the

`macleish`

package and`ggspatial`

to make an informative static map of the MacLeish Field Station property.Use the spatial data in the

`macleish`

package and`leaflet`

to make an informative interactive map of the MacLeish Field Station property.

**Problem 3 (Hard)**: GIS data in the form of shapefiles is all over the Web. Government agencies are particularly good sources for these. The following code downloads bike trail data in Massachusetts from MassGIS. Use `bike_trails`

to answer the following questions:

```
if (!file.exists("./biketrails_arc.zip")) {
<- "http://download.massgis.digital.mass.gov/"
part1 <- "shapefiles/state/biketrails_arc.zip"
part2 <- paste(part1, part2, sep = "")
url <- basename(url)
local_file download.file(url, destfile = local_file)
unzip(local_file, exdir = "./biketrails/")
}
```

```
library(sf)
<- path.expand("./biketrails/biketrails_arc")
dsn st_layers(dsn)
```

```
Driver: ESRI Shapefile
Available layers:
layer_name geometry_type features fields
1 biketrails_arc Line String 272 13
```

`<- read_sf(dsn) bike_trails `

How many distinct bike trail segments are there?

What is the longest individual bike trail segment?

How many segments are associated with the Norwottuck Rail Trail?

Among all of the named trails (which may have multiple features), which one has the longest total length?

The bike trails are in a Lambert conformal conic projection. Note that the units of the coordinates are very different from lat/long. In order to get these data onto our leaflet map, we need to re-project them. Convert the bike trails to EPSG:4326, and create a leaflet map.

Color-code the bike trails based on their length, and add an informative legend to the plot.

**Problem 4 (Hard)**: The MacLeish snowmobile trail map generated in the book is quite rudimentary.
Generate your own map that improves upon the aesthetics and information content.

## 18.7 Supplementary exercises

Available at https://mdsr-book.github.io/mdsr2e/geospatial-II.html#geospatialII-online-exercises

No exercises found