Introduction
Trees are associated with a range of health benefits, from reducing the effects of the urban heat island1 to increasing pedestrian safety2,3 and neighborhood social cohesion4 to removing pollutants from the air.5 Given their contributions to health, localized cooling, and stormwater absorption, tree planting programs are touted as important climate change adaptation strategies.6
Access to trees and other types of nature in American cities is unevenly distributed across lines of race, ethnicity, and class, with profound implications for human and non-human health and well-being.7-14 In Denver, we know that the neighborhoods with the least tree cover (Figure 1) are those in the “inverted-L” areas15 of historical disadvantage and disinvestment, and the city is actively working to plant trees here.16
Figure 1: Local Moran’s I map showing high-high and low-low clusters of tree canopy in Denver by Census block group.
This study uses both traditional and geospatial statistical methods to make the inequities at play in Denver explicit.17 Specifically, this work asks:
- What variables predict higher tree canopy?
- How do the relationships between each variable and tree canopy play out in space?
- Where do areas with high tree canopy neighbor areas with high values of each variable?
- How does the strength and effect size of the relationship between variables change over space?
Methods
Data
This work combines American Community Survey demographic data (2017-2021, Census block group scale)18 with high resolution (3-foot) landcover raster data from the Denver Regional Council of Government’s Land Use Land Cover project.19 Based on literature exploring variables related to tree canopy cover,7,9,14,20,21 the initial demographic variables explored here included median household income, percent non-Hispanic white, percent Black, percent Hispanic, median age, percent with a bachelor’s degree or higher, percent owner occupied, median year structure built, and median home value. Block groups with a population of zero were excluded. Using ArcGIS Pro 3.2, I reclassified landcover data to isolate the dependent variable, percent tree canopy cover, and joined tree data to the demographic data using zonal statistics.
Data Exploration
Data exploration used the free and open-source statistical software R and sought to answer the question, what variables predict higher tree canopy? My initial exploratory analyses included plotting histograms and Q-Q plots for each variable. This analysis showed that tree canopy is normally distributed, but the social variables are not. Next, I calculated Spearman’s rho for each variable to determine which variables are correlated with tree canopy. Finally, I made linear models for tree canopy as a function of each variable separately. Based on these analyses, I isolated the following independent variables for further study: percent white, median age, percent owner occupied, and percent with bachelor’s degree or higher. These variables were selected to reduce collinearity (e.g., between percent white and percent Hispanic) and to account for gaps in the available data (e.g., missing values in median year structure built). The selected variables are consistent with those in the literature.7,9,14,20,21
To check each variable for spatial autocorrelation, I ran Global Moran’s I in ArcGIS Pro using Queen contiguity to define neighbors. All variables exhibit positive spatial autocorrelation, confirming that space affects the relationship between each explanatory variable and tree canopy.
Spatial Analyses
My spatial analyses explore how the relationships between tree canopy and each variable play out in space. First, I wanted to visualize where high values of tree canopy neighbor high values of each variable, and vice versa. Following Greene et. al.,22 I created bivariate local Moran’s I maps showing clusters and outliers of percent tree canopy as a function of each of the four independent variables. This step used the rgeoda package in R,23 with script based on examples provided in the R version of Luc Anselin’s GeoDa Workbook.24 My code used first order Queen contiguity to define weights as well as the default number of permutations (999) and alpha level for confidence (0.05).
Next, I used geographically weighted regression (GWR) in ArcGIS Pro to create linear models for each block group and understand how the strength and effect size of the relationship between tree canopy and each of these four variables changes across space. GWR is used in a wide range of disciplines, including for environmental justice25 and health-related26 research. I began by building a multiscale geographically weighted regression (MGWR) with all four variables. Unlike standard GWR, MGWR finds the best bandwidth for each variable in a multivariate model, recognizing that each independent variable may interact with the dependent variable at a different scale.27 My analysis used the Golden Search function to optimize each bandwidth as well as the bisquare weighting method. While this analysis does not output an r2 value for each variable (my real interest), it was a good starting point to confirm that my variables created a decent model.
Finally, I ran univariate GWR models for each variable, again in ArcGIS Pro using Golden Search. These analyses identify the coefficients (effect size) and r2 (strength) values for each variable across space.
Results
Data exploration yielded four variables—percent white, median age, percent owner occupied, and percent with bachelor’s degree—that predict tree canopy cover. Linear regression models find that each of these variables alone accounts for roughly 9–15% of variation in tree canopy citywide. All variables exhibit clustering (Table 1).
Variable | Spearman's rho | Coefficient | R2 | Moran's Index |
---|---|---|---|---|
Percent Tree Canopy | -- | -- | -- | 0.689626 |
Percent Whtie | 0.4294310 | 0.13162 | 0.1511 | 0.699215 |
Median Age | 0.3943120 | 0.33862 | 0.1149 | 0.296326 |
Percent Owner Occupied | 0.3611758 | 0.11218 | 0.1372 | 0.371869 |
Percent with Bachelor's Degree | 0.3412113 | 0.12122 | 0.09273 | 0.635945 |
Bivariate local Moran’s I shows clusters where a block group’s percent tree canopy value is more closely related to values of each explanatory variable at neighboring locations than would be expected in a random distribution.28 For example, high-high clusters in the percent tree and percent white map (Figure 2) reflect areas where denser tree canopies are nearby high percentages of white people, and low-low clusters reflect areas where fewer trees are nearby low percentages of white people. This variable shows the most consistent pattern of high-high clustering, reflecting the tendency for both tree canopy and white folks to cluster in this part of the city. While the other maps show more outliers, the general pattern in the maps corresponds to the general pattern of tree canopy cover in the city (Figure 1).
Figure 2: Bivariate local Moran’s I maps showing clustering of tree canopy with (clockwise from top left) percent white, median age, percent owner occupied, and percent
The MGWR model combining all four variables yields an adjusted R-squared of 0.7824: this model accounts for roughly 78% of the variation in tree canopy across Denver. The residuals are close to normally distributed (Figure 3), with a global Moran’s index of 0.039995 (p-value = 0.103347). These results confirmed that my chosen variables do a decent job of explaining Denver’s tree canopy distribution.
Figure 3: Distribution of the MGWR model’s residuals. The mean is close to 0 and to the median, and the standard deviation is just over 1.0. This graph suggests that the model is a decent approximation of what is happening on the ground.
The univariate GWR models calculate a regression model for each block group, capturing how the relationship between each variable and tree canopy varies across the city. The r2 values show wide variation in the strength of each relationship across space, while the coefficients show negative rather than positive associations in some places (Figure 4). For example, the coefficients for percent white range from -0.30 to +0.82. The largest effect size(deeper green) of percent white is in the same place as the high-high cluster in our bivariate local Moran’s I map, in Denver’s historically affluent neighborhoods around Washington and Cheeseman Parks and Cherry Creek.15 The negative effect size (deeper pink) corresponds with the high-white/low-tree outlier in the bivariate map, in the rapidly gentrifying Five Points / River North area.
Figure 4: Univariate GWR outputs, showing coefficients and local r2 values for each variable and Census block group. These results show wide variation in the explanatory power of our variables across space.
Discussion
Taken together, the bivariate local Moran’s I and GWR maps reflect layers of urban development processes interacting over decades to produce today’s uneven tree canopy—processes whose complexity cannot be captured using a global measure like linear regression. While historical exclusionary practices like redlining likely underlie the strong positive relationship between whiteness and trees in Denver’s affluent neighborhoods, the negative relationship in Five Points may reflect gentrification happening today.29 Similarly, the inverse relationships between age and trees and percent owner occupied and trees near Capitol Hill may reflect Millenials’ desire to live in urban centers,21 while the outliers and inverse relationships between all variables but age and trees in Central Park may be explained by newer builds in this area. This analysis does support uneven distribution of trees by race and class, and the localized methods used here to account for spatial effects and non-stationarity reveal a more complex picture.
Limitations
This study has several limitations, including the following:
- Some factors excluded from the study (e.g., percent Hispanic, median year home built), either due to multicollinearity or data incompleteness, may be more salient.
- My decision to exclude block groups with zero population removed large urban parks, which may have influenced the analysis.
- Although Census block groups are a common scale of analysis in the literature,12,20,22 data at this scale have high margins of error. Data at the Census tract level also may be more complete, which may have allowed me to select other, more salient variables.
- While the results shown here are visually compelling, interpreting their meaning is complex. Multivariate measures of spatial autocorrelation can conflate geographical and attribute similarity,30 while the statistical significance of GWR results varies across space. My symbolization of GWR coefficients in particular does not follow Mennis’31 best practice of greying out areas lacking significance; between wanting to present a powerful poster and tell an interesting story, I did not want to grey out any results.
Conclusion
Denver is making strides to redress inequitable tree canopy,16 and multiple agency plans provide visions for a greener city.32–34 This study suggests that planners need to attend to the complexities underlying today’s land cover to achieve these goals equitably.
Residents in Denver’s tree-scarce areas fear that greening projects will displace them,35 a fear supported by academic research.11,36 Planners and others working to increase tree canopy and broaden access to nature must ask questions such as:
- How do we work with communities to co-create a solution they want?
- How do we increase tree canopy while reducing outdoor water use?
- How do we increase amenities without contributing to displacement?
- How do we reorient our public space away from cars and toward people?
It’s clear where we need more trees; getting them there is the hard part, particularly given the financial constraints faced by the city and competing demands on public space. As we work toward a greener city, we need to do our best to ensure that those who will benefit most from access to nature are not displaced by it.
Erika Jermé is a Master of Arts candidate in Applied Geography and Geospatial Science.