--- title: "R workflow for forest mapping and inference from ABA prediction models" author: "Jean-Matthieu Monnet" date: "`r Sys.Date()`" output: html_document: default pdf_document: default papersize: a4 bibliography: "../bib/bibliography.bib" --- ```{r setup, include=FALSE} # erase all cat("\014") rm(list = ls()) # knit options knitr::opts_chunk$set(echo = TRUE) # Set so that long lines in R will be wrapped: knitr::opts_chunk$set(tidy.opts = list(width.cutoff = 80), tidy = TRUE) knitr::opts_chunk$set(fig.align = "center") ``` --- The code below presents a workflow to map forest parameters using calibrated prediction models (see [previous tutorial](https://gitlab.irstea.fr/jean-matthieu.monnet/lidartree_tutorials/-/blob/master/area-based.2.model.calibration.Rmd)) and wall-to-wall coverage of the region of interest with airborne laser scanning. This tutorial is the last step of the so-called area-based approach. It is based on functions from `R` packages `lidaRtRee` and `lidR`. Licence: GNU GPLv3 / [Source page](https://gitlab.irstea.fr/jean-matthieu.monnet/lidartree_tutorials/-/blob/master/R/area-based.3.mapping.and.inference.Rmd) Many thanks to Pascal Obstétar for checking code and improvement suggestions. # Load data ## ABA prediction models ABA prediction models calibrated in the previous tutorial can be loaded from the R archive "data/aba.model/output/models.rda". Three models are available: * a single model calibrated with the whole field dataset, * a stratified model which is the combination of two models: one calibrated with field plots located in private forests, and the other with plots in public forests. ```{r loadModels, include = TRUE} load(file = "../data/aba.model/output/models.rda") ``` ## Airborne laser scanning data For wall-to-wall mapping, the ALS metrics that are included in the models have to be computed for each element of a partition of the ALS acquisition. The area of interest is usually divided in square pixels which surface is similar to the one of field plots used for model calibration. The workflow does not yet use the catalog processing capabilities of the package `lidR`. The batch processing of an ALS wall-to-wall thus cover relies on parallel processing of ALS point clouds organized in rectangular tiles, non-overlapping tiles. It is best if their borders correspond to multiple values of the pixel size. For example purpose, two adjacent ALS tiles are provided, in two versions: * one with altitude values, in the folder "../data/aba.model/ALS/tiles.laz", these will be used for terrain statistics computations, * one with (normalized) height values, in the folder "../data/aba.model/ALS/tiles.norm.laz", these will be used for tree detection and point metrics computation. Files are too large to be hosted on the gitlab repository, they can be downloaded as a [zip file](https://drive.google.com/u/0/uc?export=download&id=1ripo-PLZ8_IjE7rAQ2RECj-fjg1UpC5i) from Google drive, and should be extracted in the folder "data/aba.model/ALS/" before proceeding with the processing. Files can be automatically downloaded thanks to the `googledrive` package with the following code, this requires authenticating yourself and authorizing the package to deal on your behalf with Google Drive. ```{r downloadGoogledrive, include = TRUE, eval = FALSE} # set temporary file temp <- tempfile(fileext = ".zip") # download file from google drive dl <- googledrive::drive_download(googledrive::as_id("1ripo-PLZ8_IjE7rAQ2RECj-fjg1UpC5i"), path = temp, overwrite = TRUE) # unzip to folder out <- unzip(temp, exdir = "../data/aba.model/ALS/") # remove temporary file unlink(temp) ``` ```{r loadALS, include = TRUE, warning = FALSE} # build catalogs cata_height <- lidR::readALSLAScatalog("../data/aba.model/ALS/tiles.norm.laz/") cata_altitude <- lidR::readALSLAScatalog("../data/aba.model/ALS/tiles.laz/") lidR::opt_progress(cata_altitude) <- lidR::opt_progress(cata_height) <- FALSE # set projection info lidR::projection(cata_height) <- lidR::projection(cata_altitude) <- 2154 cata_height ``` ## GIS data In order to apply the corresponding models in the case of stratum specific calibration, a GIS layer defining the spatial extent of each category is required. The layer "Public4Montagnes" in the folder "data/aba.model/GIS" corresponds to public forests (Forêts publiques, ONF Paris, 2019). All remaining areas will be considered as private-owned. In order to exclude all non-forest areas from model application, another GIS layer will be used. It is the forest cover as defined by the [BD Forêt](https://inventaire-forestier.ign.fr/spip.php?article646) of IGN. Both layers are available under the [open licence Etalab 2.0](https://www.etalab.gouv.fr/licence-ouverte-open-licence). ```{r stratumExtraction, include=TRUE, message=FALSE, warning=FALSE, fig.width = 4, fig.height = 6} # load GIS layer of public forests public <- sf::st_read("../data/aba.model/GIS/Public4Montagnes.shp", stringsAsFactors = TRUE, quiet = TRUE) # load GIS layer of forest mask forest <- sf::st_read("../data/aba.model/GIS/ForestMask.shp", stringsAsFactors = TRUE, quiet = TRUE) # set coordinate system sf::st_crs(public) <- sf::st_crs(forest) <- 2154 ``` # Forest parameters mapping ## ALS metrics mapping First the ALS metrics used as independent variables in the ABA prediction models have to be mapped on the whole area. For this the SAME functions used for computing the metrics from the plot-extent cloud metrics should be used. Metrics are computed in each cell of the area of interest divided in square pixels. Pixel surface should be of the same size as calibration plots. Using round values is advisable in case other remote sensing data are considered for use. ```{r ALSMetrics, include=TRUE, message=FALSE, warning=FALSE, fig.width = 4, fig.height = 6} # resolution for metrics map resolution <- 25 ``` The following paragraph show how to compute metrics for a 300 x 300 m^2^ subsample of the first ALS tile. ```{r loadTile, include=TRUE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 6} # load ALS tile point_cloud_height <- lidR::clip_rectangle(cata_height, 899600, 6447600, 899900, 6447900) point_cloud_ground <- lidR::filter_ground(lidR::clip_rectangle(cata_altitude, 899600, 6447600, 899900, 6447900)) ``` ### Point cloud metrics For computation of point cloud metrics, the same function used in `lidaRtRee::clouds_metrics` is now supplied to `lidR::grid_metrics`. Some metrics are displayed hereafter. ```{r pointMetrics, include=TRUE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 4} # compute point metrics metrics_points <- lidR::grid_metrics(point_cloud_height, aba_point_metrics_fun, res = resolution) ``` ```{r plotpointMetrics, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 9, fig.height = 3} par(mfrow = c(1, 3)) for (i in c("zmean", "zsd", "imean")) { raster::plot(metrics_points[[i]], main = i) } ``` ### Tree metrics For tree metrics, three steps are required: * computation of the canopy height model, segmentation and extraction of trees; * summary statistics of tree features by target pixels (values calculated based on trees which apices are inside the pixel); * additional statistics about the percentage of cover and the mean canopy height (values calculated based on segmented tree crowns which are inside the target pixel). It is very important that tree segmentation parameters are the same as in the calibration steps, and that the function for aggregating tree attributes into raster metrics is the also the same as in the calibration step (except for the area which might differ slightly between the field plots and target cells). ```{r treeMetrics, include=TRUE, message=FALSE, warning=FALSE, fig.width = 6, fig.height = 6} # compute chm chm <- lidaRtRee::points2DSM(point_cloud_height, res = aba_res_chm) # replace NA, low and high values chm[is.na(chm) | chm < 0 | chm > 60] <- 0 # tree top detection (same parameters as used by clouds_tree_metrics in calibration step -> default parameters) segms <- lidaRtRee::tree_segmentation(chm) # extraction to data.frame trees <- lidaRtRee::tree_extraction(segms$filled_dem, segms$local_maxima, segms$segments_id) # compute raster metrics metrics_trees <- lidaRtRee::raster_metrics(trees[, -1], res = resolution, fun = function(x) { lidaRtRee::std_tree_metrics(x, resolution^2 / 10000) }, output = "raster" ) # remove NAs and NaNs metrics_trees[!is.finite(metrics_trees)] <- 0 # # compute canopy cover in trees and canopy mean height in trees # in region of interest, because it is not in previous step. r_treechm <- segms$filled_dem # set chm to 0 in non segment area r_treechm[segms$segments_id == 0] <- NA # compute raster metrics dummy <- lidaRtRee::raster_metrics(r_treechm, res = resolution, fun = function(x) { c( sum(!is.na(x$filled_dem) / (resolution / aba_res_chm)^2), mean(x$filled_dem, na.rm = T) ) }, output = "raster" ) names(dummy) <- c("TreeCanop_cover_in_plot", "TreeCanopy_meanH_in_plot") metrics_trees <- raster::addLayer(metrics_trees, dummy) ``` The detected trees are plotted below, with the CHM as background. ```{r plotTrees, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 6, fig.height = 6} par(mfrow = c(1, 1)) raster::plot(chm, main = "CHM and detected trees") sp::plot(trees, cex = trees$h / 60, add = TRUE) ``` Some metrics maps are displayed below. ```{r plottreeMetrics, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 9, fig.height = 3} par(mfrow = c(1, 3)) for (i in c("Tree_meanH", "Tree_sdH", "Tree_density")) { raster::plot(metrics_trees[[i]], main = i) } ``` ### Terrain metrics Terrain metrics can be computed from the altitude values of ground points. A wrapping function is required to pass `lidaRtRee::terrain_points_metrics` to `lidR::grid_metrics`. ```{r terrainMetrics, include=TRUE, message=FALSE, warning=FALSE} # compute terrain metrics f <- function(x, y, z) { as.list(lidaRtRee::terrain_points_metrics(data.frame(x, y, z))) } metrics_terrain <- lidR::grid_metrics(point_cloud_ground, ~ f(X, Y, Z), res = resolution) ``` ```{r plotterrainMetrics, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 9, fig.height = 3} par(mfrow = c(1, 3)) raster::plot(metrics_terrain[["altitude"]], main = "altitude") raster::plot(metrics_terrain$azimut_gr, main = "azimut (gr)", breaks = seq(from = 0, to = 400, length.out = 81), col = rainbow(81)) raster::plot(metrics_terrain$slope_gr, main = "Slope (gr)") ``` ### Batch processing of tiles For the batch processing of tiles, parallel processing with packages `foreach` and `doFuture` is used. The processing capabilities of `lidR` are not yet used. ```{r setupCluster, include=TRUE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 4} library(foreach) # create parallel frontend, specify to use two parallel sessions doFuture::registerDoFuture() future::plan("multisession", workers = 2L) ``` A buffering procedure is designed to handle the border effects when detecting trees at tile edges. A 10 m buffer is enough for tree metrics. One can also specify points classes to use, or apply a threshold to high points. Some parameters specified in the previous paragraphs are also required for the batch processing (output map resolution, chm resolution for tree segmentation, functions for metrics computation). ```{r paramCluster, include=TRUE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 4} # processing parameters b_size <- 10 # height threshold (m) for high points removal (points above this threshold are considered # as outliers) h_points <- 60 # points classes to retain for analysis (vegetation + ground) # ground should be 2 class_points <- c(0, 1, 2, 3, 4, 5) ``` This first chunk of code computes tree and point metrics from the normalized ALS tiles. ```{r batchProcessing, include=TRUE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 4} # processing by tile metrics <- foreach::foreach(i = 1:nrow(cata_height), .errorhandling = "remove") %dopar% { # tile extent b_box <- as.numeric(cata_height@data[i, c("Min.X", "Min.Y", "Max.X", "Max.Y")]) # load tile extent plus buffer a <- try(lidR::clip_rectangle( cata_height, b_box[1] - b_size, b_box[2] - b_size, b_box[3] + b_size, b_box[4] + b_size )) # # check if points are successfully loaded if (class(a) == "try-error") { return(NULL) } # add 'buffer' flag to points in buffer with TRUE value in this new attribute a <- lidR::add_attribute( a, a$X < b_box[1] | a$Y < b_box[2] | a$X >= b_box[3] | a$Y >= b_box[4], "buffer" ) # remove unwanted point classes, and points higher than height threshold a <- lidR::filter_poi(a, is.element(Classification, class_points) & Z <= h_points) # check number of remaining points if (length(a) == 0) { return(NULL) } # set negative heights to 0 a@data$Z[a@data$Z < 0] <- 0 # # compute chm chm <- lidaRtRee::points2DSM(a, res = aba_res_chm) # replace NA, low and high values by 0 chm[is.na(chm) | chm < 0 | chm > h_points] <- 0 # # compute tree metrics # tree top detection (default parameters) segms <- lidaRtRee::tree_segmentation(chm) # extraction to data.frame trees <- lidaRtRee::tree_extraction(segms$filled_dem, segms$local_maxima, segms$segments_id) # remove trees outside of tile trees <- trees[trees$x >= b_box[1] & trees$x < b_box[3] & trees$y >= b_box[2] & trees$y < b_box[4], ] # compute raster metrics metrics_trees <- lidaRtRee::raster_metrics(trees[, -1], res = resolution, fun = function(x) { lidaRtRee::std_tree_metrics(x, resolution^2 / 10000) }, output = "raster" ) # compute canopy cover in trees and canopy mean height in trees # in region of interest, because it is not in previous step. r_treechm <- segms$filled_dem # set chm to NA in non segment area r_treechm[segms$segments_id == 0] <- NA # compute raster metrics metrics_trees2 <- lidaRtRee::raster_metrics( raster::crop(r_treechm, raster::extent(b_box[1], b_box[3], b_box[2], b_box[4])), res = resolution, fun = function(x) { c( sum(!is.na(x$filled_dem)) / (resolution / aba_res_chm)^2, mean(x$filled_dem, na.rm = T) ) }, output = "raster" ) names(metrics_trees2) <- c("TreeCanopy_cover_in_plot", "TreeCanopy_meanH_in_plot") # # compute 1D height metrics # remove buffer points a <- lidR::filter_poi(a, buffer == 0) # if (length(a) == 0) { return(NULL) } # all points metrics metrics_points <- lidR::grid_metrics(a, aba_point_metrics_fun, res = resolution) # # extend / crop to match metrics_points metrics_trees <- raster::extend(metrics_trees, metrics_points, values = NA) metrics_trees2 <- raster::extend(metrics_trees2, metrics_points, values = NA) metrics_trees <- raster::crop(metrics_trees, metrics_points) metrics_trees2 <- raster::crop(metrics_trees2, metrics_points) # merge rasterstacks metrics <- metrics_points metrics <- raster::addLayer(metrics, metrics_trees) metrics <- raster::addLayer(metrics, metrics_trees2) return(metrics) } # mosaic rasters names_metrics <- names(metrics[[1]]) metrics_map <- do.call(raster::merge, metrics) names(metrics_map) <- names_metrics ``` The second chunk of code computes terrain metrics from the ALS tiles with altitude values. ```{r batchProcessingTerrain, include=TRUE, message=FALSE, warning=FALSE} # function to compute terrain metrics to input to grid_metrics f <- function(x, y, z) { as.list(lidaRtRee::terrain_points_metrics(data.frame(x, y, z))) } # metrics_terrain <- foreach::foreach(i = 1:nrow(cata_altitude), .errorhandling = "remove") %dopar% { # tile extent b_box <- as.numeric(cata_altitude@data[i, c("Min.X", "Min.Y", "Max.X", "Max.Y")]) # load tile extent plus buffer a <- try(lidR::clip_rectangle( cata_altitude, b_box[1], b_box[2], b_box[3], b_box[4] )) # check if points are successfully loaded if (class(a) == "try-error") { return(NULL) } # keep only ground points a <- lidR::filter_ground(a) lidR::grid_metrics(a, ~ f(X, Y, Z), res = resolution) } # mosaic rasters names_metrics <- names(metrics_terrain[[1]]) metrics_terrain <- do.call(raster::merge, metrics_terrain) names(metrics_terrain) <- names_metrics ``` Finally all metrics are combined in a single object. ```{r allMetrics, include=TRUE, message=FALSE, warning=FALSE} metrics_terrain <- raster::extend(metrics_terrain, metrics_map, values = NA) metrics_terrain <- raster::crop(metrics_terrain, metrics_map) metrics_terrain <- raster::dropLayer(metrics_terrain, "adjR2.plane") # merge rasterstacks metrics_map <- raster::addLayer(metrics_map, metrics_terrain) ``` If the metrics maps are to be displayed in external software, the tif format can be used to produce one single with all bands. Meanwhile band names are not retained, so it is useful to save them in a separated R archive. ```{r saveMetrics, include=TRUE, message=FALSE, warning=FALSE} metrics_names <- names(metrics_map) save(metrics_names, file = "../data/aba.model/output/metrics_names.rda") raster::writeRaster(metrics_map, file = "../data/aba.model/output/metrics_map.tif", overwrite = TRUE) save(metrics_map, file = "../data/aba.model/output/metrics_map.rda") ``` ## Forest parameters mapping ### Single (not stratified) model Forest parameters are then mapped using the mapped metrics and the previously calibrated model with the function `lidaRtRee::aba_predict`. ```{r mapForest, include=TRUE, message=FALSE, warning=FALSE, fig.width = 5.1, fig.height = 3} prediction_map <- lidaRtRee::aba_predict(model_aba, metrics_map) raster::plot(prediction_map, main = "prediction map") ``` ### Stratified models In case strata-specific models have been calibrated, it is necessary to add a layer containing the strata information in the metrics raster. For this, the existing vector layer of public forests is rasterized and added to the metrics raster. The levels in the layer metadata are filled, both in the ID (code: numeric) and stratum (label: character) fields. ```{r rasterizeStrat, include=TRUE, message=FALSE, warning=FALSE} # rasterize layer of public forest using value = 1 in polygons metrics_map$stratum <- raster::rasterize(public, metrics_map, field = 1) # set to 0 in non-public forests areas metrics_map$stratum[is.na(metrics_map$stratum)] <- 0 # label levels in the raster metadata levels(metrics_map$stratum) <- data.frame(ID = c(0, 1), stratum = c("private", "public")) # crop public vector public_cropped <- sf::st_crop(public, raster::extent(metrics_map)) ``` Then the function `lidaRtRee::aba_predict` is used to produce the map. If the input is an object containing the stratified model and if the metrics map contains the strata layer, then the model directly maps the output by retaining the corresponding model in each strata. ```{r mapForestStratified, include=TRUE, message=FALSE, warning=FALSE} # map with stratified model prediction_map_mixed <- lidaRtRee::aba_predict(model_aba_stratified_mixed, metrics_map, stratum = "stratum") # for checking purposes # extracted "private" stratum model from combined model model_private <- list(model = model_aba_stratified_mixed$model$private, stats = model_aba_stratified_mixed$stats["private", ]) # produce corresponding map prediction_map_private <- lidaRtRee::aba_predict(model_private, metrics_map) # extracted "public" stratum model from combined model model_public <- list(model = model_aba_stratified_mixed$model$public, stats = model_aba_stratified_mixed$stats["public", ]) # produce corresponding map prediction_map_public <- lidaRtRee::aba_predict(model_public, metrics_map) ``` ```{r plotmapForestStratified, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 9, fig.height = 5} # plot all maps par(mfrow = c(2, 2)) # plot raster and vector raster::plot(metrics_map$stratum, main = "Ownership", legend = FALSE, col = c("green", "blue"), xaxt = "n", yaxt = "n") legend("bottom", legend = c("private", "public"), fill = c("green", "blue")) limits <- range(c(raster::values(prediction_map_private), raster::values(prediction_map_public))) plot(public_cropped, col = NA, add = TRUE) raster::plot(prediction_map_mixed, main = "Stratified model", zlim = limits, xaxt = "n", yaxt = "n") plot(public_cropped, col = NA, add = TRUE) raster::plot(prediction_map_private, main = "Private model", zlim = limits, xaxt = "n", yaxt = "n", legend = FALSE) raster::plot(prediction_map_public, main = "Public model", zlim = limits, xaxt = "n", yaxt = "n", legend = FALSE) ``` ### Forest mask and thresholds To avoid applying models in non-forest areas and to remove the extremes values that may have been predicted due to outliers in the ALS and metrics values, the function `lidaRtRee::clean_raster` can be applied: * it applies a lower and upper threshold to map values, * it sets to 0 the NA values in the map, * it then multiplies the map with a mask where non-forest cells should have an NA value for those to propagate in the prediction map. The first step is to rasterize the forest mask. In the current example, all the area is forested so we will used the polygon with ID 27 as mask. ```{r cleanMap, include=TRUE, message=FALSE, warning=FALSE} # rasterize the forest mask forest_mask <- raster::rasterize(forest, metrics_map, field = "FID") # set values to 1 in polygon with ID 27, 0 outside raster::values(forest_mask) <- ifelse(raster::values(forest_mask) == 27, 1, NA) # crop forest vector forest_cropped <- sf::st_crop(forest, raster::extent(metrics_map)) # clean raster # threshold values and set to NA outside of forest prediction_map_mixed_clean <- lidaRtRee::clean_raster(prediction_map_mixed, c(0, 80), forest_mask) ``` ```{r plotcleanMap, include=TRUE, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 2.5} par(mfrow = c(1, 3)) raster::plot(forest_mask, main = "Forest mask") plot(sf::st_geometry(forest_cropped), add = TRUE) raster::plot(prediction_map_mixed, main = "Stratified predictions map") raster::plot(prediction_map_mixed_clean, main = "Masked map") plot(sf::st_geometry(forest_cropped), add = TRUE) ``` # Inference Work in progress...