--- title: "R workflow for tree segmentation from ALS data" author: "Jean-Matthieu Monnet" date: "`r Sys.Date()`" output: html_document: default pdf_document: default papersize: a4 bibliography: "../bib/bibliography.bib" --- ```{r setup, include=FALSE} # erase all cat("\014") rm(list = ls()) # knit options knitr::opts_chunk$set(echo = TRUE) # Set so that long lines in R will be wrapped: knitr::opts_chunk$set(tidy.opts = list(width.cutoff = 80), tidy = TRUE) knitr::opts_chunk$set(fig.align = "center") # for display of rgl in html knitr::knit_hooks$set(webgl = rgl::hook_webgl) # output to html html <- TRUE ``` --- The code below presents a tree segmentation workflow from Airborne Laser Scanning (lidar remote sensing) data. The workflow is based on functions from `R` packages [lidaRtRee](https://cran.r-project.org/package=lidaRtRee) (tested with version `r packageVersion("lidaRtRee")`) and [lidR](https://cran.r-project.org/package=lidR) (tested with version `r packageVersion("lidR")`), and it includes the following steps: * treetop detection, + crown segmentation, + accuracy assessment with field inventory, + species classification Steps 1 and 3 are documented in [@Monnet10; @Monnet11c]. The detection performance of this algorithm was evaluated in a benchmark [@Eysn15]. Licence: GNU GPLv3 / [source page](https://gitlab.irstea.fr/jean-matthieu.monnet/lidartree_tutorials/-/blob/master/R/tree.detection.Rmd) Many thanks to Pascal Obstétar for checking code and improvement suggestions. ## Material ### Field inventory The field inventory corresponds to a 50 m x 50 m plot located in the Chablais mountain (France) [@Monnet11c, pp. 34]. All trees with a diameter at breast height above 7.5 cm are inventoried. The data is available in package `lidaRtRee`. ```{r loadTreeInventory, include = TRUE} # load dataset from package (default) data(tree_inventory_chablais3, package = "lidaRtRee") ``` Otherwise you can load your own data provided positions and heights are measured. ```{r prepareTreeInventory, eval=FALSE} # import field inventory fichier <- "chablais3_listeR.csv" tree.inventory <- read.csv(file = fichier, sep = ";", header = F, stringsAsFactors = TRUE) names(tree_inventory_chablais3) <- c("x", "y", "d", "h", "n", "s", "e", "t") # save as rda for later access # save(tree_inventory_chablais3,file="tree_inventory_chablais3.rda") ``` Attributes are: * `x` easting coordinate in Lambert 93 + `y` northing coordinate in Lambert 93 + `d` diameter at breast height (cm) + `h` tree height (m) + `n` tree number + `s` species abreviated as GESP (GEnus SPecies) + `e` appearance (0: missing or lying, 1: normal, 2: broken treetop, 3: dead with branches, 4: snag) + `t` tilted (0: no, 1: yes) ```{r displaytreeinventory, echo=FALSE} head(tree_inventory_chablais3, n = 3L) ``` Function `plot_tree_inventory` is designed to plot forest inventory data. Main species are European beech (*Fagus sylvatica*), Norway spruce (*Picea abies*) and silver fir (*Abies alba*). ```{r plotTreeInventory, include = TRUE, out.width = '50%', fig.dim=c(6.5, 5.5)} # display inventoried trees lidaRtRee::plot_tree_inventory(tree_inventory_chablais3 [, c("x", "y")], tree_inventory_chablais3$h, species = as.character(tree_inventory_chablais3$s) ) ``` The `ggplot2` package also provides nice outputs. ```{r ggplot2, include = TRUE, out.width = '50%', fig.dim=c(5.5, 5.5)} # use table of species of package lidaRtRee to always use the same color for a given species plot.species <- lidaRtRee::species_color()[levels(tree_inventory_chablais3$s), "col"] library(ggplot2) ggplot(tree_inventory_chablais3, aes(x = x, y = y, group = s)) + geom_point(aes(color = s, size = d)) + coord_sf(datum = 2154) + scale_color_manual(values = plot.species) + scale_radius(name = "Diameter") + geom_text(aes(label = n, size = 20), hjust = 0, vjust = 1) + labs(color = "Species") # titre de la légende ``` We define the region of interest (ROI) to crop ALS data to corresponding extent before further processing. ROI is set on the extent of tree inventory, plus a 10 meter buffer. ```{r roi, include = TRUE} # buffer to apply around ROI (meters) ROI_buffer <- 10 # ROI limits ROI_range <- data.frame(round(apply(tree_inventory_chablais3[, c("x", "y")], 2, range))) ``` ### Airborne Laser Scanning data In this tutorial, ALS data available in the `lidaRtRee` package is used. ```{r loadALS, include = TRUE, message = FALSE} # load data in package lidaRtRee (default) LASfile <- system.file("extdata", "las_chablais3.laz", package="lidaRtRee") las_chablais3 <- lidR::readLAS(LASfile) ``` Otherwise, ALS data is loaded with functions of package `lidR`. First a catalog of files containing ALS data is built. Then points located inside our ROI are loaded. ```{r prepareALS, eval=FALSE, message = FALSE, warning = FALSE} # directory for laz files lazdir <- "../data/tree.detection" # build catalog of files # specifying ALS data cata <- lidR::readALSLAScatalog(lazdir) # set coordinate system lidR::projection(cata) <- 2154 # extract points in ROI plus additional buffer las_chablais3 <- lidR::clip_rectangle( cata, ROI_range$x[1] - ROI_buffer - 5, ROI_range$y[1] - ROI_buffer - 5, ROI_range$x[2] + ROI_buffer + 5, ROI_range$y[2] + ROI_buffer + 5 ) # save as rda for easier access: # save(las_chablais3, file="las_chablais3.rda", compress = "bzip2") ``` ## Data preparation ### Digital Elevation Models From the ALS point cloud, digital elevation models are computed [@Monnet11c, pp. 43-46]. The Digital Terrain Model (DTM) represents the ground surface, it is computed by bilinear interpolation of points classified as ground. The Digital Surface Model (DSM) represents the canopy surface, it is computed by retaining in each raster's pixel the value of the highest point included in that pixel. The Canopy Height Model (CHM) is the normalized height of the canopy. It is computed by subtracting the DTM to the DSM. ```{r computeDEMs, include=TRUE, warning = FALSE} # define extent and resolution of raster output_raster <- raster::raster(resolution = 0.5, xmn = ROI_range$x[1] - ROI_buffer, xmx = ROI_range$x[2] + ROI_buffer, ymn = ROI_range$y[1] - ROI_buffer, ymx = ROI_range$y[2] + ROI_buffer ) # terrain model computed from points classified as ground dtm <- lidR::grid_terrain(las_chablais3, output_raster, lidR::tin()) # dtm_lidaRtRee <- lidaRtRee::points2DTM(lidR::filter_ground(las_chablais3), # res = 0.5, ROI_range$x[1] - ROI_buffer, # ROI_range$x[2] + ROI_buffer, ROI_range$y[1] - ROI_buffer, # ROI_range$y[2] + ROI_buffer # ) # surface model dsm <- lidR::grid_canopy(lidR::clip_roi(las_chablais3, raster::extent(output_raster)), output_raster, lidR::p2r()) # dsm_lidaRtRee <- lidaRtRee::points2DSM(las_chablais3, # res = 0.5, dtm@extent@xmin, # dtm@extent@xmax, dtm@extent@ymin, dtm@extent@ymax # ) # canopy height model chm <- dsm - dtm ``` ```{r plotDEMs, echo=FALSE, fig.width = 12, fig.height = 4.5, out.width='100%'} par(mfrow = c(1, 3)) # display DTM raster::plot(dtm, main = "DTM") # display DSM raster::plot(dsm, main = "DSM") # display CHM raster::plot(chm, main = "CHM") ``` ### Visual comparison of field inventory and ALS data A plot mask is computed from the inventoried positions, using a height-dependent buffer. Indeed, tree tops are not necessarily located verticaly above the trunk positions. In order to compare detected tree tops with inventoried trunks, buffers have to be applied to trunk positions to account for the non-verticality of trees. ```{r computeMask, include=TRUE, message=FALSE} # plot mask computation based on inventoried positions # convex hull of plot mask_chull <- lidaRtRee::raster_chull_mask(tree_inventory_chablais3[, c("x", "y")], dsm) # union of buffers around trees mask_tree <- lidaRtRee::raster_xy_mask( tree_inventory_chablais3[, c("x", "y")], 2.1 + 0.14 * tree_inventory_chablais3$h, dsm ) # union of convexHull and tree buffers mask_plot <- max(mask_chull, mask_tree) # vectorize plot mask mask_plot_v <- raster::rasterToPolygons(mask_plot, function(x) (x == 1), dissolve = T) ``` Displaying inventoried trees on the CHM shows a pretty good agreement of crowns visible in the CHM with trunk locations and sizes. ```{r plotPlot, include = TRUE, out.width = '70%', fig.dim=c(6.5, 4.5), warnings=FALSE} # display CHM raster::plot(chm, col = gray(seq(0, 1, 1 / 255)), main = "Canopy Height Model and tree positions" ) # add inventoried trees lidaRtRee::plot_tree_inventory(tree_inventory_chablais3[, c("x", "y")], tree_inventory_chablais3$h, species = as.character(tree_inventory_chablais3$s), add = TRUE ) # display plot mask raster::plot(mask_plot_v, border = "red", add = TRUE) ``` ## Tree delineation ### Segmentation Tree segmentation is performed on the Canopy Height Model by using a general function (`lidaRtRee::tree_segmentation`) which consists in the following steps: * image pre-processing (non-linear filtering and smoothing for noise removal), + local maxima filtering and selection for apex (local maximum) detection, + image segmentation with a watershed algorithm for crown delineation. The first two steps are documented in @Monnet11c, pp. 47-52. ```{r plotSegm, include=TRUE, fig.width = 12, fig.height = 4.5, out.width='100%'} # tree detection (default settings), applied on canopy height model segms <- lidaRtRee::tree_segmentation(chm) # par(mfrow = c(1, 3)) # display pre-processed chm raster::plot(segms$smoothed_dem, main = "Pre-processed CHM") # display selected local maxima raster::plot(segms$local_maxima, main = "Selected local maxima") # display segments, except ground segment dummy <- segms$segments_id dummy[dummy == 0] <- NA raster::plot(dummy %% 8, main = "Segments (random colors)", col = rainbow(8), legend = FALSE) ``` ### Extraction of apices positions and attributes A `data.frame` of detected apices located in the plot mask is then extracted with the function `tree_extraction.` Segments can be converted from raster to polygons but the operation is quite slow. Attributes are : * `id`: apex id + `x`: easting coordinate of apex + `y`: northing coordinate of apex + `h`: height of apex + `dom_radius`: distance of apex to nearest higher pixel of CHM + `s`: crown surface + `v`: crown volume + `sp`: crown surface inside plot + `vp`: crown volume inside plot ```{r plotSegmTrees, include=TRUE, out.width = '50%', fig.dim=c(4.5, 4.5)} # tree extraction only inside plot mask for subsequent comparison apices <- lidaRtRee::tree_extraction( segms$filled_dem, segms$local_maxima, segms$segments_id, mask_plot ) head(apices, n = 3L) # convert segments from raster to polygons segments_v <- raster::rasterToPolygons(segms$segments_id, dissolve = T) # # display initial image raster::plot(chm, col = gray(seq(0, 1, 1 / 255)), main = "CHM and detected positions") # display segments border sp::plot(segments_v, border = "white", add = T) # display plot mask sp::plot(mask_plot_v, border = "red", add = T) # display detected apices graphics::points(apices$x, apices$y, col = "blue", cex = apices$h / 20, pch = 2) ``` ## Detection evaluation ### Tree matching To assess detection accuracy, reference (field) trees should be linked to detected apices. Despite the possibility of error, automated matching has the advantage of making the comparison of results from different algorithms and settings reproducible and objective. The algorithm presented below is based on the 3D distance between detected treetops and inventory positions and heights [@Monnet10]. It returns an object with the pairs of matched reference trees and detected apices. The function `lidaRtRee::plot_matched` then plots the results. ```{r plotMached, include=TRUE, out.width = '70%', fig.dim=c(6.5, 4.5)} # match detected apices with field trees based on relative distance of apices matched <- lidaRtRee::tree_matching( tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h) ) # display matching results lidaRtRee::plot_matched( tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h), matched, chm, mask_plot_v ) ``` ### Detection accuracy ```{r detectionStats, include=FALSE} # height histogram of detections detection_stats <- lidaRtRee::hist_detection( tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h), matched ) ``` Detection accuracy is evaluated thanks to the number of correct detections (`r detection_stats$true_detections`), false detections (`r detection_stats$false_detections`) and omissions (`r detection_stats$omissions`). In heterogeneous stands, detection accuracy is height-dependent, it is informative to display those categories on a height histogram. ```{r plotDetection, include=TRUE, out.width = '40%', fig.dim=c(4.5, 4.5)} # height histogram of detections detection_stats <- lidaRtRee::hist_detection( tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h), matched ) ``` ### Height estimation accuracy ```{r heightRegression, include=FALSE} height_reg <- lidaRtRee::height_regression(tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h), matched, species = tree_inventory_chablais3$s ) ``` For true detections, estimated heights can be compared to field-measured heights. Here, the root mean square error is `r round(height_reg$stats$rmse,1)`m, while the bias is `r round(height_reg$stats$bias,1)`m. The linear regression is displayed hereafter. ```{r plotRegression, include=TRUE, out.width = '40%', fig.dim=c(4.5, 4.5)} # linear regression between reference height and estimated height height_reg <- lidaRtRee::height_regression(tree_inventory_chablais3[, c("x", "y", "h")], cbind(apices@coords, apices$h), matched, species = tree_inventory_chablais3$s ) ``` ## Species classification The previous steps also output a segmentation of the CHM, i.e. each detected apex is associated to a crown segment in 2D. Next we aim at checking for relationships between the point cloud contained in the segment, and the species of the largest field tree contained in the segment. The processing steps are: * calculate statistical indices describing the point cloud for each segment, * extract the highest reference tree from each segment, * exploratory analysis of the relationship between the indices and the species. ### Points in segments Before computation of point cloud metrics in each segment, the whole point cloud is normalized to convert point altitude to height above ground. Points are then labeled with the id of the segment they belong to. A list of LAS objects corresponding to the segments is then created. ```{r pointsSegments, include=TRUE, warning=FALSE, message=FALSE} # normalize point cloud lasn <- lidR::normalize_height(las_chablais3, lidR::tin()) # add segment id in LAS object lasn <- lidR::merge_spatial(lasn, segms$segments_id, "seg_id") # put all seg_id values in ordered list liste_seg_id <- sort(unique(lasn$seg_id)) # set names of list equal to values names(liste_seg_id) <- liste_seg_id # extract point cloud for each segment id in a list las_l <- lapply(liste_seg_id, function(x) {lidR::filter_poi(lasn, seg_id == x)}) ``` ### Metrics computation Basic point cloud metrics are computed in each segment (maximum, mean, standard deviation of height, mean and standard deviation of intensity), with the function `lidaRtRee::clouds_metrics` which applies a function to all point clouds in a list, by passing it to `lidR::cloud_metrics`. Please refer to the help of this last function for the expected syntax of the user-defined function to compute metrics. ```{r metrics, include=TRUE, eval=TRUE} # compute basic las metrics in each segment metrics <- lidaRtRee::clouds_metrics(las_l, func = ~ list( maxZ = max(Z), meanZ = mean(Z), sdZ = sd(Z), meanI = mean(Intensity), sdI = sd(Intensity) )) # add segment id attribute metrics$seg_id <- row.names(metrics) head(metrics, n = 3L) ``` ### Merge with reference trees and detected apices Computed metrics are merged with reference trees and detected apices, thanks to the segment id. ```{r mergeMetrics, include=TRUE, eval=TRUE} # associate each reference tree with the segment that contains its trunk. tree_inventory_chablais3$seg_id <- raster::extract( segms$segments_id, tree_inventory_chablais3[, c("x", "y")] ) # create new data.frame by merging metrics and inventoried trees based on segment id tree_metrics <- base::merge(tree_inventory_chablais3, metrics) # remove non-tree segment tree_metrics <- tree_metrics[tree_metrics$seg_id != 0, ] # add metrics to extracted apices data.frame apices <- base::merge(apices, metrics, by.x = "id", by.y = "seg_id", all.x = T) ``` Plotting the reference trees with the mean intensity of lidar points in the segments shows that when they are dominant, broadleaved trees tend to have higher mean intensity than coniferous trees. ```{r Intensity, include=TRUE, out.width = '70%', fig.dim=c(6.5, 4.5)} # create raster of segment' mean intensity r_mean_intensity_segm <- segms$segments_id # match segment id with id in metrics data.frame dummy <- match(raster::values(r_mean_intensity_segm), apices$id) # replace segment id with corresponding mean intensity raster::values(r_mean_intensity_segm) <- apices$meanI[dummy] # display tree inventory with mean intensity in segment background raster::plot(r_mean_intensity_segm, main = "Mean intensity in segment") # display tree inventory lidaRtRee::plot_tree_inventory(tree_inventory_chablais3[, c("x", "y")], tree_inventory_chablais3$h, species = as.character(tree_inventory_chablais3$s), add = T ) ``` ### Exploratory analysis A boxplot of mean intensity in segments per species shows that mean intensity distribution is different between species. The analysis can be run by considering only the highest inventoried trees in each segment, as smaller trees are likely to be suppressed and have smaller contribution to the point cloud. ```{r metricsByHighestTree, include=FALSE, eval=TRUE} # create new variable orderer by segment id then decreasing height tree_metrics_h <- tree_metrics[order(tree_metrics$seg_id, -tree_metrics$h), ] i <- 2 # leave only first (highest) tree per segment id while (i < nrow(tree_metrics_h)) { if (tree_metrics_h$seg_id[i] == tree_metrics_h$seg_id[i + 1]) { tree_metrics_h <- tree_metrics_h[-(i + 1), ] } else { i <- i + 1 } } tree_metrics_h$s <- factor(tree_metrics_h$s) ``` ```{r boxplot1, include=TRUE, out.width = '80%', fig.dim=c(9.5, 5.5)} par(mfrow = c(1, 2)) boxplot(meanI ~ s, data = tree_metrics[, c("s", "maxZ", "meanZ", "sdZ", "meanI", "sdI")], ylab = "Mean intensity in segment", xlab = "Specie", main = "All inventoried trees", las = 2, varwidth = TRUE ) boxplot(meanI ~ s, data = tree_metrics_h, ylab = "Mean intensity in segment", xlab = "Specie", main = "Highest inventoried tree in segment", las = 2, varwidth = TRUE ) ``` ```{r boxplot2, include=FALSE, out.width = '100%', fig.dim=c(9.5, 7.5)} par(mfrow = c(2,3)) boxplot(maxZ ~ s, data = tree_metrics_h, ylab = "Max height in segment", xlab = "Specie", main = "Max height by species", las = 2, varwidth = TRUE ) boxplot(meanZ ~ s, data = tree_metrics_h, ylab = "Mean height in segment", xlab = "Specie", main = "Mean height by species", las = 2, varwidth = TRUE ) boxplot(sdZ ~ s, data = tree_metrics_h, ylab = "Sd(height) in segment", xlab = "Specie", main = "Sd(height) by species", las = 2, varwidth = TRUE ) boxplot(meanI ~ s, data = tree_metrics_h, ylab = "Mean intensity in segment", xlab = "Specie", main = "Mean intensity by species", las = 2, varwidth = TRUE ) boxplot(sdI ~ s, data = tree_metrics_h, ylab = "Sd(intensity) in segment", xlab = "Specie", main = "Sd(intensity) by species", las = 2, varwidth = TRUE ) ``` A linear discriminant analysis shows that it might be possible to discriminate between spruce, fir and the deciduous species, thanks to a combination of height and intensity variables. ```{r AFD, include=TRUE, out.width = '60%', fig.dim=c(6, 6)} # principal component analysis pca <- ade4::dudi.pca(tree_metrics_h[, c("maxZ", "meanZ", "sdZ", "meanI", "sdI")], scannf = F) # linear discriminant analysis lda <- ade4::discrimin(pca, tree_metrics_h$s, scannf = F, nf = 2) plot(lda) ``` The following line creates a new factor variable `Groups` with three groups: the species PIAB, ABAL and all others together. ```{r AFD.groups, include=TRUE, out.width = '60%', fig.dim=c(6, 6)} tree_metrics_h$Groups <- tree_metrics_h$s levels(tree_metrics_h$Groups)[!is.element(levels(tree_metrics_h$Groups), c("ABAL", "PIAB"))] <- "Other" ``` A linear discriminant analysis is performed on this new variable. ```{r AFD.groups.update, include=TRUE, out.width = '60%', fig.dim=c(6, 6)} lda <- ade4::discrimin(pca, tree_metrics_h$Groups, scannf = F, nf = 2) plot(lda) ``` ### Classification The function `MASS::lda` is used for prediction purposes. First, the cross-validation option (`CV = TRUE`) is chosen in order to produce a confusion matrix in a prediction case. ```{r AFD.MASS, include=TRUE, out.width = '60%', fig.dim=c(6, 6)} lda_MASS <- MASS::lda(tree_metrics_h[, c("maxZ", "meanZ", "sdZ", "meanI", "sdI")], tree_metrics_h$Groups, CV = TRUE) # confusion matrix matrix_confusion <- table(tree_metrics_h$Groups, lda_MASS$class) matrix_confusion # percentage of good classification round(sum(diag(matrix_confusion)) / sum(matrix_confusion) * 100, 1) # confidence interval of the percentage binom.test(sum(diag(matrix_confusion)), sum(matrix_confusion))$conf.int ``` Then the model is fitted and applied to all segments of the map. ```{r prediction.MASS, include=TRUE, fig.dim=c(9, 9)} # build model lda_MASS <- MASS::lda(tree_metrics_h[, c("maxZ", "meanZ", "sdZ", "meanI", "sdI")], tree_metrics_h$Groups) # apply model to metrics metrics$predicted_s <- predict(lda_MASS, metrics[, c("maxZ", "meanZ", "sdZ", "meanI", "sdI")])$class # apply model to trees tree_metrics_h$predicted_s <- predict(lda_MASS, tree_metrics_h[, c("maxZ", "meanZ", "sdZ", "meanI", "sdI")])$class ``` To display the classification results, an image of the classified segments is created and the highest field trees in each segment are displayed on top of it. Detentions are correct when: * purple dots are on purple segments (correct ABAL classification), * blue dots are on blue segments (correct PIAB classification), * other colors are on green segments (correct "others" classification). ```{r display.prediction.MASS, include=TRUE, fig.dim=c(7, 7)} # create image of predicted species species <- segms$segments_id # replace segment id by predicted species in image raster::values(species) <- metrics$predicted_s[match(raster::values(segms$segments_id), metrics$seg_id)] # remove ground segment species[segms$segments_id == 0] <- NA # convert to factor raster species <- raster::as.factor(species) # build raster attribute table rat rat <- raster::levels(species)[[1]] rat$Species <- levels(metrics$predicted_s) levels(species)[[1]] <- rat # retrieve reference colors target_col <- lidaRtRee::species_color()[rat$Species, "col"] # set NA color to green target_col[is.na(target_col)] <- "green" # display results raster::plot(species, col = target_col, legend = FALSE) legend("bottomright", legend=rat$Species, fill=target_col) sp::plot(segments_v, add = TRUE, border = "white") lidaRtRee::plot_tree_inventory(tree_metrics_h [, c("x", "y")], tree_metrics_h$h, bg = lidaRtRee::species_color()[as.character(tree_metrics_h$s), "col"], col = "black", pch = 21, add = TRUE ) ``` ```{r saveGIS, include=FALSE, warning=FALSE, message=FALSE} # # extract all apices # apices_all <- lidaRtRee::tree_extraction(segms$filled_dem, segms$local_maxima, segms$segments_id) # # save outputs # raster::writeRaster(chm, file = "./data/output/chm.tif", overwrite = TRUE) # raster::writeRaster(species, file = "./data/output/r_species.tif", overwrite = TRUE) # raster::writeRaster(segms$segments_id, file = "./data/output/r_segments.tif", overwrite = TRUE) # rgdal::writeOGR(segments_v, "./data/output/", "v_segments", driver = "ESRI Shapefile", overwrite = TRUE) # write.csv(apices_all, "./data/output/apices.csv", row.names = FALSE) # write.csv(tree_metrics_h, "./data/output/correct_detections.csv", row.names = FALSE) # write.csv(tree_inventory_chablais3, "./data/output/tree_inventory_chablais.csv", row.names = FALSE) ``` ```{r test, include=FALSE, fig.dim=c(9, 9)} # add attributes of apices to polygons by merging # !!!! for some reason the segment_id is messed up during the process # segments_v@data <- merge(segments_v@data, metrics, by.x = "segments_id", by.y = "seg_id") # remove ground segment # segments_v <- segments_v[segments_v$segments_id != 0,] ``` ## Display point cloud The point cloud can be displayed colored by segment, with poles at the location of inventoried trees. ```{r displayPointCloud, include=TRUE, eval=html, webgl=TRUE, fig.width=6, fig.height=6, warning=FALSE} rgl::par3d(mouseMode = "trackball") # parameters for interaction with mouse # select segment points and offset them to avoid truncated coordinates in 3d plot points.seg <- lasn@data[which(lasn@data$seg_id != 0), c("X", "Y", "Z", "seg_id")] points.seg$X <- points.seg$X - 974300 points.seg$Y <- points.seg$Y - 6581600 # plot point cloud rgl::plot3d(points.seg[, c("X", "Y", "Z")], col = points.seg$seg_id %% 10 + 1, aspect = FALSE) # # add inventoried trees tree_inventory_chablais3$z <- 0 for (i in 1:nrow(tree_inventory_chablais3)) { rgl::lines3d( rbind( tree_inventory_chablais3$x[i] - 974300, tree_inventory_chablais3$x[i] - 974300 ), rbind( tree_inventory_chablais3$y[i] - 6581600, tree_inventory_chablais3$y[i] - 6581600 ), rbind( tree_inventory_chablais3$z[i], tree_inventory_chablais3$z[i] + tree_inventory_chablais3$h[i] ) ) } ``` ```{r plotTreeModel1, include=FALSE, eval=FALSE} # Inventoried trees # Using package rLiDAR, a 3d view of the field inventory can be displayed # shape different between coniferous / deciduous rgl::rgl.open() rgl::rgl.bg(color = "white") for (i in 1:nrow(tree_inventory_chablais3)) { if (is.na(tree_inventory_chablais3$h[i]) | is.na(tree_inventory_chablais3$d[i])) { next } if (!is.element( as.character(tree_inventory_chablais3$s[i]), c("ABAL", "PIAB", "TABA") )) { rLiDAR::LiDARForestStand( crownshape = "halfellipsoid", CL = 0.6 * tree_inventory_chablais3$h[i], CW = tree_inventory_chablais3$h[i] / 4, HCB = 0.4 * tree_inventory_chablais3$h[i], dbh = tree_inventory_chablais3$d[i] / 50, resolution = "high", X = tree_inventory_chablais3$x[i], Y = tree_inventory_chablais3$y[i], mesh = F ) } else { rLiDAR::LiDARForestStand( crownshape = "cone", CL = 0.5 * tree_inventory_chablais3$h[i], CW = tree_inventory_chablais3$h[i] / 4, HCB = 0.5 * tree_inventory_chablais3$h[i], dbh = tree_inventory_chablais3$d[i] / 50, resolution = "high", X = tree_inventory_chablais3$x[i], Y = tree_inventory_chablais3$y[i], mesh = F, stemcolor = "burlywood4", crowncolor = "darkgreen" ) } } ``` ```{r plotTreeModel2, include=FALSE, eval=FALSE} # virtual apices from detection # shape different based on mean lidar intensity value (threshold 55) library(rLiDAR) rgl::rgl.open() rgl::rgl.bg(color = "white") for (i in 1:nrow(apices)) { if (apices$meanI[i] > 55) { rLiDAR::LiDARForestStand( crownshape = "halfellipsoid", CL = 0.6 * apices$h[i], CW = sqrt(4 * apices$s[i] / pi), HCB = 0.4 * apices$h[i], dbh = apices$h[i] / 50, resolution = "high", X = apices$x[i], Y = apices$y[i], mesh = F ) } else { rLiDAR::LiDARForestStand( crownshape = "cone", CL = 0.5 * apices$h[i], CW = sqrt(4 * apices$s[i] / pi), HCB = 0.5 * apices$h[i], dbh = apices$h[i] / 50, resolution = "high", X = apices$x[i], Y = apices$y[i], mesh = F, stemcolor = "burlywood4", crowncolor = "darkgreen" ) } } ``` ## Batch processing The following code exemplifies how to process numerous LAS files and extract apices for the whole area with parallel processing. The processing runs faster if data is provided as chunks to the segmentation algorithm, and results are then aggregated, rather than running on the full coverage of the data. In order to avoid border effects, chunks are provided to the algorithm as overlapping tiles. Results are cropped to prevent the same tree from appearing twice in the final results. Tile size and buffer size are important parameters : - tile size is a trade-off between the number of chunks to process and the amount of RAM required to process a single tile ; - buffer size is a trade-off between redundant processing of the overlap area, and assuring that a tree is which treetop is located at the border of a tile has its full crown within the buffer size. Tiles can be processed with parallel computing, within limits of the cluster's RAM and number of cores. The steps for processing a batch of las/laz files are : - build catalog of files - provide tiling parameters - provide segmentation parameters - provide output parameters - set cluster options for parallel processing - compute the X and Y coordinates of tiles - parallelize the processing with the package `future`, by sending to the parallel sessions the coordinates of tiles to process, and a function with the instructions to proceed : - load tile corresponding to the coordinates - compute CHM - segment and extract apices - aggregate list of results Be aware that in case tree segments are vectorized, some obtained polygons might overlap. The segmentation algorithm might not be deterministic and borders are sometimes not consistent when adjacent polygons are pasted from different tiles. ```{r batch, include=TRUE, out.width = '100%', fig.dim=c(8, 8), warning = FALSE} rm(list = ls()) # BUILD CATALOG OF FILES # folder containing the files lazdir <- "../data/forest.structure.metrics" # build catalog cata <- lidR::readALSLAScatalog(lazdir) # disable display of catalog processing lidR::opt_progress(cata) <- FALSE # set coordinate system lidR::projection(cata) <- 2154 # option to read only xyzc attributes (coordinates, intensity, echo order and classification) from height files lidR::opt_select(cata) <- "xyz" # # BATCH PROCESSING PARAMETERS # tile size to split area into chunks to process # trade-off between RAM capacity VS total number of tiles to process tile_size <- 70 # here 70 for example purpose with small area # buffer size: around each tile to avoid border effects on segmentation results # trade-off between making sure a whole tree crown is processed in case its top # is on the border VS duplicate processing buffer_size <- 10 # 5 m is minimum, 10 is probably better depending on tree size # # TREE SEGMENTATION PARAMETERS # set raster resolution resolution <- 1 # # OUTPUT PARAMETERS # option to vectorize crowns (set to FALSE if too slow) out_vectorize_apices <- TRUE # output canopy height models ? (set to FALSE if too much RAM used) out_chm <- TRUE # save chms on disk out_save_chm <- FALSE # # CLUSTER PARAMETERS # specify to use two parallel sessions future::plan("multisession", workers = 2L) # remove warning when using random numbers in parallel sessions options(future.rng.onMisuse = "ignore") # # COORDINATES OF TILES # low left corner x <- seq( from = floor(cata@bbox["x", "min"] / tile_size) * tile_size, by = tile_size, to = ceiling(cata@bbox["x", "max"] / tile_size - 1) * tile_size ) # [1:2] length_x <- length(x) y <- seq( from = floor(cata@bbox["y", "min"] / tile_size) * tile_size, by = tile_size, to = ceiling(cata@bbox["y", "max"] / tile_size - 1) * tile_size ) # [1:2] length_y <- length(y) # repeat coordinates for tiling while area x <- as.list(rep(x, length_y)) y <- as.list(rep(y, each = length_x)) # # PARALLEL PROCESSING # function takes coordinates of tile as arguments resultats <- future.apply::future_mapply( # i and j are the coordinated of the low left corner of the tile to process FUN = function(i, j) { # initialize output output <- list() output$name <- paste0(i, "_", j) # extract tile plus buffer las_tile <- lidR::clip_rectangle( cata, i - buffer_size, j - buffer_size, i + tile_size + buffer_size, j + tile_size + buffer_size ) # check if points are present if (nrow(las_tile@data) > 0) { # normalization if required # las_tile <- lidR::normalize_height(las_tile, lidR::tin()) # in this example LAS tiles are already normalized # compute canopy height model chm <- lidaRtRee::points2DSM(las_tile) # check all chm is not NA if (!all(is.na(raster::values(chm)))) { # output chm if asked for if (out_chm) { output$chm <- raster::crop(chm, raster::extent( i, i + tile_size, j, j + tile_size )) } # save on disk if (out_save_chm) { raster::writeRaster(raster::crop( chm, raster::extent( i, i + tile_size, j, j + tile_size ) ), file = paste0("chm_", i, "_", j, ".tif"), overwrite = TRUE ) } # # tree detection segms <- lidaRtRee::tree_segmentation(chm) # tree extraction apices <- lidaRtRee::tree_extraction( segms$filled_dem, segms$local_maxima, segms$segments_id ) # remove apices in buffer area apices <- apices[apices$x >= i & apices$x < i + tile_size & apices$y >= j & apices$y < j + tile_size, ] # add tile id to apices to avoid duplicates in final file apices$tile <- paste0(i, "_", j) # convert to vectors if option is TRUE if (out_vectorize_apices) { # vectorize apices_v <- raster::rasterToPolygons(segms$segments_id, dissolve = T) # remove polygons which treetop is in buffer apices_v <- apices_v[is.element(apices_v$segments_id, apices$id), ] # names(apices_v) <- "id" # add attributes # errors when using sp::merge so using sp::over even if it is probably slower # merge(apices_v@data, apices, all.x = TRUE) apices_v@data <- cbind(apices_v@data, sp::over(apices_v, apices)) # save in list output$apices_v <- apices_v } # save apices in list output$apices <- apices } # end of raster is not all NAs check } # end of nrow LAS check output }, x, y, SIMPLIFY = FALSE ) # function applied to the lists of coordinates (x and y) # # RESULTS AGGREGATION # extract results from nested list into separate lists and then bind data id <- unlist(lapply(resultats, function(x) x[["name"]])) # # apices apices <- lapply(resultats, function(x) x[["apices"]]) # remove NULL elements apices[sapply(apices, is.null)] <- NULL # bind remaining elements apices <- do.call(rbind, apices) # # chm if (out_chm) { chm <- lapply(resultats, function(x) x[["chm"]]) # merge chm # no names in list otherwise do.call returns an error chm_all <- do.call(raster::merge, chm) names(chm) <- id } # apices_v if (out_vectorize_apices) { apices_v <- lapply(resultats, function(x) x[["apices_v"]]) # remove NULL elements apices_v[sapply(apices_v, is.null)] <- NULL apices_v <- do.call(rbind, apices_v) # 1-pixel overlapping in apices_v might be present because image segmentation # is not fully identical in overlap areas of adjacent tiles. } ``` The following image displays the results for the whole area. ```{r batch.plot, include=TRUE, out.width = '90%', fig.dim=c(8.5, 5.5)} # threshold outsiders in chm chm_all[chm_all > 40] <- 40 chm_all[chm_all < 0] <- 0 # display chm raster::plot(chm_all, main = "Canopy Height Model and segments" ) # display segments border sp::plot(apices_v, border = "white", add = T) # add apices sp::plot(apices, cex = apices$h / 40, add = TRUE, pch = 2) ``` The following lines save outputs to files. ```{r batch.export, eval = FALSE} # merged chm raster::writeRaster(chm_all, file = "chm.tif", overwrite = TRUE) # apices sf::st_write(sf::st_as_sf(apices), 'apices_points.gpkg', 'apices_points', delete_dsn = TRUE) # vectorized apices if (out_vectorize_apices) sf::st_write(sf::st_as_sf(apices_v), 'v_apices_points.gpkg', 'v_apices_points', delete_dsn = T) ``` ## References