Commit 601fe438 authored by Monnet Jean-Matthieu's avatar Monnet Jean-Matthieu
Browse files

Updated area-based after correction of point metrics (buffer removal before computation).

No related merge requests found
Showing with 134 additions and 169 deletions
+134 -169
......@@ -29,7 +29,7 @@ Many thanks to Pascal Obstétar for checking code and improvement suggestions.
# Load data
The "Quatre Montagnes" dataset from France, prepared as described in the [data preparation tutorial](https://gitlab.irstea.fr/jean-matthieu.monnet/lidartree_tutorials/-/blob/master/R/area-based.1.data.preparation.Rmd) is loaded from the R archive files located in the folder "data/aba.model/output".
The "Quatre Montagnes" dataset from France, prepared as described in the [data preparation tutorial](https://gitlab.irstea.fr/jean-matthieu.monnet/lidartree_tutorials/-/blob/master/R/area-based.1.data.preparation.Rmd) is loaded from the `R` archive files located in the folder "data/aba.model/output".
## Field data
......@@ -88,7 +88,7 @@ Two types of vegetation metrics can be computed.
## Point cloud metrics
Point cloud metrics are computed with the function `lidaRtRee::clouds_metrics`, which applies the `lidR::cloud_metrics` to all point clouds in a list. Default computed metrics are those proposed by the function [`lidR::stdmetrics`](https://github.com/Jean-Romain/lidR/wiki/stdmetrics). Additional metrics are available with the function `lidaRtRee::aba_metrics`. The buffer points, which are located outside of the plot extent inventoried on the field, should be removed before computing those metrics
Point cloud metrics are computed with the function `lidaRtRee::clouds_metrics`, which applies the function `lidR::cloud_metrics` to all point clouds in a list. Default computed metrics are those proposed by the function [`lidR::stdmetrics`](https://github.com/Jean-Romain/lidR/wiki/stdmetrics). Additional metrics are available with the function `lidaRtRee::aba_metrics`. The buffer points, which are located outside of the plot extent inventoried on the field, should be removed before computing those metrics.
```{r computeMetrics, include=TRUE}
# define function for later use
......@@ -159,37 +159,27 @@ model_aba$stats
The function computes values predicted in leave-one-out cross-validation, by using the same combination of dependent variables and fitting the regression coefficients with all observations except one. Predicted values can be plotted against field values with the function `lidaRtRee::aba_plot`. It is also informative to check the correlation of prediction errors with other forest or environmental variables.
In this example, only tree metrics are selected in the basal area prediction model. The model seems to fail to predict large values. The prediction errors are positively correlated with basal area because large values are under-estimated.
The model seems to fail to predict large values, and the prediction errors are positively correlated with basal area.
```{r modelPlot, include=TRUE, fig.height = 4.5, fig.width = 8}
```{r modelCorrelation, include=TRUE, fig.height = 4.5, fig.width = 8}
# check correlation between errors and other variables
round(cor(cbind(model_aba$values$residual, plots[subsample, c("G_m2_ha", "N_ha", "D_mean_cm")], metrics_terrain[subsample, 1:3])), 2)[1, ]
# significance of correlation value
cor.test(model_aba$values$residual, plots[subsample, variable])
# plot predicted VS field values
par(mfrow = c(1, 2))
lidaRtRee::aba_plot(model_aba, main = variable)
plot(plots[subsample, c("G_m2_ha")], model_aba$values$residual, ylab = "Prediction errors", xlab = "Field values")
abline(h = 0, lty = 2)
```
In case only point cloud metrics are used as potential inputs, the errors are hardly better distributed. Coloring points by ownership shows that plots located in private forests have the largest basal area values which tend to be under-estimated.
```{r metrics_pointsOnly, include=TRUE, fig.height = 4.5, fig.width = 8}
model_aba_metrics_points <- lidaRtRee::aba_build_model(plots[subsample, variable], metrics_points[subsample, ], transform = "boxcox", nmax = 4, xy = plots[subsample, c("X", "Y")])
# renames outputs
row.names(model_aba_metrics_points$stats) <- names(model_aba_metrics_points$model) <- variable
# model_aba_metrics_points$model[[variable]]
model_aba_metrics_points$stats
# cor.test(model_aba_metrics_points$values$residual, plots[subsample, variable])
Coloring points by ownership shows that plots located in private forests have the largest basal area values.
```{r modelPlot, include=TRUE, fig.height = 4.5, fig.width = 8}
par(mfrow = c(1, 2))
# plot predicted VS field values
lidaRtRee::aba_plot(model_aba_metrics_points,
lidaRtRee::aba_plot(model_aba,
main = variable,
col = ifelse(plots$stratum == "public", "green", "blue")
)
legend("topleft", c("public", "private"), col = c("green", "blue"), pch = 1)
plot(plots[subsample, c("G_m2_ha")],
model_aba_metrics_points$values$residual,
model_aba$values$residual,
ylab = "Prediction errors", xlab = "Field values",
col = ifelse(plots$stratum == "public", "green", "blue")
)
......@@ -220,6 +210,8 @@ The obtained models are presented below. The table columns correspond to:
* `CV-RMSE.%` coefficient of variation of the Root Mean Square Errors of prediction in CV (%),
* `CV-RMSE` Root Mean Square Error of prediction in CV.
The two largest (outlier) values of mean diameter are underestimated by the model, which greatly decreases the accuracy statistics. This might be explained by the fact that when trees reach maturity, diameter growth continues while height growth almost stops. As the ALS point cloud mostly contains height information, there is some signal saturation for high mean diameter values. It might also be the case for high biomass values.
```{r multipleModelsTable, echo = FALSE, fig.width = 12, fig.height = 4.5}
# prepare output for report
table_output <- cbind(
......@@ -242,14 +234,12 @@ rm(models_aba, model_stats)
## Motivation
When calibrating a statistical relationship between forest stand parameters, which are usually derived from diameter measurements, and ALS metrics, one relies on the hypothesis that the interaction of laser pulses with the leaves and branches structure is constant on the whole area. However, differences can be expected either due to variations in acquisition settings (flight parameters, scanner model), in forests (stand structure and composition) or in topography (slope). Better models might be obtained when calibrating stratum-specific relationships, provided each stratum is more homogeneous regarding the laser / vegetation interaction. A trade-off has to be achieved between the within-strata homogeneity and the number of available plots for calibration in each stratum. A minimum number of plots is approximately 50, while 100 would be recommended. In this example we hypothesize that ownership reflects both structure and composition differences in forest stands.
When calibrating a statistical relationship between forest stand parameters, which are usually derived from diameter measurements, and ALS metrics, one relies on the hypothesis that the interaction of laser pulses with the leaves and branches structure is constant on the whole area. However, differences can be expected either due to variations in acquisition settings (flight parameters, scanner model), in forests (stand structure and composition) or in topography (slope). Better models might be obtained when calibrating stratum-specific relationships, provided each stratum is more homogeneous regarding the laser interaction with the vegetation. A trade-off has to be achieved between the within-strata homogeneity and the number of available plots for calibration in each stratum. A minimum number of plots is approximately 50, while 100 would be recommended. In this example we hypothesize that ownership reflects both structure and composition differences in forest stands.
## Calibration of stratum-specific models
Stratum-specific models are computed and stored in a list during a `for` loop. The function `lidaRtRee::aba_combine_strata` then combines the list of models corresponding to each stratum to compute aggregated statistics for all plots, making it easier to compare stratified with non-stratified models.
In this example, the model for "private" ownership yields a large error on the plot "Verc-C5-1", which considerably lowers the accuracy of the stratified approach.
```{r stratifiedmodelCalibration, include=TRUE, warning = FALSE}
# stratification variable
strat <- "stratum"
......@@ -335,6 +325,9 @@ The following lines save the data required for the [area-based mapping step](htt
save(model_aba_stratified_mixed, model_aba, aba_point_metrics_fun, aba_res_chm,
file = "../data/aba.model/output/models.rda"
)
```
```{r saveForlidaRtRee, include=FALSE, eval=FALSE}
# save data for lidaRtRee package
# quatre_montagnes <- cbind(plots, metrics)
# save(quatre_montagnes, file = "quatre_montagnes.rda")
......
No preview for this file type
This source diff could not be displayed because it is too large. You can view the blob instead.
No preview for this file type
This source diff could not be displayed because it is too large. You can view the blob instead.
No preview for this file type
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment