biodivMapR.Rmd 15.1 KB
Newer Older
Florian de Boissieu's avatar
Florian de Boissieu committed
1
---
Florian de Boissieu's avatar
Florian de Boissieu committed
2
title: "Tutorial for biodivMapR"
Florian de Boissieu's avatar
Florian de Boissieu committed
3
4
5
author: "Jean-Baptiste Féret, Florian de Boissieu"
date: "`r Sys.Date()`"
output: 
floriandeboissieu's avatar
floriandeboissieu committed
6
  html_vignette:
Florian de Boissieu's avatar
Florian de Boissieu committed
7
8
9
10
11
12
13
    number_sections: true
vignette: >
  %\VignetteIndexEntry{Tutorial}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

14
15
16
17
18
19
20
21
22
```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval=FALSE
)
```


Florian de Boissieu's avatar
Florian de Boissieu committed
23
24
25
26
27
28
29
30
31
32
33
34
35
36
This tutorial aims at describing the processing workflow and giving the associated code to compute $\alpha$ and $\beta$ diversity maps on an extraction of Sentinel-2 image taken over Cameroun forest. The workflow is composed of the following steps:

* define the processing parameters

    * input / output files paths
    * output spatial resolution
    * computing options
    
* compute the $\alpha$ and $\beta$ diversity maps:
    * compute PCA and select best components
    * compute the diversity maps
    
* validate results comparing to field plots measurements

floriandeboissieu's avatar
floriandeboissieu committed
37
Below is the typical flow chart of the computation of diversity maps with __biodivMapR__ :
Florian de Boissieu's avatar
Florian de Boissieu committed
38

floriandeboissieu's avatar
floriandeboissieu committed
39
<img align="bottom" width="100%" height="100%" src="../man/figures/Figure1_FlowChart_biodivMapR_Process_600DPI.png">
Florian de Boissieu's avatar
Florian de Boissieu committed
40

floriandeboissieu's avatar
floriandeboissieu committed
41
42
43


# Define processing parameters
Florian de Boissieu's avatar
Florian de Boissieu committed
44
45

## Input / Output files
jbferet's avatar
jbferet committed
46
47
The input images are expected to be in ENVI HDR format, BIL interleaved. 
The functions `perform_radiometric_filtering` and `perform_PCA` start with a procedure checkin if the image format is as expected. if not, the functions will retrun a message explaining the problem and will stop the process.
Florian de Boissieu's avatar
Florian de Boissieu committed
48

jbferet's avatar
jbferet committed
49
If the format is not ENVI format BIL interleaved, the function `raster2BIL` allows conversion into proper format and returns updated `Input_Image_File`
Florian de Boissieu's avatar
Florian de Boissieu committed
50

jbferet's avatar
jbferet committed
51
52
53
54
55
Spectral bands should be defined if the image is multi or hyperspectral image. 

A mask can also be set to work on a selected part of the input image. The mask is expected to be a raster in the same format as the image (ENVI HDR), with values 0 = masked or 1 = selected. Only one band is required. If no mask is to be used set `Input_Mask_File = FALSE`.

The output directory defined with `Output_Dir` will contain all the results. For each image processed, a subdirectory will be automatically created after its name.
Florian de Boissieu's avatar
Florian de Boissieu committed
56

57
58

```{r Input / Output files}
jbferet's avatar
jbferet committed
59
Input_Image_File  = system.file('extdata', 'RASTER', 'S2A_T33NUD_20180104_Subset', package = 'biodivMapR')
Florian de Boissieu's avatar
Florian de Boissieu committed
60

jbferet's avatar
jbferet committed
61
62
63
64
# Input.Image.File  = raster2BIL(Raster.Path = Input.Image.File,
#                                        Sensor = 'SENTINEL_2A',
#                                        Convert.Integer = TRUE,
#                                        Output.Directory = '~/test')
Florian de Boissieu's avatar
Florian de Boissieu committed
65

jbferet's avatar
jbferet committed
66
67
68
Input_Mask_File   = FALSE

Output_Dir        = 'RESULTS'
Florian de Boissieu's avatar
Florian de Boissieu committed
69
70
```

71
72
The image provided with the package is a subset of tile T33NUD acquired by Sentinel-2A satellite over Cameroonese rainforest in January 4th, 2018. 
<img align="bottom" width="100%" height="100%" src="../man/figures/01_RGB_S2A_T33NUD_20180104_Subset.png">
73

Florian de Boissieu's avatar
Florian de Boissieu committed
74
## Spatial resolution
jbferet's avatar
jbferet committed
75
The algorithm estimates \alpha and \beta diversity within a window, that is also the output spatial resolution. It is defined in number of pixels of the input image with parameter `window_size`, e.g. `window_size = 10` meaning a window of 10x10 pixels. It will be the spatial resolution of the ouput rasters.
Florian de Boissieu's avatar
Florian de Boissieu committed
76
77

As a rule of thumb, spatial units between 0.25 and 4 ha usually match with ground data.
jbferet's avatar
jbferet committed
78
if `window_size` is too small, it will result in low number of pixels per spatial unit, hence limited range of variation of diversity in the image.
Florian de Boissieu's avatar
Florian de Boissieu committed
79

jbferet's avatar
jbferet committed
80
In this example, the spatial resolution of the input raster is 10m. Setting `window_size = 10` will result in diversity maps of spatial resolution 100m x 100m.
Florian de Boissieu's avatar
Florian de Boissieu committed
81

82
```{r Spatial resolution}
jbferet's avatar
jbferet committed
83
window_size = 10
Florian de Boissieu's avatar
Florian de Boissieu committed
84
85
86
87
88
```

## PCA filtering
If set to `TRUE`, a second filtering based on PCA outliers is processed.

89
```{r PCA filtering}
90
FilterPCA = FALSE
Florian de Boissieu's avatar
Florian de Boissieu committed
91
92
93
94
95
```

## Computing options
The use of computing ressources can be controled with the following parameters: 

jbferet's avatar
jbferet committed
96
97
* `nbCPU` controls the parallelisation of the processing: how many CPUs will be asssigned for multithreading, 
* `MaxRAM` controls the size in GB of the input image chunks processed by each thread (this does not correspond to the max amount of RAM allocated),
jbferet's avatar
jbferet committed
98
* `nbclusters` controls the number of clusters defined by k-means clustering for each repetition. Images showing moderate diversity (temperate forests) may need lower number of clusters (20), whereas highly diverse tropical forest require more (50). The larger the value the longer the computation time. Users are invited to experiment with different numbers of clusters in order to identify the snsitivity of divrsity patterns to this parameter.
Florian de Boissieu's avatar
Florian de Boissieu committed
99

100
```{r Computing options}
jbferet's avatar
jbferet committed
101
nbCPU         = 2
Florian de Boissieu's avatar
Florian de Boissieu committed
102
103
104
105
MaxRAM        = 0.5
nbclusters    = 50
```

floriandeboissieu's avatar
floriandeboissieu committed
106
107
# Compute diversity maps

Florian de Boissieu's avatar
Florian de Boissieu committed
108
109
## Mask non vegetated / shaded / cloudy pixels

110
```{r Mask non vegetated / shaded / cloudy pixels}
jbferet's avatar
jbferet committed
111
112
113
NDVI_Thresh = 0.5
Blue_Thresh = 500
NIR_Thresh  = 1500
Florian de Boissieu's avatar
Florian de Boissieu committed
114
print("PERFORM RADIOMETRIC FILTERING")
jbferet's avatar
jbferet committed
115
116
117
Input_Mask_File = perform_radiometric_filtering(Input_Image_File, Input_Mask_File, Output_Dir,
                                            NDVI_Thresh = NDVI_Thresh, Blue_Thresh = Blue_Thresh,
                                            NIR_Thresh = NIR_Thresh)
Florian de Boissieu's avatar
Florian de Boissieu committed
118
119
120
```

## PCA
121
A pixel-based PCA is run on the input image across the spectral bands to select the most interesting spectral information relative to spectral diversity and remove shaded pixels, spatial noise and sensor artefacts. 
Florian de Boissieu's avatar
Florian de Boissieu committed
122

123
124
125
126
127
The resulting PCA raster are then stored in a binalry file in the results directory, which in our case corresponds to
`RESULTS/S2A_T33NUD_20180104_Subset/SPCA/PCA/OutputPCA_8_PCs`

This PCA raster file can be displayed using QGIS or any GIS / image processing software. Here, the PCs corresponding to our image look like this: 

jbferet's avatar
jbferet committed
128
129
<img align="bottom" width="100%" height="100%" src="../man/figures/PCs_1234.png">
<img align="bottom" width="100%" height="100%" src="../man/figures/PCs_5678.png">
130
131
132
133

This PCA band selection left to user judgement, who then writes the band to be kept in a `.txt` file located in the same directory as the PCA raster file. The file is automatically created and ready to edit with function `select_PCA_components`. Each selected band should be identified per line in this file. 
The main goal of PC selection is to discard PCs showing no relevant information corresponding to vegetation, or including artifacts possibly explained by sensor properties. It is somehow a subjective process, and we are currently working on automatic selection of these components.

jbferet's avatar
jbferet committed
134
For this example, PCA bands 1, 2, 5, 6 and 8 can be kept if writing the following lines in file `Selected_Components.txt` opened for edition (do not forget carriage return after last value):
Florian de Boissieu's avatar
Florian de Boissieu committed
135
136
137
138
```
1
2
5
139
140
6
8
jbferet's avatar
jbferet committed
141

Florian de Boissieu's avatar
Florian de Boissieu committed
142
```
143
PC#3 and PC#7 were discarded as the main patterns observed for these components did not correspond to vegetation patterns. As a rule of thumb, between 2 and 6 selected PCs are usually sufficient to catch the main diversity patterns, but this selection strongly depends on the conditions of acquisition, the type of sensor, and the complexity of the vegetation being observed.
Florian de Boissieu's avatar
Florian de Boissieu committed
144
145
Here is the code to perform PCA and select PCA bands:

146
```{r PCA}
Florian de Boissieu's avatar
Florian de Boissieu committed
147
print("PERFORM PCA ON RASTER")
jbferet's avatar
jbferet committed
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
PCA_Output        = perform_PCA(Input_Image_File, Input_Mask_File, Output_Dir,
                               FilterPCA = TRUE, nbCPU = nbCPU,MaxRAM = MaxRAM)
# path for the PCA raster
PCA_Files         = PCA_Output$PCA_Files
# number of pixels used for each partition used for k-means clustering
Pix_Per_Partition = PCA_Output$Pix_Per_Partition
# number of partitions used for k-means clustering
nb_partitions     = PCA_Output$nb_partitions
# path for the updated mask
Input_Mask_File   = PCA_Output$MaskPath
# parameters of the PCA model
PCA_model         = PCA_Output$PCA_model
# definition of spectral bands to be excluded from the analysis
SpectralFilter    = PCA_Output$SpectralFilter

Florian de Boissieu's avatar
Florian de Boissieu committed
163
print("Select PCA components for diversity estimations")
jbferet's avatar
jbferet committed
164
select_PCA_components(Input_Image_File, Output_Dir, PCA_Files, File_Open = TRUE)
Florian de Boissieu's avatar
Florian de Boissieu committed
165
166
```

167

Florian de Boissieu's avatar
Florian de Boissieu committed
168
169
## $\alpha$ and $\beta$ diversity maps

jbferet's avatar
jbferet committed
170
The first step towards  \alpha and \beta diversity mapping corresponds to the computation of a `SpectralSpecies` map, which identifies the cluster ('spectral species') assigned to each pixel in the image, after k-means clustering is performed.
171
172
173


```{r Spectral species map}
Florian de Boissieu's avatar
Florian de Boissieu committed
174
print("MAP SPECTRAL SPECIES")
jbferet's avatar
jbferet committed
175
176
177
map_spectral_species(Input_Image_File, Output_Dir, PCA_Files, PCA_model, SpectralFilter, Input_Mask_File,
                     Pix_Per_Partition, nb_partitions, nbCPU=nbCPU, MaxRAM=MaxRAM, 
                     nbclusters = nbclusters, TypePCA = TypePCA, CR = TRUE)
178
179
180
181
182
```

SpectralSpecies is then stored in a raster file located here:

`RESULTS/S2A_T33NUD_20180104_Subset/SPCA/SpectralSpecies`
Florian de Boissieu's avatar
Florian de Boissieu committed
183

184
185
186
187
$\alpha$ and $\beta$ diversity maps, as well as validation, are based on this `SpectralSpecies` raster. 

The code to compute $\alpha$ and $\beta$ diversity maps from this file is as follows:
```{r alpha and beta diversity maps}
Florian de Boissieu's avatar
Florian de Boissieu committed
188
189
print("MAP ALPHA DIVERSITY")
# Index.Alpha   = c('Shannon','Simpson')
jbferet's avatar
jbferet committed
190
Index_Alpha   = c('Shannon')
jbferet's avatar
jbferet committed
191
192
map_alpha_div(Input_Image_File, Output_Dir, window_size, nbCPU=nbCPU,
              MaxRAM=MaxRAM, Index_Alpha = Index_Alpha, nbclusters = nbclusters)
Florian de Boissieu's avatar
Florian de Boissieu committed
193
194

print("MAP BETA DIVERSITY")
jbferet's avatar
jbferet committed
195
map_beta_div(Input_Image_File, Output_Dir, window_size, nb_partitions=nb_partitions,
jbferet's avatar
jbferet committed
196
             nbCPU=nbCPU, MaxRAM=MaxRAM, nbclusters = nbclusters)
Florian de Boissieu's avatar
Florian de Boissieu committed
197
198
```

199
200
201
202
203
204
205
206
207
208
209
210
$\alpha$ and $\beta$ diversity maps are then stored in raster files located here: 
`RESULTS/S2A_T33NUD_20180104_Subset/SPCA/ALPHA`
and here:
`RESULTS/S2A_T33NUD_20180104_Subset/SPCA/BETA`

Different rasters can be produced and users are invited to refer to the documentation for more options.

Here, processing our example leads to the following $\alpha$ and $\beta$ diversity maps

<img align="bottom" width="100%" height="100%" src="../man/figures/03_AlphaBeta.png">


floriandeboissieu's avatar
floriandeboissieu committed
211
212
# Compute diversity indices from a vector layer

213
The following code computes $\alpha$ and $\beta$ diversity from field plots and extracts the corresponding diversity indices from previouly computed `SpectralSpecies` raster in order to perform validation.
Florian de Boissieu's avatar
Florian de Boissieu committed
214

215
```{r alpha and beta diversity indices from vector layer}
Florian de Boissieu's avatar
Florian de Boissieu committed
216
217
218
219
220
221
222
# location of the spectral species raster needed for validation
TypePCA     = 'SPCA'
Dir.Raster  = file.path(Output.Dir,basename(Input.Image.File),TypePCA,'SpectralSpecies')
Name.Raster = 'SpectralSpecies'
Path.Raster = file.path(Dir.Raster,Name.Raster)

# location of the directory where shapefiles used for validation are saved
Florian de Boissieu's avatar
Florian de Boissieu committed
223
vect        = system.file('extdata', 'VECTOR', package = 'biodivMapR')
Florian de Boissieu's avatar
Florian de Boissieu committed
224
225
226
Shannon.All = list() # ??

# list vector data
jbferet's avatar
jbferet committed
227
Path.Vector         = list_shp(vect)
Florian de Boissieu's avatar
Florian de Boissieu committed
228
229
230
Name.Vector         = tools::file_path_sans_ext(basename(Path.Vector))

# get alpha and beta diversity indicators corresponding to shapefiles
231
Biodiv.Indicators           = diversity_from_plots(Raster = Path.Raster, Plots = Path.Vector,NbClusters = nbclusters)
Florian de Boissieu's avatar
Florian de Boissieu committed
232
233
234
235
236
237
238
# if no name
Biodiv.Indicators$Name.Plot = seq(1,length(Biodiv.Indicators$Shannon[[1]]),by = 1)
Shannon.RS                  = c(Biodiv.Indicators$Shannon)[[1]]
```

The tables are then written to tab-seperated files.

239
```{r Write validation}
Florian de Boissieu's avatar
Florian de Boissieu committed
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# write RS indicators
####################################################
# write indicators for alpha diversity
Path.Results = file.path(Output.Dir, basename(Input.Image.File), TypePCA, 'VALIDATION')
dir.create(Path.Results, showWarnings = FALSE, recursive = TRUE)
ShannonIndexFile <- file.path(Path.Results, "ShannonIndex.tab")
write.table(Shannon.RS, file = ShannonIndexFile, sep = "\t", dec = ".", na = " ", 
            row.names = Biodiv.Indicators$Name.Plot, col.names= F, quote=FALSE)

Results =  data.frame(Name.Vector, Biodiv.Indicators$Richness, Biodiv.Indicators$Fisher,                                Biodiv.Indicators$Shannon, Biodiv.Indicators$Simpson)
names(Results)  = c("ID_Plot", "Species_Richness", "Fisher", "Shannon", "Simpson")
write.table(Results, file = paste(Path.Results,"AlphaDiversity.tab",sep=''), sep="\t", dec=".",               na=" ", row.names = F, col.names= T,quote=FALSE)

# write indicators for beta diversity
BC_mean = Biodiv.Indicators$BCdiss
colnames(BC_mean) = rownames(BC_mean) = Biodiv.Indicators$Name.Plot
write.table(BC_mean, file = paste(Path.Results,"BrayCurtis.csv",sep=''), sep="\t", dec=".", na=" ", row.names = F, col.names= T,quote=FALSE)

```
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297

These results can then be displayed according to the need for further analysis. Here, for the purpose of illustration, we provide a code in order to visualize the differences among field plots located in the image: we first perform a PCoA on the Bray Curtis dissimilarity matrix computed from the field plots:


```{r PCoA on Field Plots}
# apply ordination using PCoA (same as done for map_beta_div)
library(labdsv)
MatBCdist = as.dist(BC_mean, diag = FALSE, upper = FALSE)
BetaPCO   = pco(MatBCdist, k = 3)

```

The plots corresponding to forested areas with high, medium and low diversity, as well as low vegetation/degraded forest close tomain roads are distributed as follows: 

<img align="bottom" width="100%" height="100%" src="../man/figures/04_RGB_FieldLegend.png">

Here, we produce figures in order to locate the different types of vegetation in the PCoA space:

```{r plot PCoA & Shannon}
# very uglily assign vegetation type to polygons in shapefiles
nbSamples = c(6,4,7,7)
vg        = c('Forest high diversity', 'Forest low diversity', 'Forest medium diversity', 'low vegetation')
Type_Vegetation = c()
for (i in 1: length(nbSamples)){
  for (j in 1:nbSamples[i]){
    Type_Vegetation = c(Type_Vegetation,vg[i])
  }
}

# create data frame including alpha and beta diversity
library(ggplot2)
Results     =  data.frame('vgtype'=Type_Vegetation,'pco1'= BetaPCO$points[,1],'pco2'= BetaPCO$points[,2],'pco3' = BetaPCO$points[,3],'shannon'=Shannon.RS)

# plot field data in the PCoA space, with size corresponding to shannon index
ggplot(Results, aes(x=pco1, y=pco2, color=vgtype,size=shannon)) +
  geom_point(alpha=0.6) +
  scale_color_manual(values=c("#e6140a", "#e6d214", "#e68214", "#145ae6"))
filename = file.path(Path.Results,'BetaDiversity_PcoA1_vs_PcoA2.png')
ggsave(filename, plot = last_plot(), device = 'png', path = NULL,
jbferet's avatar
jbferet committed
298
       scale = 1, width = 6, height = 4, units = "in",
299
300
301
302
303
304
305
306
       dpi = 600, limitsize = TRUE)


ggplot(Results, aes(x=pco1, y=pco3, color=vgtype,size=shannon)) +
  geom_point(alpha=0.6) +
  scale_color_manual(values=c("#e6140a", "#e6d214", "#e68214", "#145ae6"))
filename = file.path(Path.Results,'BetaDiversity_PcoA1_vs_PcoA3.png')
ggsave(filename, plot = last_plot(), device = 'png', path = NULL,
jbferet's avatar
jbferet committed
307
       scale = 1, width = 6, height = 4, units = "in",
308
309
310
311
312
313
314
       dpi = 600, limitsize = TRUE)

ggplot(Results, aes(x=pco2, y=pco3, color=vgtype,size=shannon)) +
  geom_point(alpha=0.6) +
  scale_color_manual(values=c("#e6140a", "#e6d214", "#e68214", "#145ae6"))
filename = file.path(Path.Results,'BetaDiversity_PcoA2_vs_PcoA3.png')
ggsave(filename, plot = last_plot(), device = 'png', path = NULL,
jbferet's avatar
jbferet committed
315
       scale = 1, width = 6, height = 4, units = "in",
316
317
318
319
320
321
322
323
324
       dpi = 600, limitsize = TRUE)

```

The resulting figures are displayed here: 

<img align="bottom" width="100%" height="100%" src="../man/figures/BetaDiversity_PcoA1_vs_PcoA2.png">
<img align="bottom" width="100%" height="100%" src="../man/figures/BetaDiversity_PcoA1_vs_PcoA3.png">
<img align="bottom" width="100%" height="100%" src="../man/figures/BetaDiversity_PcoA2_vs_PcoA3.png">