Commit 9be9ede3 authored by Cresson Remi's avatar Cresson Remi
Browse files

DOC: improve documentation in README.md

parent 5e9bf445
......@@ -47,10 +47,18 @@ Let's describe quickly the new applications provided.
## PatchesExtraction
This application performs the extraction of patches in images from a vector data containing points. The OTB sampling framework can be used to generate the set of selected points. After that, you can use the **PatchesExtraction** application to perform the sampling of your images.
We denote input source an input image, or a stack of input image (of the same size !). The user can set the **OTB_TF_NSOURCES** environment variable to select the number of input sources that he wants. For example, if she wants to sample a time series of Sentinel or Landsat, and in addition a very high resolution image like Spot-7 or Rapideye (like the [M3 deep net](https://arxiv.org/pdf/1803.01945.pdf)), she needs 2 sources (1 for the TS and 1 for the VHR). The sampled patches will be extracted at each positions designed by the points, if they are entirely inside all input images. For each image source, patches sizes must be provided.
For each source, the application export all sampled patches as a single multiband raster, stacked in rows. For instance, if you have a number *n* of samples of size *16 x 16* in a *4* channels source image, the output image will be a raster of size *16 x 16n* with *4* channels.
An optional output is an image of size *1 x n* containing the value of one specific field of the input vector data. Typically, the *class* field can be used to generate a dataset suitable for a model that performs pixel wise classification.
This application performs the extraction of patches in images from a vector data containing points.
The OTB sampling framework can be used to generate the set of selected points.
After that, you can use the **PatchesExtraction** application to perform the sampling of your images.
We denote _input source_ an input image, or a stack of input images (of the same size !).
The user can set the **OTB_TF_NSOURCES** environment variable to select the number of _input sources_ that he wants.
For example, for sampling a Time Series (TS) together with a single Very High Resolution image (VHR), a number of 2 sources is required: 1 input images list for time series and 1 input image for the VHR.
The sampled patches will be extracted at each positions designed by the points, only if they are entirely lying inside all _input sources_ extents.
For each _input source_, patches sizes must be provided.
For each _input source_, the application export all sampled patches as a single multiband raster, stacked in rows.
For instance, for *n* samples of size *16 x 16* from a *4* channels _input source_, the output image will be a raster of size *16 x 16n* with *4* channels.
An optional output is an image of size *1 x n* containing the value of one specific field of the input vector data.
Typically, the *class* field can be used to generate a dataset suitable for a model that performs pixel wise classification.
![Schema](doc/images/patches_extraction.png)
......@@ -75,28 +83,44 @@ Examples:
otbcli_PatchesExtraction -vec points.sqlite -source1.il $s2_list -source1.patchsizex 16 -source1.patchsizey 16 -field class -source1.out outpatches_16x16.tif -outlabels outlabels.tif
```
## Build your Tensorflow model
## Build your Tensorflow model <a name="buildmodel"></a>
You can build your Tensorflow model as shown in the `./python/` directory. The high-level Python API of Tensorflow is used here to explort a *SavedModel* that applications of this remote module can read.
Python purists can even train their own models, thank to Python bindings of OTB: to get patches as 4D numpy arrays, just read the patches images with OTB (**ExtractROI** application for instance) and get the output float vector image as numpy array. Then, simply do a np.reshape to the dimensions that you want !
However, you can use any deep net available on the web, or use an existing gui application to create your own Tensorflow models.
The important thing here is to know the following parameters for your **placeholders** (the inputs of your model) and **output tensors** (the outputs of your model).
- For each **input placeholder**:
You can build models using the TensorFlow Python API as shown in the `./python/` directory.
Models must be exported in **SavedModel** format.
When using a model in OTBTF, the important thing is to know the following parameters related to the _placeholders_ (the inputs of your model) and _output tensors_ (the outputs of your model).
- For each _input placeholder_:
- Name
- Receptive field
- For each **output tensor**:
- **Receptive field**
- For each _output tensor_:
- Name
- Expression field
- Scale factor
- **Expression field**
- **Scale factor**
![Schema](doc/images/schema.png)
Here the scale factor is related to one of the model inputs. It tells if your model perform a physical change of spacing of the output (e.g. introduced by non unitary strides in pooling or convolution operators). For each output, it must be expressed relatively to one single input called the reference input.
Additionally, you will need to remember the **target nodes** (e.g. optimizers, ...) used for training and every other placeholder that are important, especially user placeholders that are used only for training without default value (e.g. "dropout value").
The **scale factor** descibes the physical change of spacing of the outputs, typically introduced in the model by non unitary strides in pooling or convolution operators.
For each output, it is expressed relatively to one single input of the model called the _reference input source_.
Additionally, the names of the _target nodes_ must be known (e.g. "optimizer").
Also, the names of _user placeholders_, typically scalars placeholders that are used to control some parameters of the model, must be know (e.g. "dropout_rate").
The **receptive field** corresponds to the input volume that "sees" the deep net.
The **expression field** corresponds to the output volume that the deep net will create.
## Train your Tensorflow model
Here we assume that you have produced patches using the **PatchesExtraction** application, and that you have a model stored in a directory somewhere on your filesystem. The **TensorflowModelTrain** performs the training, validation (against test dataset, and against validation dataset) providing the usual metrics that machine learning frameworks provide (confusion matrix, recall, precision, f-score, ...).
Set you input data for training and for validation. The validation against test data is performed on the same data as for training, and the validation against the validation data, well, is performed on the dataset that you give to the application. You can set also batches sizes, and custom placeholders for single valued tensors for both training and validation. The last is useful if you have a model that behaves differently depending the given placeholder. Let's take the example of dropout: it's nice for training, but you have to disable it to use the model. Hence you will pass a placeholder with dropout=0.3 for training and dropout=0.0 for validation.
Here we assume that you have produced patches using the **PatchesExtraction** application, and that you have a **SavedModel** stored in a directory somewhere on your filesystem.
The **TensorflowModelTrain** application performs the training, validation (against test dataset, and against validation dataset) providing the usual metrics that machine learning frameworks provide (confusion matrix, recall, precision, f-score, ...).
You must provide the path of the **SavedModel** to the _model.dir_ parameter.
The _model.restorefrom_ and _model.saveto_ corresponds to the variables of the **SavedModel** used respectively for restoring and saving them.
For instance, you can overwrite the variables of the **SavedModel** directly in its `variables/variables...` file.
Set you _input sources_ for training (_training_ parameter group) and for validation (_validation_ parameter group): the evaluation is performed against training data, and optionally also against the validation data (only if you set _validation.mode_ to "class").
For each _input sources_, the patch size and the placeholder name must be provided.
Regarding validation, if a different name is found in a particular _input source_ of the _validation_ parameter group, the application knows that the _input source_ is not fed to the model at inference, but is used as reference to compute evaluation metrics of the validation dataset.
Batch size (_training.batchsize_) and number of epochs (_training.epochs_) can be set.
_User placeholders_ can be set separately for training (_training.userplaceholders_) and validation (_validation.userplaceholders_).
The _validation.userplaceholders_ can be useful if you have a model that behaves differently depending the given placeholder.
Let's take the example of dropout: it's nice for training, but you have to disable it to use the model at inference time.
Hence you will pass a placeholder with "dropout\_rate=0.3" for training and "dropout\_rate=0.0" for validation.
Of course, one can train models from handmade python code: to import the patches images, a convenient method consist in reading patches images as numpy arrays using OTB applications (e.g. **ExtractROI**) or GDAL, then do a np.reshape to the dimensions wanted.
![Schema](doc/images/model_training.png)
......@@ -142,14 +166,27 @@ MISSING -training.source2.placeholder <string> Name of the input placeho
Use -help param1 [... paramN] to see detailed documentation of those parameters.
Examples:
otbcli_TensorflowModelTrain -source1.il spot6pms.tif -source1.placeholder x1 -source1.patchsizex 16 -source1.patchsizey 16 -source2.il labels.tif -source2.placeholder y1 -source2.patchsizex 1 -source2.patchsizex 1 -model.dir /tmp/my_saved_model/ -training.userplaceholders is_training=true dropout=0.2 -training.targetnodes optimizer -model.saveto /tmp/my_saved_model_vars1
otbcli_TensorflowModelTrain -source1.il spot6pms.tif -source1.placeholder x1 -source1.patchsizex 16 -source1.patchsizey 16 -source2.il labels.tif -source2.placeholder y1 -source2.patchsizex 1 -source2.patchsizex 1 -model.dir /tmp/my_saved_model/ -training.userplaceholders is_training=true dropout=0.2 -training.targetnodes optimizer -model.saveto /tmp/my_saved_model/variables/variables
```
As you can note, there is `$OTB_TF_NSOURCES` + 1 sources for practical purpose: because we need at least 1 source for input data, and 1 source for the truth.
As you can note, there is `$OTB_TF_NSOURCES` + 1 sources because we often need at least one more source for the reference data (e.g. terrain truth for land cover mapping).
## Serve the model
The **TensorflowModelServe** application perform model serving, it can be used to produce output raster with the desired tensors. Thanks to the streaming mechanism, very large images can be produced. The application uses the `TensorflowModelFilter` and a `StreamingFilter` to force the streaming of output. This last can be optionally disabled by the user, if he prefers using the extended filenames to deal with chunk sizes. however, it's still very useful when the application is used in other composites applications, or just without extended filename magic. Some models can consume a lot of memory. In addition, the native tiling strategy of OTB consists in strips but this might not always the best. For Convolutional Neural Networks for instance, square tiles are more interesting because the padding required to perform the computation of one single strip of pixels induces to input a lot more pixels that to process the computation of one single tile of pixels.
So, this application takes in input one or multiple images (remember that you can change the number of inputs by setting the `OTB_TF_NSOURCES` to the desired number) and produce one output of the specified tensors.
Like it was said before, the user is responsible of giving the *receptive field* and *name* of input placeholders, as well as the *expression field*, *scale factor* and *name* of the output tensors. The user can ask for multiple tensors, that will be stack along the channel dimension of the output raster. However, if the sizes of those output tensors are not consistent (e.g. a different number of (x,y) elements), an exception will be thrown.
The **TensorflowModelServe** application perform model serving, it can be used to produce output raster with the desired tensors.
Thanks to the streaming mechanism, very large images can be produced.
The application uses the `TensorflowModelFilter` and a `StreamingFilter` to force the streaming of output.
This last can be optionally disabled by the user, if he prefers using the extended filenames to deal with chunk sizes.
However, it's still very useful when the application is used in other composites applications, or just without extended filename magic.
Some models can consume a lot of memory.
In addition, the native tiling strategy of OTB consists in strips but this might not always the best.
For Convolutional Neural Networks for instance, square tiles are more interesting because the padding required to perform the computation of one single strip of pixels induces to input a lot more pixels that to process the computation of one single tile of pixels.
So, this application takes in input one or multiple _input sources_ (the number of _input sources_ can be changed by setting the `OTB_TF_NSOURCES` to the desired number) and produce one output of the specified tensors.
The user is responsible of giving the **receptive field** and **name** of _input placeholders_, as well as the **expression field**, **scale factor** and **name** of _output tensors_.
The first _input source_ (_source1.il_) corresponds to the _reference input source_.
As explained [previously](#buildmodel), the **scale factor** provided for the _output tensors_ is related to this _reference input source_.
The user can ask for multiple _output tensors_, that will be stack along the channel dimension of the output raster.
However, if the sizes of those _output tensors_ are not consistent (e.g. a different number of (x,y) elements), an exception will be thrown.
![Schema](doc/images/classif_map.png)
......@@ -187,6 +224,7 @@ otbcli_TensorflowModelServe -source1.il spot6pms.tif -source1.placeholder x1 -so
```
## Composite applications for classification
Who has never dreamed to use classic classifiers performing on deep learning features?
This is possible thank to two new applications that uses the existing training/classification applications of OTB:
......@@ -332,31 +370,39 @@ Note that you can still set the `OTB_TF_NSOURCES` environment variable.
## The basics
Here we will try to provide a simple example of doing a classification using a deep net that performs on one single VHR image.
Our data set consists in one Spot-7 image, *spot7.tif*, and a training vector data, *terrain_truth.shp* that qualifies two classes that are forest / non-forest.
First, we **compute statistics** of the vector data : how many points can we sample inside objects, and how many objects in each class.
Our data set consists in one Spot-7 image, *spot7.tif*, and a training vector data, *terrain_truth.shp* that describes sparsely forest / non-forest polygons.
First, we compute statistics of the vector data : how many points can we sample inside objects, and how many objects in each class.
We use the **PolygonClassStatistics** application of OTB.
```
otbcli_PolygonClassStatistics -vec terrain_truth.shp -field class -in spot7.tif -out vec_stats.xml
```
Then, we will select some samples with the **SampleSelection** application of the existing machine learning framework of OTB.
Since the terrain truth is sparse, we want to sample randomly points in polygons with the default strategy of the **SampleSelection** OTB application.
```
otbcli_SampleSelection -in spot7.tif -vec terrain_truth.shp -instats vec_stats.xml -field class -out points.shp
```
Ok. Now, let's use our **PatchesExtraction** application. Out model has a receptive field of 16x16 pixels.
We want to produce one image of patches, and one image for the corresponding labels.
Now we extract the patches with the **PatchesExtraction** application.
We want to produce one image of 16x16 patches, and one image for the corresponding labels.
```
otbcli_PatchesExtraction -source1.il spot7.tif -source1.patchsizex 16 -source1.patchsizey 16 -vec points.shp -field class -source1.out samp_labels.tif -outpatches samp_patches.tif
```
That's it. Now we have two images for patches and labels. We can split them to distinguish test/validation groups (with the **ExtractROI** application for instance). But here, we will just perform some fine tuning of our model, located in the `outmodel` directory. Our model is quite basic. It has two input placeholders, **x1** and **y1** respectively for input patches (with size 16x16) and input reference labels (with size 1x1). We named **prediction** the tensor that predict the labels and the optimizer that perform the stochastic gradient descent is an operator named **optimizer**. We perform the fine tuning and we export the new model variables in the `newvars` folder.
Let's use our **TensorflowModelTrain** application to perform the training of this existing model.
Now we have two images for patches and labels.
We can split them to distinguish test/validation groups (with the **ExtractROI** application for instance).
But here, we will just perform some fine tuning of our model.
The **SavedModel** is located in the `outmodel` directory.
Our model is quite basic: it has two input placeholders, **x1** and **y1** respectively for input patches (with size 16x16) and input reference labels (with size 1x1).
We named **prediction** the tensor that predict the labels and the optimizer that perform the stochastic gradient descent is an operator named **optimizer**.
We perform the fine tuning and we export the new model variables directly in the `outmodel/variables` folder, overwritting the existing variables of the model.
We use the **TensorflowModelTrain** application to perform the training of this existing model.
```
otbcli_TensorflowModelTrain -model.dir /path/to/oursavedmodel -training.targetnodesnames optimizer -training.source1.il samp_patches.tif -training.source1.patchsizex 16 -training.source1.patchsizey 16 -training.source1.placeholder x1 -training.source2.il samp_labels.tif -training.source2.patchsizex 1 -training.source2.patchsizey 1 -training.source2.placeholder y1 -model.saveto newvars
otbcli_TensorflowModelTrain -model.dir /path/to/oursavedmodel -training.targetnodesnames optimizer -training.source1.il samp_patches.tif -training.source1.patchsizex 16 -training.source1.patchsizey 16 -training.source1.placeholder x1 -training.source2.il samp_labels.tif -training.source2.patchsizex 1 -training.source2.patchsizey 1 -training.source2.placeholder y1 -model.saveto /path/to/oursavedmodel/variables/variables
```
Note that we could also have performed validation in this step. In this case, the `validation.source2.placeholder` would be different than the `training.source2.placeholder`, and would be **prediction**. This way, the program know what is the target tensor to evaluate.
After this step, we decide to produce an entire map of forest over the whole Spot-7 image. First, we duplicate the model, and we replace its variable with the new ones that have been computed in the previous step.
Then, we use the **TensorflowModelServe** application to produce the **prediction** tensor output for the entire image.
After this step, we use the trained model to produce the entire map of forest over the whole Spot-7 image.
For this, we use the **TensorflowModelServe** application to produce the **prediction** tensor output for the entire image.
```
otbcli_TensorflowModelServe -source1.il spot7.tif -source1.placeholder x1 -source1.rfieldx 16 -source1.rfieldy 16 -model.dir /tmp/my_new_model -output.names prediction -out map.tif uint8
otbcli_TensorflowModelServe -source1.il spot7.tif -source1.placeholder x1 -source1.rfieldx 16 -source1.rfieldy 16 -model.dir /path/to/oursavedmodel -output.names prediction -out map.tif uint8
```
## Begin with provided models
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment