diff --git a/readme.MD b/readme.MD index ac5f216408efc4c79ed109b506efa1aedb581c90..658116b862f3068b6047e0f5e165ecfa26438bcd 100644 --- a/readme.MD +++ b/readme.MD @@ -12,9 +12,9 @@ The objective of this document is to describe the input and output files that we Input final vector file is available on : _./FinalDBPreprocessed/DATABASE_SAMPLED_. -The following steps have been done to create this file from the initial input _/media/DATA/johann/PUL/TileHG/**FinalDBPreprocessed**/DATABASE_READY_ (dense annotation): -1. Split large geometry using the intersection regular grid of 1*1 km and keep only objects with a surface higher than 250m2, the output is _./**FinalDBPreprocessed**/DATABASE_GEOM_SPLIT_ -2. Random sampling over a regular grid of 5*5 km (K cells) ( _VectorsDPP/Sampling.py_), given a fixed sampling rate r (e.g. 10%), we randomly select n objects from N available per cell equal to (N\*r/K) . Sampling rate values are the following : +The following steps have been done to create this file from the initial input _./FinalDBPreprocessed/DATABASE_READY_ (dense annotation): +1. Split large geometry using the intersection regular grid of 1*1 km and keep only objects with a surface higher than 250m2, the output is _./FinalDBPreprocessed/DATABASE_GEOM_SPLIT_ +2. Random sampling over a regular grid of 5*5 km (K cells) (_VectorsDPP/Sampling.py_), given a fixed sampling rate r (e.g. 10%), we randomly select n objects from N available per cell equal to (N\*r/K) . Sampling rate values are the following : - 2.5% for **Cereals/Oilseeds**, **Meadows/Uncultivated** and **Forest** - 5% for **Built** class - 10% for **Water**, **Market Gardenning** and **Fodder** @@ -49,7 +49,7 @@ The steps are included together from the script _Sentinel2Theia/main.py_, which 4. Spatial interpolation of 20 meters bands (B5,B6,B7,B8A,B9,B11,B12) into 10 meters (_Sentinel2Theia/GFSuperImpose.py_) 5. Compute NDVI and NDWI vegetation indices (_Sentinel2Theia/VegetationIndices.py_) 6. Build a training dataset and compute descriptive statistics ("meta info"), given a list of feature names, and a vector file (_Sentinel2Theia/training_set.py_) -7. Scaling the data between 0 and 1 and split into Positive and Unlabeled (50%), and Testing (50%) given a class of interest. Repeat it 10 times independently, given in input different sampling object size (e.g. [20, 40, 60, 80 ,100]), and a window (e.g. 20) (_./PUL/Experiments.py_) +7. Scaling the data between 0 and 1 and split into Positive and Unlabeled (50%), and Testing (50%) given a class of interest. Repeat it 10 times independently, given in input different sampling object size (e.g. [20, 40, 60, 80 ,100]), and a window (e.g. 20) (_PUL/Experiments.py_) #### Description of the final output files