with right package for home-made observers

b89452c7 · patrick.lambert · 5b83d13e · b89452c7 · b89452c7
Commit b89452c7 authored 4 years ago by patrick.lambert
Hide whitespace changes
Inline Side-by-side

Showing

with 85 additions and 37 deletions
+85 -37
--- a/data/input/northeastamerica/RIO_obs_NEA.xml
+++ b/data/input/northeastamerica/RIO_obs_NEA.xml
@@ -66,12 +66,12 @@
 					</yAxisLabel>
 					<variableName>getFemaleSpawnerEffective</variableName>
 				</fr.cemagref.observation.observers.jfreechart.TemporalSerieChart>
-				<miscellaneous.TemporalRangeSerieChart>
+				<observer.TemporalRangeSerieChart>
 					<title>Range distribution</title>
 					<xAxisLabel>Time (season)</xAxisLabel>
 					<yAxisLabel>latitude</yAxisLabel>
 					<variableName>getRangeDistributionWithLat</variableName>
-				</miscellaneous.TemporalRangeSerieChart>
+				</observer.TemporalRangeSerieChart>
 				<fr.cemagref.observation.observers.jfreechart.TemporalSerieChart>
 					<graphType>LINE</graphType>
 					<title>Mean age at first reproduction for female</title>
@@ -127,7 +127,7 @@
 		<java-class>environment.RiverBasin</java-class>
 		<fr.cemagref.observation.kernel.ObservablesHandler>
 			<observers>
-				<miscellaneous.TemporalSerieChartForBasin>
+				<observer.TemporalSerieChartForBasin>
 					<graphType>LINE</graphType>
 					<title>Number of juveniles</title>
 					<xAxisLabel>Time
@@ -135,14 +135,14 @@
 					</xAxisLabel>
 					<yAxisLabel>number of juveniles</yAxisLabel>
 					<variableName>getJuvenileNumber</variableName>
-				</miscellaneous.TemporalSerieChartForBasin>
-				<miscellaneous.TemporalSerieChartForBasin>
+				</observer.TemporalSerieChartForBasin>
+				<observer.TemporalSerieChartForBasin>
 					<graphType>LINE</graphType>
 					<title>% Autochtone</title>
 					<xAxisLabel>Time (season)</xAxisLabel>
 					<yAxisLabel>% Autochtone</yAxisLabel>
 					<variableName>getLastPercentageOfAutochtone</variableName>
-				</miscellaneous.TemporalSerieChartForBasin>
+				</observer.TemporalSerieChartForBasin>
 			</observers>
 		</fr.cemagref.observation.kernel.ObservablesHandler>
 	</entry>

--- a/exploration/GR3D_Rdescription/deathBasinW.Rmd
+++ b/exploration/GR3D_Rdescription/deathBasinW.Rmd
@@ -12,7 +12,7 @@ output:
 ---

 ```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
+knitr::opts_chunk$set(echo = TRUE, include = FALSE)
 ```

 ```{r include = FALSE}
@@ -24,15 +24,29 @@ library(flextable)
 ```

 ```{r load data, include = FALSE}
-distance <-  as.matrix(read.csv("../../data/input/northeastamerica/distanceGridNEA.csv", row.names = 1, stringsAsFactors = FALSE))
-distance <- distance %>%
+distanceNEA <-  as.matrix(read.csv("../../data/input/northeastamerica/distanceGridNEA.csv", row.names = 1, stringsAsFactors = FALSE))
+distanceNEA <- distanceNEA %>%
  replace(., col(.) == row(.), NA) %>% 
  as.data.frame() %>% mutate(destination = row.names(.)) %>% 
  pivot_longer(cols = -destination, names_to = 'departure', values_to = 'distance') %>% 
  dplyr::select(departure, destination, distance) %>% 
  arrange(departure, distance)

-riverBasins =  read.csv("../../data/input/northeastamerica/nea_riverbasins.csv")
+riverBasinsNEA =  read.csv("../../data/input/northeastamerica/nea_riverbasins.csv")
+
+
+distanceAA <-  as.matrix(read.csv("../../data/input/atlanticarea/distanceGridAA.csv", row.names = 1, stringsAsFactors = FALSE))
+distanceAA <- distanceAA %>%
+  replace(., col(.) == row(.), NA) %>% 
+  as.data.frame() %>% mutate(destination = row.names(.)) %>% 
+  pivot_longer(cols = -destination, names_to = 'departure', values_to = 'distance') %>% 
+  dplyr::select(departure, destination, distance) %>% 
+  arrange(departure, distance)
+
+riverBasinsAA =  read.csv("../../data/input/atlanticarea/aa_basins.csv")
+
+riverAARougier2015 = read.csv("basinsRougieretal2015.csv", stringsAsFactors = FALSE) %>% rename(basin_id = id, basin_name = nomBV) %>% 
+  mutate(basin_name =replace(basin_name, basin_name == "Sevre_Niortaise", "Sevre Niortaise"))

 ```

@@ -41,16 +55,23 @@ source("GR3Dfunction.R")

 ```

-From the distance grid file used by GR3D for the US application, the data, after reshaping, look like
+# *Computation of the death basin weight*

-```{r distance, echo=FALSE, include = FALSE}
-#paged.print=TRUE
-ft <- head(distance,15) %>%   flextable() %>% set_formatter_type(fmt_double = "%.02f") %>% autofit()
-set_caption(ft, 'Examples of distance (in km) between departure and destination basins')
-#
-```
+## *Baseline*

-The parameters for the kernel function based on accessibility from @rougier2015CombinedUseEmpirical are:
+*The weight for destination basin* $j_2$ *from basin* $j_1$ *is:*
+
+$$w_{j_1\rightarrow j_2} = \frac {1} {1 +  e ^{\alpha_0 + \alpha_1 \cdot {\frac {( D_{j_1\rightarrow j_2}   - \mu_D)} {\sigma_D} } } }$$
+
+*The sum of weights for the departure basin* $j_1$ *is:* $$w_{j_1} = \sum_{j_2 \neq j_1} {w_{j_1\rightarrow j_2}}$$
+
+*The mean weight across all basins is then:* $$\overline{w} = \sum_{j_1 =1}^{n_B} {w_{j_1}}$$
+
+*It is advised to use this mean value for the death basin weight. Notice that the* $\mu_D$ *and* $\sigma_D$ *depend on the basins list considered.*
+
+## *Atlantic area application*
+
+*The parameters for the kernel function based on accessibility for Atlantic Area application defined in @rougier2015CombinedUseEmpirical are:*

 ```{r}
 alpha0 = -2.9
@@ -60,35 +81,56 @@ standardDeviationInterDistance = 978

 ```

-```{r include = FALSE}
-meanInterDistance <- mean(distance$distance, na.rm = TRUE )
-standardDeviationInterDistance <- sd(distance$distance, na.rm = TRUE) 
+*and the resulting death basin weight is*
+
+```{r}
+wDeathBasin = .4
 ```

-Notice that the $\mu_D$ and $\sigma_D$ depend on the basins list considered. The true values for $\mu_D$ and $\sigma_D$ are respectively `r meanInterDistance` and `r standardDeviationInterDistance`. So there is a problem in the AA application <!--# calculte rhe value for Rougier 2015 --> since the number of basins was increased in comparison with @rougier2015CombinedUseEmpirical.
+```{r echo =TRUE, include = FALSE}
+distanceAA <- distanceAA %>% mutate(W =  logitKernel(distance, alpha0, alpha1, meanInterDistance, standardDeviationInterDistance))
+```
+
+```{r}
+riverAARougier2015 %>% select(basin_name) %>% setdiff(riverBasinsAA %>% select(basin_name))
+
+
+distanceAA %>% inner_join(riverAARougier2015, by =c('departure' = 'basin_id')) %>% 
+  inner_join(riverAARougier2015, by =c('destination' = 'nomBV')) %>% 
+  group_by(departure) %>% 
+  summarise(sumW = sum(W, na.rm = TRUE), .groups = 'drop') %>% 
+  summarise(mean(sumW)) %>% unlist()

-```{r kernel function, echo = TRUE}
-range = round(range(distance$distance, na.rm = TRUE))
-dist = range[1]:range[2]
-dataKernel = data.frame(dist = dist, w  = logitKernel(dist, alpha0, alpha1, meanInterDistance, standardDeviationInterDistance))
-dataKernel %>% ggplot(aes(x=dist, y=w)) + geom_line() + labs(x='distance between departure and destination basins (km)')
 ```

-The weight for destination basin $j_2$ from basin $j_1$ is:
+## 

-$$w_{j_1\rightarrow j_2} = \frac {1} {1 +  e ^{\alpha_0 + \alpha_1 \cdot {\frac {( D_{j_1\rightarrow j_2}   - \mu_D)} {\sigma_D} } } }$$
+## *North East America application*

-```{r echo =TRUE, include = FALSE}
-distance <- distance %>% mutate(W =  logitKernel(distance, alpha0, alpha1, meanInterDistance, standardDeviationInterDistance))
+*From the distance grid file used by GR3D for the US application, the data, after reshaping, look like*
+
+```{r distance, echo=FALSE, include = FALSE}
+#paged.print=TRUE
+ft <- head(distanceNEA,15) %>%   flextable() %>% set_formatter_type(fmt_double = "%.02f") %>% autofit()
+set_caption(ft, 'Examples of distance (in km) between departure and destination basins')
+#
 ```

-The sum of weights for the departure basin $j_1$ is: $$w_{j_1} = \sum_{j_2 \neq j_1} {w_{j_1\rightarrow j_2}}$$
+```{r include = FALSE}
+meanInterDistance <- mean(distance$distance, na.rm = TRUE )
+standardDeviationInterDistance <- sd(distance$distance, na.rm = TRUE) 
+```

-The mean weight across all basins is then: $$\overline{w} = \sum_{j_1 =1}^{n_B} {w_{j_1}}$$
+$\mu_D$ *and* $\sigma_D$ *values for this application are respectively `r meanInterDistance` and `r standardDeviationInterDistance`.*

-It is advised to use this value for the death basin weight.
+```{r kernel function, echo = TRUE, include = FALSE}
+range = round(range(distance$distance, na.rm = TRUE))
+dist = range[1]:range[2]
+dataKernel = data.frame(dist = dist, w  = logitKernel(dist, alpha0, alpha1, meanInterDistance, standardDeviationInterDistance))
+dataKernel %>% ggplot(aes(x=dist, y=w)) + geom_line() + labs(x='distance between departure and destination basins (km)')
+```

-The histogram of weights sum in NorthEast America application is given by
+*The histogram of weights sum in North East America application is given by*

 ```{r histogram, echo=FALSE, message=FALSE, warning=FALSE, fig.cap="Distribution of weight sums for departure basins"}
 distance %>% group_by(departure) %>% summarise(sumW = sum(W, na.rm = TRUE)) %>%
@@ -104,7 +146,7 @@ deathBasinWeight <- distance %>%
  summarise(mean(sumW)) %>% unlist()
 ```

-The dashed blue line corresponds to the mean weight ($w_{j_1}$ =`r round(deathBasinWeight, 4)` that is advised to be used as death basin weight. For a departure basin with a $w_{j_1}$ below this value, the strayers mortality is higher than 50 %. For these departure basins, most destination basins are far from the departure basin, the sum of destination basins weights is low and the death basin is attractive.
+*The dashed blue line corresponds to the mean weight (*$w_{j_1}$ *=`r round(deathBasinWeight, 4)` that is advised to be used as death basin weight. For a departure basin with a* $w_{j_1}$ *below this value, the strayers mortality is higher than 50 %. For these departure basins, most destination basins are far from the departure basin, the sum of destination basins weights is low and the death basin is attractive.*

 ```{r, echo =FALSE, warning = FALSE, include = FALSE, mortalyRateLatitude, fig.cap ="Evolution of mortality rate in the death basin according to departure basin latitude"}
 distance %>% group_by(departure) %>% summarise(sumW = sum(W, na.rm = TRUE), .groups = 'drop') %>% 
@@ -114,6 +156,10 @@ distance %>% group_by(departure) %>% summarise(sumW = sum(W, na.rm = TRUE), .gro

 ```

+## Virtual linear network of basin
+
+## full universe
+
 ```{r fake list of basin, echo false, include = TRUE}
 nbBasin = 100
 distBetweenBasin = 10
@@ -159,6 +205,8 @@ drawMortalityVslatitude(data=basinDistance)

 ```

+##Sampled universe
+
 ```{r  irregular sampling of fake universe, echo FALSE}

 sampledBasin <- basin %>% sample_n(25) %>%  arrange(latitude)
@@ -178,7 +226,7 @@ drawMortalityVslatitude(sampledBasinDistance)

 The strayer mortality increases at the edge of the distribution.

-The selection of basins impacts the strayers mortality. It is probably safer to considered a constant rate rather a death basin.
+The selection of basins impacts the strayers mortality. It is probably safer to considered a constant rate rather a death basin. It ais a priority to improve the coverage of the area by increasing the number of basins considered.

 The logit function to compute the basin weights introduces a plateau for the short distance that leads to random destination in the the departure vicinity