\newpage

% Description of the traits and tree growth data formatting for workshop on traits and competitive interactions


# Introduction

This document describes the data structure and the main R functions available so far for the data formatting for the working group on traits and competition.

![Workflow](workflow.png "Workflow")


# Structure of data for analysis

For the analysis we need for each ecoregion country (or big tropical plot) a list with three elements.

## First element is a  data.frame for individual tree data with columns


----------------------------------------------------------------------------
var       numeric   units   description                                     
--------- --------- ------- ------------------------------------------------
obs.id    0                 a unique identifier of observqtion (if multiple 
                            observation for a same tree)                    

tree.id   0                 a unique identifier of each tree                

sp        0                 the species code                                

sp.name   0                                                                 

cluster   0                                                                 

plot      0                 the plot code                                   

ecocode   0                 the ecoregion code (trying to merge similar     
                            ecoregion to have ecoregion with enough         
                            observation per ecoregion)                      

D         1         cm      diameter growth                                 

G         1         mm/yr   the diameter growth rate                        

dead      1                 a dummy variable 0 alive 1 dead                 

year      1         yr      the number of year for the growth measurement   

htot      1         m       the height of the individual for the data base  
                            for which it is availble to compute max height  
                            per species                                     

Lon       1         deg     Longitude of the plot in WGS84                  

Lat       1         deg     Latitude of teh plots in WGS84                  

perc.dead 1                 the percentage of dead computed on each plot to 
                            exlude plot with perturbation (equal 1 for plot 
                            with known perturbation)                        

weights   1         /mm2    the weigths of the tree to have an estimation of
                            basal area per m^2                              

census    1         0       the name of the year of the census 1            
----------------------------------------------------------------------------


##  Second element is a data.frame competition index with columns

    - $tree.id$ a unique identifier of each tree
    - $ecocode$ the species code
    - one column per species with the name as in the species code $sp$ in the previous the plot code
	- $BATOT.COMPET$ the sum of the basal area of all species

## Third element is a data.frame for the species traits data with columns


--------------------------------------------------------------------------------------
var                numeric   units   description                                      
------------------ --------- ------- -------------------------------------------------
sp                 0                 the species code used in other tables            

Latin_name         0                 the latin name of the species                    

Leaf.N.mean        1         mg/g    Leaf Nitrogen per mass                           

Seed.mass.mean     1         mg      seed mass                                        

SLA.mean           1         mm2/mg  specific leaf area                               

Wood.density.mean  1         mg/mm3  wood density                                     

Max.height.mean    1                 from NFI data I compute the 99% quantile in m and
                                     the same columns with ,sd, instead of ,mean, with
                                     either the mean sd within species if species mean
                                     or the mean sd with genus if genus mean because  
                                     no species data a dummy variable with true or    
                                     false if genus mean                              

Leaf.N.sd          1                                                                  

Seed.mass.sd       1                                                                  

SLA.sd             1                                                                  

Wood.density.sd    1                                                                  

Max.height.sd      1                                                                  

Leaf.N.exp         1                                                                  

Seed.mass.exp      1                                                                  

SLA.exp            1                                                                  

Wood.density.exp   1                                                                  

Leaf.N.genus       1                                                                  

Seed.mass.genus    1                                                                  

SLA.genus          1                                                                  

Wood.density.genus 1                                                                  

Leaf.N.nobs        1                                                                  

Seed.mass.nobs     1                                                                  

SLA.nobs           1                                                                  

Wood.density.nobs  1                                                                  
--------------------------------------------------------------------------------------

and the same columns with $sd$ instead of $mean$ with either the mean sd within species if species mean or the mean sd with genus if genus mean because no species data
a dummy variable with true or false if genus mean

# Competition index

## National forest inventory type data

We computes the sum of basal area (BA) per plot (including the weight of each tree to have a basal area in $m^2/ha$) total and per species without the  BA of the target tree (see the R function `BA.SP.FUN` in the file format.function.R).

## Large plot data

Need to compute the basal area ($m^2/ha$) per species in the neighborhood of each individuals in given radius $R$. The function  `BA.SP.FUN.XY` in the file format.function.R should do that but not tested.


# Traits data

The objective is to have a table with the species mean of the traits or the genus mean for the traits if no data available.

## TRY data

* The TRY data is provided with one row for each variables measured on a single individuals (traits variable and non traits variables). The function `fun.extract.try` (in FUN.TRY.R) extract the traits variables and the non traits variables that we want to create a table with one row per individual (Observation.ID) and one column per traits or non traits variables.

* Then we compute for each species (and all its potential synonyms) the mean observation of each traits (in log10) without experimental data if possible or with experimental data if no data. If no data is available for a given species we compute the *genus* mean (and a dummy variable indicating that this is genus mean). The function also compute the traits sd. See function `fun.species.traits` . This function also exclude outlier based on the method used by Kattge et al 2011 (GCB) (see function `fun.out.TF2`).

* Then I have computed the mean sd within species (assuming that the within species sd is constant over all species).

* So far on the French data I have only list species potential synonyms self build but it would be great to either creates a list of potential synonyms from existing list or alternatively to match the TRY species and the forest inventory species on the same list to have teh same species.



## Other data provided for each data

* Need to write a function to compute mean per species for each traits and decide if we use the same species sd for these data sets.

### Ecoregions

For the NFI data we will divide the data set by regions with similar ecological conditions. This will allow to estimate the link between competitive interactions and traits within regions of similar conditions and see how the results vary (for instance in the US there is a large variability between the north and the south). This will allow to make comparison with large tropical plot more easy. Then this will help to have smaller data set to speed up the estimation. Please could you either provides a source of ecoregion with a GIS layer that we can use or better directly includes this variable in the data (at the plot level). Similarly in term of climatic variables I was planning to use the best variables available for each data rather than a global data base of lower quality. Could you either give the link of such a data set or better directly do get the variables for each plot.
I think that we do not have any ecoregion information that was directly measured in the SFI data. However, we have joint each SFI plot with Olson ecoregions.


# Progress


---------------------------------------------------------
Data.name                  Demographic.data              
-------------------------- ------------------------------
BCI                        Large 50ha plot with semi     
                           spatial localisation of tree  
                           with multiple census          

Fushan                     Large plot with spatial       
                           localisation of tree with     
                           multiple census               

Luquillo                   Large plot with spatial       
                           localisation of tree with     
                           multiple census               

La Chonta                  Large plot with spatial       
                           localisation of tree with     
                           multiple census               

Paracou                    Large plot with spatial       
                           localisation of tree with     
                           multiple census               

Mbaiki                     Large plot with spatial       
                           localisation of tree with     
                           multiple census               

FIA                        Forest inventory plots in the 
                           US Formatting M. Vanderwel to 
                           be done                       

Canada                     Forest inventory plots in     
                           Canada Formatting John        
                           Caspersen to be done          

France                     Forest inventory plots        

Spain                      Forest inventory plots check  
                           with M Zaval formatting       
                           probably done                 

Sweden                     Forest inventory plots.       
                           Formatting to be discuss      
                                                         

Switzerland                Forest inventory plots.       
                           Formatting to be discuss      
                                                         

New Zealand                Forest inventory plots.       
                           Formatting to be discuss      
                           (Coomes sub sample)           

Autralia NSW Kooyman plots Several medium size plots.    
                           Formatting in progress        
                                                         

CSIRO plots                Several medium size plots.    
                           Formatting in progress        
                                                         
---------------------------------------------------------

Table: Table continues below (continued below)

 
---------------------------------------------
Demo.data.availability    Traits.data        
------------------------- -------------------
ok                        Available with data

ok                        Available with data

Need to contact Zimmerman Available with data

no                        Available with data

ok                        Available with data

Waite                     Available with data

ok                        TRY                

ok                        TRY                

ok                        TRY                

ok                        TRY                

ok                        TRY                

ok                        TRY                

ok                        Available with data

ok                        Available with data

Waite                     Available with data
---------------------------------------------

Table: Table continues below

 
------------------------------------------------
Traits.data.vailability   Abiotic.variables     
------------------------- ----------------------
ok                        topography and/or soil

ok                        topography and/or soil

Need to contact Swenson   topography and/or soil

ok                        topography and/or soil

ok                        topography and/or soil

Waite                     topography and/or soil

ok                        climate               

ok                        climate               

ok                        climate               

ok                        climate               

ok                        climate               

ok                        climate               

ok                        climate               

ok                        climate               

Waite                     climate               
------------------------------------------------

 
---------------------------------------------------------------
Progress.in.formatting.the.data   TODO                         
--------------------------------- -----------------------------
demo data ok                      compute CI adn process traits

demo data ok                      compute CI adn process traits

NO                                send email                   

NO                                                             

demo and competition index ok     Traits ask Ghislain to do    

Waite ghislain                                                 

Done                              Need to add max height from  
                                  FIA MISSING CENSUS VARIABLE  

Need to pupdate with new code     waite new data with Quebec   
per ecoregion                     MISSING CENSUS VARIABLE      

Done                              rewrite to format per        
                                  ecoregion                    

Demo done                         Competition index and TRY    

demo ok                           missing TreeID and mortality 

demo ok                           missing mortality, ecoregion 

demo ok                                                        

demo and compeitition index       Traits ask Ghislain to do    
ok                                                             

Waite                             daniel send email with traits
---------------------------------------------------------------