Commit 23eba576 authored by Fize Jacques's avatar Fize Jacques
Browse files

change Readme

parent 696d4787
No related merge requests found
Showing with 43 additions and 7 deletions
+43 -7
...@@ -12,11 +12,13 @@ This repository contains two ways of executing Biotex (Python and Java). A list ...@@ -12,11 +12,13 @@ This repository contains two ways of executing Biotex (Python and Java). A list
## Requirements ## Requirements
* Python 3 * Python 3.6+
* Java 7-8 * Java 7-8
* TreeTagger (can be found [here](https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/))
It is advised to put *TreeTagger* in `$HOME/.tree-tagger/`. If not please used the `treetagger_src` parameter in `BiotexWrapper` to set your TreeTagger directory.
## Install ## Setup
To make it works, clone this repository using : To make it works, clone this repository using :
git clone https://gitlab.irstea.fr/jacques.fize/biotex_python.git git clone https://gitlab.irstea.fr/jacques.fize/biotex_python.git
...@@ -27,13 +29,47 @@ Then, install the module by using the following commands : ...@@ -27,13 +29,47 @@ Then, install the module by using the following commands :
(sudo) pip3 install . (sudo) pip3 install .
# Example # Get Started
## A first run
To see if everything work, use the following code.
```python ```python
from biotex import BiotexWrapper from biotex import BiotexWrapper
wrapper = BiotexWrapper(lang="fr") wrapper = BiotexWrapper(language="french")
corpus= [Load your corpus here] corpus= ["D'avantage de lignes en commun de bus.",
wrapper.create_corpus_from_txt(corpus) 'Les dérèglements climatiques (crue, sécheresse)',
terminology = wrap.extract_terminology("output.txt") 'Protéger les captages d\'eau potable en interdisant toute activité polluante dans les "périmètres de protection rapprochée" et inciter les collectivités locales à acheter les terrains de ces périmètres. Supprimer les avantages fiscaux sur les produits pétroliers utilisés dans le transport aérien, maritime,BTP... Instaurer une taxe sur les camions traversant la France qui serait utilisée soit pour la transition écologique soit pour soigner les personnes atteintes de maladies respiratoires. Aider l\'agriculture à changer de modèle.',
"Je n'utilise pas la voiture pour des déplacements quotidiens"]
wrapper.terminology(corpus)
```
## Parameters of the Biotex Wrapper class
Here is the list of all the available parameters in the wrapper.
``` ```
Parameters
----------
biotex_jar_path : str, optional
Filepath of the Biotex jar [***]
pattern_path : str, optional
Directory that contains pre-defined patterns [***]
dataset_src : src, optional
FilePath of datasets used by Biotex [***]
stopwords_src : str, optional
Path of the directory that contains stop-words for each language [***]
treetagger_src : str, optional
Path of the directory that contains TreeTagger
type_of_terms : str, optional
number of terms you want to extract ("all","multi"), by default "all"
language : str, optional
language of the data, by default "french"
score : str, optional
score used to sort the extracted term, by default "F-TFIDF-C_M"
patron_number : str, optional
number of pattern used to extract terms, by default "3"
[***] Only change these settings if you are familiar with the Biotex Java API.
```
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment