Commit 2bae0464 authored by rdecoupe's avatar rdecoupe
Browse files

improve README.md

parent 3f60b535
......@@ -3,4 +3,96 @@
This project aims to deploy an API Rest around BioTex. **BioTex is a Automated Term Extractor** (see [here](http://tubo.lirmm.fr/biotex/index.jsp) for more details)
BioTex need a POS (Part of Speech) tagger. Its author suggest [TreeTagger](https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)
This repository is based upon the work of **Jacques Fize** who build a python wrapper of Biotext (see [his repository](https://gitlab.irstea.fr/jacques.fize/biotex_python) for more details)
\ No newline at end of file
This repository is based upon the work of
* **Juan Antonio LOSSIO-VENTURA** creator of [BioTex](https://github.com/sifrproject/biotex/tree/master)
* **Andon Tchechmedjiev** who modify BioTex repository in a maven way
* **Jacques Fize** who build a python wrapper of Biotext (see [his repository](https://gitlab.irstea.fr/jacques.fize/biotex_python) for more details)
## How does it work :
This repository deploy a virtual machine with all BioTex and BioTex dependancies installed.
This project is not a virtual machine (VM) but a playbook to installed a VM.
For deploying BioTex you can :
* Use you own VM and start with Ansible [Playbook](biotex-rest/biotex-rest-deployment/ansible/playbook/biotex-wrapper.yml)
* Create a new fresh VM with [vagrant and ansible](biotex-rest/biotex-rest-deployment/vagrant/Vagrantfile)
## Prerequisites
* Ansible
+ Install ansible on your own computer (for Ubuntu or Debian) :
```shell
apt-get install ansible
```
+ If you connect to your remote machine using password instead of ssh-key (as recommanded), you have to install this apt :
```shell
apt-get install sshpass
```
+ Configure ansible (allow becoming unprivileged user without error : "Failed to set permissions on the temporary files Ansible needs to create when becoming an unprivileged user")
```shell
sed -i 's/.*pipelining.*/pipelining = True/' /etc/ansible/ansible.cfg
sed -i 's/.*allow_world_readable_tmpfiles.*/allow_world_readable_tmpfiles = True/' /etc/ansible/ansible.cfg
```
* Vagrant with virtualox
+ Install virtualbox on your own computer (for Ubuntu or Debian):
```
apt-get install virtualbox
```
+ Install vagrant on your own computer (for Ubuntu or Debian):
```shell
apt-get install vagrant
```
## Getting started
### Deploy from scratch with Vagrant and Ansible
1. Set your nodes' IP address in [VagrantFile](biotex-rest/biotex-rest-deployment/vagrant/Vagrantfile).
Inside this file, edit your network setting (as DNS nameserver: if your host machine is on a corporate network, your network administrator may have set rules about using DNS servers. In this case, your network allows only your company's DNS. Please edit them in this file too.
2. In command line : start and configure a VM from this [directory](biotex-rest/biotex-rest-deployment/vagrant/) :
```shell
vagrant up
```
Pay attention, some OS (as debian buster, need to be sudo for this instruction : sudo vagrant up)
3. In command line : ssh to your new VM
```shell
vagrant ssh
```
4. In command line : create a corpus (don't forget to separte files by this separator ##########END##########. The corpus file has to be ended by this separator as well)
```shell
touch coprus/corpus.txt
vim corpus/corpus.txt
```
Insert your documents contents separated by ##########END##########.
5. Start biotex : in command line
```shell
java -jar biotex/target/biotex.jar biotex/biotex.properties
```
6. See results in output directory
```shell
ls output/
```
## Informations on Ansible and Vagrant
### Description of Ansible
[Wikipedia definition :](https://en.wikipedia.org/wiki/Ansible_(software))
> Ansible is an open-source software provisioning, configuration management, and application-deployment tool.[2] It runs on many Unix-like systems, and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration.
Ansible is a good tool to deploy and maintain IT systems. Based on [Yaml ](https://en.wikipedia.org/wiki/YAML) configuration files, ansible makes it easy to describe your configuration and share it with your collaborators.
Then you can deploy it to your infrastructure, you only need to have a ssh access to your servers.
Ansible playbook is a list of system instructions which has to be send to a machine. That's why you only need 2 things :
- Get ansible installed on your own computer
- Have a remote machine (physical, vmware, virtualbox, docker, lxc, ...) with a ssh server running
### Description of Vagrant
[Wikipedia definition :](https://en.wikipedia.org/wiki/Vagrant_(software))
> Vagrant is an open-source software product for building and maintaining portable virtual software development environments,[5] e.g. for VirtualBox, KVM, Hyper-V, Docker containers, VMware, and AWS. It tries to simplify the software configuration management of virtualizations in order to increase development productivity. Vagrant is written in the Ruby language, but its ecosystem supports development in a few languages.
Vagrant manages your virtual machine (VM) on command line. The benefits are :
- Quickly create VM with a know & controlled environment
- Restore your VM to a known state
- Destribute yours VM easly
Vagrant and ansible can be combined to create/deploy/maintain your VM as we do in this project
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment