-
rdecoupe authoredfa1a35fe
- Ansible deployment for AIDMOIt
- Prerequisites
- Ansible
- Description of Ansible
- Getting started
- Vagrant & Ansible
- Description of Vagrant
- Getting started
- Deploy a mono-node HDFS
- Deploy mono node HDFS on a VM
- Deploy mono-node HDFS on a server
- Deploy a HDFS cluster
- Deploy cluster HDFS with multiple VMs
- Deploy cluster HDFS on servers
- Deploy GeoNetwork
- With Vagrant and ansible
- On server
- License
Ansible deployment for AIDMOIt
This project aims to deploy a datalake and its ecosystem using Ansible and/or Vagrant.
Acutally, this project does :
- Deploy a mono-node HDFS on :
- Deploy a cluster HDFS on :
- Deploy a geonetwork as metadata system management
- [Your own computer using a virtual machine with Vagrant & ansible provision]
Prod Environment : Deploy a mononode HDFS on a server using ansible Sandbox Environment: Deploy a mononode HDFS on a VM from Vagrant & Ansible
Prerequisites
- Ansible
- Install ansible on your own computer (for Ubuntu or Debian) :
apt-get install ansible
- If you connect to your remote machine using password instead of ssh-key (as recommanded), you have to install this apt :
apt-get install sshpass
- Configure ansible (allow becoming unprivileged user without error : "Failed to set permissions on the temporary files Ansible needs to create when becoming an unprivileged user")
sed -i 's/.*pipelining.*/pipelining = True/' /etc/ansible/ansible.cfg sed -i 's/.*allow_world_readable_tmpfiles.*/allow_world_readable_tmpfiles = True/' /etc/ansible/ansible.cfg
- Vagrant with virtualox
- Install virtualbox on your own computer (for Ubuntu or Debian):
apt-get install virtualbox
- Install vagrant on your own computer (for Ubuntu or Debian):
apt-get install vagrant
Ansible
Description of Ansible
Ansible is an open-source software provisioning, configuration management, and application-deployment tool.[2] It runs on many Unix-like systems, and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration.
Ansible is a good tool to deploy and maintain IT systems. Based on Yaml configuration files, ansible makes it easy to describe your configuration and share it with your collaborators. Then you can deploy it to your infrastructure, you only need to have a ssh access to your servers.
Getting started
Ansible playbook is a list of system instructions which has to be send to a machine. That's why you only need 2 things :
- Get ansible installed on your own computer
- Have a remote machine (physical, vmware, virtualbox, docker, lxc, ...) with a ssh server running
Vagrant & Ansible
Description of Vagrant
Vagrant is an open-source software product for building and maintaining portable virtual software development environments,[5] e.g. for VirtualBox, KVM, Hyper-V, Docker containers, VMware, and AWS. It tries to simplify the software configuration management of virtualizations in order to increase development productivity. Vagrant is written in the Ruby language, but its ecosystem supports development in a few languages.
Vagrant manages your virtual machine (VM) on command line. The benefits are :
- Quickly create VM with a know & controlled environment
- Restore your VM to a known state
- Destribute yours VM easly
Vagrant and ansible can be combined to create/deploy/maintain your VM as we do in this project
Getting started
You only need virtual box and vagrant installed on your computer. This project is going to create VM that you need for your datalake
Deploy a mono-node HDFS
Deploy mono node HDFS on a VM
- In cli : Go to the directory which contains the VagrantFile
- In cli : start your VM:
vagrant up
Deploy mono-node HDFS on a server
First configure your IP adress in the inventory file
Then run the script ansible-launch.sh :
/bin/bash ansible-launch.sh
Deploy a HDFS cluster
Deploy cluster HDFS with multiple VMs
- Set your nodes' IP address in VagrantFile. Inside this file, edit your network setting (as DNS nameserver: if your host machine is on a corporate network, your network administrator may have set rules about using DNS servers. In this case, your network allows only your company's DNS. Please edit them in the settings section and in provision shell in vagrantFile. If this is not the case (i.e. no DNS rules), please assign "false" to the COMPANY_NETWORK_DNS_RULE variable.
- Declare those IP for ansible provision in vars. If you did not change IP setting, skip this step.
- Configure your own computer to access to your nodes using their hostname (need for access to hadoop web ui)
vim /etc/hosts
10.0.0.10 namenode 10.0.0.11 datanode1 10.0.0.12 datanode2
- in cli : start your multiple VM from this directory : vagrant/cluster :
vagrant up
- Format HDFS :
- ssh on namenode
- in cli : as user hadoop : change directory & format HDFS
sudo su hadoop cd /usr/local/hadoop/bin/ hdfs namenode -format
- Start HDFS deamon on your cluser
- ssh on namenode
- in cli : as root : start service hadoop
sudo systemctl start hadoop
- WORK In Progress : systemd will tell you something wrong happens but cluster is working anyway.
- Verify your cluster is up:
- on your own device, use a webbrowser
- go on [IP-of-your-namenode]:9870 if default : http://10.0.0.10:9870
Deploy cluster HDFS on servers
work in progress
Deploy GeoNetwork
With Vagrant and ansible
- Set your nodes' IP address in VagrantFile. Inside this file, edit your network setting (as DNS nameserver: if your host machine is on a corporate network, your network administrator may have set rules about using DNS servers. In this case, your network allows only your company's DNS. Please edit them in the settings section and in provision shell in vagrantFile. If this is not the case (i.e. no DNS rules), please assign "false" to the COMPANY_NETWORK_DNS_RULE variable.
- in cli : start a geonetwork vm form this directory : vagrant/geonetwork](vagrant/geonetwork) :
vagrant up
- On your own computer, using a webrowser, go on http://10.0.0.9:8080/geonetwork (if IP address as default)
On server
work in progress
License
Aidmoit's Collect is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses