Info
Build Hadoop cluster in OpenStack with Terraform.
Primary goal of this project is to build Hadoop cluster. But the most part is generic - the Hadoop deployment can be skipped, or replaced by implementing different deployment type (see deployments directory).
Requirements
Locally installed:
Configuration:
- public part of ssh key uploaded to OpenStack
- ssh-agent with ssh key
- configured access to OpenStack, see Cloud Documentation (either downloaded cloud.yaml file or the environment set)
- floating IP created
Hadoop image
To setup Hadoop on single machine using Hadoop image, launch:
/usr/local/sbin/hadoop-setup.sh
Hadoop image can be used also to build Hadoop cluster. It contains pre-downloaded and pre-installed Hadoop packages and dependencies, so this will speed things up.
Single machine
See above, when Hadoop image is available.
It is possible also to build Hadoop on single machine using terraform + this orchestration scripts (set type=hadoop-single, and n=0).
For example (check also the other values used in variables.tf):
cat <<EOF > mycluster.auto.tfvars
n = 0
type = "hadoop-single"
flavor = "standard.large" # >4GB memory needed
EOF
terraform init
terraform apply
Build cluster
#
# 1. check *variables.tf*
#
# It is possible to override default values using *\*.auto.tfvars* files.
#
cat <<EOF > mycluster.auto.tfvars
domain = 'mydomain'
n = 3
security_trusted_cidr = [
"0.0.0.0/0",
"::/0",
]
ssh = 'mykey'
EOF
#
# 2. launch the setup
#
terraform init
terraform apply
Destroy cluster
terraform destroy
Usage
Hadoop can be used on the "master" node (the frontend machine). The name can be configured by master_hostname in variables.tf. This machine is configured with the floating public IP address.
Before accessing Hadoop services, it is needed to obtain the Kerberos ticket:
kinit
Password
Look for the generated password of the created user for Hadoop in the output or password.txt file in home directory (/home/debian/password.txt).
It is possible to set the new password on the master server using ('debian' is the user name):
sudo kadmin.local cpw debian
Public IP
The public IP is in the public_hosts file or inventory file.
Advanced usage
Add Hadoop node
On the terraform client machine:
# increase number of nodes in terraform
vim *.auto.tfvars
# check the output
terraform plan
# perform the changes
terraform apply
# refresh configuration
yellowmanager refresh
#(this will call with credentials: 1) hdfs dfsadmin -refreshNodes, 2) yarn rmadmin -refreshNodes)
Remove Hadoop node
Data must be migrated from the removed nodes first in the Hadoop cluster. Theoretically this isn't needed when removing only one node due to replication policy on HDFS. In such case the steps would be the same as adding node.
- update Hadoop cluster
On the master machine:
# nodes to remove (it must be the nodes with the highest numbers), for example:
echo node3.terra >> /etc/hadoop/conf/excludes
# refresh configuration
yellowmanager refresh
#(this will call with credentials: 1) hdfs dfsadmin -refreshNodes, 2) yarn rmadmin -refreshNodes)
# wait to finalize decommissioning (CLI or check the http://PUBLIC_IP:50070)
sudo -u hdfs kinit -k -t /etc/security/keytab/nn.service.keytab nn/`hostname -f`
sudo -u hdfs hdfs dfsadmin -report
...
- update infrastructure + SW configuration
On the terraform client machine:
# decrease number of nodes in terraform
vim *.auto.tfvars
# check the output
terraform plan
# perform the changes
terraform apply
- cleanups
On the master machine:
echo > /etc/hadoop/conf/excludes
sudo -u hdfs hdfs dfsadmin -refreshNodes
Add user
Launch /usr/local/sbin/hadoop-adduser.sh USER_NAME in the whole cluster.
For example using Ansible from the master machine (replace $USER_NAME by the new user name):
sudo su -l deployadm
cd ~/terraform
ansible -i ./inventory -m command -a "/usr/local/sbin/hadoop-adduser.sh $USER_NAME" all
The generated password is written on the output and stored in the home directory.
Internals
Terraform builds the infrastructure. In the last step the orchestrate.py script is launched, which finishes the missing pieces (waiting for machine existence, proper DNS setup, ...), and then deploys and configures the software. The information about the infrastructure from Terraform is stored to config.json file and used for the orchestration.
The orchestration script has multiple steps and dry-run option. See ./orchestrate.py --help.