- Spin up Ubuntu 12.04 HVM (ubuntu/images/hvm/ubuntu-precise-12.04-amd64-server-20140927):
- 1x r3.large, 30GB General Purpose SSD (for Ambari Master)
- 2x t2.small, 30GB General Purpose SSD (for Ambari Slaves)
For production, separate Name Node from other services. In this case, we will be having r3.large contain the Name Node and all other services as well.
For our sake, we kept the .pem file for all three instances the same. We call it env1pem.pem.
-
Create a security group with the following settings:
- Ambari needs 8080 from everywhere
- TCP/ICMP/UDP from all instances within security group
- 22, 80 open
- 8800-8820 for RVI
- If we need storm, we will need to add their respective ports as well
-
ssh into master
-
sudo apt-get update
-
sudo apt-get install lamp-server^ -y, sudo apt-get install ntp -y
-
leave password blank for MySQL prompts (we will not be using MySQL)
-
sudo su
-
cd ~
-
ssh-keygen -t rsa
- name it "id_rsa"
-
copy the env1pem.pem into the master, in the same directory as ambari
-
ssh into a slave, from master
-
sudo apt-get update, sudo apt-get install ntp -y
-
sudo su
-
vi /root/.ssh/id_rsa.pub and copy in the pub
-
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
-
repeat 11 to 15 for other slaves
-
go back to master
-
vi /etc/hosts
-
append this below 127.0.0.1 localhost: 172.31.6.147 ip-172-31-6-147.us-west-2.compute.internal slave1 172.31.3.44 ip-172-31-3-44.us-west-2.compute.internal slave2 172.31.42.145 ip-172-31-42-145.us-west-2.compute.internal master
where <> <<Private DNS/Fully Qualified Domain Name (FQDN)>> <>
-
check if you can ssh root@slave1 from master.
-
wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu12/2.x/updates/2.0.0/ambari.list -O /etc/apt/sources.list.d/ambari.list
-
apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
-
apt-get update
-
apt-get install ambari-server -y
-
ambari-server setup
- daemon? no
- jdk choice? Oracle JDK 1.7
- agree to jdk? y
- advanced db config? n
-
ambari-server start
-
open browser to port 8080 of master instance
-
sign in with username=admin, password=admin191
- you can change password if needed
-
click launch install wizard
-
in get started: name the cluster ucsdkthxbai, click next
-
stack: HDP 2.2, click next
-
install options: * target hosts: put all your private DNS (masters and slaves) * host registration info: copy in env1pem.pem, and ssh user account is ubuntu * click next * note: sometimes the next arrow doesn't work. click back and redo
-
Choose services: choose all services, and click next. ignore spark warning.
-
assign masters: put all services onto master, except zookeeper server, which each node requires one.
-
assign slaves and clients: put data nodes on slaves ONLY, put rest on master ONLY.
-
customize services:
- in hive, enter database password: hivepassword
- in oozie, enter database password: ooziepassword
- in knox, enter master secret: knoxpassword
-
review: next
-
wait for installation (will take ~30 minutes)
-
press finish to go to dashboard
- ssh into master instance
- sudo su
- cd /root
- mkdir rvi
- apt-get install git -y
- wget http://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb
- dpkg -i erlang-solutions_1.0_all.deb
- apt-get update
- apt-get install erlang -y
- run erl -v and check its version is on 6.20
- git clone https://github.com/PDXostc/rvi_core.git
- cd rvi_core/
- vi rvi_sample.config
- (line 66) change lager_console_backend, notice to lager_console_backend, info
- (line 93) add the public IP of master instance to node_address:8817
- (line 112) change node_service_prefix to jlr.com/backend
- apt-get install make
- make deps
- make compile
- ./scripts/setup_rvi_node.sh -d -n hdpbackend -c rvi_sample.config
- ./scripts/rvi_node.sh -n hdpbackend
- open new terminal tab
- git clone https://github.com/PDXostc/rvi_backend TBA after pull request complete; code for apache kafka is currently on our own repo