Hadoop 3.2.1 Installation Steps on Ubuntu

ANKUSH THAVALI
23 Nov, 2021
0 Comments
1 Min Read

Hadoop 3.2.1 Installation Steps on Ubuntu

Putty Setting [Assuming Ubuntu installation completed]

Below setting will help to take remote of hadoop cluster using putty

https://learnomate.org/settings-to-connect-to-putty-with-remove-oracle-database-server/

Change Hostname in Ubuntu

sudo hostnamectl set-hostname hadoop.com

open /etc/hosts file and change hostname

root@ankush-virtual-machine:/home/ankush# cat /etc/hosts
127.0.0.1       localhost
127.0.1.1       hadoop.com

Fire hostname and ensure hostname has changed

root@ankush-virtual-machine:/home/ankush# hostname
hadoop.com

Install Packages that will help you to take ssh and enable copy paste from vmware

sudo apt update
sudo apt install net-tools
sudo apt install open-vm-tools-desktop -y
sudo apt install vim -y
sudo apt install openssh-server -y
sudo service ssh status

Switch to root user

sudo su -

Java Installation

sudo apt install openjdk-8-jdk -y

java -version;
javac -version

sudo adduser hdoop
su - hdoop
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
ssh localhost

Add hdoop users to suers list

su - ankush
sudo adduser hdoop sudo

Downloading Hadoop


wget https://downloads.apache.org/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
or
Directly download hadoop from Apache official website. [move zip file to /home/hdoop location]

tar xzf hadoop-3.2.1.tar.gz

Editing 6 important files

1st file [.bashrc]


cd /home/hdoop
sudo vi .bashrc - ##here you might face issue saying hdoop is not sudo user
if this issue comes then
su - ankush
sudo adduser hdoop sudo


cd /home/hdoop
sudo vi .bashrc
#Add below lines in this file

#Hadoop Related Options
export HADOOP_HOME=/home/hdoop/hadoop-3.2.1
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

source ~/.bashrc

2nd File [hadoop-env.sh]

sudo vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh #Add below line in this file in the end export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

3rd File [core-site.xml]


vi $HADOOP_HOME/etc/hadoop/core-site.xml

#Add below lines in this file(between "<configuration>" and "<"/configuration>")
   
   <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hdoop/tmpdata</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
        <description>The name of the default file system></description>
    </property>

4th File [hdfs-site.xml]

#Add below lines in this file(between "<configuration>" and "<"/configuration>")


<property>
  <name>dfs.data.dir</name>
  <value>/home/hdoop/dfsdata/namenode</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>

5th File [mapred-site.xml]

$HADOOP_HOME/etc/hadoop/mapred-site.xml

#Add below lines in this file(between <configuration> and </configuration>)

<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
</property>

6th File [yarn-site.xml]

sudo vi $HADOOP_HOME/etc/hadoop/yarn-site.xml

<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>127.0.0.1</value>
</property>
<property>
  <name>yarn.acl.enable</name>
  <value>0</value>
</property>
<property>
  <name>yarn.nodemanager.env-whitelist</name>  <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>

Launching Hadoop

hdfs namenode -format

start-all.sh
start-dfs.sh

Hadoop WebUi URL


Hadoop NameNode started on default port 9870

http://192.168.0.60:9870/

port 8042 for getting the information about the cluster and all applications.

http://localhost:8042/

Access port 9864 to get details about your Hadoop node.

http://localhost:9864/

Start/Stop Hadoop Services

start-all.sh & stop-all.sh : Used to start and stop hadoop daemons all at once. Issuing it on the master machine will start/stop the daemons on all the nodes of a cluster. Deprecated as you have already noticed.

start-dfs.sh, stop-dfs.sh and start-yarn.sh, stop-yarn.sh : Same as above but start/stop HDFS and YARN daemons separately on all the nodes from the master machine. It is advisable to use these commands now over start-all.sh & stop-all.sh

hadoop-daemon.sh namenode/datanode and yarn-deamon.sh resourcemanager : To start individual daemons on an individual machine manually. You need to go to a particular node and issue these commands.

Use case : Suppose you have added a new DN to your cluster and you need to start the DN daemon only on this machine,

bin/hadoop-daemon.sh start datanode

Hadoop 3.2.1 Installation Steps on Ubuntu

Putty Setting [Assuming Ubuntu installation completed]

Change Hostname in Ubuntu

open /etc/hosts file and change hostname

Fire hostname and ensure hostname has changed

Install Packages that will help you to take ssh and enable copy paste from vmware

Switch to root user

Java Installation

Add hdoop users to suers list

Downloading Hadoop

Editing 6 important files

1st file [.bashrc]

2nd File [hadoop-env.sh]

3rd File [core-site.xml]

4th File [hdfs-site.xml]

5th File [mapred-site.xml]

6th File [yarn-site.xml]

Launching Hadoop

Hadoop WebUi URL

Start/Stop Hadoop Services

The next success story is yours....

Get the right guidance to leap through your career

About Us

Explore

Useful Links

Contact Info