wget https://downloads.apache.org/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
or
Directly download hadoop from Apache official website. [move zip file to /home/hdoop location]
tar xzf hadoop-3.2.1.tar.gz
Editing 6 important files
1st file [.bashrc]
cd /home/hdoop
sudo vi .bashrc - ##here you might face issue saying hdoop is not sudo user
if this issue comes then
su - ankush
sudo adduser hdoop sudo
cd /home/hdoop
sudo vi .bashrc
#Add below lines in this file
#Hadoop Related Options
export HADOOP_HOME=/home/hdoop/hadoop-3.2.1
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
source ~/.bashrc
2nd File [hadoop-env.sh]
sudo vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
#Add below line in this file in the end
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
3rd File [core-site.xml]
vi $HADOOP_HOME/etc/hadoop/core-site.xml
#Add below lines in this file(between "<configuration>" and "<"/configuration>")
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdoop/tmpdata</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system></description>
</property>
4th File [hdfs-site.xml]
#Add below lines in this file(between "<configuration>" and "<"/configuration>")
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
5th File [mapred-site.xml]
$HADOOP_HOME/etc/hadoop/mapred-site.xml
#Add below lines in this file(between <configuration> and </configuration>)
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Hadoop NameNode started on default port 9870
http://192.168.0.60:9870/
port 8042 for getting the information about the cluster and all applications.
http://localhost:8042/
Access port 9864 to get details about your Hadoop node.
http://localhost:9864/
Start/Stop Hadoop Services
start-all.sh & stop-all.sh : Used to start and stop hadoop daemons all at once. Issuing it on the master machine will start/stop the daemons on all the nodes of a cluster. Deprecated as you have already noticed.
start-dfs.sh, stop-dfs.sh and start-yarn.sh, stop-yarn.sh : Same as above but start/stop HDFS and YARN daemons separately on all the nodes from the master machine. It is advisable to use these commands now over start-all.sh & stop-all.sh
hadoop-daemon.sh namenode/datanode and yarn-deamon.sh resourcemanager : To start individual daemons on an individual machine manually. You need to go to a particular node and issue these commands.
Use case : Suppose you have added a new DN to your cluster and you need to start the DN daemon only on this machine,
bin/hadoop-daemon.sh start datanode