Apache Hadoop — How to install and configure a cluster on Ubuntu 18.04

java -version
sudo apt-get install default-jre
sudo apt-get install default-jdk
sudo addgroup hadoop
sudo adduser -–ingroup hadoop hadoopuser
sudo adduser hadoopuser sudo
sudo apt-get install openssh-server
su - hadoopuser
ssh-keygen -t rsa -P ""
cat $HOME/ .ssh/id_rsa.pub >> $HOME/ .ssh/authorized_keys
ssh localhost
sudo wget -P /home/arga/Desktop/ https://archive.apache.org/dist/hadoop/common/hadoop-2.9.1/hadoop-2.9.1.tar.gz
cd /home/arga/Desktop
sudo tar xvzf hadoop-2.9.1.tar.gz
sudo mv hadoop-2.9.1 /usr/local/hadoop
sudo chown -R hadoopuser /usr/local
sudo gedit ~/.bashrc
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=""
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
source ~/.bashrc
cd /usr/local/hadoop/
sudo nano hadoop-env.sh

Now, let’s set up a simple-node for a cluster

sudo nano core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
sudo nano hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>
sudo nano yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
sudo cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
sudo nano mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
cd
mkdir -p /usr/local/hadoop_space/hdfs/namenode
mkdir -p /usr/local/hadoop_space/hdfs/datanode
sudo chown -R hadoopuser /usr/local/hadoop_space
cd
hdfs namenode -format
start-all.sh
jps

And this is how a single node cluster is created.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Afonso Antunes

Afonso Antunes

Finalist student in Computer Engineering