(For more resources related to this topic, see here.)

From the perspective of functionality, a Hadoop cluster is composed of an HDFS cluster and a MapReduce cluster . The HDFS cluster consists of the default filesystem for Hadoop. It has one or more NameNodes to keep track of the filesystem metadata, while actual data blocks are stored on distributed slave nodes managed by DataNode. Similarly, a MapReduce cluster has one JobTracker daemon on the master node and a number of TaskTrackers on the slave nodes. The JobTracker manages the life cycle of MapReduce jobs. It splits jobs into smaller tasks and schedules the tasks to run by the TaskTrackers. A TaskTracker executes tasks assigned by the JobTracker in parallel by forking one or a number of JVM processes. As a Hadoop cluster administrator, you will be responsible for managing both the HDFS cluster and the MapReduce cluster.

In general, system administrators should maintain the health and availability of the cluster. More specifically, for an HDFS cluster, it means the management of the NameNodes and DataNodes and the management of the JobTrackers and TaskTrackers for MapReduce. Other administrative tasks include the management of Hadoop jobs, for example configuring job scheduling policy with schedulers.

Managing the HDFS cluster

The health of HDFS is critical for a Hadoop-based Big Data platform. HDFS problems can negatively affect the efficiency of the cluster. Even worse, it can make the cluster not function properly. For example, DataNode's unavailability caused by network segmentation can lead to some under-replicated data blocks. When this happens, HDFS will automatically replicate those data blocks, which will bring a lot of overhead to the cluster and cause the cluster to be too unstable to be available for use. In this recipe, we will show commands to manage an HDFS cluster.

Getting ready

Before getting started, we assume that our Hadoop cluster has been properly configured and all the daemons are running without any problems.

Log in to the master node from the administrator machine with the following command:

ssh hduser@master

How to do it...

Use the following steps to check the status of an HDFS cluster with hadoop fsck:

Check the status of the root filesystem with the following command:
hadoop fsck /
We will get an output similar to the following:

FSCK started by hduser from /10.147.166.55 for path / at Thu Feb
28 17:14:11 EST 2013
..

/user/hduser/.staging/job_201302281211_0002/job.jar: Under
replicated blk_-665238265064328579_1016. Target Replicas is 10 but
found 5 replica(s).

.................................Status: HEALTHY

Total size: 14420321969 B
Total dirs: 22
Total files: 35
Total blocks (validated): 241 (avg. block size 59835360 B)
Minimally replicated blocks: 241 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 2 (0.8298755 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0248964
Corrupt blocks: 0
Missing replicas: 10 (2.0491803 %)
Number of data-nodes: 5
Number of racks: 1
FSCK ended at Thu Feb 28 17:14:11 EST 2013 in 28 milliseconds

The filesystem under path '/' is HEALTHY
The output shows that some percentage of data blocks is under-replicated. But because HDFS can automatically make duplication for those data blocks, the HDFS filesystem and the '/' directory are both HEALTHY.

Check the status of all the files on HDFS with the following command:
hadoop fsck / -files
We will get an output similar to the following:

FSCK started by hduser from /10.147.166.55 for path / at Thu Feb
28 17:40:35 EST 2013
/ <dir>

/home <dir>
/home/hduser <dir>
/home/hduser/hadoop <dir>
/home/hduser/hadoop/tmp <dir>
/home/hduser/hadoop/tmp/mapred <dir>
/home/hduser/hadoop/tmp/mapred/system <dir>
/home/hduser/hadoop/tmp/mapred/system/jobtracker.info 4 bytes, 1
block(s): OK
/user <dir>
/user/hduser <dir>
/user/hduser/randtext <dir>
/user/hduser/randtext/_SUCCESS 0 bytes, 0 block(s): OK
/user/hduser/randtext/_logs <dir>
/user/hduser/randtext/_logs/history <dir>
/user/hduser/randtext/_logs/history/job_201302281451_0002_13620904
21087_hduser_random-text-writer 23995 bytes, 1 block(s): OK
/user/hduser/randtext/_logs/history/job_201302281451_0002_conf.xml
22878 bytes, 1 block(s): OK
/user/hduser/randtext/part-00001 1102231864 bytes, 17 block(s):
OK
Status: HEALTHY
Hadoop will scan and list all the files in the cluster.
This command scans all ? les on HDFS and prints the size and status.

Check the locations of file blocks with the following command:
hadoop fsck / -files -locations
The output of this command will contain the following information:

The first line tells us that file part-00000 has 17 blocks in total and each block has 2 replications (replication factor has been set to 2). The following lines list the location of each block on the DataNode. For example, block blk_6733127705602961004_1127 has been replicated on hosts 10.145.231.46 and 10.145.223.184. The number 50010 is the port number of the DataNode.

Check the locations of file blocks containing rack information with the following command:
hadoop fsck / -files -blocks -racks

Delete corrupted files with the following command:
hadoop fsck -delete

Move corrupted files to /lost+found with the following command:
hadoop fsck -move

Use the following steps to check the status of an HDFS cluster with hadoop dfsadmin:

Report the status of each slave node with the following command:
hadoop dfsadmin -report
The output will be similar to the following:

Configured Capacity: 422797230080 (393.76 GB)
Present Capacity: 399233617920 (371.82 GB)
DFS Remaining: 388122796032 (361.47 GB)
DFS Used: 11110821888 (10.35 GB)
DFS Used%: 2.78%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 10.145.223.184:50010
Decommission Status : Normal
Configured Capacity: 84559446016 (78.75 GB)
DFS Used: 2328719360 (2.17 GB)
Non DFS Used: 4728565760 (4.4 GB)
DFS Remaining: 77502160896(72.18 GB)
DFS Used%: 2.75%
DFS Remaining%: 91.65%
Last contact: Thu Feb 28 20:30:11 EST 2013

...
The first section of the output shows the summary of the HDFS cluster, including the configured capacity, present capacity, remaining capacity, used space, number of under-replicated data blocks, number of data blocks with corrupted replicas, and number of missing blocks.

The following sections of the output information show the status of each HDFS slave node, including the name (ip:port) of the DataNode machine, commission status, configured capacity, HDFS and non-HDFS used space amount, HDFS remaining space, and the time that the slave node contacted the master.

Refresh all the DataNodes using the following command:
hadoop dfsadmin -refreshNodes

Check the status of the safe mode using the following command:
hadoop dfsadmin -safemode get
We will be able to get the following output:

Safe mode is OFF
The output tells us that the NameNode is not in safe mode. In this case, the filesystem is both readable and writable. If the NameNode is in safe mode, the filesystem will be read-only (write protected).

Manually put the NameNode into safe mode using the following command:
hadoop dfsadmin -safemode enter
This command is useful for system maintenance.

Make the NameNode to leave safe mode using the following command:
hadoop dfsadmin -safemode leave
If the NameNode has been in safe mode for a long time or it has been put into safe mode manually, we need to use this command to let the NameNode leave this mode.

Wait until NameNode leaves safe mode using the following command:
hadoop dfsadmin -safemode wait
This command is useful when we want to wait until HDFS finishes data block replication or wait until a newly commissioned DataNode to be ready for service.

Save the metadata of the HDFS filesystem with the following command:
hadoop dfsadmin -metasave meta.log
The meta.log file will be created under the directory $HADOOP_HOME/logs. Its content will be similar to the following:

21 files and directories, 88 blocks = 109 total
Live Datanodes: 5
Dead Datanodes: 0
Metasave: Blocks waiting for replication: 0
Metasave: Blocks being replicated: 0
Metasave: Blocks 0 waiting deletion from 0 datanodes.
Metasave: Number of datanodes: 5
10.145.223.184:50010 IN 84559446016(78.75 GB) 2328719360(2.17 GB)
2.75% 77502132224(72.18 GB) Thu Feb 28 21:43:52 EST 2013
10.152.166.137:50010 IN 84559446016(78.75 GB) 2357415936(2.2 GB)
2.79% 77492854784(72.17 GB) Thu Feb 28 21:43:52 EST 2013
10.145.231.46:50010 IN 84559446016(78.75 GB) 2048004096(1.91 GB)
2.42% 77802893312(72.46 GB) Thu Feb 28 21:43:54 EST 2013
10.152.161.43:50010 IN 84559446016(78.75 GB) 2250854400(2.1 GB)
2.66% 77600096256(72.27 GB) Thu Feb 28 21:43:52 EST 2013
10.152.175.122:50010 IN 84559446016(78.75 GB) 2125828096(1.98 GB)
2.51% 77724323840(72.39 GB) Thu Feb 28 21:43:53 EST 2013
21 files and directories, 88 blocks = 109 total
...

How it works...

The HDFS filesystem will be write protected when NameNode enters safe mode. When an HDFS cluster is started, it will enter safe mode first. The NameNode will check the replication factor for each data block. If the replica count of a data block is smaller than the configured value, which is 3 by default, the data block will be marked as under-replicated. Finally, an under-replication factor, which is the percentage of under-replicated data blocks, will be calculated. If the percentage number is larger than the threshold value, the NameNode will stay in safe mode until enough new replicas are created for the under-replicated data blocks so as to make the under-replication factor lower than the threshold.

We can get the usage of the fsck command using:

hadoop fsck

The usage information will be similar to the following:

Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks
[-locations | -racks]]]
<path> start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-blocks print out block report
-locations print out locations for every block
-racks print out network topology for data-node locations
By default fsck ignores files opened for write, use
-openforwrite to report such files. They are usually tagged CORRUPT or
HEALTHY depending on their block allocation status.

We can get the usage of the dfsadmin command using:

hadoop dfsadmin

The output will be similar to the following:

Usage: java DFSAdmin
[-report]
[-safemode enter | leave | get | wait]
[-saveNamespace]
[-refreshNodes]
[-finalizeUpgrade]
[-upgradeProgress status | details | force]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-setQuota <quota> <dirname>...<dirname>]
[-clrQuota <dirname>...<dirname>]
[-setSpaceQuota <quota> <dirname>...<dirname>]
[-clrSpaceQuota <dirname>...<dirname>]
[-setBalancerBandwidth <bandwidth in bytes per second>]
[-help [cmd]]

There's more…

Besides using command line, we can use the web UI to check the status of an HDFS cluster. For example, we can get the status information of HDFS by opening the link http://master:50070/dfshealth.jsp.

We will get a web page that shows the summary of the HDFS cluster such as the configured capacity and remaining space. For example, the web page will be similar to the following screenshot:

managing-hadoop-cluster-img-1

By clicking on the Live Nodes link, we can check the status of each DataNode. We will get a web page similar to the following screenshot:

managing-hadoop-cluster-img-2

By clicking on the link of each node, we can browse the directory of the HDFS filesystem. The web page will be similar to the following screenshot:

managing-hadoop-cluster-img-3

The web page shows that file /user/hduser/randtext has been split into five partitions. We can browse the content of each partition by clicking on the part-0000x link.

Configuring SecondaryNameNode

Hadoop NameNode is a single point of failure. By configuring SecondaryNameNode, the filesystem image and edit log files can be backed up periodically. And in case of NameNode failure, the backup files can be used to recover the NameNode. In this recipe, we will outline steps to configure SecondaryNameNode.

Getting ready

We assume that Hadoop has been configured correctly.

Log in to the master node from the cluster administration machine using the following command:

ssh hduser@master

How to do it...

Perform the following steps to configure SecondaryNameNode:

Stop the cluster using the following command:
stop-all.sh

Add or change the following into the file $HADOOP_HOME/conf/hdfs-site.xml:
<property>
<name>fs.checkpoint.dir</name>
<value>/hadoop/dfs/namesecondary</value>
</property>
If this property is not set explicitly, the default checkpoint directory will be ${hadoop.tmp.dir}/dfs/namesecondary.

Start the cluster using the following command:
start-all.sh
The tree structure of the NameNode data directory will be similar to the following:

${dfs.name.dir}/
├── current
│ ├── edits
│ ├── fsimage
│ ├── fstime
│ └── VERSION
├── image
│ └── fsimage
├── in_use.lock
└── previous.checkpoint
├── edits
├── fsimage
├── fstime
└── VERSION
And the tree structure of the SecondaryNameNode data directory will be similar to the following:

${fs.checkpoint.dir}/
├── current
│ ├── edits
│ ├── fsimage
│ ├── fstime
│ └── VERSION
├── image
│ └── fsimage
└── in_use.lock

There's more...

To increase redundancy, we can configure NameNode to write filesystem metadata on multiple locations. For example, we can add an NFS shared directory for backup by changing the following property in the file $HADOOP_HOME/conf/hdfs-site.xml:

<property>
<name>dfs.name.dir</name>
<value>/hadoop/dfs/name,/nfs/name</value>
</property>

Managing the MapReduce cluster

A typical MapReduce cluster is composed of one master node that runs the JobTracker and a number of slave nodes that run TaskTrackers. The task of managing a MapReduce cluster includes maintaining the health as well as the membership between TaskTrackers and the JobTracker. In this recipe, we will outline commands to manage a MapReduce cluster.

Getting ready

We assume that the Hadoop cluster has been properly configured and running.

Log in to the master node from the cluster administration machine using the following command:

ssh hduser@master

How to do it...

Perform the following steps to manage a MapReduce cluster:

List all the active TaskTrackers using the following command:
hadoop -job -list-active-trackers
This command can help us check the registration status of the TaskTrackers in the cluster.

Check the status of the JobTracker safe mode using the following command:
hadoop mradmin -safemode get
We will get the following output:

Safe mode is OFF
The output tells us that the JobTracker is not in safe mode. We can submit jobs to the cluster. If the JobTracker is in safe mode, no jobs can be submitted to the cluster.

Manually let the JobTracker enter safe mode using the following command:
hadoop mradmin -safemode enter
This command is handy when we want to maintain the cluster.

Let the JobTracker leave safe mode using the following command:
hadoop mradmin -safemode leave
When maintenance tasks are done, you need to run this command.

If we want to wait for safe mode to exit, the following command can be used:
hadoop mradmin -safemode wait

Reload the MapReduce queue configuration using the following command:
hadoop mradmin -refreshQueues

Reload active TaskTrackers using the following command:
hadoop mradmin -refreshNodes

How it works...

Get the usage of the mradmin command using the following:

hadoop mradmin

The usage information will be similar to the following:

Usage: java MRAdmin
[-refreshServiceAcl]
[-refreshQueues]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshNodes]
[-safemode <enter | leave | get | wait>]
[-help [cmd]]
...

The meaning of the command options is listed in the following table:

Option	Description
-refreshServiceAcl	Force JobTracker to reload service ACL.
-refreshQueues	Force JobTracker to reload queue configurations.
-refreshUserToGroupsMappings	Force JobTracker to reload user group mappings.
-refreshSuperUserGroupsConfiguration	Force JobTracker to reload super user group mappings.
-refreshNodes	Force JobTracker to refresh the JobTracker hosts.
-help [cmd]	Show the help info for a command or all commands.

Summary

In this article, we learned Managing the HDFS cluster, configuring SecondaryNameNode, and managing the MapReduce cluster. As a Hadoop cluster administrator, as the system administrator is responsible for managing both the HDFS cluster and the MapReduce cluster, he/she must be aware of how to manage these in order to maintain the health and availability of the cluster. More speciﬁcally, for an HDFS cluster, it means the management of the NameNodes and DataNodes and the management of the JobTrackers and TaskTrackers for MapReduce, which is covered in this article.