Locating Common Balance Problems
Problem 1: Lack of Permission to Execute the Balance Task (Access denied)
Problem details: After the start-balancer.sh command is executed, the "hadoop-root-balancer-Hostname.out" log displays "Access denied for user test1. Superuser privilege is required."
cat /opt/client/HDFS/hadoop/logs/hadoop-root-balancer-host2.outTime Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being MovedINFO: Watching file:/opt/client/HDFS/hadoop/etc/hadoop/log4j.properties for changes with interval : 60000org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Access denied for user test1.Superuser privilege is requiredat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:122)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5916)
Cause analysis:
The administrator account is required for executing the balance task.
Solution:
- Secure version
Perform authentication for user hdfs or a user in the supergroup group and then execute the balance task.
- Normal version
Run the su - hdfs command on the client before running the balance command on HDFS.
Problem 2: Failed to Execute the Balance Task, and /system/balancer.id Reports an Exception
Problem details:
A user starts a balance process on the HDFS client. After the process is stopped unexpectedly, the user performs the balance operation again. The operation fails.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.protocol.RecoveryInProgressException): Failed to APPEND_FILE /system/balancer.id for DFSClient because lease recovery is in progress. Try again later.
Cause analysis:
Generally, after the balance operation is complete in HDFS, the /system/balancer.id file is automatically released and the balance operation can be performed again.
In the preceding scenario, the first balance operation is stopped abnormally. Therefore, when the balance operation is performed for the second time, the /system/balancer.id file still exists. As a result, the append /system/balancer.id operation is triggered and the balance operation fails.
Solution:
Method 1: After the hard lease period exceeds one hour, release the lease on the original client and perform the balance operation again.
Method 2: Delete the /system/balancer.id file from HDFS and perform the balance operation again.
- Problem 1: Lack of Permission to Execute the Balance Task (Access denied)
- Problem 2: Failed to Execute the Balance Task, and /system/balancer.id Reports an Exception