How to change number of replications of certain files in HDFS?

The HDFS has a configuration in hdfs-site.xml to set the global replication number of blocks with the “dfs.replication” property.

However, there are some “hot” files that are access by many nodes. How to increase the number of blocks for these certain files in HDFS?

You can the replication number of certain file to 10:

hdfs dfs -setrep -w 10 /path/to/file

You can also recursively set the files under a directory by:

hdfs dfs -setrep -R -w 10 /path/to/dir/

setrep

Usage: hdfs dfs -setrep [-R] [-w] <numReplicas> <path>

Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.

Options:

The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.

Example:

hdfs dfs -setrep -w 3 /user/hadoop/dir1

Exit Code:

Returns 0 on success and -1 on error.

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

Leave a Reply

Your email address will not be published. Required fields are marked *