HDFS is the primary or major component of the Hadoop ecosystem which is responsible for storing large data sets of structured or unstructured data across various nodes and thereby maintaining the metadata in the form of log files. To use the HDFS commands, first you need to start the Hadoop services using the following command: Hadoop works on its own File System which is distributed in nature known as “Hadoop distributed File System HDFS”. Hadoop relies on distributed storage and parallel processing.

1. version

hadoop version

2. mkdir

hdfs dfs –mkdir /path/directory_name

3. ls

hdfs dfs -ls /path

4. put

hdfs dfs -put <localsrc> <dest>

5. copyFromLocal

hdfs dfs -copyFromLocal <localsrc> <hdfs destination>

6. get

hdfs dfs -get <src> <localdest>

7. copyToLocal

hdfs dfs -copyToLocal <hdfs source> <localdst>

8. cat

hdfs dfs –cat /path_to_file_in_hdfs

9. mv

hdfs dfs -mv <src> <dest>

10. cp

hdfs dfs -cp <src> <dest>

11. du

hdfs dfs –du –s /directory/filename

12. text

hdfs dfs –text /directory/filename

13. count

hdfs dfs -count <path>

14. setrep

hdfs dfs -setrep -R  4 /geeks


16. jps

17. chmod

hdfs dfs -chmod [-R] <mode> <path>

18. getmerge

hdfs dfs -getmerge <src> <localdest>

Follow me

Contact us for Training/ Job Support

Caution: Your use of any information or materials on this website is entirely at your own risk. It is provided for educational purposes only. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.