Interview Questions
HDFS – Hadoop Distributed File System
-
Hadoop Distributed File System is exclusively designed framework of Apache to handle processing of clusters of data resided within commodity software. The file system is based on GFS (Google File System) and known for fault-tolerance and reliability. With the help of shell like commands, HDFS enables the interaction with the file system. Like MapReduce, HDFS also follows a master/slave architecture in which the master NameNode manages the metadata of the file system and the slave DataNode stores the actual data assigned by the master. Thus, every data stored in the HDFS contains many sets of DataNodes. These DataNodes are not only meant for storing and retrieving the actual data but also capable of performing operations such as block creation, replication and deletion which are instructed by the master NameNode.
Compatibility with other File Systems
Though the HDFS is the main file system used by Hadoop, it also supports working with other distributed file systems such as HFTP, FS, Local FS and S3 FS etc.
Get in touch with training experts Get Free Quotes