what is yarn in hadoop

what is yarn in hadoop

1 year ago 39
Nature

YARN stands for Yet Another Resource Negotiator and is a resource management and job scheduling technology in the open-source Hadoop distributed processing framework. It was introduced in Hadoop 2.0 to remove the bottleneck on Job Tracker, which was present in Hadoop 1.0. The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. YARN allows administrators to allocate and monitor the resources required by each application in a cluster, such as CPU, memory, and disk space.

YARN has several key components, including:

  • ResourceManager (RM): A global resource manager that manages the allocation of resources to applications.
  • NodeManager (NM): A per-machine agent that is responsible for containers, monitoring their resource usage (CPU, memory, disk, network) and reporting the same to the ResourceManager.
  • ApplicationMaster (AM): A per-application framework-specific entity that negotiates resources from the ResourceManager and works with the NodeManager(s) to execute and monitor the tasks.
  • Container: A generic term in YARN used to describe a collection of resources (CPU, memory, disk, network) allocated to an application.

YARN has opened up new uses for Hadoop, allowing it to process and run data for batch processing, stream processing, interactive processing, graph processing, and many more. It supports various data processing engines such as Apache Spark, Apache Flink, Apache Storm, and others. YARN also allows different data processing engines to run and process data stored in HDFS, making the system much more efficient.

In summary, YARN is a resource management and job scheduling technology in the open-source Hadoop distributed processing framework. It allows administrators to allocate and monitor the resources required by each application in a cluster, and it supports various data processing engines such as Apache Spark, Apache Flink, Apache Storm, and others. YARN has opened up new uses for Hadoop, allowing it to process and run data for batch processing, stream processing, interactive processing, graph processing, and many more.

Read Entire Article