what is apache beam

what is apache beam

1 year ago 31
Nature

Apache Beam is an open-source, unified programming model for defining and executing data processing pipelines, including ETL, batch, and stream processing. It simplifies the mechanics of large-scale data processing by providing a single programming model for both batch and streaming data-parallel processing pipelines. Using one of the provided SDKs, you can build a program that defines the pipeline, which is then executed in one of the supported runners, including Apache Flink, Apache Samza, Apache Spark, and Google Cloud Dataflow.

Apache Beam is particularly useful for embarrassingly parallel data processing tasks, in which the problem can be decomposed into many smaller bundles of data that can be processed independently and in parallel. It can also be used for Extract, Transform, and Load (ETL) tasks and pure data integration, such as moving data between different storage media and data sources, transforming data into a more desirable format, or loading data onto a new system.

Apache Beam is an Apache Software Foundation project, available under the Apache v2 license. It offers powerful abstractions that insulate users from low-level details of distributed data processing, such as coordinating individual workers, reading from sources, and writing to sinks. The Apache Beam data pipelines are expressed with generic transforms, making them understandable and maintainable, which helps accelerate Apache Beam adoption and onboarding of new team members. Apache Beam is easy to adopt and implement because it abstracts users from low-level details and provides freedom of choice between programming languages.

Apache Beam is extensible, with projects such as TensorFlow Extended and Apache Hop built on top of Apache Beam. It is a top-level project at Apache, the worlds largest, most welcoming open-source community. Data processing leaders around the world contribute to Apache Beams development and make an impact by bringing next-gen distributed data processing and advanced technology solutions into reality.

Read Entire Article