what is avro

what is avro

2 years ago 136
Nature

Avro is a data serialization framework that provides data exchange services for Apache Hadoop. It is an open-source project developed within Apaches Hadoop project. Avro uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data and a wire format for communication between Hadoop nodes and from client programs to the Hadoop services.

Avro uses a schema to structure the data that is being encoded. It offers excellent schema evolution, and has implementations for the JVM (Java, Kotlin, Scala, etc.), Python, C/C++/C#, PHP, Ruby, Rust, JavaScript, and even Perl. Avro is extensively used in big data frameworks like Apache Hadoop and Apache Flink, enabling efficient storage, processing, and data interchange in distributed systems.

One of the critical features of Avro is the ability to define a schema for data, which helps define a binary format for data and map it to the programming language of choice. Avros data model maps well to Hadoop data formats and Hive, as well as to other data systems. It is a popular binary row-based serialized textual format that can be seen as a binary alternative to JSON.

Read Entire Article