what is snowpark

what is snowpark

1 year ago 36
Nature

Snowpark is a new developer framework designed to make building complex data pipelines much easier, and to allow developers to interact with Snowflake directly without having to move data. It is a set of libraries and runtimes in Snowflake that securely deploy and process non-SQL code, including Python, Java, and Scala. Snowpark provides a client API to developers that can be used with Python, Scala, Java to perform transformations, and it works on lazy execution principles like Spark. Some key features of Snowpark include:

  • DataFrame API: Snowpark consists of libraries, including the DataFrame API, which allows developers to write queries and data transformations using familiar DataFrames. Push down processing to benefit from the performance and scale of the Snowflake elastic processing engine.

  • Machine Learning APIs: Snowpark also includes native Snowpark machine learning (ML) APIs for model development (public preview) and deployment (private preview). Snowpark Model Registry (private preview) provides a unified repository for an organizations ML models to streamline and scale MLOps.

  • User-Defined Functions (UDFs): Snowpark allows developers to execute custom Python, Java, and Scala code in Snowflake, including business logic or trained machine learning models. Leverage the embedded Anaconda repository for effortless access to open-source libraries.

  • Stored Procedures: Snowpark enables developers to operationalize and orchestrate their DataFrame operations and custom code to run on a desired schedule and at scale.

  • Snowpark Container Services: Developers can register, deploy, and run container images (private preview) in Snowflake-managed infrastructure.

Snowpark allows developers to develop their data pipelines in their preferred language, with no limits on the level of complexity. It brings deeply integrated, DataFrame-style programming to the languages developers prefer to use. Snowpark jobs are conceptually very similar to Spark jobs, and Snowflake can also connect to Spark through the Snowflake Connector for Spark. Snowpark accelerates workloads by up to 99 percent and allows developers to leverage Snowflake’s computing power to ship their code to the data rather than exporting data to run in other environments where big data is a second-class citizen.

Read Entire Article