Skip to content

Jeremy-Demlow/customfunctions

Repository files navigation

CustomFunctions

Functions and Procedure For Snowflake. This is a simple repository that contains the functions and procedures that I have created to show the ease of managment and the power for snowflake CLI.

Setup and Deployment Guide

1. Installing Snowflake CLI

Link

Cheat Sheet

To install the Snowflake CLI, follow the instructions provided in the official installation guide. This guide covers all the necessary steps to set up the CLI on your system.

2. Configuring the CLI

To use the Snowflake CLI, you need to create a connection configuration in the ~/.snowflake/config.toml file. This file allows you to specify connection details for various environments. Refer to the Snowflake CLI configuration documentation for a comprehensive guide.

Using Snowpark in Snowflake CLI

3. Setting Up Your Database and Schema

You can use the Snowsight UI or the Snowflake CLI to execute the following SQL commands. While the code example uses the ACCOUNTADMIN role for simplicity, you should use a role with the appropriate permissions in your environment.

NOTE: The following SQL commands create a new role, grant the necessary permissions, and set up a warehouse and schema for the Snowpark project. You can adjust the code based on your requirements. You don’t need to grat all the permissions to the role, but this is a good starting point.

ALTER SESSION SET query_tag = '{"team":"Solutions","name":"JeremyDemlow", "version":0.1, "attributes":{"medium":"setup", "source":"DATASCIENCE", "purpose": "setup"}}';

USE ROLE ACCOUNTADMIN;
CREATE ROLE DATA_SCIENTIST;

USE ROLE SYSADMIN;
GRANT CREATE DATABASE ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT CREATE COMPUTE POOL ON ACCOUNT TO ROLE DATA_SCIENTIST;

USE ROLE ACCOUNTADMIN;
GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT MONITOR USAGE ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT EXECUTE MANAGED TASK ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT BIND SERVICE ENDPOINT ON ACCOUNT TO ROLE DATA_SCIENTIST;
GRANT IMPORTED PRIVILEGES ON DATABASE snowflake TO ROLE DATA_SCIENTIST;
SET my_user_var = (SELECT  '"' || CURRENT_USER() || '"' );
GRANT ROLE data_scientist TO USER identifier($my_user_var);

USE ROLE DATA_SCIENTIST;

CREATE OR REPLACE WAREHOUSE DS_WH_XS
  WAREHOUSE_SIZE = XSMALL
  AUTO_SUSPEND = 120
  AUTO_RESUME = TRUE;

CREATE DATABASE DATASCIENCE; 
CREATE SCHEMA DATASCIENCE;
USE SCHEMA DATASCIENCE;

4. Initializing and Deploying the Snowpark Project

Creating a Boilerplate

To set up a boilerplate Snowpark project, use the following command:

snow init custom_functions --template example_snowpark

Note: This will create a boilerplate for you. In this example, adjustments were made to move all the files from the default ‘app/’ directory to the root. You can modify these based on your preferences. This was a custom development decision to work the way I typical develop.

Adding Repo To Snowflake

snow connection set-default DATASCIENCE
snow connection test
snow git setup DATASCIENCE

Building and Deploying the Project

To build and deploy the Snowpark project, execute:

snow snowpark build; snow snowpark deploy --replace; rm DATASCIENCE.zip; rm dependencies.zip; rm dev.zip; rm requirements.snowflake.txt

The deployment output will list the created objects, such as procedures and functions.

5. Creating and Testing a Table

Creating a Fake Orders Table

Navigate to the dev/development.ipynb notebook and run the cells to create a fake orders table in the CUSTOM_FUNCTIONS schema. This setup is essential for testing the procedure that you will create with this repo.

Note: Running the entire notebook will also perform an aggregation in the CUSTOM_FUNCTIONS schema. You can defer this step if you prefer to follow the next steps, but this is also showing how you can actually develop locally and run everything before pushing this off to snowflake.

Testing Functions and Procedures

NOTE: pytest is included in the project. You can run the tests by executing the following command, but this is not going to be 100% coverage, but I did want to show that this can be done in this flow as well. The deploy_snowpark.yaml in github/workflows will run the test and if it fails you will want it to skip the deployment of the code.

pytest -v

To test the functions and procedures, use the following commands in Snowflake CLI or in Snowsight:

# Run Function
snow sql -q "SELECT DATASCIENCE.CUSTOM_FUNCTIONS.HELLO_FUNCTION('I would like a sandwich, please');"

# Run Procedures
snow sql -f ./customfunctions/files/perform_aggregation_config.sql
now sql -f ./customfunctions/files/perform_aggregation_config.sql -D SOURCE_TABLE=DATASCIENCE.CUSTOM_FUNCTIONS.ORDERS -D TARGET_TABLE=DATASCIENCE.CUSTOM_FUNCTIONS.ORDERS_AGGREGATES

In Snowsight:

CREATE SCHEMA IF NOT EXISTS DATASCIENCE.UAT;
-- Using caller rights of the session, but using CUSTOM_FUNCTIONS data and putting the results in uat
CALL DATASCIENCE.CUSTOM_FUNCTIONS.PERFORM_AGGREGATION(
'{
    "request_id": "AGG_002",
    "source_table": "DATASCIENCE.CUSTOM_FUNCTIONS.orders",
    "target_table": "DATASCIENCE.uat.orders_aggregates",
    "group_by": ["PRODUCT_CATEGORY", "REGION"],
    "metrics": [
        {"name": "TOTAL_SALES", "function": "sum", "column": "SALES_AMOUNT"},
        {"name": "AVERAGE_ORDER_VALUE", "function": "avg", "column": "ORDER_VALUE"},
        {"name": "ORDER_COUNT", "function": "count", "column": "ORDER_ID"}
    ],
    "filters": [
        {"column": "DATE", "operator": "between", "value": ["2023-01-01", "2023-12-31"]},
        {"column": "STATUS", "operator": "in", "value": ["completed", "shipped"]}
    ],
    "version": "1.0.0"
}'
);

# This is showing how you can use a file in your procedures or functions.
CALL DATASCIENCE.CUSTOM_FUNCTIONS.HELLO_PROCEDURE('Developer we should expect our yaml to appear');

This guide provides a streamlined approach to setting up and deploying your Snowflake environment with Snowpark. Customize as necessary to fit your project’s requirements.

About

Managing UDFs, UDTF, UDAF and others in this repository

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published