Author: Sachin Shrikant Gotal
In this blog, we are going to discuss how we can use the Schema change tool to manage our database objects and scripts so that we can deploy them on Snowflake.
Tools we are going to use
- Github for version control
- Github Actions for creating pipeline or trigger
- Schema change for Database Change management
What is Schema Change?
Schema change is a simple python based tool to manage all your Snowflake database scripts and objects. We can combine schema change with version control tools like GitHub and CI/CD tools like Github action to follow modern software delivery practices and enable DataOps.
Project Structure
Schema change expects a project folder structure as follows.
You need to have a Root folder(with any name). Inside the root folder, you can have any number of subfolders.
You also need to follow naming conventions while creating scripts files. Here’s an example of a demo folder which is a root folder, and inside it, we have scripts.
|– demo
|– V1.1__Release1.sql
|– V1.2__Release2.sql
|– V1.3__Release3.sql
The script’s name must follow a certain pattern.
Prefix – All scripts should be started with the letter “V” denoting versioned scripts.
Version number – After that, a version number must be given. We have 1.1, 1.2, 1.3, etc.
Separator – We should have 2 underscores “__” as a separator after that.
Description – The name of the script (In our example, we have Release1, Release2, etc.)
Suffix – .sql
Prerequisite
- We must have a METADATA database created already. This database will contain a schema that will contain a Change history table inside which all the information about scripts that are executed will be stored. This will be used by schema change to keep a track of the scripts.
- Python 3 installed.
Now let’s create a repository on Github. I have created with the name SchemachangeCICD and then cloned it locally.
Now I have created a folder named dbscripts. This will be the root folder for my project, where I’ll be keeping all my change scripts.
Let’s go ahead and create a workflow in GitHub actions. To create that, go to your repo and click on Actions and New workflow/ Setup a workflow yourself.
Let’s make changes to the YAML file. I have named it main.YAML
We want the workflow to trigger as soon as someone commits some changes in the dbscripts folder and the main branch. Below is the YAML configuration for that.
name: CI on: push: branches: – main paths: – ‘dbscripts/**’ |
As soon as the changes are committed and pushed on the main branch inside dbscripts folder, we want schema change to run and deploy our scripts on the Snowflake.
Below is the YAML file configuration.
jobs: build: runs-on: ubuntu-latest steps: – uses: actions/checkout@v2 – name: Use Python 3.8.x uses: actions/setup-python@v2.2.1 with: python-version: 3.8.x – name: Run schemachange env: SF_ACCOUNT: ${{ secrets.SF_ACCOUNT }} SF_USERNAME: ${{ secrets.SF_USERNAME }} SNOWFLAKE_PASSWORD: ${{ secrets.SF_PASSWORD }} SF_ROLE: ${{ secrets.SF_ROLE }} SF_WAREHOUSE: ${{secrets.SF_WAREHOUSE}} SF_DATABASE: ${{ secrets.SF_DATABASE }} run: | echo “GITHUB WORKSPACE $GITHUB_WORKSPACE” python –version echo “Step 1: Initializing Schemachange” pip install schemachange echo “Step 2: Running Schemachange” schemachange -f $GITHUB_WORKSPACE/dbscripts -a $SF_ACCOUNT -u $SF_USERNAME -r $SF_ROLE -w $SF_WAREHOUSE -d $SF_DATABASE -c $SF_DATABASE.SCHEMACHANGE.CHANGE_HISTORY –create-change-history-table |
We need to have all the environment variables set up in Github. So we go ahead and configure that in GitHub secrets.
Configuring Secrets/Environment Variables
So go to the GitHub repository, click on settings and click on secrets.
Configure the below secrets. You need to configure secrets for SNOWFLAKE_USER, SNOWFLAKE_ACCOUNT, SNOWFLAKE_PASSWORD, SNOWFLAKE_WAREHOUSE, SNOWLFLAKE_ROLE, SNOWFLAKE_DATABASE.
Now all of our secrets are configured, we have also configured the workflow. Now let’s go ahead and create a script file, place it inside the dbscripts folder, commit the changes, and see if it works.
I have named my file V1.1__initial.sql and just have created a simple CREATE table script.
CREATE SCHEMA DEMO;
CREATE TABLE FIRST_TABLE_CICD
(
COL1 VARCHAR
,COL2 VARCHAR
);
After I commit the changes and push the changes in the master branch, the workflow will trigger, and our scripts will go and get executed by schema change inside Snowflake.
If you go to GitHub action and check, you will see that the build has been successful. Also the
The job was successful, and all the queries inside the script have been executed on Snowflake.
I hope you have got an idea of how to use schema change with GitHub with this tutorial.
Feel free to comment and ask questions. Thank you.