Author: Scott Fair
Sharing data with internal and external customers and business partners can be a complicated, brittle, and time-consuming process. Moving data between systems requires extracting, staging, securing, and granting access to the data before the partner can consume it. The data producer and consumer must agree on the format, and if one column change has not been documented, the process breaks. This increases data preparation time, increases costs, and impacts business decision-making.
Snowflake excels in its capacity to share data quickly and easily between teams and organizations. Let’s look at an example of how Snowflake’s data sharing capabilities can improve time-to-insight for a small, independent record label. This example is taken from a real-life project; only the names have been changed for privacy reasons.
Pain Points
Record labels are busy places. Releases need to be distributed, sales must be tracked, and band revenues must be paid. In today’s digital world, all these activities are 1’s and 0’s flying across the internet. And that means data. Lots and lots of data is created every day. This data can be used to identify new marketing opportunities, improve fan outreach, and strengthen the relationship between the label and the artist.
The biggest pain points were identified as:
· Extracting data from the distribution channels into spreadsheets for the artists is a manual, labor-intensive process.
· This process is only done once per year due to the level of effort required.
· Sending spreadsheets via email can be difficult if the document is too big for the email system and the email gets rejected.
· Compressing the files doesn’t always help because the receiver has to have the right software to decompress.
Snowflake has a better way.
Solution: Sharing data with Reader Accounts
Snowflake has several ways to share data:
· Data can be directly shared from one account to another account
· Data can be shared to the Snowflake Data Marketplace
· Data can be shared in a Private Data Exchange
But what if the external partner, in this case, a band or solo artist, doesn’t have a Snowflake account? Snowflake has that situation covered with Reader Accounts.
A Reader Account, in our example, is a read-only user in the record label’s Snowflake account. The account can select data from tables that have been shared with the Reader Account. And with Snowflake’s Row-Based Access Control (RBAC) and Secure Views, each Reader Account can only see the data that they are allowed to see.
(Source: Introduction to Secure Data Sharing — Snowflake Documentation)
Design:
Let’s look at how all this is put together:
1. Sales, Spins, and Social Media (S3M) data is received by the record label (the frequency was increased to weekly) and is stored in their respective tables in the S3M schema. The tables have an Artist Name on each record.
2. A schema is created to hold the secure views of the S3M data (S3M_Shared).
3. An Access Control table is created in the S3M_Shared schema.
4. Each artist is given a Reader Account.
5. Each Reader Account is entered into an Access Control table that equates their Reader Account ID with the Artist Name that is on each record.
6. Secure Views are created in the S3M_Shared schema that joins the S3M table data to the Access Control table on the artist’s name.
7. The Reader Account value is collected when the Reader Account user logs into Snowflake. The join assures that the Reader Account user can only see the data associated with their Artist Name and Reader Account.
In action
Artists can access the data at any time. No one from the label is required to extract and format a subset of the “official” spreadsheet.
Reader Accounts use different server resources than data loading, so there is never contention for resources.
Onboarding a new artist into the new system is easy. A python script was written to create the reader account, create the reader account’s compute resource, add the account to the access control table, and grant the reader account access to the shared data.
Best of all, there is no data movement. The Reader Account accesses live data views, so updates to the base tables are instantly available.
Results
Setting up data sharing with the artists improved the record label’s operations along with several key performance indicators:
1. Allowed the artists to access their coordinated sales, spins, and social media data
2. Reduced the costs and time required to share data with the artists
3. Facilitated sharing data with marketing analysts
4. Identified new opportunities, for example:
a. One artist found that they were popular in Scandinavia, so marketing was increased to that geography
b. Artists that were popular in the same area could coordinate tours, reducing performance costs by sharing the venue.
c. Identified the songs that had the most plays, allowing the artists to tailor their setlist to the most popular songs and provide an excellent fan experience.
d. Identified the most effective social media groups for advertising
Summary
In this example, we’ve seen how Snowflake’s powerful and easy-to-use Data Sharing capabilities can be used with Reader Accounts to enable business partners to access approved company data in a secure way. This increases the business value of data, decreases time to insight, and strengthens business relationships.