In the rapidly evolving landscape of modern big data management, one data management system that has grown in popularity is Snowflake. As the role of a data management consultant becomes increasingly important for business success, having access to the right tools to handle vast datasets, thus optimising performance times, has become paramount. Snowflake is a cloud-based data warehouse that provides a range of features for managing data, including support for roles and hierarchies. In this blog post, we’ll take a closer look at how Snowflake roles and hierarchies work and how they can help you manage access to your data warehouse.
As Enterprise data warehouses have grown in size and complexity, managing access to data has become increasingly important, especially in relation to Enterprise data security. In a typical data warehouse team, there are many different types of users, from data analysts, engineers, and data scientists to administrators, business users, etc. Each of these users requires access to different parts of the data warehouse, and each has a unique set of needs and requirements. To manage this complexity, Snowflake offers 2 types of Access Control Mechanisms:
- Discretionary Access Control (DAC): Each object has an owner who can, in turn, grant access to that object.
- Role-based Access Control (RBAC): Access privileges are assigned to roles, which are, in turn, assigned to users.
The following table highlights the significance of role-based access to Data Warehouses:
Discretionary Access Control (DAC)
In Snowflake, roles are used to control access to data. Each role defines a set of privileges that determine what a user can and cannot do within the data warehouse. For example, a role might allow a user to read data from a particular table and perform some limited data entry but remove or delete data.
Roles can be assigned to individual users or to groups of users. For example, you might create a role for the Testing Team that allows them to read data from all tables in the data warehouse but not to modify or delete any data.
Let’s delve into the following example:
Create 2 users, test_user_1 and dev_user_1, each belonging to the test and dev teams, respectively, using the below code:
# | Instruction |
---|---|
1 | -Use SECURITYADMIN ROLE to Create User |
2 | –Create a user from Testing Team |
3 | create user test_user_1 |
4 | password = ‘Testuser1234’ |
5 | comment = ‘This user belongs to TESTING Team’ |
6 | must_change_password = FALSE; |
7 | |
8 | –Create a user from DEV Team |
9 | create user dev_user_1 |
10 | password = ‘Devuser1234’ |
11 | comment = ‘This user belongs to DEV Team’ |
12 | must_change_password = FALSE; |
Granting the below roles to the Dev user:
# | Instruction |
---|---|
1 | grant role SYSADMIN to user dev_user_1; |
2 | grant role SYSADMIN to user dev_user_1; |
3 | grant role SECURITYADMIN to user dev_user_1; |
Using role SYSADMIN create a Database, Schema and Table:
# | Instruction |
---|---|
1 | –Use Role SYSADMIN |
2 | use role SYSADMIN; |
3 | –create a database |
4 | create database my_db; |
5 | –create a schema |
6 | create schema my_schema; |
7 | –create a table |
8 | create table my_table (c1 varchar(10)); |
If I log in as a Test User, I won’t be able to see the database, schema and table created in the above table as a Test user is not granted any role as opposed to a dev user.
Where as if I login as a Dev User, I can see the database and schemas created above.
Summary of Discretionary Access Control:
As per the above example:
Database, Schema and Table are created using the user dev user, which in turn is granted the SYSADMIN (sys admin) role. Thus, the owner of the object is SYSADMIN, who can grant access to any other user.
Dev_user can drop, alter, and insert into the table at its discretion; hence, it is called Discretionary Access Control.
Let’s see how Discretionary Access Control (DAC) will now work in conjunction with Role Based Access Control (RBAC)
In addition to roles, Snowflake also supports hierarchies. A hierarchy is a way of organising roles into a tree-like structure. Each role in the hierarchy inherits the privileges of its parent role and can also define its own privileges.
By combining roles and hierarchies, you can create a flexible system for managing access to your data warehouse. For example, you might create a hierarchy that includes roles for admin, data engineers and QA Teams. Each of these roles would have its own set of privileges but would also inherit privileges from higher-level roles in the hierarchy.
You could then assign individual users to the appropriate roles based on their job function and level of access.
For example, a Snowflake Administrator might be assigned to the “sf_admin” role, while a senior data engineer might be assigned to the “de_team” role. This will be explained in more detail below.
Create Roles as below:
# | Instruction |
---|---|
1 | Create role “sf_admin” comment = “This is Snowflake Administrator for Sales Database”; |
2 | Grant role “sf_admin” to role SECURITYADMIN; |
3 | |
4 | Create role “de_team” comment = “This is Data Engineering Team Role”; |
5 | grant role “de_team” to role “sf_admin”; |
6 | |
7 | Create role “qa_team” comment = “This is QA Team Role”; |
8 | grant role “qa_team” to role “sf+admin”; |
Create Users as below using USERADMIN role as best practice:
# | Instruction |
---|---|
1 | Use role useradmin: |
2 | |
3 | create user sf_admin_user_1 password = ‘sales1234’ comment = ‘This user belongs to TESTING Team’ must_change_password = FALSE; |
4 | create user sf_admin_user_2 password = ‘sales1234’ comment = ‘This user belongs to TESTING Team’ must_change_password = FALSE; |
5 | create user de_team_user_1 password = ‘sales1234’ comment = ‘This user belongs to TESTING Team’ must_change_password = FALSE; |
6 | create user de_team_user_2 password = ‘sales1234’ comment = ‘This user belongs to TESTING Team’ must_change_password = FALSE; |
7 | create user qa_team_user_1 password = ‘sales1234’ comment = ‘This user belongs to TESTING Team’ must_change_password = FALSE; |
Grant Roles to Users:
# | Instruction |
---|---|
1 | use role securityadmin; |
2 | |
3 | grant role “sf_admin” to user sf_admin_user_1; |
4 | grant role “sf_admin” to user sf_admin_user_2; |
5 | |
6 | grant role “de_team” to user de_team_user_1; |
7 | grant role “de_team” to user de_team_user_2; |
8 | |
9 | grant role “qa_team” to user qa_team_user_1; |
10 | grant role “qa_team” to user qa_team_user_2; |
Role Hierarchy should looks like this:
Now, If I log in using sf_admin_user_1, I can see I have access to the other 2 roles (de_team and qa_team) as the sf_admin role is at the top of the hierarchy:
Now, If I log in using de_team_user_1, I cannot see any other role:
Summary of Role-based Access Control:
Thus, a role (sf_admin) that sits at the top of the role hierarchy can perform all the privileges as the role (de_team / qa_team), which is a child role but not vice-versa, is true.
To conclude, Snowflake roles and hierarchies offer a practical and effective approach to managing access within your data warehouse. By creating roles that match different job functions and access levels and then organising these roles into hierarchies, you can establish a versatile system that fits your organisation’s needs.
Snowflake roles and hierarchies provide a steady framework for striking the right balance between data security and accessibility, no matter the scale of your data operations. Whether you’re overseeing a data setup or managing a data landscape, Snowflake roles and hierarchies provide a reliable tool to ensure your data remains well-organised and that you have control over access.
FAQ
Q: What does a data warehouse allow organisations to achieve?
A data warehouse enables organisations to centralise and store large volumes of data from various sources in a structured manner. It can provide a unified view of data, making it easier to analyse and make informed decisions. With data warehousing, organisations can improve data accessibility, enhance reporting and analytics, optimise business processes, and ultimately gain valuable insights to drive growth and innovation.
Q: What is a data quality manager?
A data quality manager is responsible for ensuring the accuracy, consistency and reliability of an organisation’s data. Their role involves developing and implementing data quality standards, policies, and procedures. Data Quality Managers collaborate with different teams to monitor data quality, identify and rectify data errors or inconsistencies, and establish data governance practices.
Further Reading…
About Skillfield:
Skillfield is a Melbourne-based Cyber Security and Data Services consultancy and professional services company. We provide solutions that help our customers discover, protect and optimise big data in a way that works for them.