In this tutorial, we will be discussing what is Snowflake Data Warehouse, Snowflake architecture, how to create a free trail account for test drive, and finally how to access Snowflake WebUI.
Table of Contents
1. What is Snowflake Data Cloud Warehouse?
Snowflake is a cloud-based data warehousing platform developed on top of Cloud. As of now, popular Cloud providers such as Amazon Web Service (AWS), Microsoft Azure and Google Cloud are supporting Snowflake.
There is no need to install, configure, or maintain any hardware (virtual or real) or software because it runs fully on public cloud infrastructure. Snowflake is a real SaaS product.
Snowflake delivers a data warehouse model that is quicker, easier to set up, and significantly more adaptable than typical data warehouse systems.
It quickly rose to the top of the data management solutions for analytics market because of its unique characteristics.
2. Snowflake Architecture
The Snowflake database design is a combination of shared-disk and shared-nothing database systems. Snowflake, like shared-disk systems, uses a central data storage to store data and it will be accessible from all compute nodes in the platform.
Snowflake performs computation utilizing MPP (massively parallel processing) compute clusters, wherein each node in the cluster maintains a piece of the full data set locally, similar to shared-nothing systems.
This method combines the ease of data management of a shared-disk design with the speed and scale-out advantages of a shared-nothing architecture.
There are 3 layers in the Snowflake Architecture.
- Storage layer,
- Computation layer,
- Cloud Services layer.
Let us discuss about each layer in detail.
2.1. Storage layer
Snowflake divides the data into numerous micro partitions, each of which is optimised and compressed internally. It stores data in a columnar fashion.
Data is saved in the cloud and is managed using a shared-disk approach, making data administration simple. In the shared-nothing architecture, this ensures that users do not have to worry about data distribution over several nodes.
To fetch data for query processing, compute nodes link to the storage layer. We just pay for the average monthly storage utilised since the storage layer is independent.
Snowflake's storage is elastic since it is provided in the cloud, and it is paid monthly based on consumption per TB.
The pricing for the storage is charged by the cloud providers according to the storage utilization per month, which means the storage cost is determined using the average amount of storage used per month, after compression.
2.2. Computation layer
This layer consists of Virtual Warehouses which are scalable computational units.
The compute layer gets data from the storage layer and caches it locally to enhance query results in the future, i.e., each Virtual Warehouse has its own cache.
Multiple virtual warehouses can run at the same time, maintain ACID, and execute multiple concurrent processing on data.
In Snowflake, multiple Virtual Warehouses can be constructed for varying requirements based on workloads.
These Warehouses are MPP (Massively Parallel Processing) in nature. A single storage layer can be used by each virtual warehouse.
A virtual Warehouse has its own computing cluster and does not interface with other virtual warehouses, that is shared nothing architecture.
The pricing for this Warehouse would be in a dynamic manner, ie., charges will be calculated how much computational unit you have consumed, it will be calculated as credit.
So, whenever the Data Warehouse is doing the computation, at that time only the charging clock will run. Below is the charging measurement of different data warehouse sizes.
Virtual Warehouse Size | Credit per hour |
X- Small | 1 |
Small | 2 |
Medium | 4 |
Large | 8 |
X-Large | 16 |
2X-Large | 32 |
3X-Large | 64 |
4X-Large | 128 |
5X-Large | 256 |
6X-Large | 512 |
This charging would be done per second usage. The minimum charging would be for one minute, ie., as soon as the data warehouse starts computation, the minimum charge will be calculated from one minute usage. For example, if you use only 30 seconds, the charge would be for one minute.
2.3. Cloud Services layer
This layer contains all the operations that coordinate throughout Snowflake, such as authentication, security, metadata management of the loaded data, and query optimizer.
The service layer provides a SQL client interface for data operations such as DDL and DML. This cloud services also can be scaled like other layers.
Pricing for this Cloud services will not be charged directly. By default, some amount of Credits will be allocated for the Cloud services as these services also require computational power to run.
Charge will be started for the cloud service when the utilization goes beyond 10% of the total computation utilized per day. For example, if the operational computation utilized 100 credits and Cloud services utilized 15 credits then the charging for cloud services would be 15 - (10% of 100) = 15.
3. What is Credit?
Credit is the charging unit in snowflake. Amount for each credit will be calculated according to the edition and Cloud Provider. You can check and estimate the pricing in this link.
You can select the Cloud Provider, region and currency; you will get the estimated charge per credit.
4. Create Snowflake Account
Snowflake is providing a free trial account for test drive; it is $400 worth of usage. You can create a free trial account in the link below.
In the Sign up page, enter your basic details and click Continue.
Choose Snowflake edition, Cloud provider, and Region. Make sure you have checked the License agreement box and and click Get started.
A confirmation mail will be sent to your registered Email.
Go to your Email inbox, open the activation mail from Snowflake support and click the 'Click to Activate' link. You will be redirected to new where you can set your username and password.
Set the username and password. Click the 'Get started' to proceed.
Once you get started, you will get confirmation mail saying that your account has been activated. Click 'Log in to Snowflake' link.
You will now be redirected to the landing page of the Snowflake WebUI.
Conclusion
In this article, we have gone through the concept of Snowflake, its architecture and we opened a free trial account for POC and testing purposes. We also discussed how to access the Snowflake's WebUl.
We will see how to install SnowSQL, access Snowflake using SnowSQL (CLI) and will learn various Database Management activities in the upcoming article.
Resources: