Snowflake stores table data in micro-partitions and uses the columnar storage format keeping MIN/MAX values statistics for each column in every partition and for the entire table as well. Let’s investigate how this affects the performance of SQL queries involving MIN and MAX functions.
I have events
table that is quite large and has 62.5 billion rows and 9.1 TB of data. The first query is very simple:
SELECT MIN(event_hour), MAX(event_hour) FROM events; -------------- 0 23
It was very fast and took just 408 ms to get the result. Looking at the query execution profile in Snowflake we can see that it was purely a metadata operation (METADATA-BASED RESULT), and no any data was scanned:
Now let’s add a filter:
SELECT MIN(event_hour), MAX(event_hour) FROM events WHERE event_date = '2019-04-30';
This query took 5 seconds to execute. Now Snowflake cannot use MIN/MAX statistics for event_hour
column because it is collected for the entire table and micro-partition, while we need MIN/MAX values within the specified date.
Still the query was quite fast:
Bytes scanned 376.53 MB Partitions scanned 2,092 Partitions total 828,926
You can see that Snowflake read only 2,092 of 828,926 partitions of the table. It means that the events table is well-clustered if respect of event_date
column.
In other words, most micro-partitions have constant values of event_date
in all rows comprising the partition, so Snowflake can efficiently apply MIN/MAX statistics at the partition level, and skip scanning the partitions that do not include the specified date.
Now let’s use another filter:
SELECT MIN(event_hour), MAX(event_hour) FROM events WHERE event_name = 'LOGIN';
Now it took 7 minutes 48 seconds (!) to complete the query. Snowflake had to scan much more data:
Bytes scanned 121.7 GB Partitions scanned 769,848 Partitions total 828,926
Snowflake scanned almost all partitions in the table. In the previous example we saw that the table is well-clustered by event_date
column, so excessive scanning for event_name
filter can mean 2 things:
1. The table can be bad-clustered in respect of event_name
column. For example, there can be relatively small number of LOGIN
event rows but they exist in almost every partition, so they cannot be skipped from reading.
In you re-cluster the table moving the rows for the same events into the same partitions, then Snowflake will be able to query the table more efficiently and prune more partitions when event_name
filter is applied.
2. There are too many LOGIN
events, they appear in every partition with large number of rows i.e the table is greatly skewed by this event, and it takes 80-90% of rows. In this case the re-clustering will not help a lot since Snowflake will still need to read most partitions. But clustering can help filter by other non-skewed values more efficiently.
You can note that using the columnar storage Snowflake scanned data only for 2 columns (not the entire table having 9.1 TB) and it is 1.3% of the table size (although it is still 121.7 GB).