Performance of MIN/MAX Functions – Metadata Operations and Partition Pruning in Snowflake

Snowflake stores table data in micro-partitions and uses the columnar storage format keeping MIN/MAX values statistics for each column in every partition and for the entire table as well. Let’s investigate how this affects the performance of SQL queries involving MIN and MAX functions.

I have events table that is quite large and has 62.5 billion rows and 9.1 TB of data. The first query is very simple:

SELECT MIN(event_hour), MAX(event_hour)
FROM events;
--------------
0          23

It was very fast and took just 408 ms to get the result. Looking at the query execution profile in Snowflake we can see that it was purely a metadata operation (METADATA-BASED RESULT), and no any data was scanned:

Now let’s add a filter:

SELECT MIN(event_hour), MAX(event_hour)
FROM events
WHERE event_date = '2019-04-30';

This query took 5 seconds to execute. Now Snowflake cannot use MIN/MAX statistics for event_hour column because it is collected for the entire table and micro-partition, while we need MIN/MAX values within the specified date.

Still the query was quite fast:

Bytes scanned       376.53 MB

Partitions scanned  2,092
Partitions total    828,926

You can see that Snowflake read only 2,092 of 828,926 partitions of the table. It means that the events table is well-clustered if respect of event_date column.

In other words, most micro-partitions have constant values of event_date in all rows comprising the partition, so Snowflake can efficiently apply MIN/MAX statistics at the partition level, and skip scanning the partitions that do not include the specified date.

Now let’s use another filter:

SELECT MIN(event_hour), MAX(event_hour)
FROM events
WHERE event_name = 'LOGIN';

Now it took 7 minutes 48 seconds (!) to complete the query. Snowflake had to scan much more data:

Bytes scanned       121.7 GB

Partitions scanned  769,848
Partitions total    828,926

Snowflake scanned almost all partitions in the table. In the previous example we saw that the table is well-clustered by event_date column, so excessive scanning for event_name filter can mean 2 things:

1. The table can be bad-clustered in respect of event_name column. For example, there can be relatively small number of LOGIN event rows but they exist in almost every partition, so they cannot be skipped from reading.

In you re-cluster the table moving the rows for the same events into the same partitions, then Snowflake will be able to query the table more efficiently and prune more partitions when event_name filter is applied.

2. There are too many LOGIN events, they appear in every partition with large number of rows i.e the table is greatly skewed by this event, and it takes 80-90% of rows. In this case the re-clustering will not help a lot since Snowflake will still need to read most partitions. But clustering can help filter by other non-skewed values more efficiently.

You can note that using the columnar storage Snowflake scanned data only for 2 columns (not the entire table having 9.1 TB) and it is 1.3% of the table size (although it is still 121.7 GB).