Handy Snowflake functions
There are many very handy functions in Snowflake.
1. LIKE ALL and LIKE ANY
Let’s start with LIKE ALL. How often do you write: WHERE Column1 LIKE '%X%' AND Column1 LIKE '%Y%'? Well we have a function for that in Snowflake. It’s like this: WHERE Column1 LIKE ALL ('%X%', '%Y%'). Handy isn’t it? Obviously in Snowflake strings are case sensitive so we need to use UPPER(). And we have LIKE ANY too, meaning it’s matching ANY of the items in the brackets.
We also have ILIKE and ILIKE ANY, for case insensitive comparison. Which is also handy.
2. DECODE, IIF, GREATEST
I write CASE WHEN like this quite often:
SELECT CASE WHEN Column1 = 1 THEN 'X'
WHEN Column1 = 2 THEN 'Y'
ELSE 'Z' END
But we have DECODE in Snowflake, like this: SELECT DECODE(Column1, 1, 'X', 2, 'Y', 'Z').
What's the difference? Decode is a function, whereas CASE WHEN is an expression. So Decode is handy when you want to combine it with another function. And it’s shorter too. Obviously if it is more than one variable/column then we still need to CASE WHEN.
Similar to that is the IFF function. That second F stands for “Function” (because we have IF statement in Snowflake). Like this: IIF(Column1 = 1, 'A', 'B'). So if Column1 = 1 it returns A, otherwise it returns B. Simple isn’t it?
Talking about comparing stuff (string, date, number), we have GREATEST in Snowflake. It takes many arguments, not just two. Like this:
· GREATEST(1,2,3) – it returns 3.
· GREATEST('a','A','b') – it returns b. Because lower case has higher ASCII.
· GREATEST(TO_DATE('2024-08-21'),TO_DATE('2024-07-13')) – it returns 21/8/2024.
GREATEST is handy because it supports any data type including variant. Obviously it has to be the same data type. You can’t compare a date to a string or to a number. And we have GREATEST_IGNORE_NULL too, which is very handy as it ignores the NULLs.
Recommended by LinkedIn
3. LAST_DAY and NEXT_DAY
When I need to find out the end of the month I used to start from the first day of the following month, then substract one day. Well we have a function for that in Snowflake. It's called LAST_DAY. You can use it to find the last day of the month, or the last day of the quarter, or of the year. Like this: LAST_DAY({date}). If you put today's date for example (28th July 2024) it will return 31st July 2024. Handy isn't it? To return the end of quarter, put 'quarter' as the second parameter: LAST_DAY({date}, 'quarter').
Snowflake also have another handy date function called next_day, which return the date of next Thursday, or next Monday, etc. Like this next_day({date}, 'Thursday'). For example, if we put in today's date (28th July 2024) it will return 1st August 2024, the date. You don't need to type "Thursday", you can just type "Thu" it will understand.
4. EQUAL_NULL and NVL
In the WHERE clause we often compare two columns (could be from different tables). And we always have to use IFNULL or COALESCE, in case the data contains NULL. So it’s like this: WHERE COALESCE(Column1,'') = COALESCE(Column2,''). In Snowflake we have a function for that. It’s called EQUAL_NULL. Like this: WHERE EQUAL_NULL(Column1,Column2). If Column1 is NULL but Column2 is not the output is FALSE. The same thing with the reverse (Column2 is NULL but Column1 is not). But if Column1 and Column2 are both NULL, then the output is TRUE. Which is what we want. So this EQUAL_NULL function is handy.
There is a similar function to that, called IS_DISTINCT_FROM. I’ll let you have a look.
We often use COALESCE(Column1, Column2) which means that if Column1 is NULL, then use Column2. Well there is a function for that in Snowflake. It’s called NVL, which stands for NULL Value Logic. Same usage, i.e. NVL(Column1, Column2). But we also have NVL2 like this: NVL2(Column1, Column2, Column3). If Column1 is NULL, it returns Column3, otherwise it returns Column2.
5. EditDistance and JaroWinkler
I’ll close this article with EditDistance, which is very useful when comparing strings. It’s like this: EDITDISTANCE('Night', 'Lighter'). It returns how many changes we need to do to change Night to become Light. In this case it is 3: change N to L, add e, add r.
A more sophisticated version of it is JaroWinkler_Similarity, which returns the similarity percentage. JAROWINKLER_SIMILARITY('Night', 'Lighter') – the output is 79%. Whereas between Night and Light it is 86%.
6. CONCAT, || and CONCAT_WS
Last one, I promise. For string concatenation you can do CONCAT('A', 'B', 'C') – which returns ABC. But you can also do 'A' || 'B' || 'C' which also returns ABC.
And in Snowflake we also have CONCAT_WS. WS stands for “with separator”. It’s like CONCAT but the first argument is the separator. For example if we do CONCAT_WS(',', 'A', 'B', 'C') it returns A,B,C. So it puts comma between the items.
That’s all folks. Happy learning!