Explore this snippet here.
Part of the data cleaning process involves understanding the quality of your data. NULL values are usually best avoided, so counting their occurrences is a common operation. There are several methods that can be used here:
sum(if(<column> is null, 1, 0)
- use theIFF
function to return 1 or 0 if a value is NULL or not respectively, then aggregate.count(*) - count(<column>)
- use the different forms of thecount()
aggregation which include and exclude NULLs.sum(case when x is null then 1 else 0 end)
- similar to the IFF method, but using aCASE
statement instead.
with data as (
select * from (values (1), (2), (null), (null), (5)) as data (x)
)
select
sum(iff(x is null, 1, 0)) with_iff,
count(*) - count(x) with_count,
sum(case when x is null then 1 else 0 end) with_case
from data
WITH_IFF | WITH_COUNT | WITH_CASE |
---|---|---|
2 | 2 | 2 |