You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 2, 2024. It is now read-only.
Not sure if this is a bug or a feature... either way I don't think it adds any privacy benefit since it can be worked around by decomposing the group by grouping sets query into multiple group by statements.
☝️ This is equivalent to the union of group by 2 and group by 3 and I get the results I expect: The results contain the same two rows as in the query above (with different grouping_ids). Link
select
grouping_id(
bucket ("Line No_" by 2000.0),"Quantity"
),
bucket ("Line No_" by 2000.0),"Quantity",
count(*),
count_noise(*)
from"SalesCrMemoLine"group by grouping sets ((2), (3))
☝️ The two rows for the "Quantity" grouping are still there, but the numbers are totally different. It looks like the mere fact of including another group has added extra noise to the unrelated column.
In other examples, this can even cause a column to be suppressed or removed entirely from the result set.
TLDR; group by grouping sets ought to be equivalent to the union of the individual group by ${subgroup} queries. But in some cases adding an extra (and unrelated) grouping set can change results from other grouping sets.
The text was updated successfully, but these errors were encountered:
Each grouping expression generates noise layers, which should apply only to those buckets containing it.
It seems the Cloak always applies these noise layers globally, to all the buckets, returning different noise values for a bucket which should not be affected by unrelated grouping expressions.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Not sure if this is a bug or a feature... either way I don't think it adds any privacy benefit since it can be worked around by decomposing the
group by grouping sets
query into multiplegroup by
statements.The following were run against https://demo.aircloak.com/data_sources/NAV_W1_TENANT_WS001.
☝️ This is equivalent to just
group by 2
and I get the same result whichever way I write the query statement. So far, so good. Link☝️ This is equivalent to the union of
group by 2
andgroup by 3
and I get the results I expect: The results contain the same two rows as in the query above (with differentgrouping_id
s). LinkNow if instead I add a column with the
bucket
functions something weird happens.☝️ The two rows for the "Quantity" grouping are still there, but the numbers are totally different. It looks like the mere fact of including another group has added extra noise to the unrelated column.
In other examples, this can even cause a column to be suppressed or removed entirely from the result set.
TLDR;
group by grouping sets
ought to be equivalent to the union of the individualgroup by ${subgroup}
queries. But in some cases adding an extra (and unrelated) grouping set can change results from other grouping sets.The text was updated successfully, but these errors were encountered: