Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

activity_occurrence is leaving out activity in partition, why? #4

Open
jbergenblom opened this issue Dec 11, 2023 · 0 comments
Open

Comments

@jbergenblom
Copy link

Just started diving in to the Activity Schema pattern and am simultaneously using this dbt-activity-schema and the ActivitySchema repo as reference.

But when trying out the dbt macros I couldn't get anything useful out using the window aggregates (activity_occurrence and activity_repeated_at). Or at least not assuming that they should reflect the ordinal ordering of events for a specific customer AND activity.

Just take the pseudo-code for these metrics from the ActivitySchema implementation specification: occurrence-columns calculation. Here the field activity is used in the partitioning.

row_number() over (partition by coalesce (activity, customer, anonymous_customer_id) order by ts asc) as activity_occurrence,
lead(ts) over (partition by coalesce (activity, customer, anonymous_customer_id) order by ts asc) as activity_repeated_at

But in this repo's implementation there is only partitioning done using customer:

# macros/activity_occurrence.sql
...
partition by coalesce (
    {{ safe_cast("customer", type_string()) }},
    {{ safe_cast("anonymous_customer_id", type_string()) }}
) order by ts asc ) as activity_occurrence,
...

So they contradict, and in my head I can't see the value in just having an ordinal numbering of all events for a customer. Is this diff something you did intentionally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant