Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks loader: Add collector_tstamp_date column #943

Closed
istreeter opened this issue Jun 23, 2022 · 1 comment
Closed

Databricks loader: Add collector_tstamp_date column #943

istreeter opened this issue Jun 23, 2022 · 1 comment

Comments

@istreeter
Copy link
Contributor

When loading to delta tables, it is helpful to take advantage of data partitions for better query performance.

For snowplow data, an appropriate partitioning column is date of the collector timestamp. It is relatively low cardinality, and collector_tstamp is a field that appears in many analytic queries, including Snowplow's standard data models. To make this column available for partitioning, we need to explicitly add it to the COPY INTO statement.

@istreeter
Copy link
Contributor Author

Closing in favour of #951

@istreeter istreeter removed this from the 4.1.0 milestone Jun 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant