Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove agate dependency? #3413

Closed
jtcohen6 opened this issue Jun 2, 2021 · 5 comments
Closed

Remove agate dependency? #3413

jtcohen6 opened this issue Jun 2, 2021 · 5 comments
Labels
dependencies Changes to the version of dbt dependencies enhancement New feature or request seeds Issues related to dbt's seed functionality stale Issues that have gone stale

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 2, 2021

Describe the feature

As time goes on, our reliance on the agate package for light-weight data frames may be providing us more grief than gain.

That said, there's still a lot agate does get us:

  • Type inference for seeds that's better than nothing
  • A reasonable programming interface for tables of info dynamically returned from the database. Many dbt internals, packages, and community members have tapped directly into agate methods, e.g. print_table() (docs)
  • It's not pandas

Any changes we make here have a lot of potential visibility and surface area.

Describe alternatives you've considered

I'm not sure... are there other packages that could occupy the same niche for us? Is this something we should look to hand-roll? I'm not convinced that the problems we're encountering, especially around data type inference, have "easy" solutions, if only we could be writing the code for them ourselves.

Additional context

This affects all databases. While agate functionality is obscured from many users, I imagine that a nontrivial percentage of the overall user base depends on agate-specific methods today, whether via their own Jinja code or by leveraging macros from a package.

I'm adding this to the v1.0 list, not because we should definitely do it, but because we should consider whether we can comfortably include the current agate dependency in our stable major-version release.

Who will this benefit?

  • RPC users → dbt Cloud IDE users
  • Maintainers of dbt-core
  • The workings of run_query are wrapped in more mystique today than they ultimately should be. While agate provides an okay syntax for working with query results, it's not ideal. Along with work around execute, we could improve the experience of templating model SQL based on the results from introspective queries.
@jtcohen6 jtcohen6 added enhancement New feature or request dependencies Changes to the version of dbt dependencies 1.0.0 Issues related to the 1.0.0 release of dbt labels Jun 2, 2021
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jul 1, 2021

We can't estimate this right now. Instead, let's spike:

  • all the places agate is used
  • what we need out of a type inferencer
  • backwards compatibility options for documented methods on agate.Table objects

Some options:

  • re-roll the functional bits we need from agate
  • fork or vendor agate to fix dependency/upgrade issues, and give ourselves points of intervention in the future

@jtcohen6 jtcohen6 added the seeds Issues related to dbt's seed functionality label Oct 5, 2021
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Oct 5, 2021

@drewbanin I'd be curious to hear what sticks out as the really bad parts of relying on agate. The two issues I outlined above have both been resolved:

The biggest downside for me is, we'd want to keep the dependency tightly pinned, since the maintainers have a history of including breaking changes in patch releases (e.g. coercing bools into numbers).

@drewbanin
Copy link
Contributor

@jtcohen6 I'm with you that the big downside risk is around pinning the dependency. It would be unfortunate if a dependency of agate==1.6.3 impacted another dependency in a way that bricked installs or prevented us from using another lib. Short of that though, I can't think of any functional impacts of agate that we'd need to change/remove/bypass in dbt anymore.

@jtcohen6 jtcohen6 removed the 1.0.0 Issues related to the 1.0.0 release of dbt label Oct 6, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Apr 5, 2022

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@jpmckinney
Copy link

Hiya, I took over maintenance of agate a little while ago. Which patch coerced bools into numbers?

Anyhow, semantic versioning should be fine going forward.

agate accepts PRs if you need anything :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Changes to the version of dbt dependencies enhancement New feature or request seeds Issues related to dbt's seed functionality stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

3 participants