Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake hook doesn't parameterize SQL passed as a string type, causing SnowflakeOperator to fail #16068

Closed
grassten opened this issue May 25, 2021 · 5 comments · Fixed by #16102
Labels
kind:bug This is a clearly a bug

Comments

@grassten
Copy link
Contributor

grassten commented May 25, 2021

Apache Airflow version: 2.1.0

What happened: The new Snowflake hook run method is not taking parameters into account when the SQL is passed as a string; it's using Snowflake connector's execute_string method, which does not support parameterization. So the only way to parameterize your query from a SnowflakeOperator is to put the SQL into a list.

if isinstance(sql, str):
cursors = conn.execute_string(sql, return_cursors=True)
for cur in cursors:
self.query_ids.append(cur.sfqid)
self.log.info("Rows affected: %s", cur.rowcount)
self.log.info("Snowflake query id: %s", cur.sfqid)
cur.close()

https://docs.snowflake.com/en/user-guide/python-connector-api.html#execute_string

How to reproduce it: Pass a sql string and parameters to SnowflakeOperator; the query will not be parameterized, and will fail as a SQL syntax error on the parameterization characters, e.g. %(param)s.

Anything else we need to know: Quick workaround is to put your sql string into a list, these are still being parameterized correctly.

@grassten grassten added the kind:bug This is a clearly a bug label May 25, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented May 25, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented May 25, 2021

Would you be willing to submit a fix @grassten ? Happy to review and merge it,

@grassten
Copy link
Contributor Author

Would you be willing to submit a fix @grassten ? Happy to review and merge it,

Sure! I'll work on getting a PR in today or tomorrow morning.

@grassten
Copy link
Contributor Author

grassten commented May 26, 2021

@potiuk I'm wondering if we should split the passed sql param and then parameterize the resulting queries, or stop splitting the string altogether (splitting was implemented recently in #15533). To split the string I think we could use from snowflake.connector.util_text import split_statements; that's what's used by execute_string in the Snowflake connector. (I have that in a PR in my fork here)

That being said, I don't think parameterizing a list of queries (after splitting) with a single set of params is valid. I think we should put the passed sql into a single item list & pass it through the existing code; this seems more in line w/ what you would expect to happen. cc @mobuchowski @eladkal I see you guys on that PR, what do you think?

Edit: I've submitted the PR to use split_statements; I think this implementation makes strings w/ multiple statements behave similarly to lists of sql in its parameter handling.

@potiuk
Copy link
Member

potiuk commented May 26, 2021

@mobuchowski - WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants