Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[airflow 2.4.3] presto queries returning none following upgrade to common.sql provider #31612

Closed
1 of 2 tasks
yanivsin opened this issue May 30, 2023 · 14 comments · Fixed by #35132
Closed
1 of 2 tasks
Labels
area:providers kind:bug This is a clearly a bug pending-response provider:trino stale Stale PRs per the .github/workflows/stale.yml policy file

Comments

@yanivsin
Copy link

yanivsin commented May 30, 2023

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

After upgrading apache-airflow-providers-common-sql from 1.2.0 to anything above 1.3.0 presto queries using the get_records() and or get_first() function returns none.

using the same query -- select 1:
1.2.0: Done. Returned value was: [[1]]
1.3.0 and above:

Running statement: select 1, parameters: None
[2023-05-30, 11:57:37 UTC] {{python.py:177}} INFO - Done. Returned value was: None

What you think should happen instead

i would expect that running the query select 1 on presto would provide the same result when the environment is running apache-airflow-providers-common-sql 1.2.0 or apache-airflow-providers-common-sql 1.5.1.

How to reproduce

run the following query: PrestoHook(conn_id).get_records(select 1)
ensure that the requirements are as labelled below.

Operating System

NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora"

Versions of Apache Airflow Providers

apache-airflow==2.4.3
apache-airflow-providers-amazon==6.0.0
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-common-sql==1.5.1
apache-airflow-providers-ftp==3.1.0
apache-airflow-providers-google==8.4.0
apache-airflow-providers-http==4.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-jenkins==3.0.0
apache-airflow-providers-mysql==3.2.1
apache-airflow-providers-postgres==5.2.2
apache-airflow-providers-presto==5.1.0
apache-airflow-providers-sendgrid==3.0.0
apache-airflow-providers-slack==6.0.0
apache-airflow-providers-snowflake==3.3.0
apache-airflow-providers-sqlite==3.2.1
apache-airflow-providers-trino==4.1.0

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@yanivsin yanivsin added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels May 30, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented May 30, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@hussein-awala
Copy link
Member

#26944 added a condition on cursor.description before returning the result, could you try executing this one?

PrestoHook(conn_id).run(sql="<your query>", handler=lambda cursor: cursor.fetchall())

@yanivsin
Copy link
Author

thank you @hussein-awala. this works.

In continuation with the issue, at the moment the sql sensor for presto queries is broken as all get_records() queries return None. i've created this #31630 pr which i believe should fix the issue. WDYT?

@hussein-awala
Copy link
Member

We should know why the description is None (it should not be), and if we need to change something, we should do that in the presto provider and not the common sql. I will do some tests from my side

@yanivsin
Copy link
Author

thank you.

@eladkal
Copy link
Contributor

eladkal commented May 31, 2023

Is the issue from Airflow side or Presto SDK?
We had previous case of #26774 (comment) that turned not to be Airflow issue.
Can we confirm if the SDK work as expected ?

@github-actions
Copy link

github-actions bot commented Jul 1, 2023

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jul 1, 2023
@github-actions
Copy link

github-actions bot commented Jul 9, 2023

This issue has been closed because it has not received response from the issue author.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 9, 2023
@arghya18
Copy link

Any update on this?

@potiuk
Copy link
Member

potiuk commented Aug 17, 2023

Any update on this?

Yes - as you saw: This issue has been closed because it has not received response from the issue author.

If you have similar issue, I suggest you to open a new one and explain circumstances you have - especially addressing the above questions in the discussion.

@arghya18
Copy link

I have exactly the same issue as the original issue is not solved. Do you still want me to open a new ticket? It might be redundant

@potiuk
Copy link
Member

potiuk commented Aug 17, 2023

I have exactly the same issue as the original issue is not solved. Do you still want me to open a new ticket? It might be redundant

Or not. You might have similar issue. You might have different airflow configuration. You migh thave different provider version . Depending on your evidences, details and logs. And when you open an issue and somone will try to add extra questions that will help to test some hypotheses and maybe ask you to provide evidences, then it will be on you to provide those answers.

Some poeple asked extra questions that needed answer and they were not answered by the author.

Note that there is no guarantee it will be solved either way. It will need someone to take a look and solve it, and amount of evidences and effort put by those who raise an issue in making it easier to diagnose and help increases the chances that someone will, actually take a look and help. The more you show that you've done a lot to help to diagnose it, the more likely somoene will want to spend their free time on fixing it (note that Airflow has > 2500 contributors and they mostly contribute in their free time, so the easier and better quality information is there).

@ming-zhang-SN
Copy link

TL;DR
I encountered the same issue. Updating the presto-python-client to version 0.8.4 resolves it.

Details:
As previously discussed, the issue was due to the cursor.description being None within either fetch_all_handler or fetch_one_handler, which we believed shouldn't occur. After further investigation, I found that the description was None because the _query's columns were also None:
Link to 0.8.3 code on GitHub

This is because the columns are only assigned within the fetch function here:
Link to 0.8.3 code on GitHub

Essentially, one must execute a fetch command, after which the columns will be assigned, preventing the description from being None.

However, in the latest version of presto-python-client (0.8.4), a call to fetch has been added to the execute function:
Link to commit on GitHub

And the execute function is always called here:
Link to Apache Airflow code on GitHub
This occurs before either the fetch_all_handler or fetch_one_handler is invoked.

potiuk added a commit to potiuk/airflow that referenced this issue Oct 23, 2023
Previous version of client had compatibility issue with dbapi expected
behaviour and returned None for getting records.

Fixes: apache#31612
@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Thanks @ming-zhang-SN for the comment explaining it. I just added PR #35132 so that new version of presto provider will have min-version bumped for the client.

potiuk added a commit that referenced this issue Oct 23, 2023
…#35132)

Previous version of client had compatibility issue with dbapi expected
behaviour and returned None for getting records.

Fixes: #31612
potiuk added a commit that referenced this issue Oct 29, 2023
…#35132)

Previous version of client had compatibility issue with dbapi expected
behaviour and returned None for getting records.

Fixes: #31612
(cherry picked from commit 8ef2a99)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug pending-response provider:trino stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants