Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot use get_current_context() in a @task.virtualenv task #34158

Open
1 of 2 tasks
mziwisky opened this issue Sep 7, 2023 · 14 comments
Open
1 of 2 tasks

cannot use get_current_context() in a @task.virtualenv task #34158

mziwisky opened this issue Sep 7, 2023 · 14 comments
Labels
area:core area:core-operators Operators, Sensors and hooks within Core Airflow good first issue kind:feature Feature Requests

Comments

@mziwisky
Copy link

mziwisky commented Sep 7, 2023

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

On Airflow 2.5.1 (on AWS MWAA), I ran this DAG:

from datetime import datetime, timedelta
from airflow.decorators import dag, task
from airflow.operators.python import get_current_context

@task.virtualenv(system_site_packages=True)
def test():
    data_interval_end = get_current_context()['data_interval_end']
    print(data_interval_end)

@dag(
    start_date=datetime(2023, 9, 6),
    schedule="10 * * * *",
)
def bug_test():
    test()

the_dag = bug_test()

And I got airflow.exceptions.AirflowException: Current context was requested but no context was found! Are you running within an airflow task?

I know that I can do it like this:

@task.virtualenv(system_site_packages=True)
def test(data_interval_end=None):
    print(data_interval_end)

That works fine if I only need the context directly inside that function, but where this actually popped up in practice was a DAG that used some shared lib functions that used get_current_context, which of course works fine when called from normal tasks but blew up when called from a virtualenv task.

What you think should happen instead

ideally, get_current_context() should work even if it's called from a virtualenv task.

How to reproduce

described above

Operating System

Linux? it's AWS MWAA

Versions of Apache Airflow Providers

No response

Deployment

Amazon (AWS) MWAA

Deployment details

No response

Anything else

this was also mentioned in this closed issue: #20974 (comment)

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@mziwisky mziwisky added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Sep 7, 2023
@potiuk potiuk removed the needs-triage label for new issues that we didn't triage yet label Sep 7, 2023
@potiuk
Copy link
Member

potiuk commented Sep 7, 2023

I think it would be possible, it's just a matter of serializing the context and adding code in https://github.com/apache/airflow/blob/main/airflow/utils/python_virtualenv_script.jinja2 to save the context to _CONTEXT - - not super difficult task so if anyone would like to pick it up, it is up for grabs. @mziwisky - maybe you would like to contribute it?

@potiuk potiuk added type:new-feature Changelog: New Features kind:feature Feature Requests good first issue and removed kind:bug This is a clearly a bug type:new-feature Changelog: New Features labels Sep 7, 2023
@mziwisky
Copy link
Author

mziwisky commented Sep 7, 2023

thanks for the pointer, @potiuk. i'll see if i find some time to take a stab at it

@pedro-cf
Copy link

pedro-cf commented Oct 31, 2023

Hello, is there any updates on this? perhaps a workaround meanwhile? Trying to figure out if it's possible to access context or the task instance inside a @task.virtualenv

@Taragolis
Copy link
Contributor

Still good first issue 😉

@potiuk
Copy link
Member

potiuk commented Oct 31, 2023

Hello, is there any updates on this? perhaps a workaround meanwhile? Trying to figure out if it's possible to access context or the task instance inside a @task.virtualenv

I believe it's easy to get any specific context property - same as any other parameter. You can pass whatever context variables you want via templattng https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html#templating . That would be the first thing I'd check if I were you.

Have you tried it and you had some problems with it, I wonder?

@pedro-cf
Copy link

pedro-cf commented Oct 31, 2023

Hello, is there any updates on this? perhaps a workaround meanwhile? Trying to figure out if it's possible to access context or the task instance inside a @task.virtualenv

I believe it's easy to get any specific context property - same as any other parameter. You can pass whatever context variables you want via templattng https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html#templating . That would be the first thing I'd check if I were you.

Have you tried it and you had some problems with it, I wonder?

Could you give me an example of a very simple DAG with one @task.virtualenv , where you would access the context for the task instance and the params inside?

@potiuk
Copy link
Member

potiuk commented Oct 31, 2023

Could you give me an example of a very simple DAG with one @task.virtualenv , where you would access the context for the task instance and the params inside?

Look at the examples I linked to, I am do not know if it will work and I have no such example at hand but this direction should be good enough .

@pedro-cf
Copy link

Is there any update on this ?

@potiuk
Copy link
Member

potiuk commented Feb 27, 2024

Is there any update on this ?

Does not look like there are . But you (and anyone else) can attempt to make a change. What could help is you explainig what are the result of you've done and help those who might pick up the issue and work on it - I am sure if you did follow that and provide your findings here it would be more helpful for someone to decide what they could do. Without it, it migh tbe difficult for them.

Providing results of your checks would be very useful to those will decide to move things forward and that would definitely help to implement it faster (as opposed to asking if there is any update - when evidently there is not).

Contrary to popular believes, asking whether there are any updates on issue that does not have any update and not providing any more information and contributing findings, do no t make the issues solved faster. At most they scare people away from picking a task and contributing, because theat will make impression that you want to push them to fix it (in their free time) without any contribution from your side.

This is how open-source contributions work

@gGonz
Copy link

gGonz commented Jun 13, 2024

Contrary to popular believes, asking whether there are any updates on issue that does not have any update and not providing any more information and contributing findings, do no t make the issues solved faster.

Can we at least have some workaround or example, the docs on the links pointed above doesn't provide any hint of how to fix this issue.

@mziwisky
Copy link
Author

The original post on this issue offers a workaround -- access context properties by making them parameters of your task function. e.g.

@task.virtualenv(system_site_packages=True)
def test(data_interval_end=None):
    print(data_interval_end)

@pedro-cf
Copy link

pedro-cf commented Jun 13, 2024

@gGonz

Can we at least have some workaround or example, the docs on the links pointed above doesn't provide any hint of how to fix this issue.

Even though it is not possible to access the full **context or ti you can access most variables directly:
https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html

example:

@task.virtualenv(
    requirements=[
        "colormap==1.1.0"
    ]
)
def test(params=None, data_interval_end=None, run_id=None, logical_date=None):
    print(params)
    print(data_interval_end)
    print(run_id)
    print(logical_date)

if you need to pass aditional static data to the task you can also use default_args of the DAG, f.e. you can pass the dag_id using this.

@gGonz
Copy link

gGonz commented Jun 14, 2024

@gGonz

Can we at least have some workaround or example, the docs on the links pointed above doesn't provide any hint of how to fix this issue.

Even though it is not possible to access the full **context or ti you can access most variables directly: https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html

example:

@task.virtualenv(
    requirements=[
        "colormap==1.1.0"
    ]
)
def test(params=None, data_interval_end=None, run_id=None, logical_date=None):
    print(params)
    print(data_interval_end)
    print(run_id)
    print(logical_date)

if you need to pass aditional static data to the task you also use default_args of the dag, f.e. you can pass the dag_id using this.

Thanks, in my case I also needed to add use_dill=True to the @task.virtualenv() decorator to make it work.

@pedro-cf
Copy link

The original post on this issue offers a workaround -- access context properties by making them parameters of your task function. e.g.

@task.virtualenv(system_site_packages=True)
def test(data_interval_end=None):
    print(data_interval_end)

@mziwisky Is it possible to add custom ones? such as the dag_id or map_index ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core area:core-operators Operators, Sensors and hooks within Core Airflow good first issue kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

6 participants