Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates from long-running tests #19

Merged
merged 4 commits into from
Feb 19, 2024
Merged

Updates from long-running tests #19

merged 4 commits into from
Feb 19, 2024

Conversation

jrbourbeau
Copy link
Member

Folks can ignore this for now. I'm just checkpointing some local work from yesterday.

@jrbourbeau jrbourbeau changed the title [DNM] Updates Updates from long-running tests Feb 19, 2024
@jrbourbeau jrbourbeau marked this pull request as ready for review February 19, 2024 15:41
Copy link
Member Author

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this PR contains updates from running the full pipeline several times (some long-running overnight). Some of the changes here are temporary, but I'm going to merge them for now as I'd like to show them during a call today. @mrocklin feel free to leave post-merge comments and I can address in a follow-up

@jrbourbeau jrbourbeau merged commit bcfbb4d into main Feb 19, 2024
1 check passed

lock_generate = FileLock("generate.lock")
lock_json_to_parquet = FileLock("json_to_parquet.lock")
lock_compact = FileLock("compact.lock")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these work even in the cloud setting because this is all on the Prefect VM, right? But they have to be file locks rather than threading.Locks because prefect will create many processes?

Copy link
Member Author

@jrbourbeau jrbourbeau Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct

Also, I was using Prefect's built in concurrency limiting functionality https://docs.prefect.io/latest/guides/global-concurrency-limits/, which I like the idea of more than using filelocks. But we would sporadically run into deadlocks which is why I switched back to filelocks.

Also opened this issue PrefectHQ/prefect#12015 as a possible alternative

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably what we actually want here is something like ...

if lock.acquired:
    return

with lock:
    ...

Because we don't want things to pile up.

Not saying that we should actually do this, just mentioning it. Short term we could also make our own limiting decorator. My guess is that this makes more sense / is alot easier with prefect serve than it does with other Prefect deployment strategies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants