Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map component state inconsistent with condor_q #129

Open
JoshKarpel opened this issue Mar 24, 2019 · 4 comments
Open

Map component state inconsistent with condor_q #129

JoshKarpel opened this issue Mar 24, 2019 · 4 comments
Assignees
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@JoshKarpel
Copy link
Contributor

Presumably I'm missing some state transitions that exist in the event logs in this HTMap dir: https://www.dropbox.com/s/m2q748xuz944p1r/htmap-missed-events.tar.gz?dl=0

Vacating all of the maps and letting them restart brought them back in line with condor_q.

@JoshKarpel JoshKarpel added the bug Something isn't working label Mar 24, 2019
@JoshKarpel JoshKarpel self-assigned this Mar 24, 2019
@JoshKarpel JoshKarpel added the wontfix This will not be worked on label Mar 28, 2019
@JoshKarpel
Copy link
Contributor Author

JoshKarpel commented Mar 28, 2019

I have no idea how to reproduce this, and didn't notice anything in the event log that would obviously cause this. Moving to wontfix for now.

@JoshKarpel
Copy link
Contributor Author

JoshKarpel commented Apr 8, 2019

This has cropped back up, but I still have no idea what's causing it. Number of "Done" HTCondor jobs agrees with number of "Completed" HTMap components, but HTMap thinks more components are currently running than actually are.

This only seems to happen with large maps with many thousands of components, but that may just be because the sample size is large enough to see it. I'm getting about 100 mismatched statuses in a 6000 component map.

@JoshKarpel JoshKarpel removed the wontfix This will not be worked on label Apr 9, 2019
@JoshKarpel
Copy link
Contributor Author

This appears to be a bug in HTCondor: https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6982

@JoshKarpel JoshKarpel changed the title Job state inconsistent with condor_q Map component state inconsistent with condor_q Apr 9, 2019
@JoshKarpel JoshKarpel added the wontfix This will not be worked on label Jul 17, 2019
@JoshKarpel
Copy link
Contributor Author

Adding wontfix because there's nothing to do on our end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant