-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Map component state inconsistent with condor_q #129
Comments
I have no idea how to reproduce this, and didn't notice anything in the event log that would obviously cause this. Moving to |
This has cropped back up, but I still have no idea what's causing it. Number of "Done" HTCondor jobs agrees with number of "Completed" HTMap components, but HTMap thinks more components are currently running than actually are. This only seems to happen with large maps with many thousands of components, but that may just be because the sample size is large enough to see it. I'm getting about 100 mismatched statuses in a 6000 component map. |
This appears to be a bug in HTCondor: https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6982 |
Adding wontfix because there's nothing to do on our end. |
Presumably I'm missing some state transitions that exist in the event logs in this HTMap dir: https://www.dropbox.com/s/m2q748xuz944p1r/htmap-missed-events.tar.gz?dl=0
Vacating all of the maps and letting them restart brought them back in line with
condor_q
.The text was updated successfully, but these errors were encountered: