-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Lazy statepoint loading #238
Comments
We discussed this offline (I can't remember who was a part of the discussion). Our conclusion was that loading a job by id should not require validating that |
I am ok with lazy loading and only validating the state point metadata when it is accessed. |
@bdice I don't remember being a part of the conversation, but I agree with the approach. |
@bdice and I did discuss and agree on this approach. I need to post my pseudo-prototype for #249 along with a class diagram so that we can progress on that front. IMO that serves as the cleanest path forward for implementing lazy loading by isolating the logic for synchronization and ensuring that a clear set of invariants are well-defined and well-tested for the different possible use-cases of nesting and buffering. |
@vyasr I think (but am not sure) that my intended implementation of lazy loading can be clearly separated from your work on #249. I may need to work on #239 a little more to be sure, but I believe my implementation will rely more heavily on the Project class's caching and opening of jobs than the specifics of how job state points / job documents are synchronized. You should feel free to go ahead and post your "pseudo-prototype" (😉) in the meantime. |
Feature description
I am experimenting with refactoring statepoint loading to be lazy. Currently,
project.open_job()
is called on every job during aflow run
command, which causes the statepoints to be loaded. However, the statepoint information isn't even used in many cases. This may be a case where signac is executing unnecessary I/O (which is costly for large projects like the ones I'm working with).Proposed solution
The property
job.statepoint
currently just returnsjob._statepoint
, meaning that the internal state is effectively identical to the publicly-accessible state. I am investigating making the propertyjob.statepoint
load lazily, withjob._statepoint = None
until the statepoint is requested.I see a HUGE performance boost (from 30 seconds to <0.5 seconds for a simple operation on 3000 jobs) since the I/O is dropped dramatically, but some tests fail because job statepoint corruption (which raises a
JobsCorruptedError
) is not detected at the time ofproject.open_job()
. I think that lazy loading is generally safe except in cases where the statepoint hash and job id don't match (corrupted jobs).I just wanted to open this for discussion - I am not sure where we stand on approaches to handling corrupted data. Specifically, I want to know whether we think that validating
hash(statepoint) == id
(and thus the I/O cost of a statepoint load) is always necessary when opening a job by id.Additional context
Maybe investigate
async
as an alternative...? Still expensive but maybe less so.The text was updated successfully, but these errors were encountered: