-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow generating execution graphs from MPI runs #154
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to check that the MPI calls record what we expect them to?
c631709
to
2257f29
Compare
fb8c201
to
ba74dcf
Compare
ba74dcf
to
102b79c
Compare
// Update the result for the master message | ||
sch.setFunctionResult(msg); | ||
// Wait for the scheduler to set the result on the MPI non-master messages | ||
SLEEP_MS(500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than sleeping here, can we instead call getFunctionResult
for each message in turn? Although some sleeps still exist in the tests, we should try to avoid wherever possible.
tests/utils/message_utils.cpp
Outdated
REQUIRE(msgA.mpiworldid() == msgB.mpiworldid()); | ||
REQUIRE(msgA.mpirank() == msgB.mpirank()); | ||
REQUIRE(msgA.mpiworldsize() == msgB.mpiworldsize()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the need for this method, but is it really just doing the same thing as checkMessageEquality
, just not checking the ID? Could we instead add a bool
flag to checkMessageEquality
, which is something like checkIDs
which is true
by default, then pass false
when we need to do this sort of check?
Then we could change the parameter on checkExecGraphEquality
to also be checkIDs
with true
default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we check only these fields is that these are the ones MPI changes, and thus we manually set them in the test.
Any other field we also wanted to check, we'd have to, afaict, manually add ourselves to each message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we don't touch any of the others, is there any harm in checking them? They will all be uninitialised and therefore pass the checks won't they? We'll just be comparing lots of blank strings/ zeros etc.
I.e. can we avoid having to add a new comparison function to check every permutation of edits we might make to a message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, for instance the executedHost
or finishtimestamp
fields are set by the scheduler when the chained functions finish (i.e. the world is destroyed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the executed host not be the current host therefore we can check that? We deal with the finish timestamp issue in another tests here by just overriding it: https://github.com/faasm/faabric/blob/master/tests/test/scheduler/test_scheduler.cpp#L663
The reason I'm against having more than one message comparison function is that it's easy to forget to add fields to them, and if we just have one, then it's easier to track.
It's a sort of blacklisting/ whitelisting problem, where rather than saying "we want to check that this subset of fields is the same", we're saying "we only expect these fields to have changed, and all others to be the same". The latter is more strict (and more vebose), but catches problems more frequently as a result.
bb95f65
to
6801be6
Compare
6d87aa3
to
5246a0e
Compare
In this PR I add support to gneerate execution graphs from MPI runs.
Before only chained calls would log subsequent invocations (i.e. calls to
sch.callFunctions
) to redis.Logging all invocations to
callFunctions
would prevent having more fixes like this, but I am not sure if we want to do it or not. Plus, we'd have to pass the current message id to the scheduler, but that's a minor issue I guess.I also add a test where we check that the MPI world creation generates the expected execution graph (i.e. root process faning-out to
size - 1
processes). Given that we don't have control over the messages that MPI uses to generate that graph, we are just able to imitate it in theexpected
graph for the test. In particular some fields likemsg.id()
will differ from theactual
graph and theexpected
one. Thus we only compare the fields thatMpiWorld::create()
sets.