You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have an initial stat setup in src/statistics which has already been handy to get initial Thread modes running.
Since we have issues with Lin and STMThread-mode predictability and STM_domain.agree_prop_par_asym's usage of Semaphore.Binary not always triggering bugs, it would be welcome to expand on that stats setup to guide us towards stats-backed improvements.
Right now, one has to rewrite code to setup a stats experiment every time.
Ideally, we would have stats tests in place ready to run. Running them continuously on the CI would probably be overkill and a waste of CPU cycles - it would nevertheless be useful to be able to do so, to help make sure improvements are indeed improvements across across the various platforms and architectures.
It could furthermore back changes to the Lin/STM core such as #324 by being able to say, e.g., "with 95% confidence this change does not affect the bug finding rate".
An ideal setup would measure
the fraction of failed properties (no shrinking required) of some large number of iterations, e.g., 37/20000
across the usual ref and CList tests + a few more representable examples (Hashtbl?, ...) that also allocate
test both Lin and STM properties in Domain and Thread modes
We should be able to do so, without too much new code or duplication by factoring the 'test specifications' into a reusable module akin to src/neg_tests/stm_tests_spec_ref.ml that can then be used from both the "regular test drivers" and from a "stats driver".
The src/neg_tests/stats.ml file of #99 contains steps in this direction.
This is a prerequisite for making progress on #338 (and #47) AFAICS.
The text was updated successfully, but these errors were encountered:
We have an initial stat setup in
src/statistics
which has already been handy to get initialThread
modes running.Since we have issues with
Lin
andSTM
Thread
-mode predictability andSTM_domain.agree_prop_par_asym
's usage ofSemaphore.Binary
not always triggering bugs, it would be welcome to expand on that stats setup to guide us towards stats-backed improvements.Right now, one has to rewrite code to setup a stats experiment every time.
Ideally, we would have stats tests in place ready to run. Running them continuously on the CI would probably be overkill and a waste of CPU cycles - it would nevertheless be useful to be able to do so, to help make sure improvements are indeed improvements across across the various platforms and architectures.
It could furthermore back changes to the Lin/STM core such as #324 by being able to say, e.g., "with 95% confidence this change does not affect the bug finding rate".
An ideal setup would measure
ref
andCList
tests + a few more representable examples (Hashtbl
?, ...) that also allocateLin
andSTM
properties inDomain
andThread
modesWe should be able to do so, without too much new code or duplication by factoring the 'test specifications' into a reusable module akin to
src/neg_tests/stm_tests_spec_ref.ml
that can then be used from both the "regular test drivers" and from a "stats driver".The
src/neg_tests/stats.ml
file of #99 contains steps in this direction.This is a prerequisite for making progress on #338 (and #47) AFAICS.
The text was updated successfully, but these errors were encountered: