-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SKIP] Update Benchmark Results (Azure F4s-v2-4vCPU-8GB-RAM) #403
Conversation
| C# (AOT) | 26.34 ms | 373.28 ms | 3.26 s | 3.65 s | | ||
| C# (JIT) | 27.52 ms | 376.84 ms | 3.26 s | 3.67 s | | ||
| F# (AOT) | 27.38 ms | 392.00 ms | 3.42 s | 3.84 s | | ||
| F# (JIT) | 91.46 ms | 532.33 ms | 4.04 s | 4.67 s | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to the first time here... Is vCPU really this unstable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referencing f#? It has consistently gotten that score over the last ~10 runs so I consider that stable. The jit just performs better on longer runs.
As I said before we should use minimal time as a proxy to performance because VM is too unstable |
If we assume “real world” example, I can imagine a web-service which get those JSON files in requests and returning processed top5 JSON files in response - you will be interested in average performance of the service probably. Or like 95-99% percentile of the service latency. |
that metric on a proper machine should still be closer to the minimal time than to this vCPU "average" time. |
I mean compare the Julia time to just a few hours ago on the exact same code base and machine: stable 24ms to this PR 31ms |
Comparing what is happening now with the original When comparing the two - I can see a much higher variability in the recent implementations (e.g., before we only had ~1ms). P. S. We can easily put this to the test by actually reviving that (exact) PR and stop trying to do all kinds of esthetic changes (that we assume will not impact the perf). P. P. S. I am aware of the recent issue from here. So maybe avoid precompilation and rely on the warm-up for now (the initial |
I understand that Julia has some problems with that but consider those 2 cases: Who knows what happened there… cache, cpu registers, operating system, file system, miracle.. Case 2: should this outlier be calculated? Like one client will wait his response too long. We definitely should be aware of those behaviour, but probably not count them in benchmark result. that’s why maybe mean is not the best metric and we could use median or mode instead |
whatever crap happened can only make time slower :) https://blog.kevmod.com/2016/06/10/benchmarking-minimum-vs-average/
what? this is a local, CPU-intense benchmark, without any network or I/O, so whatever happens (such as vCPU being hectic) are not real. Fluctuations cannot make your program faster, only slower, and those are noise |
This is not noise in this case @Moelf, I think you agree that something is strange about Julia, and as such I don't understand why we should favor it using the minimum. In any case the VM is showing the same behaviour of my two computers attested in JuliaLang/julia#51988, these are the results of the last run in Julia:
so using the minimum will not help in this case either. But it shouldn't be done anyway since I don't think that oscillations are just noise since if you look at all other languages they don't seem to oscillate as much as Julia even on previous versions where this clear strange pattern for Julia didn't emerge. |
one thing to try is to see if these oscillations are only related to |
The answer is no...Incredibly it happens less often that precompilation goes bad with |
Doing fresh run right now |
Automated PR