-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix computation of cache-line size on Power #6515
Conversation
d538b0a
to
df47457
Compare
port/unix/omrcpu.c
Outdated
:"r"((void *) &buf[512])); | ||
memset(buf, 255, sizeof(buf)); | ||
|
||
__asm__ __volatile__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the __volatile__
isn't needed. The port library copy doesn't use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nm, I see you just added it here. I guess it doesn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was used in both places in j9memclr.cpp
and won't hurt. In fact, it may save us from a future compiler optimization.
jenkins build all |
The changes LGTM, but can the cache line size detection be factored out into a single place? |
I'll look into whether there's a clear path to that. |
df47457
to
18814e6
Compare
Updated as suggested, reusing |
The description of the functional fix (i.e. the memory clobber) seems to have gone missing from the commit message. It's probably more important now than it was before, since there are more changes |
The compiler needs to be told that our uses of dcbz clobber memory, otherwise the optimizer can reasonably assume in getCacheLineSize(), that each element of buf still has the value (255) assigned by memset(). * reuse getCacheLineSize() in omrcpu_startup() * use uint32_t consistently for cache line size Signed-off-by: Keith W. Campbell <[email protected]>
18814e6
to
8b3bf5b
Compare
That description was never in the commit message itself (only in the description here); it's now also in the commit message. |
Jenkins build all |
This allows the compiler to do the widening operation once, rather than repeatedly in loops, fixing a performance regression introduced in eclipse#6515. Signed-off-by: Keith W. Campbell <[email protected]>
This allows the compiler to do the widening operation once, rather than repeatedly in loops, fixing a performance regression introduced in eclipse#6515. Signed-off-by: Keith W. Campbell <[email protected]>
We need to inform the compiler that our uses of
dcbz
clobber memory, otherwise the optimizer can reasonably assume, e.g. inomrcpu_startup(),
thatbuf[0] == 255
at the beginning of the for loop.