-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate unnecessary vector register moves #33975
Comments
Is v8 possibly a register that needs to be saved due to the method call? |
It could be a case since even though it's a non-volatile register (to be precise, its low 64 bits are non-volatile) we still need to preserve its high 64 bits if they are actually used somewhere. |
That's right, but in the second example the value is used before it's restored
so if the parent did clobber the top bits the result would be wrong. So looks like a dataflow bug.. Also I think the RA should avoid using |
or if the JIT know which registers are used in the function call it can safely not do the spill/restore at all. I don't know if it has that information for LRA? |
Also what's the general preference here? stack space or performance? if stack space is a concern than using the |
I agree this could be an option, but not sure if this can be done easily. @CarolEidt probably knows.
I believe generally such information is not available to the JIT due to top-down compilation approach.
I didn't know that spilling could be cheaper than register move. @BruceForstall @briansull Do you see any issues with JIT doing this instead of ASIMD MOV for volatile SIMD&FP registers around a function call? |
So you're saying it would be better to save & restore the full vector, rather than doing the partial save & restore? The use of |
Actually I take that back, this is a bit nuanced.. while the stores can end up being cheaper as soon as you have more than 1 register to save. The loads are a lot more expensive.. so until you spill a LOT, the But how does the JIT handle this case and high register pressure? I'm slightly worried that it'll spill another register so it can do the saving of the top part.. |
I believe that what it will do is spill it to a volatile register (since those are always available), and then store that upper half to memory. It might be reasonable to add support for spilling to memory if the callee-save registers are all occupied. |
Ah ok, that sounds reasonable. Then I think if we can avoid using I suspect that once all the helper functions have some naive AArch64 code behind them that having a non-vector call in between vector code would be much rarer... unless they're vector function calls but that would ideally use the vector APCs anyway. so the only bug here is the use before restore of |
@echesakovMSFT @TamarChristinaArm - unless I'm looking at the wrong thing, it seems that we are no longer generating the same code for these |
@CarolEidt We might not see it now because |
Yeah what @echesakovMSFT should work. Could probably increase the chances of getting this exact issue by increasing register pressure. |
I can take a look at this |
I was trying to find instances of the "use-before-restore" issue and I couldn't find them even in the original PR #33749 The code that I quoted on the first message of the thread was copied from Tamar's comment #33749 (comment) and I believe corresponds to 6E084509 mov v9.d[0], v8.d[1]
6E08454B mov v11.d[0], v10.d[1]
97FFF096 bl System.Runtime.Intrinsics.Vector128:<Create>g__SoftwareFallback|23_0(int):System.Runtime.Intrinsics.Vector128`1[Int32]
4E003810 zip1 v16.16b, v0.16b, v0.16b
4E103A10 zip1 v16.16b, v16.16b, v16.16b
4E103A11 zip1 v17.16b, v16.16b, v16.16b
6E180528 mov v8.d[1], v9.d[0]
4E281E31 and v17.16b, v17.16b, v8.16b
6E18056A mov v10.d[1], v11.d[0] Upper 64 bits of Unless, I misunderstood something, I don't think there exists such issue in the JIT. |
@echesakovMSFT hmm no I think I agree.. I don't know how that bit changed during copy and paste, but clearly it's not the same sequence as in the original comment you made. I usually copy them locally to make inline comments and something must have flipped.. So this looks like a bogus report.. Sorry for the noise, I'll double check these from now on to make sure. |
@echesakovMSFT - should this now be closed? |
@CarolEidt I was looking into another aspect of the issue - whether the mov-s are necessary or not, so I kept the issue open. |
That sounds like a great idea! |
Closing since the original issue was confirmed to be not an issue. |
Copied from @TamarChristinaArm reply in #33749 (comment)
category:cq
theme:hardware-intrinsics
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: