Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization in i.div_euclid(power_of_two) #71096

Closed
tspiteri opened this issue Apr 13, 2020 · 2 comments · Fixed by #125347
Closed

Missed optimization in i.div_euclid(power_of_two) #71096

tspiteri opened this issue Apr 13, 2020 · 2 comments · Fixed by #125347
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tspiteri
Copy link
Contributor

If a signed integer is divided by a power of two using Euclidean division, it is equivalent to an arithmetic shift, but this is not caught. For example

pub fn foo(a: i32) -> i32 {
    a.div_euclid(4)
}

produces the assembly code

example::foo:
        mov     eax, edi
        sar     eax, 31
        shr     eax, 30
        add     eax, edi
        mov     ecx, eax
        sar     ecx, 2
        and     eax, -4
        sub     edi, eax
        sar     edi, 31
        lea     eax, [rdi + rcx]
        ret

while it could simply be

example::foo:
        mov     eax, edi
        sar     eax, 2
        ret

The LLVM IR is

define i32 @_ZN7example3foo17h28d3a223d67ddccdE(i32 %a) unnamed_addr #0 !dbg !6 {
start:
  %q.i = sdiv i32 %a, 4, !dbg !9
  %_11.i = srem i32 %a, 4, !dbg !16
  %_11.lobit.i = ashr i32 %_11.i, 31, !dbg !18
  %.0.i = add nsw i32 %_11.lobit.i, %q.i, !dbg !18
  ret i32 %.0.i, !dbg !19
}

attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

which I interpret to be something like

pub fn foo_ir(a: i32) -> i32 {
    let q = a / 4;
    let r = a % 4;
    let sign_r = if r < 0 { -1 } else { 0 };
    sign_r + q
}

(I did check that this function produces the same LLVM IR.)

I don't know LLVM enough to know whether there is some kind of flooring division intrinsic that could be used to optimize this, or whether this a missed optimization in the LLVM side and LLVM can recognize that pattern into an arithmetic shift.

@jonas-schievink jonas-schievink added A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 13, 2020
@nagisa nagisa added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Apr 14, 2020
@nagisa
Copy link
Member

nagisa commented Apr 14, 2020

I market this as A-LLVM as it could also be an opttimisation in LLVM, but I suspect that we may just need to write the function in a way that would make this more obvious to LLVM.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@DianQK
Copy link
Member

DianQK commented Mar 29, 2024

rustc 1.70 has already fixed this issue: https://rust.godbolt.org/z/YxdfjMdxh.

@rustbot label E-needs-test -llvm-fixed-upstream

@rustbot rustbot added E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes and removed llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes labels Mar 29, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 9, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 10, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 11, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 13, 2024
@bors bors closed this as completed in 7ac6c2f Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants