Missed optimization in i.div_euclid(power_of_two) #71096

tspiteri · 2020-04-13T14:09:30Z

If a signed integer is divided by a power of two using Euclidean division, it is equivalent to an arithmetic shift, but this is not caught. For example

pub fn foo(a: i32) -> i32 {
    a.div_euclid(4)
}

produces the assembly code

example::foo:
        mov     eax, edi
        sar     eax, 31
        shr     eax, 30
        add     eax, edi
        mov     ecx, eax
        sar     ecx, 2
        and     eax, -4
        sub     edi, eax
        sar     edi, 31
        lea     eax, [rdi + rcx]
        ret

while it could simply be

example::foo:
        mov     eax, edi
        sar     eax, 2
        ret

The LLVM IR is

define i32 @_ZN7example3foo17h28d3a223d67ddccdE(i32 %a) unnamed_addr #0 !dbg !6 {
start:
  %q.i = sdiv i32 %a, 4, !dbg !9
  %_11.i = srem i32 %a, 4, !dbg !16
  %_11.lobit.i = ashr i32 %_11.i, 31, !dbg !18
  %.0.i = add nsw i32 %_11.lobit.i, %q.i, !dbg !18
  ret i32 %.0.i, !dbg !19
}

attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

which I interpret to be something like

pub fn foo_ir(a: i32) -> i32 {
    let q = a / 4;
    let r = a % 4;
    let sign_r = if r < 0 { -1 } else { 0 };
    sign_r + q
}

(I did check that this function produces the same LLVM IR.)

I don't know LLVM enough to know whether there is some kind of flooring division intrinsic that could be used to optimize this, or whether this a missed optimization in the LLVM side and LLVM can recognize that pattern into an arithmetic shift.

nagisa · 2020-04-14T01:17:36Z

I market this as A-LLVM as it could also be an opttimisation in LLVM, but I suspect that we may just need to write the function in a way that would make this more obvious to LLVM.

DianQK · 2024-03-29T04:47:21Z

rustc 1.70 has already fixed this issue: https://rust.godbolt.org/z/YxdfjMdxh.

@rustbot label E-needs-test -llvm-fixed-upstream

Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic

nagisa added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Apr 14, 2020

tspiteri mentioned this issue Sep 14, 2022

Missed optimization for flooring division of signed by power of two llvm/llvm-project#57741

Closed

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

tesuji mentioned this issue May 20, 2024

Add codegen tests for E-needs-test #125347

Merged

bors closed this as completed in #125347 Jun 14, 2024

bors closed this as completed in 7ac6c2f Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missed optimization in i.div_euclid(power_of_two) #71096

Missed optimization in i.div_euclid(power_of_two) #71096

tspiteri commented Apr 13, 2020

nagisa commented Apr 14, 2020

DianQK commented Mar 29, 2024 •

edited

Loading

Missed optimization in i.div_euclid(power_of_two) #71096

Missed optimization in i.div_euclid(power_of_two) #71096

Comments

tspiteri commented Apr 13, 2020

nagisa commented Apr 14, 2020

DianQK commented Mar 29, 2024 • edited Loading

DianQK commented Mar 29, 2024 •

edited

Loading