Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport to 1.9: Improve performance of global code by emitting fewer atomic barriers. #49411

Merged

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Apr 19, 2023

Manual backport of #47636. We should be careful with this change, as in #47636 there used to be a mysterious failure with the i686 tester during LinearAlgebra/special (a segfault) that just disappeared at some point.

@maleadt
Copy link
Member Author

maleadt commented Apr 19, 2023

As expected:

LinearAlgebra/special                             (8) |         failed at 2023-04-19T09:21:43.942
[236] signal (11.1): Segmentation fault
in expression starting at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/stdlib/v1.9/LinearAlgebra/test/special.jl:228
gc_try_setmark at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:1965 [inlined]
gc_mark_scan_obj8 at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:2215 [inlined]
gc_mark_loop at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:2515
_jl_gc_collect at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:3400
ijl_gc_collect at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:3707
maybe_collect at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:1078 [inlined]
jl_gc_pool_alloc_inner at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:1443 [inlined]
jl_gc_pool_alloc_noinline at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:1504 [inlined]
jl_gc_alloc_ at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia_internal.h:460 [inlined]
jl_gc_alloc at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gc.c:3754
_new_array_ at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/array.c:134 [inlined]
_new_array at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/array.c:198 [inlined]
ijl_alloc_array_2d at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/array.c:443
Array at ./boot.jl:479 [inlined]
Array at ./boot.jl:487 [inlined]
similar at ./abstractarray.jl:847 [inlined]
similar at ./abstractarray.jl:835 [inlined]
* at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/LinearAlgebra/src/qr.jl:683 [inlined]
* at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/LinearAlgebra/src/uniformscaling.jl:260 [inlined]
macro expansion at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
macro expansion at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/stdlib/v1.9/LinearAlgebra/test/special.jl:233 [inlined]
macro expansion at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
top-level scope at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/stdlib/v1.9/LinearAlgebra/test/special.jl:229
jl_toplevel_eval_flex at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:903
jl_eval_module_expr at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:203 [inlined]
jl_toplevel_eval_flex at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:715
jl_toplevel_eval_flex at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:921
ijl_toplevel_eval_in at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1866
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2739 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1926
include at ./Base.jl:457 [inlined]
macro expansion at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/test/testdefs.jl:29 [inlined]
macro expansion at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
macro expansion at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/test/testdefs.jl:23 [inlined]
macro expansion at ./timing.jl:501 [inlined]
#runtests#3 at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/test/testdefs.jl:21
runtests at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/test/testdefs.jl:5 [inlined]
runtests at /cache/build/default-amdci5-7/julialang/julia-release-1-dot-9/julia-0772b1241e/share/julia/test/testdefs.jl:5
unknown function (ip: 0xe4a8c868)
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
jl_f__call_latest at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/builtins.c:774
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2739 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
do_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/builtins.c:730
#invokelatest#2 at ./essentials.jl:818
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
do_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/builtins.c:730
invokelatest at ./essentials.jl:813
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
do_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/builtins.c:730
#110 at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285
run_work_thunk at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:70
macro expansion at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/usr/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285 [inlined]
#109 at ./task.jl:514
unknown function (ip: 0xe4a87c80)
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
start_task at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/task.c:1092
Allocations: 229540207 (Pool: 229429149; Big: 111058); GC: 590

I haven't been able to catch this in rr yet.

@maleadt maleadt added the DO NOT MERGE Do not merge this PR! label Apr 19, 2023
@maleadt maleadt marked this pull request as draft April 19, 2023 10:34
@maleadt
Copy link
Member Author

maleadt commented Apr 21, 2023

I can't reproduce this locally anymore, only with the build from CI, so I'm having CI produce me a debug build. The bug also seems very sensitive to how the stack is initialized (e.g., piping output or setting env vars hides the bug).

EDIT: that doesn't work on CI either, because of #47938.

@maleadt
Copy link
Member Author

maleadt commented Apr 21, 2023

Reproducer:

wget "https://buildkite.com/organizations/julialang/pipelines/julia-release-1-dot-9/builds/176/jobs/018798ba-cd8d-49c6-83b5-bf22c72497f7/artifacts/018798c9-f54c-4821-bb77-c6e37badf491"
tar -xvf 018798c9-f54c-4821-bb77-c6e37badf491
cat <<EOD >main.jl
using Test, LinearAlgebra 
begin
    for typ in 
        rand(0)
        b = rand(0,0)
        for pivot in (ColumnNorm(), NoPivot)
            qrb = qr(b, pivot)
            lmul!(qrb.Q, matri)
        end
    end
end
for pivot in (ColumnNorm(), NoPivot()), A in (rand(5, 3),)
    @test qr(A, pivot).Q
end
EOD
env -i ./julia-0772b1241e/bin/julia --check-bounds=yes --startup-file=no -t1 main.jl

cc @vtjnash

@maleadt maleadt force-pushed the tb/backport_global_atomic_barrier branch from 11b8dc1 to 85a869d Compare April 21, 2023 13:10
@oscardssmith
Copy link
Member

Why are we backporting this? It seems very featury.

@KristofferC
Copy link
Sponsor Member

It seems very featury.

What feature? It fixes the regression in #47561.

@oscardssmith
Copy link
Member

didn't realize the performance improvement was regression fixing. Never mind.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 21, 2023

Very cool repro–fast and reliable. The object looks like we never initialized it at the time where we segfault

LinearAlgebra.QRCompactWY{Float64, Array{Float64, 2}, Array{Float64, 2}}(factors=#<894>, T=#<null>)

The code here looks like:

  Core.PiNode(val=SSAValue(60), typ=LinearAlgebra.QRCompactWY{Float64, Array{Float64, 2}, Array{Float64, 2}}),
  Expr(:call, LinearAlgebra.getfield, SSAValue(63), :(:factors)),
  Expr(:call, LinearAlgebra.getfield, SSAValue(63), :(:T)),
  Expr(:new, LinearAlgebra.QRCompactWYQ{Float64, Array{Float64, 2}, Array{Float64, 2}}, SSAValue(64), SSAValue(65)),
  goto 74,

with an initial ssavalue of:

%60 = Core.PhiNode(edges=Array{Int32, (2,)}[51, 55], values=Array{Any, (2,)}[SSAValue(50), SSAValue(54)])::Union{LinearAlgebra.QRCompactWY{Float64, Array{Float64, 2}, Array{Float64, 2}}, LinearAlgebra.QRPivoted{Float64, Array{Float64, 2}, Array{Float64, 1}, Array{Int32, 1}}},

where that value came from (as sret)

%54 =  Expr(:invoke, #qr#85(Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, typeof(LinearAlgebra.qr), Array{Float64, 2}, LinearAlgebra.NoPivot) from #qr#85(Base.Pairs{Symbol, V, Tuple{Vararg{Symbol, N}}, NamedTuple{names, T}
} where T<:Tuple{Vararg{Any, N}} where names where N where V, typeof(LinearAlgebra.qr), AbstractArray{T, 2}, Any...) where {T}, LinearAlgebra.:(#qr#85), quote Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(data=NamedTuple(), itr=()) end, LinearAlgebra.qr, SSAValue(4), quote LinearAlgebra.NoPivot() end),

This looks emitted okay

L54:                                              ; preds = %L52
  %93 = load atomic i32, i32* @jl_world_counter acquire, align 4
  store i32 %93, i32* %world_age, align 4
  call void @"j_#qr#85_53"([2 x {} addrspace(10)*]* noalias nocapture noundef sret([2 x {} addrspace(10)*]) %5, {} addrspace(10)* %12) #0, !dbg !97
  %94 = bitcast {}*** %6 to {}**
  %current_task21 = getelementptr inbounds {}*, {}** %94, i32 -18
  %95 = load atomic i32, i32* @jl_world_counter acquire, align 4
  store i32 %95, i32* %world_age, align 4
  %96 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj({}** %current_task21, i32 8, {} addrspace(10)* addrspacecast ({}* inttoptr (i32 -472480960 to {}*) to {} addrspace(10)*)) #7, !dbg !19
  %97 = bitcast {} addrspace(10)* %96 to i8 addrspace(10)*
  %98 = bitcast [2 x {} addrspace(10)*]* %5 to i8*
  %99 = load atomic i32, i32* @jl_world_counter acquire, align 4
  store i32 %99, i32* %world_age, align 4
  call void @llvm.memcpy.p10i8.p0i8.i64(i8 addrspace(10)* align 4 %97, i8* %98, i64 8, i1 false), !dbg !19, !tbaa !125, !alias.scope !126, !noalias !127
  br label %L58, !dbg !84

but the post-optimization result doesn't have stores to initialize the object

define nonnull {} addrspace(10)* @"japi1_top-level scope_51"({} addrspace(10)* %0, {} addrspace(10)** noalias nocapture noundef readonly %1, i32 %2) #0 !dbg !4 {
top:
  %3 = alloca [3 x {} addrspace(10)*], align 4
  %gcframe79 = alloca [13 x {} addrspace(10)*], align 16
  %gcframe79.sub = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 0
  %.sub = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %3, i32 0, i32 0
  %4 = bitcast [13 x {} addrspace(10)*]* %gcframe79 to i8*
  call void @llvm.memset.p0i8.i32(i8* noundef nonnull align 16 dereferenceable(52) %4, i8 0, i32 52, i1 false), !tbaa !7
  %5 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 4
  %6 = bitcast {} addrspace(10)** %5 to [3 x {} addrspace(10)*]*
  %7 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 2
  %8 = bitcast {} addrspace(10)** %7 to [2 x {} addrspace(10)*]*
  %9 = alloca [184 x i8], align 16
  %10 = alloca {} addrspace(10)**, align 4
  %.sub80 = getelementptr inbounds [184 x i8], [184 x i8]* %9, i32 0, i32 0
  store volatile {} addrspace(10)** %1, {} addrspace(10)*** %10, align 4
  %pgcstack_i8 = call i8* asm "movl %gs:0, $0;\0Aaddl $$-4, $0", "=r,~{dirflag},~{fpsr},~{flags}"() #12
  %ppgcstack = bitcast i8* %pgcstack_i8 to {}****
  %pgcstack = load {}***, {}**** %ppgcstack, align 4
  %11 = bitcast [13 x {} addrspace(10)*]* %gcframe79 to i32*
  store i32 44, i32* %11, align 16, !tbaa !7
  %12 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 1
  %13 = bitcast {} addrspace(10)** %12 to {}***
  %14 = load {}**, {}*** %pgcstack, align 4
  store {}** %14, {}*** %13, align 4, !tbaa !7
  %15 = bitcast {}*** %pgcstack to {} addrspace(10)***
  store {} addrspace(10)** %gcframe79.sub, {} addrspace(10)*** %15, align 4
  %world_age25 = getelementptr inbounds {}**, {}*** %pgcstack, i32 1
  %world_age = bitcast {}*** %world_age25 to i32*
  %16 = load i32, i32* %world_age, align 4, !tbaa !7, !alias.scope !11, !noalias !14
  %17 = load atomic i32, i32* @jl_world_counter acquire, align 4
  store i32 %17, i32* %world_age, align 4
  %current_task1827 = getelementptr inbounds {}**, {}*** %pgcstack, i32 -18
  %18 = bitcast {}*** %current_task1827 to {}*
  %19 = addrspacecast {}* %18 to {} addrspace(10)*
  %20 = bitcast {} addrspace(10)** %7 to i64*
  %21 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 5
  %22 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 6
  %23 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 8
  store {} addrspace(10)* %19, {} addrspace(10)** %23, align 16
  %ptls_field81 = getelementptr inbounds {}**, {}*** %pgcstack, i32 2, !dbg !19
  %24 = bitcast {}*** %ptls_field81 to i8**, !dbg !19
  %ptls_load8283 = load i8*, i8** %24, align 4, !dbg !19, !tbaa !7
  %25 = call noalias nonnull {} addrspace(10)* @ijl_gc_pool_alloc(i8* %ptls_load8283, i32 716, i32 12) #7, !dbg !19
  %26 = bitcast {} addrspace(10)* %25 to i32 addrspace(10)*, !dbg !19
  %27 = getelementptr inbounds i32, i32 addrspace(10)* %26, i32 -1, !dbg !19
  store atomic i32 -472480960, i32 addrspace(10)* %27 unordered, align 4, !dbg !19, !tbaa !20
  %28 = bitcast {} addrspace(10)* %25 to i64 addrspace(10)*
  %29 = bitcast {} addrspace(10)* %25 to {} addrspace(10)* addrspace(10)*
  %30 = addrspacecast {} addrspace(10)* addrspace(10)* %29 to {} addrspace(10)* addrspace(11)*
  %31 = bitcast {} addrspace(10)* %25 to [2 x {} addrspace(10)*] addrspace(10)*
  %32 = addrspacecast [2 x {} addrspace(10)*] addrspace(10)* %31 to [2 x {} addrspace(10)*] addrspace(11)*
  %33 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*] addrspace(11)* %32, i32 0, i32 1
  br label %L2, !dbg !19

L2:                                               ; preds = %L124, %top
  %tindex_phi = phi i8 [ 1, %top ], [ %172, %L124 ]
  %value_phi = phi i32 [ 2, %top ], [ %161, %L124 ]
  %34 = load atomic i32, i32* @jl_world_counter acquire, align 4
  store i32 %34, i32* %world_age, align 4
  %35 = getelementptr inbounds [13 x {} addrspace(10)*], [13 x {} addrspace(10)*]* %gcframe79, i32 0, i32 7
  store {} addrspace(10)* %25, {} addrspace(10)** %35, align 4

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 21, 2023

It looks like refstore is incorrectly computed, so it doesn't block julia-loop-licm of this value, resulting in an uninitialized value in the julia GC frame (@pchintalapudi)

Base automatically changed from backports-release-1.9 to release-1.9 April 25, 2023 14:28
@maleadt maleadt force-pushed the tb/backport_global_atomic_barrier branch from 85a869d to 8d40445 Compare April 28, 2023 07:07
@maleadt
Copy link
Member Author

maleadt commented Apr 28, 2023

OK, let's try reverting #43057 then.

@maleadt
Copy link
Member Author

maleadt commented Apr 28, 2023

The 32-bit error that happens here is different:

fatal: error thrown and no exception handler available.
ErrorException("fatal error allocating signal stack: mmap: Cannot allocate memory")
ijl_errorf at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/rtutils.c:77
alloc_sigstack at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/signals-unix.c:658 [inlined]
jl_install_thread_signal_handler at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/signals-unix.c:665
jl_init_root_task at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/task.c:1570
ijl_adopt_thread at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/threading.c:409
unknown function (ip: 0xd7b6014a)
jl_work_wrapper at /cache/build/default-amdci4-7/julialang/julia-release-1-dot-9/src/jl_uv.c:980
uv__queue_work at /workspace/srcdir/libuv/src/threadpool.c:305
worker at /workspace/srcdir/libuv/src/threadpool.c:122
start_thread at /lib/i386-linux-gnu/libpthread.so.0 (unknown line)
clone at /lib/i386-linux-gnu/libc.so.6 (unknown line)

It also seems to happen on master, so maybe the GC corruption that happened before is fixed?

@gbaraldi
Copy link
Member

It seems to reliably fail in a ccall test, both here and on other places

@KristofferC
Copy link
Sponsor Member

(and on 1.9) AFAIU this started happening after a kernel upgrade on the buildbots.

@pchintalapudi
Copy link
Member

It looks like refstore is incorrectly computed, so it doesn't block julia-loop-licm of this value, resulting in an uninitialized value in the julia GC frame

So here's the full function, just before we run escape analysis on %46: https://gist.github.com/pchintalapudi/e73cff1ff78af737076fa9745de048e5

%49 is a load of an i64 from %11, which then gets stored to %46 as initialization. However, %11 is actually a bitcast of our gc frame to a i64*, so %49 is actually a {} addrspace(10)* masquerading as an i64. I personally think something in LLVM saw an opportunity to do store-load forwarding, and took the opportunity to generate an untracked pointer from a tracked one.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 28, 2023

It looked like we did a memcpy to initialize the object, since we knew it wasn't visible to the application. But the hoisting pass is moving the allocation without keeping the initialization.

@pchintalapudi
Copy link
Member

We could wipe objects when we hoist them so that initialization can no longer pose an issue there, but alloc-opt also relies on the ability to distinguish between a pointer field and a non-pointer field. If by the time alloc-opt sees the IR it's just a store of an i64, then alloc-opt could do the wrong thing and e.g. remove an allocation without rerooting the object.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 28, 2023

Yeah, codegen may be most significantly at fault there for not respecting the memory type, and treating ptr(addrspace(10) ptr) as ptr(i8), allowing misoptimization

@maleadt maleadt force-pushed the tb/backport_global_atomic_barrier branch from 31415a3 to 0be97bb Compare May 5, 2023 08:12
@maleadt
Copy link
Member Author

maleadt commented May 5, 2023

I backported #49584 here too, which IIUC should fix the 32bit segfault.

@maleadt maleadt marked this pull request as ready for review May 5, 2023 19:30
@maleadt
Copy link
Member Author

maleadt commented May 5, 2023

CI looks good!

@KristofferC I'll let you merge this in release-1.9; it back-ports both #49584 and #47636 (both of which had merge conflicts).

@DilumAluthge
Copy link
Member

@KristofferC Should this target the backports-release-1.9 branch instead of the release-1.9 branch?

@KristofferC KristofferC changed the base branch from release-1.9 to backports-release-1.9 May 8, 2023 07:19
@KristofferC KristofferC merged commit 052931a into backports-release-1.9 May 8, 2023
@KristofferC KristofferC deleted the tb/backport_global_atomic_barrier branch May 8, 2023 07:20
@DilumAluthge DilumAluthge removed the DO NOT MERGE Do not merge this PR! label Jun 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants