Improve `mul!`, `AddedOperator`, and `update_coefficients!` to remove memory allocations #249

albertomercurio · 2024-10-13T11:28:13Z

This PR fixes #246 and the remaining allocations of #247.

This is related to this issue in the SparseArrays.jl package.

albertomercurio · 2024-10-13T15:05:09Z

Ok, I'm done.

albertomercurio · 2024-10-13T20:46:48Z

I've also noticed that the time-dependent case is not still fixed. As an example

T = ComplexF64
N = 500
M = 10
# A1 = MatrixOperator(rand(T, N, N))
# A2 = MatrixOperator(rand(T, N, N))
A1 = MatrixOperator(sprand(T, N, N, 5 / N))
A2 = MatrixOperator(sprand(T, N, N, 5 / N))
A3 = MatrixOperator(sprand(T, N, N, 5 / N))

coeff1(a, u, p, t) = sin(p.ω * t)
coeff2(a, u, p, t) = cos(p.ω * t)
coeff3(a, u, p, t) = sin(p.ω * t) * cos(p.ω * t)

c1 = ScalarOperator(rand(T), coeff1)
c2 = ScalarOperator(rand(T), coeff2)
c3 = ScalarOperator(rand(T), coeff3)

H = c1 * A1 + c2 * A2 + c3 * A3

u = rand(T, N);
du = similar(u);
p = (ω=0.1,);
t = 0.1;

@benchmark $H($du, $u, $p, $t)

BenchmarkTools.Trial: 10000 samples with 7 evaluations.
 Range (min … max):  4.964 μs …  10.530 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.045 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.079 μs ± 211.087 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▂▆▇██▇▆▅▃▂▁           ▁▁▁ ▁       ▁▁▁▁                 ▂
  ▄▁▄▃▁█████████████▆▆▅▆▆▆▆▆█████████▆▇▇██████▇▅▇▆▅▆▄▄▄▅▅▃▄▁▄ █
  4.96 μs      Histogram: log(frequency) by time      5.48 μs <

 Memory estimate: 752 bytes, allocs estimate: 8.

And the number of the allocations scales as the number of operators in the sum. I actually don't understand what is the problem.

Anyhow, the time-independent code in #246 is fixed.

albertomercurio · 2024-10-15T02:01:10Z

Ok, also the time-dependent case should be fixed. At least for the case I studied. I basically followed the discussion on this thread, where they recommended to use generated functions when dealing with tuples of different types.

Indeed, the memory allocations are absent now.

ChrisRackauckas · 2024-10-15T15:33:13Z

src/basic.jl

    iszero(L.λ) && return lmul!(false, v)
    a = convert(Number, L.λ)
    mul!(v, L.L, u, a, false)
 end

-function LinearAlgebra.mul!(v::AbstractVecOrMat,
+@inline function LinearAlgebra.mul!(v::AbstractVecOrMat,


why is this needed?

It is related to this issue in SparseArrays.jl. We currently need it to avoid extra allocations.

ChrisRackauckas · 2024-10-15T15:33:53Z

src/basic.jl

-    for op in L.ops
-        iszero(op) && continue
-        mul!(v, op, u, α, true)
+    ops_types = L.parameters[2].parameters
+    N = length(ops_types)


is this actually required?

The AddedOperator contains a Tuple of different objects. Performing a simple for loop would lead to runtime dispatch and extra allocations. Following this thread they recommended to use @generated functions. In this way, we directly work with the single types of the Tuple, and we unroll the for loop.

A recursive implmeentation is probably cleaner and would get more compilation reuse?

How exactly?

ChrisRackauckas · 2024-10-15T15:34:35Z

Needs tests.

albertomercurio · 2024-10-15T15:40:44Z

I didn't add any new functionality, so the current tests should be fine. This was just to remove extra allocations and improve performance. But if you want to add some test, do you have something specific in mind?

ChrisRackauckas · 2024-10-15T15:44:19Z

A test for zero allocations so it doesn't regress

albertomercurio · 2024-10-15T16:30:03Z

Ok, I added the tests. It seems that @allocations returns 1 instead of 0 like with BenchmarkTools.jl. @allocations returns 0 outside of the test.

Anyways, the number of allocations of the current master branch is of the order of hundreds.

ChrisRackauckas · 2024-10-15T18:12:02Z

It's allocating a return because of the global scope IIUC. Wrap it in a function that then returns nothing.

albertomercurio · 2024-10-15T22:12:50Z

Still the same. Outside the tests it returns 0 allocations with both @allocations and BenchmarkTools.jl. Inside the test it returns 1.

ChrisRackauckas · 2024-10-16T13:20:07Z

@oscardssmith rfc

oscardssmith · 2024-10-16T13:45:23Z

This happens because apply_op! is a a dispatch so it's allocating for the method call. It does seem to me that this might be a bug in the @allocations macro, but I'm not entirely sure. The easiest way to get around it is to something like

test_apply_noalloc(H, du, u, p, t) = @test @allocations apply_op!(H, du, u, p, t) ==0

since that will introduce a function barrier.

albertomercurio added 7 commits October 13, 2024 13:17

Make inline mul!

206c4c7

Remove extra arguments in mul!

92e0d00

Format document

1383b7b

Other improvements on AddedOperator

1aab2cc

Add * method for constant ScaledOperator

f02cac6

Relax * method for Number and ScaledOperator

dbd3590

Format file

a1283f4

albertomercurio added 2 commits October 13, 2024 17:06

Add empty end line

9190d18

Fix error on mul!

3fb8a9e

Implement generated functions

934ae5d

albertomercurio mentioned this pull request Oct 15, 2024

Introduce QobjEvo and use SciMLOperators for time evolution qutip/QuantumToolbox.jl#266

Draft

12 tasks

albertomercurio changed the title ~~Improve mul! to remove memory allocations~~ Improve mul!, AddedOperator, and update_coefficients! to remove memory allocations Oct 15, 2024

ChrisRackauckas reviewed Oct 15, 2024

View reviewed changes

ChrisRackauckas requested a review from oscardssmith October 15, 2024 15:34

Add tests

8280c24

Make separate function for allocations test

4cafaa0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `mul!`, `AddedOperator`, and `update_coefficients!` to remove memory allocations #249

Improve `mul!`, `AddedOperator`, and `update_coefficients!` to remove memory allocations #249

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas Oct 15, 2024

albertomercurio Oct 15, 2024

ChrisRackauckas Oct 15, 2024

albertomercurio Oct 15, 2024

ChrisRackauckas Oct 15, 2024

albertomercurio Oct 15, 2024

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 16, 2024

oscardssmith commented Oct 16, 2024

Improve mul!, AddedOperator, and update_coefficients! to remove memory allocations #249

Are you sure you want to change the base?

Improve mul!, AddedOperator, and update_coefficients! to remove memory allocations #249

Conversation

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 13, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas Oct 15, 2024

Choose a reason for hiding this comment

albertomercurio Oct 15, 2024

Choose a reason for hiding this comment

ChrisRackauckas Oct 15, 2024

Choose a reason for hiding this comment

albertomercurio Oct 15, 2024

Choose a reason for hiding this comment

ChrisRackauckas Oct 15, 2024

Choose a reason for hiding this comment

albertomercurio Oct 15, 2024

Choose a reason for hiding this comment

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 15, 2024

albertomercurio commented Oct 15, 2024

ChrisRackauckas commented Oct 16, 2024

oscardssmith commented Oct 16, 2024

Improve `mul!`, `AddedOperator`, and `update_coefficients!` to remove memory allocations #249

Improve `mul!`, `AddedOperator`, and `update_coefficients!` to remove memory allocations #249