Weakly reference binary operation codomains #14058

robertwb · 2013-02-05T09:51:57Z

R(1) + S(1) caches a strong reference to both R and S, which we'd like to avoid.

See discussion at https://groups.google.com/forum/?fromgroups=#!topic/sage-devel/ozhIHIwQ4WA

CC: @nbruin @simon-king-jena @jpflori

Component: memleak

Author: Robert Bradshaw, Nils Bruin

Branch: 13cb2a7

Reviewer: Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/14058

The text was updated successfully, but these errors were encountered:

robertwb · 2013-02-05T09:57:39Z

Attachment: 14058-weak-binop.patch.gz

simon-king-jena · 2013-02-05T10:49:07Z

comment:2

Is it possible to add an example that shows that parents become collectable that previously weren't?

nbruin · 2013-02-05T17:38:12Z

Doctest proposal

nbruin · 2013-02-05T17:42:54Z

comment:3

Attachment: trac_14058-doctest.patch.gz

I like the idea. Can we get some data on speed regressions due to this? In principle the double lookup might be noticeable. I don't mean speed regressions due to parents getting deleted -- those are good to know but need other fixes. I mean when the parents are still around.

Attached doctests may need #12313 to work, in which case this ticket should depend on that (because then that's necessary to get benefit from this ticket).

nbruin · 2013-02-05T19:54:56Z

comment:4

Sadly, the performance hit is quite noticeable.

a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

Within a run, timing test seems fairly consistent. However, between different compiles of the same code I'm observing fluctuations of up to 0.10s in timings. Lucky/unlucky memory layout?
Anyway, with patch:

sage: %time test(a,b)
CPU times: user 2.15 s, sys: 0.00 s, total: 2.15 s
Wall time: 2.16 s
sage: %time test(a,c)
CPU times: user 2.25 s, sys: 0.00 s, total: 2.25 s
Wall time: 2.26 s
sage: %time test(b,c)
CPU times: user 16.83 s, sys: 0.00 s, total: 16.83 s

without patch

sage: %time test(a,b)
CPU times: user 1.54 s, sys: 0.00 s, total: 1.54 s
Wall time: 1.55 s
sage: %time test(a,c)
CPU times: user 1.64 s, sys: 0.00 s, total: 1.64 s
Wall time: 1.64 s
CPU times: user 15.68 s, sys: 0.00 s, total: 15.68 s
Wall time: 15.76 s

For test(a,a), test(b,b), test(c,c) there doesn't seem to be a difference (luckily!)

The penalties seem to be about 0.6s, 0.6s, 1.2s, which is rather consistent with 1,1,2 extra lookups.

This indicates to me it may be worth trying storing a weakref to the morphism instead, since that can probably be dereferenced faster than a coercion lookup on the codomain. Logically this should be equivalent. We're just storing a weakref to the object we're interested in rather than instructions where to go and lookup the object we want.

Caveat:

For stored coercions of the type
(R,S, ...): (None,<map S->R>)
or
(R,S, ...): (<map R->S>,None)
the codomain is part of the key, so deletion of the codomain (which is a prerequisite for the morphism being deleted and hence the weakref dying) implies removal of the weak key entry in TripleDict.

However, for stored coercions of the type
(R,S,...): (<map R->C>,<map S->C>)
we could have that C and the weakrefs to the morphisms die, but that would not trigger the removal from the TripleDict. So we could still encounter dead weakrefs (but much less than one would perhaps initially think!) It's not really much different from how attachment: 14058-weak-binop.patch handles dead weakrefs.

Major problem for immediately applying this idea:

sage: M=QQ.coerce_map_from(ZZ)
sage: import weakref
sage: weakref.ref(M)
TypeError: cannot create weak reference to 'sage.rings.rational.Z_to_Q' object

robertwb · 2013-02-05T20:02:44Z

comment:5

Yeah, I had exactly this same idea. I'll post an updated patch.

robertwb · 2013-02-05T20:25:21Z

comment:6

Attachment: 14058-weak-binop-morphisms.patch.gz

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch

nbruin · 2013-02-05T21:16:57Z

comment:7

Great! With this new patch I cannot distinguish the timing differences from the noise one normally gets already, so I'd think this is quite acceptable.

nbruin · 2013-02-05T22:12:05Z

comment:8

patchbot seems unhappy about the doctest.Probably #12313 is indeed necessary to make parents reclaimable.

nbruin · 2013-02-05T22:12:05Z

Dependencies: #12313

simon-king-jena · 2013-02-06T07:26:13Z

comment:9

Stating in the ticket description what patches are to apply. Admittedly I didn't test yet whether the doctest patch applies after Robert's new patch. But the test patch is needed, I think.

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch trac_14058-doctest.patch

simon-king-jena · 2013-02-06T13:45:57Z

comment:11

The patches apply, and at least the new test passes. So, this together with #12313, does fix another leak.

simon-king-jena · 2013-02-06T14:47:15Z

comment:13

In vanilla Sage, the coercion model stores (coercion) morphisms in its cache (which was a TripleDict, hence, with weak keys), and the only change introduced by your patch is to store weak references to the (coercion) morphism instead. Did I understand this correctly?

In vanilla Sage, the coercion model kept a strong reference to the morphism, which kept a strong reference to domain and codomain, which were thus not reclaimable, and so the item of the TripleDict has eternal life, in spite of the weak keys. Correct?

With the patch, the coercion model keeps a weak reference to the coercion morphism, hence, the strong reference from the morphism to domain and codomain doesn't matter, and thus the item of the TripleDict may disappear.

But how is the morphism kept alive? The coercion model only has a weak reference to it. Do I understand correctly that the morphism involved in the binary operation is, at the same time, stored in the coerce dict of its codomain? That's why it does not immediately die?

In other words, do I understand that the layout is as follows? We have parents A and B, and want to apply some binary operator, with a result in the parent C (C may coincide with either A or B). And we have maps r_A and r_B from A or B to C.

Both r_A and r_B are stored with a strong reference in a cache located in C, with weak keys A and B. At the same time, they are stored with a weak reference in the coercion model, again with weak keys A and B. r_A has strong references to C and to A, r_B has strong references to C and to B.

What do we want? Do we want that keeping C alive makes A and B survive? Or do we want that keeping both A and B alive makes C survive?

If the user has a strong reference to C, then C has a strong reference to r_A and r_B (in its coerce cache), and r_A/r_B strongly refers to A/B. Hence, the existence of C keeps A and B alive. Since C is a complicated object, it is more likely to be mortal, hence, probably it is not much of a problem.

If the user has a strong reference to both A and B, then C keeps a strong reference to both r_A and r_B, and the coercion model keeps a weak reference to r_A and r_B. Moreover, r_A and r_B strongly refer to C.

However, isn't it just a reference cycle between C and r_A/r_B? It would not prevent C from becoming collectable, right?

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

Or did I misunderstand something?

simon-king-jena · 2013-02-06T15:59:01Z

An example that shows that the current patch needs work

nbruin · 2013-02-06T15:59:06Z

comment:14

Attachment: testcoercion.py.gz

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there. Adding elements with the same parents is orders of magnitude faster than relying on coercion. A step closer is ensure that your parent remains in memory. I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

I don't think that we can afford to assume that just because a user has combined two parents he/she will do so again in the future. We may be able to afford doing so if it happened recently, but that really is an optimisation, and currently I don't see a hook where we can do this efficiently.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

The problem with artificially keeping transient elements alive for longer is that you'll defeat the heuristics for generational garbage collection (which Python does), and we depend rather heavily on that to get rid of cyclic garbage. Testing would be required to see if this is indeed a problem, and it will be very application-dependent.

simon-king-jena · 2013-02-06T16:02:35Z

comment:15

Before posting a reply to Nils' post, here is an example that the existence of A and B does not ensure the survival of the pushout of A and B. First, attach attachment: testcoercion.py into a Sage session. Then:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: cm = sage.structure.element.get_coercion_model()
sage: OC = cm.explain(OA,OB)
Coercion on left operand via
    Conversion map:
      From: A
      To:   C
Coercion on right operand via
    Conversion map:
      From: B
      To:   C
Arithmetic performed after coercions.
Result lives in C
sage: OC.is_coercion_cached(OA)
True
sage: OC.is_coercion_cached(OB)
True
sage: del OC
sage: import gc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0
sage: xc = OA(1)*OB(1)
sage: isinstance(xc.parent(),C)
True
sage: del xc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0

I do believe that this is not desired behaviour.

simon-king-jena · 2013-02-06T16:13:18Z

comment:16

Replying to @nbruin:

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there.

Granted. But it is a very typical usage to not explicitly convert everything into a common parent, but let the coercion model do the work. That is what the coercion model is for!

I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

However, in the example that I posted above, note that multiplying OA(1)*OB(1) repeatedly will also create C() repeatedly, even though it is a UniqueRepresentation and hence (weakly, nowadays) cached!

I think it is not acceptable that a multiple creation is triggered.

In this scenario:
P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b
If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

OK, but not all unique parent structures rely on it.

nbruin · 2013-02-06T16:33:03Z

comment:17

Replying to @simon-king-jena:

I think it is not acceptable that a multiple creation is triggered.

...
Why not? Should the user expect that doing something a second time is faster than the first, without taking special precautions?

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

But how do you tell the difference? Against too little caching there is a simple defense: keep a reference yourself. Against too much caching there is nothing you can do. You just run out of memory because data structures outside your control blow up.

We clearly have too much caching presently, in some cases by design. I think we first need a design that's fairly clean and for which we can reliably reason there's not "too much" caching (changing the order of memory requirement is definitely "too much" for me). Next step is tweaking it, possibly with MRU queues (which in that case should really be investigated for interfering with efficient GC, which is based on "objects either live very short or very long", so artificially creating objects with a medium lifespan is bad)

simon-king-jena · 2013-02-06T17:10:57Z

comment:18

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: a = OA(1)
sage: b = OB(1)
sage: def test(x,y):
....:     for _ in xrange(2*10^5):
....:         z = x*y
....:         
sage: %time test(a,a)
CPU times: user 4.49 s, sys: 0.00 s, total: 4.49 s
Wall time: 4.49 s
sage: %time test(a,b)
CPU times: user 8.99 s, sys: 0.11 s, total: 9.11 s
Wall time: 9.13 s
sage: c = a*b
sage: print c.parent()
C
sage: %time test(a,b)
CPU times: user 8.85 s, sys: 0.03 s, total: 8.88 s
Wall time: 8.89 s

In the first run of test(a,b), there is the possibility that parent(a*b) becomes created repeatedly, as I have demonstrated above. In the second run, parent(a*b) is kept alive, by keeping a pointer to the result of a*b. There is no significant difference in performance. Of course, we also see that multiplication within one parent is faster.

Is there a way to get the patchbots report significant regressions in some examples? Because I am sure that some parts of Sage (schemes and combinat are typical candidates) rely on a strong coercion cache.

nbruin · 2013-02-06T17:45:31Z

comment:19

Replying to @simon-king-jena:

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

Ah right, because C._coerce_from_hash[A]._codomain is C, there's a cycle involving C so instead of getting deleted eagerly by Python's refcounting, it's only done with (relatively infrequent) GCs. So it even comes with a trick to improve this free "caching": Increase the interval for garbage collection. See gc.get_threshold and gc.set_threshold. By tweaking those numbers you can probably manipulate the running times in these examples.

NOTE: Doesn't timeit turn garbage collection off? So that may be a misleading tool for investigating performance for these things.

simon-king-jena · 2013-02-06T19:01:10Z

comment:20

I think it makes sense to ask sage-devel whether it is better to fix this leak or not.

I believe that keeping A and B alive should prevent C from garbage collection. In contrast, keeping C alive should not prevent A and B from garbage collection. Currently, it is the other way around. Perhaps we should try to think through whether having a weak reference to the domain of coercion or conversion maps would work.

videlec · 2015-08-10T12:08:30Z

comment:62

The failing doctest is reproducible after applying this branch onto 6.9.beta1.

seblabbe · 2015-08-11T18:43:21Z

comment:63

On sage-6.9.beta1, I get:

$ sage nbruin.sage
Time: CPU 7.95 s, Wall: 7.95 s
Time: CPU 7.68 s, Wall: 7.68 s
Time: CPU 76.76 s, Wall: 76.94 s

On sage-6.9.beta1 + #14058, I get a very good improvement:

$ sage nbruin.sage
Time: CPU 2.48 s, Wall: 2.48 s
Time: CPU 2.07 s, Wall: 2.07 s
Time: CPU 32.38 s, Wall: 32.42 s

Note: I originally thought it was a regression in sage since I was getting this earlier this morning with sage-6.8 + #14058. But thanksfully, it seems to be a improvement obtained from this ticket.

The file nbruin.sage is from comment 4:

 $ cat nbruin.sage
a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

time test(a,b)
time test(a,c)
time test(b,c)

seblabbe · 2015-08-11T19:01:02Z

comment:64

I also get improvements on tests above of Simon King at comments 18.

With sage-6.9.beta1:

$ sage testcoercion.sage
Time: CPU 1.69 s, Wall: 1.69 s
Time: CPU 4.05 s, Wall: 4.05 s
C
Time: CPU 4.03 s, Wall: 4.03 s

With sage-6.9.beta1 + this ticket:

$ sage testcoercion.sage
Time: CPU 0.34 s, Wall: 0.34 s
Time: CPU 0.85 s, Wall: 0.85 s
C
Time: CPU 0.86 s, Wall: 0.86 s

(My file testcoercion.sage contains testcoercion.py + the lines at comment 18.)

seblabbe · 2015-08-12T09:30:06Z

comment:65

With sage-6.9.beta1 + this ticket, I get All tests passed when I do

./sage -t --long src/sage/structure/coerce.pyx

but I get the problem when I test after the build

./sage -bt --long src/sage/structure/coerce.pyx

which gives the error:

**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1
**********************************************************************

I have never seen such a behavior.

sagetrac-git · 2015-08-12T11:08:38Z

Changed commit from 8713960 to b73d537

sagetrac-git · 2015-08-12T11:08:38Z

Branch pushed to git repo; I updated commit sha1. New commits:

`c136ebd`	`Merge #14058 into 6.9.beta1`
`b73d537`	`Trac #14058: fix broken doctest`

sagetrac-git · 2015-08-12T11:41:01Z

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

`e153d08`	`#14058: Add doctest`
`b9141ee`	`Merge branch 'ticket/14058' into develop`
`90ed181`	`Trivial fix for a coercion doctest`
`64b572c`	`refcount libsingular rings used in plural`
`9793dbc`	`Make one test more stable`
`2c04029`	`make a test against a memory leak clearer/more stable`
`e976b79`	`Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0`
`8713960`	`trac #14058 convert some raise statements into py3 syntax`
`c136ebd`	`Merge #14058 into 6.9.beta1`
`13cb2a7`	`Trac #14058: fix broken doctest`

sagetrac-git · 2015-08-12T11:41:01Z

Changed commit from b73d537 to 13cb2a7

seblabbe · 2015-08-12T11:46:22Z

comment:68

In other doctests in the same file, GF(3), GF(7) and GF(41) are also created. While there are references to GF(7) and GF(41), no variable references to GF(3) which is available for gargage collection. If GF(3) gets garbage collected before the offending doctest, then everything is fine, the doctest pass. But it can also be garbage collected in the call to gc.collect() in the doctest. If this is the case, then N0 was also counting GF(3) which explains why N2-N0 can be negative.

One way to fix this is to call gc.collect() before creating N0.

(my first fix was using sets of ids... sorry for the noise).

simon-king-jena · 2015-08-12T12:08:19Z

comment:69

Replying to @seblabbe:

One way to fix this is to call gc.collect() before creating N0.

I agree, that's the way to fix it.

jpflori · 2015-08-13T09:31:47Z

comment:71

I assume Simon's comment gives positive review here as this was the only thing to fix.

jpflori · 2015-08-13T09:32:20Z

Changed reviewer from Simon King, Frédéric Chapoton, Jean-Pierre Flori to Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

vbraun · 2015-09-12T09:56:59Z

Changed dependencies from #12313 to none

vbraun · 2015-09-13T22:12:27Z

Changed branch from public/ticket/14058 to 13cb2a7

jdemeyer · 2015-09-17T19:59:56Z

Changed commit from 13cb2a7 to none

jdemeyer · 2015-09-17T19:59:56Z

comment:76

On arando:


sage -t --long src/sage/categories/fields.py
**********************************************************************
File "src/sage/categories/fields.py", line 104, in sage.categories.fields.Fields.__contains__
Failed example:
    len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
Expected:
    0
Got:
    -1
**********************************************************************

jdemeyer · 2015-09-19T10:53:54Z

comment:77

See #19244

robertwb added this to the sage-5.11 milestone Feb 5, 2013

robertwb added t: bug labels Feb 5, 2013

robertwb assigned rlmill Feb 5, 2013

robertwb added the s: needs review label Feb 5, 2013

This comment has been minimized.

Sign in to view

videlec added s: needs work and removed s: needs review labels Aug 10, 2015

seblabbe added s: needs review and removed s: needs work labels Aug 12, 2015

jpflori added s: positive review and removed s: needs review labels Aug 13, 2015

fchapoton modified the milestones: sage-6.4, sage-6.9 Aug 14, 2015

vbraun removed the s: positive review label Sep 13, 2015

vbraun closed this as completed in a7da2fb Sep 13, 2015

This was referenced Dec 29, 2022

Weakly reference actions #14072

Open

Cache a bounded number of recently-created parents. #14073

Open

jdemeyer mentioned this issue Sep 22, 2015

Broken doctest in src/sage/categories/fields.py #19244

Closed

Weakly reference binary operation codomains #14058

Weakly reference binary operation codomains #14058

Comments

robertwb commented Feb 5, 2013

robertwb commented Feb 5, 2013

simon-king-jena commented Feb 5, 2013

nbruin commented Feb 5, 2013

nbruin commented Feb 5, 2013

nbruin commented Feb 5, 2013

robertwb commented Feb 5, 2013

robertwb commented Feb 5, 2013

nbruin commented Feb 5, 2013

nbruin commented Feb 5, 2013

nbruin commented Feb 5, 2013

simon-king-jena commented Feb 6, 2013

This comment has been minimized.

This comment has been minimized.

simon-king-jena commented Feb 6, 2013

This comment has been minimized.

simon-king-jena commented Feb 6, 2013

simon-king-jena commented Feb 6, 2013

nbruin commented Feb 6, 2013

simon-king-jena commented Feb 6, 2013

simon-king-jena commented Feb 6, 2013

nbruin commented Feb 6, 2013

simon-king-jena commented Feb 6, 2013

nbruin commented Feb 6, 2013

simon-king-jena commented Feb 6, 2013

videlec commented Aug 10, 2015

seblabbe commented Aug 11, 2015

seblabbe commented Aug 11, 2015

seblabbe commented Aug 12, 2015

sagetrac-git mannequin commented Aug 12, 2015

sagetrac-git mannequin commented Aug 12, 2015

sagetrac-git mannequin commented Aug 12, 2015

sagetrac-git mannequin commented Aug 12, 2015

seblabbe commented Aug 12, 2015

simon-king-jena commented Aug 12, 2015

jpflori commented Aug 13, 2015

jpflori commented Aug 13, 2015

vbraun commented Sep 12, 2015

vbraun commented Sep 13, 2015

jdemeyer commented Sep 17, 2015

jdemeyer commented Sep 17, 2015

jdemeyer commented Sep 19, 2015