Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weakly reference binary operation codomains #14058

Closed
robertwb opened this issue Feb 5, 2013 · 91 comments
Closed

Weakly reference binary operation codomains #14058

robertwb opened this issue Feb 5, 2013 · 91 comments

Comments

@robertwb
Copy link
Contributor

robertwb commented Feb 5, 2013

R(1) + S(1) caches a strong reference to both R and S, which we'd like to avoid.

See discussion at https://groups.google.com/forum/?fromgroups=#!topic/sage-devel/ozhIHIwQ4WA

CC: @nbruin @simon-king-jena @jpflori

Component: memleak

Author: Robert Bradshaw, Nils Bruin

Branch: 13cb2a7

Reviewer: Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/14058

@robertwb robertwb added this to the sage-5.11 milestone Feb 5, 2013
@robertwb
Copy link
Contributor Author

robertwb commented Feb 5, 2013

Attachment: 14058-weak-binop.patch.gz

@simon-king-jena
Copy link
Member

comment:2

Is it possible to add an example that shows that parents become collectable that previously weren't?

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

Doctest proposal

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

comment:3

Attachment: trac_14058-doctest.patch.gz

I like the idea. Can we get some data on speed regressions due to this? In principle the double lookup might be noticeable. I don't mean speed regressions due to parents getting deleted -- those are good to know but need other fixes. I mean when the parents are still around.

Attached doctests may need #12313 to work, in which case this ticket should depend on that (because then that's necessary to get benefit from this ticket).

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

comment:4

Sadly, the performance hit is quite noticeable.

a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

Within a run, timing test seems fairly consistent. However, between different compiles of the same code I'm observing fluctuations of up to 0.10s in timings. Lucky/unlucky memory layout?
Anyway, with patch:

sage: %time test(a,b)
CPU times: user 2.15 s, sys: 0.00 s, total: 2.15 s
Wall time: 2.16 s
sage: %time test(a,c)
CPU times: user 2.25 s, sys: 0.00 s, total: 2.25 s
Wall time: 2.26 s
sage: %time test(b,c)
CPU times: user 16.83 s, sys: 0.00 s, total: 16.83 s

without patch

sage: %time test(a,b)
CPU times: user 1.54 s, sys: 0.00 s, total: 1.54 s
Wall time: 1.55 s
sage: %time test(a,c)
CPU times: user 1.64 s, sys: 0.00 s, total: 1.64 s
Wall time: 1.64 s
CPU times: user 15.68 s, sys: 0.00 s, total: 15.68 s
Wall time: 15.76 s

For test(a,a), test(b,b), test(c,c) there doesn't seem to be a difference (luckily!)

The penalties seem to be about 0.6s, 0.6s, 1.2s, which is rather consistent with 1,1,2 extra lookups.

This indicates to me it may be worth trying storing a weakref to the morphism instead, since that can probably be dereferenced faster than a coercion lookup on the codomain. Logically this should be equivalent. We're just storing a weakref to the object we're interested in rather than instructions where to go and lookup the object we want.

Caveat:

For stored coercions of the type
(R,S, ...): (None,<map S->R>)
or
(R,S, ...): (<map R->S>,None)
the codomain is part of the key, so deletion of the codomain (which is a prerequisite for the morphism being deleted and hence the weakref dying) implies removal of the weak key entry in TripleDict.

However, for stored coercions of the type
(R,S,...): (<map R->C>,<map S->C>)
we could have that C and the weakrefs to the morphisms die, but that would not trigger the removal from the TripleDict. So we could still encounter dead weakrefs (but much less than one would perhaps initially think!) It's not really much different from how attachment: 14058-weak-binop.patch handles dead weakrefs.

Major problem for immediately applying this idea:

sage: M=QQ.coerce_map_from(ZZ)
sage: import weakref
sage: weakref.ref(M)
TypeError: cannot create weak reference to 'sage.rings.rational.Z_to_Q' object

@robertwb
Copy link
Contributor Author

robertwb commented Feb 5, 2013

comment:5

Yeah, I had exactly this same idea. I'll post an updated patch.

@robertwb
Copy link
Contributor Author

robertwb commented Feb 5, 2013

comment:6

Attachment: 14058-weak-binop-morphisms.patch.gz

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

comment:7

Great! With this new patch I cannot distinguish the timing differences from the noise one normally gets already, so I'd think this is quite acceptable.

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

comment:8

patchbot seems unhappy about the doctest.Probably #12313 is indeed necessary to make parents reclaimable.

@nbruin
Copy link
Contributor

nbruin commented Feb 5, 2013

Dependencies: #12313

@simon-king-jena
Copy link
Member

comment:9

Stating in the ticket description what patches are to apply. Admittedly I didn't test yet whether the doctest patch applies after Robert's new patch. But the test patch is needed, I think.

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch trac_14058-doctest.patch

@simon-king-jena

This comment has been minimized.

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:11

The patches apply, and at least the new test passes. So, this together with #12313, does fix another leak.

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:13

In vanilla Sage, the coercion model stores (coercion) morphisms in its cache (which was a TripleDict, hence, with weak keys), and the only change introduced by your patch is to store weak references to the (coercion) morphism instead. Did I understand this correctly?

In vanilla Sage, the coercion model kept a strong reference to the morphism, which kept a strong reference to domain and codomain, which were thus not reclaimable, and so the item of the TripleDict has eternal life, in spite of the weak keys. Correct?

With the patch, the coercion model keeps a weak reference to the coercion morphism, hence, the strong reference from the morphism to domain and codomain doesn't matter, and thus the item of the TripleDict may disappear.

But how is the morphism kept alive? The coercion model only has a weak reference to it. Do I understand correctly that the morphism involved in the binary operation is, at the same time, stored in the coerce dict of its codomain? That's why it does not immediately die?

In other words, do I understand that the layout is as follows? We have parents A and B, and want to apply some binary operator, with a result in the parent C (C may coincide with either A or B). And we have maps r_A and r_B from A or B to C.

Both r_A and r_B are stored with a strong reference in a cache located in C, with weak keys A and B. At the same time, they are stored with a weak reference in the coercion model, again with weak keys A and B. r_A has strong references to C and to A, r_B has strong references to C and to B.

What do we want? Do we want that keeping C alive makes A and B survive? Or do we want that keeping both A and B alive makes C survive?

If the user has a strong reference to C, then C has a strong reference to r_A and r_B (in its coerce cache), and r_A/r_B strongly refers to A/B. Hence, the existence of C keeps A and B alive. Since C is a complicated object, it is more likely to be mortal, hence, probably it is not much of a problem.

If the user has a strong reference to both A and B, then C keeps a strong reference to both r_A and r_B, and the coercion model keeps a weak reference to r_A and r_B. Moreover, r_A and r_B strongly refer to C.

However, isn't it just a reference cycle between C and r_A/r_B? It would not prevent C from becoming collectable, right?

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

Or did I misunderstand something?

@simon-king-jena
Copy link
Member

An example that shows that the current patch needs work

@nbruin
Copy link
Contributor

nbruin commented Feb 6, 2013

comment:14

Attachment: testcoercion.py.gz

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there. Adding elements with the same parents is orders of magnitude faster than relying on coercion. A step closer is ensure that your parent remains in memory. I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

I don't think that we can afford to assume that just because a user has combined two parents he/she will do so again in the future. We may be able to afford doing so if it happened recently, but that really is an optimisation, and currently I don't see a hook where we can do this efficiently.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

The problem with artificially keeping transient elements alive for longer is that you'll defeat the heuristics for generational garbage collection (which Python does), and we depend rather heavily on that to get rid of cyclic garbage. Testing would be required to see if this is indeed a problem, and it will be very application-dependent.

@simon-king-jena
Copy link
Member

comment:15

Before posting a reply to Nils' post, here is an example that the existence of A and B does not ensure the survival of the pushout of A and B. First, attach attachment: testcoercion.py into a Sage session. Then:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: cm = sage.structure.element.get_coercion_model()
sage: OC = cm.explain(OA,OB)
Coercion on left operand via
    Conversion map:
      From: A
      To:   C
Coercion on right operand via
    Conversion map:
      From: B
      To:   C
Arithmetic performed after coercions.
Result lives in C
sage: OC.is_coercion_cached(OA)
True
sage: OC.is_coercion_cached(OB)
True
sage: del OC
sage: import gc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0
sage: xc = OA(1)*OB(1)
sage: isinstance(xc.parent(),C)
True
sage: del xc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0

I do believe that this is not desired behaviour.

@simon-king-jena
Copy link
Member

comment:16

Replying to @nbruin:

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there.

Granted. But it is a very typical usage to not explicitly convert everything into a common parent, but let the coercion model do the work. That is what the coercion model is for!

I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

However, in the example that I posted above, note that multiplying OA(1)*OB(1) repeatedly will also create C() repeatedly, even though it is a UniqueRepresentation and hence (weakly, nowadays) cached!

I think it is not acceptable that a multiple creation is triggered.

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

OK, but not all unique parent structures rely on it.

@nbruin
Copy link
Contributor

nbruin commented Feb 6, 2013

comment:17

Replying to @simon-king-jena:

I think it is not acceptable that a multiple creation is triggered.

...
Why not? Should the user expect that doing something a second time is faster than the first, without taking special precautions?

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

But how do you tell the difference? Against too little caching there is a simple defense: keep a reference yourself. Against too much caching there is nothing you can do. You just run out of memory because data structures outside your control blow up.

We clearly have too much caching presently, in some cases by design. I think we first need a design that's fairly clean and for which we can reliably reason there's not "too much" caching (changing the order of memory requirement is definitely "too much" for me). Next step is tweaking it, possibly with MRU queues (which in that case should really be investigated for interfering with efficient GC, which is based on "objects either live very short or very long", so artificially creating objects with a medium lifespan is bad)

@simon-king-jena
Copy link
Member

comment:18

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: a = OA(1)
sage: b = OB(1)
sage: def test(x,y):
....:     for _ in xrange(2*10^5):
....:         z = x*y
....:         
sage: %time test(a,a)
CPU times: user 4.49 s, sys: 0.00 s, total: 4.49 s
Wall time: 4.49 s
sage: %time test(a,b)
CPU times: user 8.99 s, sys: 0.11 s, total: 9.11 s
Wall time: 9.13 s
sage: c = a*b
sage: print c.parent()
C
sage: %time test(a,b)
CPU times: user 8.85 s, sys: 0.03 s, total: 8.88 s
Wall time: 8.89 s

In the first run of test(a,b), there is the possibility that parent(a*b) becomes created repeatedly, as I have demonstrated above. In the second run, parent(a*b) is kept alive, by keeping a pointer to the result of a*b. There is no significant difference in performance. Of course, we also see that multiplication within one parent is faster.

Is there a way to get the patchbots report significant regressions in some examples? Because I am sure that some parts of Sage (schemes and combinat are typical candidates) rely on a strong coercion cache.

@nbruin
Copy link
Contributor

nbruin commented Feb 6, 2013

comment:19

Replying to @simon-king-jena:

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

Ah right, because C._coerce_from_hash[A]._codomain is C, there's a cycle involving C so instead of getting deleted eagerly by Python's refcounting, it's only done with (relatively infrequent) GCs. So it even comes with a trick to improve this free "caching": Increase the interval for garbage collection. See gc.get_threshold and gc.set_threshold. By tweaking those numbers you can probably manipulate the running times in these examples.

NOTE: Doesn't timeit turn garbage collection off? So that may be a misleading tool for investigating performance for these things.

@simon-king-jena
Copy link
Member

comment:20

I think it makes sense to ask sage-devel whether it is better to fix this leak or not.

I believe that keeping A and B alive should prevent C from garbage collection. In contrast, keeping C alive should not prevent A and B from garbage collection. Currently, it is the other way around. Perhaps we should try to think through whether having a weak reference to the domain of coercion or conversion maps would work.

@videlec
Copy link
Contributor

videlec commented Aug 10, 2015

comment:62

The failing doctest is reproducible after applying this branch onto 6.9.beta1.

@seblabbe
Copy link
Contributor

comment:63

On sage-6.9.beta1, I get:

$ sage nbruin.sage
Time: CPU 7.95 s, Wall: 7.95 s
Time: CPU 7.68 s, Wall: 7.68 s
Time: CPU 76.76 s, Wall: 76.94 s

On sage-6.9.beta1 + #14058, I get a very good improvement:

$ sage nbruin.sage
Time: CPU 2.48 s, Wall: 2.48 s
Time: CPU 2.07 s, Wall: 2.07 s
Time: CPU 32.38 s, Wall: 32.42 s

Note: I originally thought it was a regression in sage since I was getting this earlier this morning with sage-6.8 + #14058. But thanksfully, it seems to be a improvement obtained from this ticket.

The file nbruin.sage is from comment 4:

 $ cat nbruin.sage
a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

time test(a,b)
time test(a,c)
time test(b,c)

@seblabbe
Copy link
Contributor

comment:64

I also get improvements on tests above of Simon King at comments 18.

With sage-6.9.beta1:

$ sage testcoercion.sage
Time: CPU 1.69 s, Wall: 1.69 s
Time: CPU 4.05 s, Wall: 4.05 s
C
Time: CPU 4.03 s, Wall: 4.03 s

With sage-6.9.beta1 + this ticket:

$ sage testcoercion.sage
Time: CPU 0.34 s, Wall: 0.34 s
Time: CPU 0.85 s, Wall: 0.85 s
C
Time: CPU 0.86 s, Wall: 0.86 s

(My file testcoercion.sage contains testcoercion.py + the lines at comment 18.)

@seblabbe
Copy link
Contributor

comment:65

With sage-6.9.beta1 + this ticket, I get All tests passed when I do

./sage -t --long src/sage/structure/coerce.pyx

but I get the problem when I test after the build

./sage -bt --long src/sage/structure/coerce.pyx

which gives the error:

**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1
**********************************************************************

I have never seen such a behavior.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 12, 2015

Changed commit from 8713960 to b73d537

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 12, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

c136ebdMerge #14058 into 6.9.beta1
b73d537Trac #14058: fix broken doctest

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 12, 2015

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

e153d08#14058: Add doctest
b9141eeMerge branch 'ticket/14058' into develop
90ed181Trivial fix for a coercion doctest
64b572crefcount libsingular rings used in plural
9793dbcMake one test more stable
2c04029make a test against a memory leak clearer/more stable
e976b79Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0
8713960trac #14058 convert some raise statements into py3 syntax
c136ebdMerge #14058 into 6.9.beta1
13cb2a7Trac #14058: fix broken doctest

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 12, 2015

Changed commit from b73d537 to 13cb2a7

@seblabbe
Copy link
Contributor

comment:68

In other doctests in the same file, GF(3), GF(7) and GF(41) are also created. While there are references to GF(7) and GF(41), no variable references to GF(3) which is available for gargage collection. If GF(3) gets garbage collected before the offending doctest, then everything is fine, the doctest pass. But it can also be garbage collected in the call to gc.collect() in the doctest. If this is the case, then N0 was also counting GF(3) which explains why N2-N0 can be negative.

One way to fix this is to call gc.collect() before creating N0.

(my first fix was using sets of ids... sorry for the noise).

@simon-king-jena
Copy link
Member

comment:69

Replying to @seblabbe:

One way to fix this is to call gc.collect() before creating N0.

I agree, that's the way to fix it.

@jpflori
Copy link

jpflori commented Aug 13, 2015

comment:71

I assume Simon's comment gives positive review here as this was the only thing to fix.

@jpflori
Copy link

jpflori commented Aug 13, 2015

Changed reviewer from Simon King, Frédéric Chapoton, Jean-Pierre Flori to Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

@fchapoton fchapoton modified the milestones: sage-6.4, sage-6.9 Aug 14, 2015
@vbraun
Copy link
Member

vbraun commented Sep 12, 2015

Changed dependencies from #12313 to none

@vbraun
Copy link
Member

vbraun commented Sep 13, 2015

Changed branch from public/ticket/14058 to 13cb2a7

@jdemeyer
Copy link

Changed commit from 13cb2a7 to none

@jdemeyer
Copy link

comment:76

On arando:


sage -t --long src/sage/categories/fields.py
**********************************************************************
File "src/sage/categories/fields.py", line 104, in sage.categories.fields.Fields.__contains__
Failed example:
    len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
Expected:
    0
Got:
    -1
**********************************************************************

@jdemeyer
Copy link

comment:77

See #19244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants