speedup crystal weight and spin e/f #27057

mantepse · 2019-01-14T23:33:47Z

CC: @tscrim @anneschilling

Component: combinatorics

Author: Travis Scrimshaw

Branch/Commit: d2fe93d

Reviewer: Travis Scrimshaw, Martin Rubey

Issue created by migration from https://trac.sagemath.org/ticket/27057

The text was updated successfully, but these errors were encountered:

mantepse · 2019-01-14T23:35:45Z

Branch: u/mantepse/speedup_crystal_weight_and_spin_e_f

mantepse · 2019-01-14T23:45:22Z

Commit: 6f26a6e

mantepse · 2019-01-14T23:45:22Z

Author: Martin Rubey

mantepse · 2019-01-14T23:45:22Z

comment:2

I need faster crystals, but I am not sure how to go about it. Here is a first try - old:

sage: n = 3; r = 7; C = crystals.Spins(CartanType(["B", n])); T = crystals.TensorProduct(*[C for i in range(r)])
sage: hi = T.highest_weight_vectors()
sage: %timeit [v.weight() for v in T.highest_weight_vectors()]
1 loop, best of 3: 6.52 s per loop

new:

sage: n = 3; r = 7; C = crystals.Spins(CartanType(["B", n])); T = crystals.TensorProduct(*[C for i in range(r)])
sage: hi = T.highest_weight_vectors()
sage: %timeit [v.weight() for v in T.highest_weight_vectors()]
1 loop, best of 3: 4.22 s per loop

Removing the assertions assert i in self.index_set() in spins.py would also save some time, I don't know how useful they are. Perhaps they can be replaced with assert 0 < i <= rank?

Somewhat surprisingly, a large amount of time is used in recomputing root_system of a Cartan type, in weight_lattice_realization. I think it makes sense to cache it, but maybe I am overlooking something better.

New commits:

`6f26a6e`	`avoid calling cartan_type in spin e / f, cache root_system, avoid calls in weight_lattice_realization`

tscrim · 2019-01-15T00:20:21Z

comment:3

I am worried that caching root_system would lead to a memory leak since RootSystem is a UniqueRepresentation (there already is a memory leak for root systems IIRC, but this will likely make the problem worse).

We do not want to do assert for validating user input (they should only be used for serious problems); instead raise a, e.g., ValueError. (This is a holdover from older code.) However, generically, the index_set() does not have to be [n]. Granted, many of the crystals are using this as an assumption (e.g., CrystalOfLetters) for various reasons. I probably would not remove the check within Sage itself because it is already there. If this does become a serious bottleneck in the computations you need, I would simply remove them from the code for your computation.

To really get a good speedup, I would cythonize the spin crystals (use some sort of bitset or bint array) and implement a custom weight function. You probably also want to have a better check for epsilon and phi too.

mantepse · 2019-01-15T05:48:06Z

comment:4

Thank you for looking into this! One question: I realize that the index set does not need to be [n] generically, but it does it have to be [n] within spins.py? (I think otherwise ret[i] = 1 shouldn't work, right?)

Since weight is currently the worst offender, I'll try implementing a custom weight function first.

I'll keep pushing to this ticket, but leave it as needs info - in case anything is good to keep.

tscrim · 2019-01-15T05:52:23Z

comment:5

Yea, it seems to be that way for spins.py (but also for letters.pyx, and subsequently all tableaux).

mantepse · 2019-01-15T06:28:06Z

comment:6

#18426 is the ticket concerning a memory leak involving root systems.

tscrim · 2019-01-15T06:29:24Z

comment:7

Replying to @mantepse:

#25560 is the ticket concerining a memory leak involving root systems.

#18426

mantepse · 2019-01-15T10:29:58Z

comment:8

Besides, is there a good reason to cache highest_weight_vectors?

I am frequently iterating over the highest weight vectors of tensor powers. In fact, I would prefer it to be an iterator, but that's probably asking for too much.

sagetrac-git · 2019-01-15T11:32:31Z

Branch pushed to git repo; I updated commit sha1. New commits:

`fe338ae`	`add dedicated weight method for spin crystals`

sagetrac-git · 2019-01-15T11:32:31Z

Changed commit from 6f26a6e to fe338ae

mantepse · 2019-01-15T11:39:24Z

comment:10

Caching Crystals.ParentMethods.weight_lattice_realization yields an even greater speedup. But maybe it would be even better if crystal elements would know (and cache) their weight_lattice_realization, because then we wouldn't constantly need to redirect via the parent. Note for example that there is Crystals.ElementMethods.cartan_type. Is there a good reason not to do this?

embray · 2019-01-15T18:15:21Z

comment:11

Retarging tickets optimistically to the next milestone. If you are responsible for this ticket (either its reporter or owner) and don't believe you are likely to complete this ticket before the next release (8.7) please retarget this ticket's milestone to sage-pending or sage-wishlist.

tscrim · 2019-01-19T17:38:12Z

comment:12

So I am not fully convinced of the elements having a cartan_type, but that is, in a way, data associated to the element. However, the weight lattice realization is definitely not. So I am -1 on caching it in the elements.

How much of a speedup do you get with caching it? I am probably in favor of caching it, but I would be interested in seeing how performance critical it is first (at least, I do not think computing the weight is the slow part).

We cache the highest_weight_vectors because it is an (relatively) expensive computation that can be used in some other computations. If you make it into an iterator, you loose that part or have to add a second method. Again, if you find it better for your computations, you can just make local changes.

Also, I think if you really need speed, you should cythonize this. I might be able to do that soon, but it is a little lower on the priority list for me right now.

mantepse · 2019-01-19T18:02:37Z

comment:13

Thanks for looking into this!

I can only answer to your comment on the highest_weight_vectors caching: my (very minor) problem is that after computing

sage: C = crystals.Letters("A5")
sage: T = crystals.TensorProduct(*([C]*14))
sage: hi = T.highest_weight_vectors()
# do some stuff
sage: C = crystals.Letters("C5")
sage: T = crystals.TensorProduct(*([C]*14))
sage: hi = T.highest_weight_vectors()
# do some stuff
sage: C = crystals.Spins("B5")
sage: T = crystals.TensorProduct(*([C]*14))
sage: hi = T.highest_weight_vectors()

I do not know how to free the memory - and until very recently, I did not understand why my computer became unusable...

tscrim · 2019-01-19T18:09:43Z

comment:14

That is a misrepresenation. It is not the @cached_method that is the problem. If the memory is not being freed, then there is a memory leak. If so, then my guess is it is a side effect of #18426.

mantepse · 2019-01-19T18:59:50Z

comment:15

Are you saying that the highest weight vectors shouldn't take up so much space?
I just tried

sage: C = crystals.Letters("A5")
sage: for d in range(2, 14):
....:     T = crystals.TensorProduct(*([C]*d))
....:     hi = T.highest_weight_vectors()
....:     print len(hi)

after which, according to htop, a big portion of the available memory on my laptop was used.

It is not a big problem for me, because I can do such experiments with "large" tensor powers (eg., yesterday I had to check something for crystals.Spins(5) with d=16) in a separate sage process, and not on my laptop.

tscrim · 2019-01-19T19:37:00Z

comment:16

No. I am asking you what you observed. Are you seeing a memory leak or is it just high usage in that specific computation?

It is almost certainly not the caching of the highest weight vectors that is causing the high memory usage. The highest weight vectors requires a certain amount of iteration over the crystal, which is not O(1) in memory (I am not sure of an algorithm that could be). So you might get some memory improvement for your particular computation by storing the list of elements in the crystal (currently I believe it regenerates them for each factor, a by-product of being an iterator).

mantepse · 2019-01-19T20:12:58Z

comment:17

I don't really understand the question. I am observing that after the sequence of instructions above the memory of my computer is used up, and it doesn't go down anymore.

In numbers:

sage: C = crystals.Letters("A5")
sage: for d in range(2, 13):
....:     T = crystals.TensorProduct(*([C]*d))
....:     hi = T.highest_weight_vectors()
....:     print get_memory_usage(), len(hi)

4192.42578125 2
4192.42578125 4
4192.67578125 10
4192.67578125 26
4192.67578125 76
4192.92578125 231
4193.17578125 756
4195.19140625 2556
4201.953125 9096
4226.99609375 33231
4320.625 126060
sage:

tscrim · 2019-01-20T01:26:52Z

comment:18

By doing a cythonization, I get the following:

sage: C = crystals.SpinsMinus(['D',4])
sage: D = crystals.SpinsPlus(['D',4])
sage: T = tensor([C,D,C,D,C,D])
sage: T.cardinality()
262144
sage: %time for x in T: pass
CPU times: user 2.31 s, sys: 5.79 ms, total: 2.31 s
Wall time: 2.29 s
sage: %time hw = T.highest_weight_vectors()
CPU times: user 16.6 ms, sys: 135 µs, total: 16.8 ms
Wall time: 14.5 ms
sage: %time for x in T: _ = x.weight()
CPU times: user 32.9 s, sys: 34 ms, total: 33 s
Wall time: 32.9 s
sage: %time for x in T: _ = x.weight()    # with caching weight_lattice_realization()
CPU times: user 19.3 s, sys: 23.1 ms, total: 19.3 s
Wall time: 19.3 s

vs Sage 8.6:

sage: C = crystals.SpinsMinus(['D',4])
sage: D = crystals.SpinsPlus(['D',4])
sage: T = tensor([C,D,C,D,C,D])
sage: T.cardinality()
262144
sage: %time for x in T: pass
CPU times: user 5.21 s, sys: 30.5 ms, total: 5.24 s
Wall time: 5.19 s
sage: %time hw = T.highest_weight_vectors()
CPU times: user 93.4 ms, sys: 8.53 ms, total: 102 ms
Wall time: 82.5 ms
sage: %time for x in T: _ = x.weight()
> 1 minute

However, there is a definite memory leak with the tensor product:

sage: C = crystals.Letters('A5')
sage: T = tensor([C]*12)
sage: T._dummy = True
sage: del T
sage: import gc
sage: gc.collect()
sage: T = tensor([C]*12)
sage: T._dummy
True

I get the same when just doing for C. So that is one step closer to the memory leak, but that is a separate issue.

New commits:

`abbbbee`	`Cythonization of spin crystals.`
`31c666c`	`Caching weight_lattice_realization.`
`844ae90`	`Forgot to add letters.pxd.`
`a5d2401`	`Full coverage and other cleanup.`

tscrim · 2019-01-20T01:26:52Z

Changed branch from u/mantepse/speedup_crystal_weight_and_spin_e_f to public/crystals/cythonize_spins-27057

tscrim · 2019-01-20T01:26:52Z

Changed commit from fe338ae to a5d2401

tscrim · 2019-01-20T01:26:52Z

Changed author from Martin Rubey to Martin Rubey, Travis Scrimshaw

tscrim · 2019-01-20T01:26:52Z

Reviewer: Travis Scrimshaw

mantepse · 2019-01-20T11:25:45Z

comment:19

That looks wonderful!

I'll be playing with this until (something like) tomorrow!

1+1/2 immediate questions:

sage: C = crystals.Spins("B3")
sage: C[0].value

doesn't exist anymore.

is this intentional, should there be a deprecation?
I used it to identify particular elements of the crystal when it needs to be really fast. Should I be using something else anyway?

I created #27083 for the memory leak.

tscrim · 2019-01-20T17:53:58Z

comment:20

Yes it is intentional as it has be repurposed as an array of bint (booleans), but is only available using Cython code. I can make it public, but it will behave completely differently. For 2), you should be using the normal == to compare two crystal elements. It should not really be much slower than comparing by value (and any speedup you saw before should now be erased as the internal data structure has changed).

After some thought, I think it is best for now to have a value @property. I am not sure about fully deprecating it at this point, but I am leaning towards it. Something else that might be worthwhile is adding a __getitem__. Thoughts?

sagetrac-git · 2019-01-20T17:54:27Z

Branch pushed to git repo; I updated commit sha1. New commits:

`4e9d29b`	`Squeezing some more speed out.`
`222bb33`	`Some little cleanup and a deprecation for value.`

sagetrac-git · 2019-01-20T17:54:27Z

Changed commit from a5d2401 to 222bb33

sagetrac-git · 2019-01-20T17:54:53Z

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

`d2fe93d`	`Some little cleanup and a @property for value.`

sagetrac-git · 2019-01-20T17:54:53Z

Changed commit from 222bb33 to d2fe93d

mantepse · 2019-01-23T11:25:34Z

Changed reviewer from Travis Scrimshaw to Travis Scrimshaw, Martin Rubey

mantepse · 2019-01-23T11:25:34Z

comment:23

Excellent! Many thanks!

mantepse · 2019-01-23T11:25:34Z

Changed author from Martin Rubey, Travis Scrimshaw to Travis Scrimshaw

vbraun · 2019-01-26T15:55:41Z

Changed branch from public/crystals/cythonize_spins-27057 to d2fe93d

mantepse added this to the sage-8.6 milestone Jan 14, 2019

mantepse added the p: major / 3 label Jan 14, 2019

mantepse added s: needs info labels Jan 14, 2019

embray modified the milestones: sage-8.6, sage-8.7 Jan 15, 2019

tscrim added s: needs review and removed s: needs info labels Jan 20, 2019

mantepse added s: positive review and removed s: needs review labels Jan 23, 2019

vbraun removed the s: positive review label Jan 26, 2019

vbraun closed this as completed in 62bdcc4 Jan 26, 2019

mantepse mentioned this issue Aug 26, 2021

memory leak, possibly in crystals #27083

Closed

speedup crystal weight and spin e/f #27057

speedup crystal weight and spin e/f #27057

Comments

mantepse commented Jan 14, 2019

mantepse commented Jan 14, 2019

mantepse commented Jan 14, 2019

mantepse commented Jan 14, 2019

mantepse commented Jan 14, 2019

tscrim commented Jan 15, 2019

mantepse commented Jan 15, 2019

tscrim commented Jan 15, 2019

mantepse commented Jan 15, 2019

tscrim commented Jan 15, 2019

mantepse commented Jan 15, 2019

sagetrac-git mannequin commented Jan 15, 2019

sagetrac-git mannequin commented Jan 15, 2019

mantepse commented Jan 15, 2019

embray commented Jan 15, 2019

tscrim commented Jan 19, 2019

mantepse commented Jan 19, 2019

tscrim commented Jan 19, 2019

mantepse commented Jan 19, 2019

tscrim commented Jan 19, 2019

mantepse commented Jan 19, 2019

tscrim commented Jan 20, 2019

tscrim commented Jan 20, 2019

tscrim commented Jan 20, 2019

tscrim commented Jan 20, 2019

tscrim commented Jan 20, 2019

mantepse commented Jan 20, 2019

tscrim commented Jan 20, 2019

sagetrac-git mannequin commented Jan 20, 2019

sagetrac-git mannequin commented Jan 20, 2019

sagetrac-git mannequin commented Jan 20, 2019

sagetrac-git mannequin commented Jan 20, 2019

mantepse commented Jan 23, 2019

mantepse commented Jan 23, 2019

mantepse commented Jan 23, 2019

vbraun commented Jan 26, 2019