Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching of data needed for computations in k_dual #14228

Closed
simon-king-jena opened this issue Mar 5, 2013 · 43 comments
Closed

Caching of data needed for computations in k_dual #14228

simon-king-jena opened this issue Mar 5, 2013 · 43 comments

Comments

@simon-king-jena
Copy link
Member

#12313 brought a severe slow-down to sage.combinat.sf.k_dual. The purpose of task #13991 is to mitigate this. The purpose of this ticket is to deal with one aspect of the slow-down.

Certain data of partitions should be cached, namely WeylGroup and the Young subgroup. And sage.combinat.sf.k_dual constructs a couple of partitions, but they are only weakly cached, so that they eventually are created repeatedly. Hence, they should be cached, too.

Apply

Depends on #13605
Depends on #14225

CC: @sagetrac-sage-combinat @tscrim @nthiery @nbruin @zabrocki

Component: combinatorics

Author: Travis Scrimshaw

Reviewer: Simon King

Merged: sage-5.8.beta4

Issue created by migration from https://trac.sagemath.org/ticket/14228

@simon-king-jena
Copy link
Member Author

comment:1

Attachment: trac_14288-partition_caches.patch.gz

@simon-king-jena
Copy link
Member Author

Dependencies: #13605

@simon-king-jena
Copy link
Member Author

Attachment: trac_14228-partition_caches.patch.gz

Disregard the first patch, it has a wrong ticket number.

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member Author

comment:2

Sorry for the "wrong" patch name.

Apply trac_14228-partition_caches.patch

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:3

I'm wondering how much of a speedup we'd get if we made IntegerListsLex a cached representation (currently it takes lambda functions as arguments which makes it impossible to hash).

sage: P = Partitions(4, max_part=2)
sage: P
Partitions of the integer 4 satisfying constraints max_part=2
sage: type(P)
sage.combinat.integer_list.IntegerListsLex_with_category
sage: P is Partitions(4, max_part=2)
False

So we alternatively want to make a new partitions parent specifically for this which just wraps the iterator given by IntegerListsLex...

I also restructured it slightly by creating an abstract base class for the actual bases for easier maintenance.

Anyways, I've attached my patch.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

Changed author from Simon King to Simon King Travis Scrimshaw

@simon-king-jena
Copy link
Member Author

Changed dependencies from #13605 to #13605, #14225

@simon-king-jena
Copy link
Member Author

comment:4

There is one hunk of my patch that only applies with #14225. Hence, new dependency.

@simon-king-jena
Copy link
Member Author

comment:5

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:6

There was one failing doctest in the first version of my patch. Because of the structural changes, the error thrown is changed (although it fundamentally is the same error).

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:7

Replying to @simon-king-jena:

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

The big thing that I did was pulled out the common code for the bases into an ABC. It looks scarier than it is. I also avoided calling Partition() directly in case what was being passed in was a list, in which case it should be slower than constructing an element from the appropriate parent. Although it's faster if we are passing in a Partition object, but this might be a marginal speed change... I'll do some investigating to see what's being passed around.

I can handle the merging.

@simon-king-jena
Copy link
Member Author

comment:8

Arrgh. I tried to post this twice, but trac didn't accept because you posted too. A minute later, trac didn't seem to accept posts at all. Let's try again.

Some timings:

Unfortunately, our two patches are incompatible. How to get the best of both worlds?

@simon-king-jena
Copy link
Member Author

comment:9

Replying to @tscrim:

Replying to @simon-king-jena:

Wow. You made a major restructuring of code. It will be difficult to merge the two approaches.

The big thing that I did was pulled out the common code for the bases into an ABC. It looks scarier than it is. I also avoided calling Partition() directly in case what was being passed in was a list, in which case it should be slower than constructing an element from the appropriate parent. Although it's faster if we are passing in a Partition object, but this might be a marginal speed change... I'll do some investigating to see what's being passed around.

I can handle the merging.

Great, thank you. Looking at your patch, it seems that your code includes parts of the changes that I suggested (but differently in the case of the Weyl group), and I think other parts of my code (namely the cached method _Partitions) make sense to be used in your patch, too.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:10

I'll let you know what happens some of my other possible speedups as well. Expect something within a few hours.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:11

Okay, I've merged the patches (I didn't incorperate the Weyl group) and made IntegerListsLex a cached representation (this was desirable anyways). However I realized that PartitionsGreatestLE does exactly what Partitions(n, max_part=k) does except it already is a unique representation, so I'm using that instead. I saw a very small difference between caching this in a function and just explicitly calling this, but I feel safer not having this be strongly cached (however feel free to ignore my opinion as I'm not a python/sage memory expert).

Here's some data. Without patch (with #14225):

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(4,>
sage: %time TestSuite(F).run()
CPU times: user 7.79 s, sys: 0.22 s, total: 8.01 s
Wall time: 8.39 s
sage: %time TestSuite(F).run()
CPU times: user 4.82 s, sys: 0.04 s, total: 4.86 s
Wall time: 5.07 s

With:

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(4,>
sage: %time TestSuite(F).run()
CPU times: user 7.17 s, sys: 0.09 s, total: 7.26 s
Wall time: 7.52 s
sage: %time TestSuite(F).run()
CPU times: user 4.26 s, sys: 0.03 s, total: 4.29 s
Wall time: 4.58 s

Edit - I still have not been able to see the approx 2x speedup that you got with just your patch, so I'm still doing something relatively slow...

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

Changed author from Simon King Travis Scrimshaw to Simon King, Travis Scrimshaw

@simon-king-jena
Copy link
Member Author

comment:12

Replying to @tscrim:

... but I feel safer not having this be strongly cached...
...
Edit - I still have not been able to see the approx 2x speedup that you got with just your patch, ...

I suppose that's why...

UniqueRepresentation is weakly cached. Hence, if you call the same Partitions, Weyl groups or permutation groups, you will get different instances, unless you keep a strong reference. Since the same Partitions (and Weyl groups or permutation groups, too) occur in different methods, it won't help to keep the reference inside of a method. Instead, one needs a cache that is shared by the methods.

It would probably be bad to have a global cache. That's why in my patch I suggested to have the cache local to the parent that uses the data. In that way, the data can still be garbage collected (it is not a global cache!), as soon as the parent using the data is done.

It seems to me that we will not get up to speed, unless we cache the few partitions that are local to these k-bounded thingies.

@simon-king-jena
Copy link
Member Author

comment:13

Your patch seems to perform quite well:

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
sage: %time TestSuite(F).run()
CPU times: user 5.78 s, sys: 0.12 s, total: 5.90 s
Wall time: 5.92 s
sage: %time TestSuite(F).run()
CPU times: user 3.18 s, sys: 0.03 s, total: 3.21 s
Wall time: 3.22 s

with

trac11490-coercion_tutorial.patch
14182_whitespace.patch
trac_13618-rings_doc_real-ts.patch
trac_13618-rings_doc_complex-ts.patch
trac_13618-rings_doc_others-ts.patch
trac_14040_housekeeping.patch
trac_14040_repr_option.patch
trac_14040_doctest.patch
trac_14040_review.patch
trac14054_fast_methods-5.8.patch
trac_12313-mono_dict-combined-random-sk.patch
trac_13387-combined.patch
trac_13387-guard_gc.patch
trac_14159_weak_value_triple_dict.patch
trac_14159_use_cdef_get.patch
trac12951_cached_extension.patch
trac_14214-cython_homset.patch
trac_14214-backup_cache.patch
trac_14063-remove_cc_compositions-ts.patch
trac_13688-finite_sets_cardinality_override-ts.patch
trac_14138.patch
trac_14138-doctests.patch
trac_13605-partition_options-ts.patch
trac_14011-Sphinx_roles-fh.patch
trac_14054-fix-rigged-list.2.patch
trac_11410-zero_one_sequence_partitions-pod.v2.patch
trac_11410-zero_one_sequence_partitions-review-ts.patch
trac_14225-partition_classcall_private.patch
trac_14228-sym_speedup-ts.patch

Let's see if caching the few permutations gives us yet some more speed.

@simon-king-jena
Copy link
Member Author

comment:14

Question about your merged patch: In partition.py, you've put a cached_method in front of one of the two young_subgroup methods and one of the two young_soubgroup_generator methods.

Was that intended? I thought it would be better to keep a pointer to the resulting permutation groups, not to some lists.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:15

I tried with the cached method in the parent as well and got approximately the same timings:

sage: %time TestSuite(F).run()
CPU times: user 7.23 s, sys: 0.10 s, total: 7.33 s
Wall time: 7.46 s
sage: %time TestSuite(F).run()
CPU times: user 4.38 s, sys: 0.01 s, total: 4.40 s
Wall time: 4.63 s

Hmm...strange we're getting such different timings. I'm running 5.8.beta1 with the following patches:

trac_14040_doctest.patch
trac_14040_housekeeping.patch
trac_14040_repr_option.patch
trac_14040_review.patch
trac_14085-root_system-ambient_spaces-nt.patch
trac_14000.patch
trac_14000-doctests.patch
trac_14174-coxeter_matrix-type_H.patch
trac_14063-remove_cc_compositions-ts.patch
trac_8920-alphabet.patch
trac_7886_conjugacy_classes_combined.patch
trac_7886-conjugacy_classes-review-ts.patch
trac_14177-graph-color_by_labels-nt.patch
trac_14176-polyhedrons-operators-nt.patch
14182_whitespace.patch
trac_14130-gyw-bs.patch
trac_13605-partition_options-ts.patch
trac_14111-qsym_tutorial-sb.patch
trac_14111-qsym_tutorial-review-ts.patch
trac_12543-import_statements-vd.patch
trac14054_fast_methods-5.8.patch
trac_14054-fix-rigged-list.2.patch
trac_2023-dynkin_graphs-ts.patch
trac_11410-zero_one_sequence_partitions-pod.v2.patch
trac_11410-zero_one_sequence_partitions-review-ts.patch
trac_14225-partition_classcall_private.patch
trac_14228-sym_speedup-ts.patch

I thought that's what your patch did. Whoops! I'll change it on the next version. There's still a few more things I want to try.

@simon-king-jena
Copy link
Member Author

comment:16

Did you do tests with your patch?

I get errors such as

    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[17]>", line 1, in <module>
        Partitions(Integer(4), outer=[Integer(3),Integer(1),Integer(1)]).list()###line 138:
    sage: Partitions(4, outer=[3,1,1]).list()
      File "classcall_metaclass.pyx", line 279, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (sage/misc/classcall_metaclass.c:932)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/partition.py", line 4006, in __classcall_private__
        return IntegerListsLex(n, **kwargs)
      File "classcall_metaclass.pyx", line 279, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (sage/misc/classcall_metaclass.c:932)
      File "cachefunc.pyx", line 991, in sage.misc.cachefunc.WeakCachedFunction.__call__ (sage/misc/cachefunc.c:5120)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python2.7/weakref.py", line 56, in __getitem__
        o = self.data[key]()
    TypeError: unhashable type: 'list'

Hence, it seems that IntegerListsLex should have a __classcall_private__ or something like that. UniqueRepresentation will only work if the defining data are hashable.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:17

I just did them and removed the CachedRepresentation since that no longer will affect the timing (since I'm using PartitionsGreatestLE which already is a UniqueRepresentation). I'm also removing the (near) duplicate young_subgroup* methods.

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:18

Okay, I also added a little micro-optimization by explicitly calling the element_class of the stored k bounded partitions parent which circumvents some the additional/duplicate _element_constructor_ tests and doctests in modified files all pass. I don't know what else to do (or we can do) at this point. With your patches, it seems we're back close to the original speed.

@tscrim

This comment has been minimized.

@simon-king-jena
Copy link
Member Author

comment:19

With the patch that you posted last, I get

sage: from sage.combinat.sf.k_dual import AffineSchurFunctions
sage: F = AffineSchurFunctions(SymmetricFunctions(QQ['t']).kBoundedQuotient(Integer(4),t=Integer(1)))
sage: %time TestSuite(F).run()
CPU times: user 5.77 s, sys: 0.06 s, total: 5.83 s
Wall time: 5.87 s
sage: %time TestSuite(F).run()
CPU times: user 3.19 s, sys: 0.04 s, total: 3.24 s
Wall time: 3.24 s

Additionally using a local cache for Partitions or Weyl group gives no further improvement.

So, I suggest we use the merged patch as basis for a our reviews. Is it cross review? Does this patch contain my code?

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:20

I believe only the cached method on young_subgroup() and the change in the import statements in partition.py.

@simon-king-jena
Copy link
Member Author

Reviewer: Simon King

@simon-king-jena
Copy link
Member Author

Changed author from Simon King, Travis Scrimshaw to Travis Scrimshaw

@simon-king-jena
Copy link
Member Author

comment:21

OK. So, I am supposed to review it, then.

@simon-king-jena
Copy link
Member Author

comment:22

The change to PartitionsGreatestLE gives the following errors in sage.combinat.sf.kschur. I know that this module is deprecated, but nevertheless, I think the following attribute error (occurring in several tests) should be fixed:

File "/home/simon/SAGE/prerelease/sage-5.8.beta0/devel/sage-main/sage/combinat/sf/kschur.py", line 42:
    sage: s(ks3([3,2,1]))
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_1[5]>", line 1, in <module>
        s(ks3([Integer(3),Integer(2),Integer(1)]))###line 42:
    sage: s(ks3([3,2,1]))
      File "parent.pyx", line 914, in sage.structure.parent.Parent.__call__ (sage/structure/parent.c:7826)
      File "coerce_maps.pyx", line 82, in sage.structure.coerce_maps.DefaultConvertMap_unique._call_ (sage/structure/coerce_maps.c:3656)
      File "coerce_maps.pyx", line 77, in sage.structure.coerce_maps.DefaultConvertMap_unique._call_ (sage/structure/coerce_maps.c:3558)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/free_module.py", line 1469, in _element_constructor_
        return self.monomial(self._basis_keys(x))
      File "parent.pyx", line 910, in sage.structure.parent.Parent.__call__ (sage/structure/parent.c:7772)
      File "parent.pyx", line 2288, in sage.structure.parent.Parent.convert_map_from (sage/structure/parent.c:15651)
      File "parent.pyx", line 2295, in sage.structure.parent.Parent.discover_convert_map_from (sage/structure/parent.c:15816)
      File "parent.pyx", line 2106, in sage.structure.parent.Parent.coerce_map_from (sage/structure/parent.c:14358)
      File "parent.pyx", line 2927, in sage.structure.parent._register_pair (sage/structure/parent.c:21520)
      File "parent.pyx", line 2901, in sage.structure.parent.EltPair.__hash__ (sage/structure/parent.c:21168)
      File "category_object.pyx", line 788, in sage.structure.category_object.CategoryObject.__hash__ (sage/structure/category_object.c:8150)
      File "sage_object.pyx", line 154, in sage.structure.sage_object.SageObject.__repr__ (sage/structure/sage_object.c:1769)
      File "/home/simon/SAGE/prerelease/sage-5.8.beta0/local/lib/python/site-packages/sage/combinat/partition.py", line 5653, in _repr_
        return "Partitions of %s having parts less than or equal to %s"%(self.n, self.k)
      File "parent.pyx", line 669, in sage.structure.parent.Parent.__getattr__ (sage/structure/parent.c:6277)
      File "misc.pyx", line 205, in sage.structure.misc.getattr_from_other_class (sage/structure/misc.c:1406)
    AttributeError: 'PartitionsGreatestLE_with_category' object has no attribute 'n'

For the patchbot:

Apply trac_14228-sym_speedup-ts.patch

@simon-king-jena
Copy link
Member Author

comment:24

PS: According to the traceback of this error, it uses the hash of CategoryObject---which is extremely slow, as it involves computing the string representation. So, does PartitionsGreatestLE_with_category not have a reasonable hash?

@tscrim
Copy link
Collaborator

tscrim commented Mar 5, 2013

comment:25

I'll implement a custom hash for it and fix those attribute errors shortly.

@tscrim
Copy link
Collaborator

tscrim commented Mar 6, 2013

Fixed doctests

@tscrim
Copy link
Collaborator

tscrim commented Mar 6, 2013

comment:26

Attachment: trac_14228-sym_speedup-ts.patch.gz

Sorry that took so long. I went out for dinner in between. I fixed the doctests in kschur.py, it was trying to pass in a NonNegativeIntergers() (which is currently not really supported by Partitions but probably should be), so I just removed that. I also implemented a custom hash which may not generate the best hash values, but it should do the job.

@tscrim
Copy link
Collaborator

tscrim commented Mar 6, 2013

comment:27

I also forgot:

For patchbot:

Apply: trac_14228-sym_speedup-ts.patch

@simon-king-jena
Copy link
Member Author

comment:28

It seems that the coverage script is complaining. Indeed, KBoundedQuotientBasis.__init__ needs a test, and perhaps documentation what it is doing.

You now define a hash for PartitionsGreatestLE. But I see that it inherits from unique representation. Hence, it should have a blazingly fast hash anyway! So, I wonder where that hash (and comparison!) has disappeared!

I reckon that one should do

class PartitionsGreatestLE(UniqueRepresentation, IntegerListsLex):

and not

class PartitionsGreatestLE(IntegerListsLex, UniqueRepresentation):

I think it is documented somewhere that UniqueRepresentation should be the first base class given.

My plan is to add a reviewer patch, removing the custom hash and bringing UniqueRepresentation's hash forward, and adding documentation plus test for the new base class.

@simon-king-jena
Copy link
Member Author

Attachment: trac_14228-reviewer.patch.gz

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member Author

comment:30

I added a reviewer patch, as I indicated in my previous post. Hope you agree with my changes. I will finish the review as soon as all tests pass.

Apply trac_14228-sym_speedup-ts.patch trac_14228-reviewer.patch

@simon-king-jena
Copy link
Member Author

comment:31

The tests pass (including the review patch), Travis' patch looks fine to me, and things in k_dual become a lot faster. Hence, it is a positive review here. I will now see how the new timings compare with pre-#12313 and post-#12313, so that hopefully #13991 can be resolved as fixed.

@tscrim
Copy link
Collaborator

tscrim commented Mar 6, 2013

comment:33

Looks good to me too. Thank you Simon.

Best,

Travis

@jdemeyer
Copy link

jdemeyer commented Mar 7, 2013

Merged: sage-5.8.beta4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants