-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to query performance #3258
Conversation
beets/dbcore/db.py
Outdated
@@ -581,16 +581,6 @@ def add(self, db=None): | |||
self._dirty.add(key) | |||
self.store() | |||
|
|||
# Formatting and templating. | |||
|
|||
_formatter = FormattedMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not as innocuous as it seems: The _formatter
attribute is used by the beets.library.Item
subclass of Model
which replaces FormattedMapping
by FormattedItemMapping
. I would strongly suspect that this is the reason for the performance improvement, not having an additional factory method wrapping it. Without reading the code, I don't know what the exact implications are, but this commit must surely be breaking some of beets functionality.
There might of course be room for optimization in FormattedItemMapping
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! I couldn't figure out why this would impact performance so much. I guess I'll investigate this further.
Thanks for tackling this long-standing issue! However, I believe the overall role of templates in beets should be investigated somwhat further before deciding on a strategy for fixing it. In particular, @sampsyo made a few suggestions in 2030: There does not seem to be the obvious way to approach caching templates. EDIT: I might personally be a little too perfectionist in that regard, its probably fine to just choose one of the approaches that have been proposed and go for it, in particular since there are no user-facing changes.
You already mentioned #2030 in your comments in #2388: I think one of the two approaches I benchmarked is very close to what you intended to do. Note the comments about the default template. I believe there are plans to drop Python 2 support in the near future,but as long as we still support it, this change must at least be disabled when on Python 2 (which might be an acceptable approach if Python 2 support is indeed removed soon-ish).
Without any traceback, I cannot really tell what the issue is. Note that you can run individual test environments using |
Thanks, I should have read the issue more carefully. So the suggestion to create a I looked into the performance of |
Awesome work so far! This is super interesting. It seems like there are two main things here: LRU-caching the templates, and making I don't see any big downsides to using |
We were previously doing calls to Template() directly, sometimes in a loop. This caused the same template to be recompiled over and over. This commit introduces a function template() which caches the results, so that multiple calls with the same template string does not require recompilation.
38939ea
to
c5075b2
Compare
I fixed the flake8 errors, and disabled |
I can confirm the performance impact on my machine! Here are three
And with it:
So that's a change from about 7.0 seconds to about 4.9 seconds, or a 1.4x speedup. 🎉 |
To answer your question:
Yep, tox depends on Python interpreters for each version you want to test under. You can always choose a specific environment by typing something like |
Improvements to query performance
Don't restrict to Python 2 precisely.
As discussed in PR #2388, this PR makes gives some performance improvements for querying. On my database of around 14k songs, these commits make
beet ls > /dev/null
run in 6.4 seconds, down from 15.5.The second commit confuses me. For some reason, inlining the call to
formatted()
makes the difference between 11 seconds and 6.4 seconds. I was confused for quite some time, thinking thatFormattedMapping
itself was doing something slow, but it turns out that the function call overhead was causing it?!In this PR I have used
lru_cache
to cache the formatting templates. I would have preferred to explicitly create the template at the call site, but I can't seem to figure it out. The functionlist_items
, inui/commands.py
, receives an argumentfmt
. I figured that this argument would contain the formatting string, but it appears to be empty. Whether I specify a formatting string on the command line or not does not seem to matter. Any pointers here would be great. In it's current state, I imagine that this PR would break Python2 support. I don't know if you intend to support Python 2, given that it will be deprecated in half a year now.Finally, I'm not sure how to run the tests? I run the
tox
command, but it complains about some interpreters missing. Am I expected to have every version of Python installed on my computer for this to work? Sorry, I'm not really a pythonista :)