Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkexpr: cache type of container literals when possible #12707

Merged
merged 1 commit into from
May 20, 2022

Conversation

huguesb
Copy link
Contributor

@huguesb huguesb commented May 1, 2022

When a container (list, set, tuple, or dict) literal expression is
used as an argument to an overloaded function it will get repeatedly
typechecked. This becomes particularly problematic when the expression
is somewhat large, as seen in #9427

To avoid repeated work, add a new field in the relevant AST nodes to
cache the resolved type of the expression. Right now the cache is
only used in the fast path, although it could conceivably be leveraged
for the slow path as well in a follow-up commit.

To further reduce duplicate work, when the fast-path doesn't work, we
use the cache to make a note of that, to avoid repeatedly attempting to
take the fast path.

For #9427

@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@huguesb huguesb force-pushed the pr-fast-container-type-cache branch from 3acbc1f to ac11374 Compare May 3, 2022 20:31
@huguesb
Copy link
Contributor Author

huguesb commented May 3, 2022

@JukkaL @hauntsaninja I would appreciate a review. I realize that this approach to caching might cause some debate but I do think it is worthwhile.

@github-actions

This comment has been minimized.

@JukkaL
Copy link
Collaborator

JukkaL commented May 9, 2022

[Not a full review.] I don't love the idea of putting the cached inferred type in the AST node. It can get stale if we reuse the AST in daemon, etc. It seems like it could cause some hard-to-debug bugs.

However, caching intermediate results is a good idea -- we just need an implementation that isn't error-prone. One idea would be to maintain a cache in TypeChecker as a dict from AST node to inferred type. We can then flush the cache at the end of each function, module or block, for example, to avoid cached results impacting results in another type checking pass, etc.

@huguesb
Copy link
Contributor Author

huguesb commented May 9, 2022

[Not a full review.] I don't love the idea of putting the cached inferred type in the AST node. It can get stale if we reuse the AST in daemon, etc. It seems like it could cause some hard-to-debug bugs.

Not attached to this approach, although I'm curious when the AST would get reused? Presumably the daemon will have to parse files again if they have changed? Or might an AST be kept around and re-checked after a dependency is updated?

However, caching intermediate results is a good idea -- we just need an implementation that isn't error-prone. One idea would be to maintain a cache in TypeChecker as a dict from AST node to inferred type. We can then flush the cache at the end of each function, module or block, for example, to avoid cached results impacting results in another type checking pass, etc.

Sure, I can rework this to use a dict cache instead.

When a container (list, set, tuple, or dict) literal expression is
used as an argument to an overloaded function it will get repeatedly
typechecked. This becomes particularly problematic when the expression
is somewhat large, as seen in python#9427

To avoid repeated work, add a new cache in ExprChecker, mapping the AST
node to the resolved type of the expression. Right now the cache is
only used in the fast path, although it could conceivably be leveraged
for the slow path as well in a follow-up commit.

To further reduce duplicate work, when the fast-path doesn't work, we
use the cache to make a note of that, to avoid repeatedly attempting to
take the fast path.

Fixes python#9427
@huguesb huguesb force-pushed the pr-fast-container-type-cache branch from ac11374 to af675fe Compare May 16, 2022 07:47
@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@huguesb huguesb mentioned this pull request May 18, 2022
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks for the updates! Large container literals are common, and it's nice to be able to type check more of them quickly.

@JukkaL JukkaL merged commit 1b7e33f into python:master May 20, 2022
JukkaL pushed a commit that referenced this pull request May 20, 2022
When a container (list, set, tuple, or dict) literal expression is
used as an argument to an overloaded function it will get repeatedly
typechecked. This becomes particularly problematic when the expression
is somewhat large, as seen in #9427

To avoid repeated work, add a new cache in ExprChecker, mapping the AST
node to the resolved type of the expression. Right now the cache is
only used in the fast path, although it could conceivably be leveraged
for the slow path as well in a follow-up commit.

To further reduce duplicate work, when the fast-path doesn't work, we
use the cache to make a note of that, to avoid repeatedly attempting to
take the fast path.

Fixes #9427
@huguesb huguesb deleted the pr-fast-container-type-cache branch June 10, 2022 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants