Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Pattern matching #1799

Merged
merged 7 commits into from
Feb 9, 2024
Merged

[Refactor] Pattern matching #1799

merged 7 commits into from
Feb 9, 2024

Conversation

yannham
Copy link
Member

@yannham yannham commented Feb 2, 2024

This PR is preliminary work for the introduction of elimination form for ADTs. It is just a pure refactoring, which shouldn't incur any user-facing change.

This refactoring was motivated by the fact that the existing structure of the codebase used names that weren't very consistent, and relied heavily on the assumption that the only possible patterns are record patterns, which would make the addition of new patterns (ADTs, but not limited to them - constants patterns as well for e.g. strings, array patterns, regular expressions, etc.) impossible.

Content

  • renaming several symbols for better consistency and generality. In particular, destructuring is now the submodule term::pattern (and same for other destructuring submodules, which are now named pattern), and the strange distinction between matches and destructuring is ditched: everything is a pattern.
  • restructuring of the representation of a pattern. In particular, this PR introduces a structure that better mimics the actual ADT of a pattern, while the previous one mangled nodes together
  • better handling of aliased patterns: make them an optional field instead of a separate variants, which streamline their handling in several part of the code base
  • add span information to every subpattern
  • factor out operations that apply to several type of patterns to proper traits, instead of declining free-standing function like do_stuff_record_pattern, do_stuff_field_pattern, etc.
  • simplify the implementation of the typechecking of patterns. The previous implementation was building the type of a pattern in a first step, which is always expected to be a record type, and in a second time would extract from this type the type of the bindings brought into scope by the pattern. This extraction was a bit complicated (building a hashmap representation of the row type), and a bit unnecessary. The current implementation builds both the type of the whole pattern and the type of each binding in a single pass, which makes the implementation easier, but also doesn't have to assume that the type of the pattern will be nested record types, which is a prerequisite for extending patterns in the future.
  • avoid illegal state with respect to the tail of the pattern (previously open = false, rest = Some("captured_var")
  • simplify LetPattern and FunPattern to not take an additional top-level alias as an argument, and simply store this data inside the pattern itself

Review

Unfortunately, the git diff doesn't seem to understand that destructuring.rs was renamed to term/pattern.rs (and similarly for typecheck::destructuring -> typecheck::pattern, nickel_lang_lsp::destructuring -> nickel_lang_lsp::pattern). That being said, the changes are quite heavy, so maybe it's not a bad idea to just review as new code (but take a look at the deleted file to have an idea of what was the previous state).

@github-actions github-actions bot temporarily deployed to pull request February 2, 2024 18:29 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2024 17:14 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2024 18:12 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2024 19:10 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 5, 2024 19:20 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 6, 2024 17:05 Inactive
@yannham yannham marked this pull request as ready for review February 6, 2024 17:52
@yannham yannham requested review from jneem and vkleen February 6, 2024 18:07
@github-actions github-actions bot temporarily deployed to pull request February 6, 2024 20:06 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 8, 2024 15:03 Inactive
@yannham yannham mentioned this pull request Feb 8, 2024
1 task
core/src/transform/desugar_destructuring.rs Outdated Show resolved Hide resolved
core/src/transform/desugar_destructuring.rs Outdated Show resolved Hide resolved
core/src/transform/desugar_destructuring.rs Outdated Show resolved Hide resolved
core/src/transform/desugar_destructuring.rs Outdated Show resolved Hide resolved
core/src/transform/desugar_destructuring.rs Outdated Show resolved Hide resolved
yannham and others added 7 commits February 9, 2024 15:14
This commit is a preliminary work for the upcoming ADTs (and more
generally the introduction of pattern matching, instead of just having
destructuring).

The refactoring aims at installing a consistent naming, simplify the
representation of patterns and their associated method, and pave the way
for other patterns that just records. Indeed, the previous
implementation often made the implicit assumptions that patterns were
only record patterns.
Record pattern was previously using a tuple `(open: bool, rest:
Option<LocIdent>)` to represent the presence of either no tail, an
non-capturing tail `..`, or a capturing tail `..rest`. This
representation allows the illegal state `(false, Some("x"))`: indeed, if
the tail is capturing, the record contract is necessarily open.

This commit flattens this representation to a single enum that correctly
represents those three different cases, instead of the 4 allowed by the
previous representation.
The AST of patterns had a special node for an aliased pattern, which was
a variant containing the alias and a potentital nested pattern. However,
this doesn't model correctly patterns: usually, it doesn't make sense to
stack aliases (and the parser won't accept it), but the previous
representation accepted ASTs for things like `x @ y @ z @ <pat>`, which
incurs additional burden to handle, although it can actually never
happen.

Additionally, the alias of the top pattern was duplicated as an optional
field in the `LetPattern` and `FunPattern` nodes of the `Term` AST.

This commit makes things simpler by storing `alias` as an optional field
directly in the `Pattern` struct, which makes it accessible without
having to pattern match on an enum variant, and forbids nested aliases.
Doing so, we remove the duplication from the `LetPattern` and
`FunPattern`, which now only takes a pattern instead of an optional
identifier and a pattern, leading to code simplification.
The refactoring of patterns has introduced a slightly different
algorithm for typechecking patterns, which isn't entirely
backward-compatible, although it's more consistent. We'll probably rule
out (i.e. depreacte) the offending special cases, but until then, this
commit restores the previous behavior, which fixes a previously failing
test.
This commit only applies pure renaming of several symbols of the
destructuring module for improved clarity. The whole module is also
moved to `term::pattern`, as patterns are just syntactic component of
the term AST.
@github-actions github-actions bot temporarily deployed to pull request February 9, 2024 14:29 Inactive
@yannham yannham added this pull request to the merge queue Feb 9, 2024
Merged via the queue into master with commit b57f74b Feb 9, 2024
5 checks passed
@yannham yannham deleted the refactor/pattern-matching branch February 9, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants