-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type-based completion in the new completer #1577
Conversation
(By the way, no particular need to get this into 1.2. I'm happy to hold off on merging until we get that out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must say I'm not sure to understand the overall strategy very well, and in particular why do we need multiple linearizer and to linearize several times. Also, it seems like we do completion either using the term definition, either using the type when it's just a type, but not both. If this is indeed the case, I think it doesn't follow the philosophy that is laid out in the notes about the semantics of completion. Completion should take information from every possible source: if there is a defined value, a type annotation, and multiple contract annotations including a dict annotation with a record as a sub contract (think {foo | {_ : {bar : Number, ..}} = {baz = {other = 2}}}
, then we should compute completion information from all of those sources at the same time, and propose both bar
and other
when trying to complete foo.baz.[.]
. Dictionaries might be trickier, but this is not really the point - we can do that later.
lsp/nls/src/field_walker.rs
Outdated
/// `{ foo: Number }` | ||
/// ^-----------------------------------------------------------| | ||
/// then in the expression `x.foo`, the definition of `foo` is here | and the | ||
/// value of its definition is `Number`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might understand as I go more into the code, but this part tickles me. If you have {foo : Number}
, then x.foo
is undefined, it's not Number
(but the LSP knows that it has the type Number
). Here it seems like we conflate {foo : Number}
with {foo = Number}
or something? And how {foo : Number = 1}
is represented? What if you have something like {foo | {bar, baz, ..} = {blorg}}
, where the completion information should be bar, baz, blorg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is that Value
(maybe not well named) is a thing that might have fields for recursing into when resolving paths. It could be a term (like { foo = 1 }
) or a type (like { foo : Number }
). For completion (and eventually goto definition) we need to be able to recurse through both of these cases. So for example, if x
has type { foo : { bar : { baz : Number }}}
and we ask for the fields on x.foo.bar
, then in the first call to FieldDefs::resolve
we'll find that x
has a field foo
with "value" { bar : { baz : Number }}
, and then on the next call to FieldDefs::resolve
we'll find the subfield bar
with "value" { baz: Number }
.
The recursion can look at both terms and types/contracts. For example, if you have
let x | { bar, baz, .. } = { blorg } in x.
then in the Term::Annotated
branch we collect the completions from both the annotation and the contract. We also accumulate fields from inferred types in FieldDefs::resolve
, and combine those fields with the ones from the term.
Thanks to this question, I realized that I'd forgotten to accumulate fields from contracts and types in record field metadata; this is fixed in the latest commit. With the fix, if you call resolve_path
on
{ foo : { bar : Number } = { bar = 1 }}
with the path foo
:
- on the first iteration, we'll see that the overall expression has an ident
foo
, and itsDef
has value{ bar = 1 }
and metadata{ bar : Number }
. - on the second iteration, we'll recurse into the previous
Def
forfoo
. We'll visit the value, which has a fieldbar
(whose Def has value 1, and no metadata). And we'll visit the metadata, which has a fieldbar
(whose Def has valueNumber
and no metadata). So we'll return twoDef
s, one for each occurrence ofbar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the explanation. I guess it's about naming and code organization rather than semantics, but what I find strange is that a metadata (a type annotation or a contract annotation) at a field level is treated differently depending on if the field has a definite value. Correct me if I'm wrong, but:
# case 1
{ foo | {..1..} = {..2..}}
# case 2
{ foo : {..1..}}
In both cases foo
has the same contract attached. But in the first case, the contract will go in the "metadata" for the completer, and the definition will be {..2..}
, while, in the other case, the definition will be {..1..}
(and then, will foo
have "metadata" in the sense of the completer?)
It seems, from a distance, that we shouldn't actually make a distinction between what's on the right of the equal, what's in metadata.annotation.contracts
and metadata.annotation.typ
. All of them are equal sources of completion information, et we might have to recurse into all of them (in fact, for something which definition is a record, having this record on the right of the equal or as a contract annotation doesn't make any difference for the completer, I believe). That's why I think I find confusing the notion of definition: it seems to arbitrarily pick one source of completion as the "definition", when there are several ones. You say that the completer uses all of the information sources then, so it's fine operationally, but I'm wondering about the modeling. For example, would that make sense to (very schematically, there might be other considerations):
- rename
Defn
->CompletionSource
- make it possible to attach multiple completion sources to each symbol/field/whatever, instead of only one definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I used definition/Defn a lot, but I believe I'm rather talking about Value
: each field should have one definition, but instead of having one value, it would make more sense to me that they have multiple completion sources. Or, if you don't want to duplicate metadata and mimic the original's field structure, keep the metadata separate and set value to None
for {foo : {bar : Number}}
. I think what's confusing me is that we do something in between: when there's a definite value we keep the same structure as the original field - value inside value
and the rest in the metadata -, but for some reason, when there's no definition, we put the type annotation in the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks, I'll see if I can find a better model for this. A few considerations:
- the name
Def
was chosen because the same infrastructure is supposed to eventually drive "goto definition/goto declaration" - we want to keep (at least part of) the metadata attached to fields: it's used for generating documentation, and not just for finding subfields
Maybe not relevant for much longer, but the current determinant for whether Value
is a type or a term is that when visiting a type in resolve_type
, the value is a type. When visiting a term in resolve_term
, the value is a term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I better see your point of "when resolving a path, we might end up on a subpart either of a type of a term" now. Then do you use metadata only for the documentation? Can Def.metadata
be set if value
is a type for example?
During resolution, what happens if you end up on a value
that has both a value definition, contract annotations and type annotations (say at on a field such as {foo | {..1..} : {..2..} = {..3..}
. Does it gives rise to as many other values that you explore (one for each item), with the path copied?
Sorry for all the questions 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, questions are great!
For your first question, we're currently using it only for documentation. There are potentially other metadata fields in the completion response that we could be filling out; I haven't looked in much detail yet. Right now metadata
is only set for fields in a Term::Record
, so I guess that means it's only paired with value being a term
.
The answer depends on what "end up on a value" means:
- if you ask for all the fields on
{foo | {..1..} : {..2..} = {..3..}}
then you will get back a singleDef
, with identifierfoo
, value{..3..}
, and the rest inmetadata
- but if you take
{foo | {..1..} : {..2..} = {..3..}}
and ask "what are the fields on thefoo
field?" (i.e. you callresolve_path
with path[foo]
) then there will be three recursive calls toFieldDefs::resolve
: one for the value, one for the contract, and one for the type. The results of those will be merged together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this modelling issue might be resolved by a refactor that I was thinking of doing anyway: making resolve
one step lazier than it currently is. That is, currently resolve
returns a collection of Ident
s and Def
. Instead, I want resolve
to return a collection of things that have fields. So basically, the old way can be obtained from the new way by flat-mapping over the things that have fields and getting all their fields.
"Things that have fields" are
#[derive(Clone, Debug, PartialEq)]
pub enum FieldHaver { // I'd welcome a better name suggestion!
RecordTerm(RecordData),
Dict(Type),
RecordType(RecordRows),
}
There are three advantages of this:
- efficiency: when resolving a long path, we aren't interested in most fields of the intermediate path elements, and so it was wasteful to collect them all
- avoids some special-casing to handle the annoyance that we can't collect all fields of a
Dict
- avoids the need to have a
Def
type describing the "value" of a field.
I have a draft of this and it seems to work ok. If it sounds reasonable, I'll finish it up on Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, from a conceptual point of view, I do find the FieldHaver
modeling/naming idea simpler and more natural, indeed.
pub struct Uninit; | ||
|
||
/// Marker trait for possible states of the linearization | ||
pub trait LinearizationState {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea here was to avoid having nonsensical things like Linearization<Ident>
. Of course you could still do it but you need the extra step of impl LinearizationState for Ident
. But I'm not convinced this was really useful, so it's probably fine to remove
We discussed this synchronously, but just so there's a record of it: I've replaced the double linearization pass by a single linearization pass that records two things. The eventual goal is to replace the As for accumulating both the type and term information, I believe this is done correctly now. There are three cases here
(This last one was missing until just recently) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan discussed in the comments is compelling, and this PR is totally functional, so it might be easier reviewing-wise to merge this first and improve then.
Ah sorry, while you were typing that I finished and pushed the refactor... |
It's not very important 🙂 just ping me when it's ready to review |
This also adds a feature flag for turning off the old completer.
@yannham This has been rebased and it's ready for another look. The main changes compared to last time are in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't look into all the details again, but I found the field_walker
part easier to follow, and much less confusing 👍
/// | ||
/// So, for `term.x.y.z`, this will be `vec!["x", "y", "z"]`. | ||
pub path: Vec<Ident>, | ||
pub enum FieldHaver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldHolder
? FieldContainer
? Recordoid
(ok, maybe not this one) ? I'm just randomly spitting suggestions, feel free to ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 I stuck with FieldHaver
for now; we can always rename it later.
lsp/nls/src/field_walker.rs
Outdated
pub value: Option<RichTerm>, | ||
/// Field metadata. | ||
pub metadata: Option<FieldMetadata>, | ||
enum FieldValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about FieldContent
or FieldData
(or something else) ? All in all I believe value is generally a reasonable name for it, but it's already quite overloaded in Nickel (values in the sense of the lambda calculus, and Field
already has a member named value
, which is rather a RichTerm
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldContent
it is.
lsp/nls/src/field_walker.rs
Outdated
pub fn ident(&self) -> LocIdent { | ||
self.ident | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's is the expected usage of self.ident()
instead of just doing self.ident
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had a reason for it originally, but I can't figure out what 😕 It's gone now.
* Collect types into a table * Type-based completion This also adds a feature flag for turning off the old completer. * Remove bogus file * Explicitly visit type annotations * Add a test for annotated types * Only do one linearization traversal * Fix docs, and avoid some clones * Combine, and resolve metadata * Add a test * Slight improvements to the lsp spec * Refactor * Remove commented-out code * Remove the confusing initial-env thing * Complete the refactor, and document a little better * Fix some rebase issues * Clean up some unused code * Remove obsolete doc * Review comments
Updates the new completer to get completions from types also.
I also added a (hopefully short-lived) feature flag to turn off the old completer. This makes it easier to test the new completer.