Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hclsyntax: Recovery of outer ObjectConsExpr with empty TemplateExpr #641

Open
dbanck opened this issue Nov 17, 2023 · 1 comment
Open

hclsyntax: Recovery of outer ObjectConsExpr with empty TemplateExpr #641

dbanck opened this issue Nov 17, 2023 · 1 comment

Comments

@dbanck
Copy link
Member

dbanck commented Nov 17, 2023

During the work to support template expressions in the language server, we discovered a different recovery behavior when dealing with empty template expressions.

"Regular" / Expected

package main

import (
	"log"

	"github.com/hashicorp/hcl/v2"
	"github.com/hashicorp/hcl/v2/hclsyntax"
)

func main() {
	cfg := `attr = "foo-${}-bar"`
	f, diags := hclsyntax.ParseConfig([]byte(cfg), "", hcl.InitialPos)
	if len(diags) > 0 {
		log.Printf("diagnostics: %s", diags)
	}
	attrs, _ := f.Body.JustAttributes()
	log.Printf("expression: %#v", attrs["attr"].Expr)
}

In this case, we get a diagnostic for an invalid expression, as expected, but the parsed configuration still contains a TemplateExpr with three parts. The second LiteralValueExpr corresponds to the empty template expression ${}.

2009/11/10 23:00:00 diagnostics: :1,15-16: Invalid expression; Expected the start of an expression, but found an invalid expression token.
2009/11/10 23:00:00 expression: &hclsyntax.TemplateExpr{Parts:[]hclsyntax.Expression{(*hclsyntax.LiteralValueExpr)(0xc000062720), (*hclsyntax.LiteralValueExpr)(0xc000062480), (*hclsyntax.LiteralValueExpr)(0xc000062780)}, SrcRange:hcl.Range{Filename:"", Start:hcl.Pos{Line:1, Column:8, Byte:7}, End:hcl.Pos{Line:1, Column:21, Byte:20}}}

Inside ObjectConsExpr

cfg := `attr = { foo = "foo-${}-bar" }`

If we encounter the empty template expression inside an object value, the parser stops and recovers the object without any items. All entries before the template expression are part of the result, all items after it are lost.

2009/11/10 23:00:00 diagnostics: :1,23-24: Invalid expression; Expected the start of an expression, but found an invalid expression token.
2009/11/10 23:00:00 expression: &hclsyntax.ObjectConsExpr{Items:[]hclsyntax.ObjectConsItem(nil), SrcRange:hcl.Range{Filename:"", Start:hcl.Pos{Line:1, Column:8, Byte:7}, End:hcl.Pos{Line:1, Column:31, Byte:30}}, OpenRange:hcl.Range{Filename:"", Start:hcl.Pos{Line:1, Column:8, Byte:7}, End:hcl.Pos{Line:1, Column:9, Byte:8}}}

This looks a bit similar to #597, but in this case the configuration is nearly valid and works in the context of a simple attribute.

Proposal

Explore whether it might be possible to handle the empty template expression differently than the default recovery on invalid value expression?

hcl/hclsyntax/parser.go

Lines 1496 to 1505 in c964a71

value, valueDiags := p.ParseExpression()
diags = append(diags, valueDiags...)
if p.recovery && valueDiags.HasErrors() {
// If expression parsing failed then we are probably in a strange
// place in the token stream, so we'll bail out and try to reset
// to after our closing brace to allow parsing to continue.
close = p.recover(TokenCBrace)
break
}

@apparentlymart
Copy link
Contributor

Thanks for sharing this, @dbanck.

The comment in the last source snippet you shared matched what was my initial instinct while thinking about this: I expect it's the way it is because symmetrical bracket tokens tend to be a more reliable heuristic for recovery than single tokens like , and =.

We could certainly try seeing how it behaves trying to use the terminating comma as the recovery anchor. I think the trick will be how it ends up dealing with the situation where the remainder of the invalid expression itself contains a comma somewhere, such as if there's a nested object constructor, nested tuple constructor, function call, or for expression. In that case, the recovery logic could potentially stop too early and end up at a position where another object entry is not expected, which could then cause some confusing downstream errors.

Of course, everything about this recovery logic is heuristic-based rather than exact anyway -- if the token stream isn't valid we're always doing a fair amount of guessing -- so I think the best we can do here is experiment with some different examples and see how they each behave, and then decide whether the cure seems better or worse than the disease.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants