syntax: unable to parse heredoc inside backtick #729

ristomcgehee · 2021-09-15T03:42:56Z

First off, I'd like to say thank you for maintaining this project. We're using your syntax parser in ossf/scorecard shell files from thousands of repos. We're getting an error when trying to parse a specific file, and since the syntax is valid bash, I'm opening this issue. I've created a small program that reproduces the error:

#!/usr/bin/env bash

OUTPUT=`passwd 2>&1 << EOT
old_password
old_password
new_password
EOT`

echo $OUTPUT

Passing this file to shfmt gives this error:

<standard input>:7:4: reached EOF without closing quote `

The text was updated successfully, but these errors were encountered:

mvdan · 2021-09-15T08:32:50Z

Thanks for filing this detailed issue! Parsing backticks is indeed tricky. There's also #636.

In general these edge cases haven't been a huge problem, because backticks have been deprecated for a while and most people use $(). Still seems worthwhile to try to fix the parser if it's possible, though. Better compatibility is generally a good thing, as long as it doesn't bring significant penalties.

mvdan · 2021-09-15T08:46:17Z

By the way, I took a quick look at your code, and I see you're trying to look at what commands people are running in their scripts. You can do this directly with the syntax package, but it's pretty manual as you only have the syntax tree. Have you seen the expand package? For example: https://pkg.go.dev/mvdan.cc/sh/v3/expand#Fields

ristomcgehee · 2021-09-16T02:41:09Z

I had not seen the expand package, but it looks pretty neat. I think it's a little more than we need at the moment since we don't know the environment that these shell files are executing in.

Our parser assumed that a heredoc must always end with a newline. Unfortunately, the following is valid shell: `foo <<EOF body EOF` Note the lack of a newline before the closing backquote. The fix is relatively straightforward. The two methods which tokenize heredoc bodies, advanceLitHdoc and quotedHdocWord, must learn to treat (r == '`' && p.backquoteEnd()) as an equivalent to the simpler case (r == '\n'). Note that we also make backquoteEnd more aggressive; right now, it returns true even if we're in a nested quote state. This is required because heredoc bodies use their own quote state, and otherwise we wouldn't realise we're closing a backtick. This seems like a good change, because backticks are special in shell. They seem to tokenize at a much lower level, which allows for bits of code like the one quoted above, as well as: arg0 `# actually an inline comment without a newline!` \ arg1 Fixes #729.

mvdan · 2022-01-01T22:52:58Z

I've sent #787, which should fix this issue. A review, or a confirmation that it fixes the problem for you, would be welcome :)

Our parser assumed that a heredoc must always end with a newline. Unfortunately, the following is valid shell: `foo <<EOF body EOF` Note the lack of a newline before the closing backquote. The fix is relatively straightforward. The two methods which tokenize heredoc bodies, advanceLitHdoc and quotedHdocWord, must learn to treat (r == '`' && p.backquoteEnd()) as an equivalent to the simpler case (r == '\n'). Note that we also make backquoteEnd more aggressive; right now, it returns true even if we're in a nested quote state. This is required because heredoc bodies use their own quote state, and otherwise we wouldn't realise we're closing a backtick. This seems like a good change, because backticks are special in shell. They seem to tokenize at a much lower level, which allows for bits of code like the one quoted above, as well as: arg0 `# actually an inline comment without a newline!` \ arg1 Fixes #729.

ristomcgehee · 2022-01-07T04:38:41Z

I've ran scorecard with the latest code in your master branch, and we're no longer getting an error on the repo we were getting an error on earlier. Thanks Daniel!

ristomcgehee mentioned this issue Sep 15, 2021

BUG: Parsing errors ossf/scorecard#839

Closed

mvdan changed the title ~~Unable to parse heredoc inside backtick~~ syntax: unable to parse heredoc inside backtick Oct 2, 2021

mvdan mentioned this issue Jan 1, 2022

syntax: add support for heredocs directly inside backquotes #787

Merged

mvdan closed this as completed in #787 Jan 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

syntax: unable to parse heredoc inside backtick #729

syntax: unable to parse heredoc inside backtick #729

ristomcgehee commented Sep 15, 2021

mvdan commented Sep 15, 2021

mvdan commented Sep 15, 2021

ristomcgehee commented Sep 16, 2021

mvdan commented Jan 1, 2022

ristomcgehee commented Jan 7, 2022

syntax: unable to parse heredoc inside backtick #729

syntax: unable to parse heredoc inside backtick #729

Comments

ristomcgehee commented Sep 15, 2021

mvdan commented Sep 15, 2021

mvdan commented Sep 15, 2021

ristomcgehee commented Sep 16, 2021

mvdan commented Jan 1, 2022

ristomcgehee commented Jan 7, 2022