-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested zsh style Parameter Expansion flags throws parser #88
Comments
Do you know what the name of this syntax is |
(note: this comment is unrelated to why the highlighting is failing) |
Yes, this is called a parameter flag or parameter subscript. It is available in both bash and zsh. Not sure about sh. 15.2.3 Subscript Flags I have been looking at it this morning and I see that the variable regex's are don't provsion for them. I think there is overlap with this and issue #74. I don't have a background in Textmate, but I have extensive experience with Perl Regex. I will look at the Textmate side and offer a proposal. |
This is a downstream consequence of missing the variable identification. The quote flag is still open. I have encountered a variety of confusing behavior that I fiddled with iteratively to see if I could reverse engineer the cause. In this case, I think the other is a symptom of the variable identification. |
In terms of fixing the problem. Theres a pattern, related to variable assignment, for detecting an array assignment. I think the array pattern handles named and not-named arrays. For the named arrays, the string pretty much has to be matched with a one-line pattern (not pattern range) due to parser limitations. One line patterns can't handle nested stuff, like nested strings inside of string interpolation. Instead a pattern range has to be used with a start quote and end quote. Slight hiccup tho, Textmate prioritizes long matches. So matching one-line string (one big chunk) compared to matching just the starting quote, will cause the whole-chunk to "win". That "win" scenario is just a warning, idk if thats even happening here. It might simply be that the pattern-range version of the string pattern isnt even included in the array-literal range. |
Maybe you already realize this, but the best way to read this is as nested parameters. It then expands to become ${(z)pathOpts_str} which the z is a flag that will iterate parse the contents of consistent with the parsing algorithm of the zsh command-line. Thus it becomes ${pathopts_quoted[@]}", which then gets passed through Q@. The @ I think is just syntactically preferred by the community, but not necessary. I haven't figured out under what conditions it is. Either way, each element is dequoted iteratively. Two good references for this are: |
"One line patterns can't handle nested stuff, like nested strings inside of string interpolation. Instead a pattern range has to be used with a start quote and end quote." I understand. Greedy vs non-greedy matching. This can actually be overcome by clever application of look-ahead/look-behind assertions By the way, what limits the parser to Textmate? Is that a VSCode thing? |
I think Theres a bit of miscommunication I'll try and clear up. I don't think the parameter expansion or parameter flag or For the
|
Sort of. Yes textmate is greedy, but its separate from the regex greedyness. Like I say "kinda" because yeah, we still use a lot of lookarounds to solve it on the textmate side. Also seeing as you're a regex expert, that will help a lot. In terms of regex hacking, there is one warning. In theory, with enough recursive regex, matching a nested string with a one-line textmate pattern is possible. Sadly I spent a lot of effort on that once for a different language only who realize that textmate will only tag the last (most-inner) part of a recursive regex pattern. So even if the pattern is matched correctly, its not tagged correctly. Ive got an issue on VS Code textmate about it, but I think you me and @ redcmd are the only people in the world who might care about the issue. That said, even with broken scopes/tags sometimes the fact that the pattern matches correctly is enough. It would be enough in this case, but the massive amount of effort it would take to make a recursive nested string pattern (a pattern that needs to contain the entire grammar thanks to subshell interpolation) would be insane for just fixing this one bug and like 2 other non-cascading bugs.
Yep. Other editors are limited to Textmate too. The alternative is the awesome Tree Sitter parser, which would never even run into this problem in the first place. Atom used it, NeoVim uses it. I use it for parsing tasks. Fun fact though. Bash is one of the few languages (I think Perl is another) that is impossible to statically parse perfectly. There can be runtime changes to the syntax thanks to, at minimum, aliases. So even the tree sitter can't always parse bash. Gotta run it to parse it (sometimes) |
Yep, cool. Wouldnt be surprised if that becomes #90 on this grammar |
It took me forever to find this reference. I expected it to be in the redirection section of the manual, but wouldn't you know it was in the Bash documentation on Command Substitution. What's funny about that is that I searched the document for "(<". If that didn't give it away. |
Actually, before I run, what is the order of operation when it comes to pattern recognition. Does the engine do sweeps based on the nodes within repository (I am referring to the json in autogenerated/shell.tmLanguage.json), does it do some compound search pattern or some other algorithm altogether? I want to make sure that we have regex patterns that don't conflict. Now I will talk to you later. |
Ok. I have some really good solutions that need to be rigorously tested. We also should discuss, what is and is not achievable with Textmate. I did find this good Textmate reference, which had a link to this one which suggests to me the Textmate grammar should be as rich as Perl's, but who knows. We should come up with some more rigorous patterns but they worked with all of my scripts, which are pretty aggressive: \b(\w+)(?:=) # Variables anything with assignments
(?:")([^"/]+)(?:") # Any quoted non-path
(?:\$\{)(\w+)(?:\}) # Any simply $ marked variable
(?:\$\{)((?:#)|\w)+(?:\}) # Any simply marked variable including with counting
(?:\$\{)([^ ]+)(?:\}) # Full Variable pattern Excluded everything but the space
(?:\$\{)(?:\()(@|\w+)(?:\))(\w+)(?:\}) # Flagged Variable
# Stack overflow to the Rescue [Regular Expressions to Match Balanced Parenthesis](https://stackoverflow.com/posts/35271017/revisions)
\((?:[^)(]|\((?:[^)(]|\((?:[^)(]|\([^)(]*\))*\))*\))*\) Nested Parenthesis *Wow!!!! Shell_variable_identification_regression.zshShell Variable Identification Regression(Note this code isn't working fully. I was developing when I got sidetracked #!/bin/zsh
# _pathOpts startup is in $BIN for me $SOME_ROOT/bin
pathOpts_path() {
local fname="$BIN/._pathOpts.sh"
[[ -e $fname ]] && echo ${fname}
}
<<!
# Usage: assign "$(sourceof $1 to var)"
alias assign=eval
sourceof() {
local_varname='${(P)3}' source_varname="${(P)1}"
echo "local ${local_varname}=${source_varname}"
}
!
deserialize_pathOpts_from_file() {
typeset -grA _pathOpts
_pathOpts=( "${(Q@)${(z@)"$(<pathOpts_path)"}" )
}
serialize_pathOpts_to_file() {
"${(j: :)${(qkv@)_pathOpts}}" #"
}
# Function to initialize _pathOpts
pathOpts() {
local reinitialize=$1
[[ -v ${_pathOpts} && ${(t)_pathOpts} == assoc-array ]] && return 0
deserialize_pathOpts
# Add your initialization logic here
# echo "Initializing _pathOpts..."
local _pathOpts_arr=(
'.' '$(pwd)'
'some' '$HOME/General-Atomics'
'apath' '$SOME_ROOT/apath'
'core_repos' '$SOME_ROOT/apath/core_repos'
'reference_repos' '$SOME_ROOT/apath/reference_repos'
'subject' '$SOME_ROOT/apath/subject_models'
'projects' '$SOME_ROOT/apath/projects'
'support' '$SOME_ROOT/apath/support'
'thesis' '$SOME_ROOT/apath/thesis'
'somecode' '$SOME_ROOT/apath/subject_models/somecode'
'base-envs' '$SOME_ROOT/apath/conda_environments'
'be-miniforge' '$SOME_ROOT/apath/conda_environments/Miniforge3-MacOSX-arm64'
'be-mf-core' '$SOME_ROOT/apath/conda_environments/mambaforge-core'
'be-core-build' '$SOME_ROOT/apath/conda_environments/core-build'
'my-docs' '$SOME_ROOT/Documentation'
'some-docs' '$SOME_ROOT/SOME-Documentation-Repo'
'CORE-projects-logs' '$SOME_ROOT/CORE-projects-logs'
'core-users' '$SOME_ROOT/core-users'
'zsh_funcs' '$HOME/zsh_funcs'
'completions' '$HOME/zsh_funcs/completions'
'.module' '$HOME/.modules'
)
get_KOBE _pathOpts_arr
typeset -agr _pathOpts=( "${j:[:SPACE:]{4}:_pathOpts_arr[@]}" )
save
}
get_KOBE() { # Key-ordered by entry
assign "$(source $1 to anArray)"
start=1 end=${#anArray} stride=2
typeset -a _pathOpts_KOBE[$end/$stride] # Pre-initializing
for i in {${start}..${end}..${stride}}; do
_pathOpts_KOBE[i]=${(qq)anArray[i]}
done
typeset -agr _pathOpts_KOBE # Changing KOBE's status
}
# Save _pathOpts to disk
save() {
pprint _pathOpts _pathOpts_KOBE > $(pathOpts_path) &2> /dev/null
}
# Manage _pathOpts based on flags
edit_PathOpts() {
local flag key path_value
flag="$1"; key="$2"; path_value="$3";
while [[ $# -gt 0 ]]; do
case $flag in
"--add")
_pathOpts[$key]=$path_value; shift 3
echo "Added path option: $key -> $path_value";;
"--remove")
unset _pathOpts[$key]; shift 2
echo "Removed path option: $key";;
"--reset")
mv "$_pathOptsPath" "$_pathOptsPath_$(gdate +%Y%m%d_%H%M%S)"
unset _pathOpts
initialize_PathOpts; shift 1
echo "Reset _pathOpts to default";;
*)
echo "Unknown flag: $flag"
return 1 ;;
esac
done
save
}
# Main function
main() {
initialize_PathOpts
local debug=false
[[ $1 == "--debug" ]] && shift && debug=true
[[ $1 == "--_pathOpts" ]] && shift && edit_PathOpts "${(P)1[@]}" && return
[[ $debug == true ]] && set -o xtrace
find_folder "$@"
[[ $debug == true ]] && set +o xtrace
}
main "$@" Assignment Identification:Quote Identification non-Path$-wrapped Variable identificationFull Variable pattern - Excludes only spacesFlagged VariableMother of all regex's - Nested Parenthesis - Thank you stack overflow |
I wouldn't say that @jeff-hykin
so in both cases of "patterns": [
{ "match": "'" },
{ "match": "'.+'" },
] the single if you attempt to also capture the whitespace before it EDIT: |
The code with a problem is:
With all extensions disabled, the resulting code looks like:
To get the parser to highlight correctly I have to append the following:
Interestingly, If I put on a single quote on the deserialize _pathOpts assignment, I get a syntax error indication. Which is not true unless I don't know something about shell grammar. It's just clear that this is not an officially pattern or I should be doing something different somewhere.
The text was updated successfully, but these errors were encountered: