Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

substr function gets error if START equals to string's length #8113

Closed
zerovip opened this issue Jan 3, 2021 · 14 comments · Fixed by #8115
Closed

substr function gets error if START equals to string's length #8113

zerovip opened this issue Jan 3, 2021 · 14 comments · Fixed by #8115
Assignees

Comments

@zerovip
Copy link

zerovip commented Jan 3, 2021

What version of Hugo are you using (hugo version)?

$ hugo version
Hugo Static Site Generator v0.80.0/extended linux/amd64 BuildDate: unknown

Does this issue reproduce with the latest release?

Not tested.

Issue

Hi, all.

After upgrading to version v0.80.0, there is an error:

start position out of bounds for x-byte string

I expected that {{ substr "foo" 3 }} gives me empty string, and this always works until v0.80.0.

I believe commit 64789fb caused this problem. It was

if start > len(asRunes) {
	return "", fmt.Errorf("start position out of bounds for %d-byte string", len(aStr))

before, and now:

if start > rlen-1 {
	return "", fmt.Errorf("start position out of bounds for %d-byte string", rlen)
}

Can removing "-1" fix this?

Thank you in advance, and happy new year!

@davidsneighbour
Copy link
Contributor

davidsneighbour commented Jan 3, 2021

That error is expected because counting starts at 0, not 1. So your start position is actually out of bounds. That behaviour changed in 0.80.0 and is now the same way substr works in other languages. That it worked before was a bug ;).

Is there a specific case where you would want an empty string to be returned? I would expect something like {{ substr "foo" 0 0 }} to do that trick.

I am not sure if that is the updated version yet: https://gohugo.io/functions/substr/ - there was a PR for the documentation to fix up the explanations for the "new" substr handling some days ago.

@davidsneighbour
Copy link
Contributor

@moorereason
Copy link
Contributor

@zerovip,
Sorry about the breaking change. The previous substr implementation was pretty buggy. We tried to ensure the new implementation matches PHP's substr. After some looking around, various languages do this differently (empty string vs error). I'm not dead set on keeping this error return, but I want a good argument for keeping the old behavior (other than it broke your site).

It's late here (power outage at work has me up at 3am), and I'm too tired to think clearly about this issue. So, in the meantime...

Can you explain your use case? How would you work around this issue?

@zerovip
Copy link
Author

zerovip commented Jan 3, 2021

Thank you for your reply.

Sorry but I don't think it is related to where counting starts. The only difference is the valid bounds shrunk.

Environment code result
Hugo 0.79.1 {{ substr "foo" 0 }} "foo"
{{ substr "foo" 1 }} "oo"
{{ substr "foo" 2 }} "o"
{{ substr "foo" 3 }} ""
Hugo 0.80.0 {{ substr "foo" 0 }} "foo"
{{ substr "foo" 1 }} "oo"
{{ substr "foo" 2 }} "o"
{{ substr "foo" 3 }} error: start position out of bounds for 3-byte string
Python "foo"[0:] "foo"
"foo"[1:] "oo"
"foo"[2:] "o"
"foo"[3:] ""

Sure I can explain my case:
I write

{{ $curr_page := . }}
{{ substr $curr_page.RelPermalink 4 }}

to get example.html from /en/example.html, then I combine it with

{{ range $.Site.Home.AllTranslations }}
    {{ .Permalink }}

to get, for example /zh/example.html. For the home page /en/, Hugo before 0.79.1 can successfully give me /zh/.

I am not familiar with Hugo, so I will be very happy and grateful if there is a better way to do so. Thank you in advance!

Here is my codes:

{{ $curr_page := . }}
{{ $curr_lang := $curr_page.Language.Lang }}
{{ $curr_trans_list := slice }}
{{ range $curr_page.Translations }}
    {{ $curr_trans_list = $curr_trans_list | append .Language.Lang }}
{{ end }}
{{ range $.Site.Home.AllTranslations }}
    {{ if (eq .Language.Lang $curr_lang) }}
        <li class='left_non-footer_option_active'>{{ .Language.LanguageName }} ⦿ </li>
    {{ else }}
        {{ if (in $curr_trans_list .Language.Lang) }}
            <li class='left_non-footer_option_non-active'>
                <a href="{{ .Permalink }}{{ substr $curr_page.RelPermalink 4 }}">{{ .Language.LanguageName }} ○ </a>
            </li>
        {{ else }}
            <li class='left_non-footer_option_non-active'>{{ .Language.LanguageName }} ⨂ </li>
        {{ end }}
   {{ end }}
{{ end }}

Oh, how would I work around… emmmm, I might write a if to treat for home page(?)


In my opinion, there is no harm to get an empty string rather than get an error. Besides, I really don't have any good argument for that. So what do other people think?


It is not in a hurry, so you can go to bed first. I am in China and it is six pm in the evening. We can listen to more others' opinions.

@jmooring
Copy link
Member

jmooring commented Jan 3, 2021

PHP doesn't complain either, and we refer to the PHP implementation in our documentation. Try this:
https://www.tehplayground.com/mdhhZ1yQKelHccMS

I encountered this breaking change a couple of days ago. I had some code that grabbed everything after the first character of a string, and I didn't provide a special case to check if the string was only a single character.

This worked in 0.79, though it is arguably lazy coding:

{{- $firstWord := "A" -}
{{- $firstCharacter := substr $firstWord 0 1 -}}
{{- $remainingCharacters := substr $firstWord 1 -}}

Changes required for 0.80:

{{- $firstWord := "A" -}
{{- $firstCharacter := substr $firstWord 0 1 -}}
{{- $remainingCharacters := "" -}}
{{- if gt (len $firstWord) 1 -}}
  {{- $remainingCharacters = substr $firstWord 1 -}}
{{- end -}}

@zerovip
Copy link
Author

zerovip commented Jan 3, 2021

PHP doesn't complain either

Yes! You are right!!! Thank you very much. I tried PHP and Javascript, they all don't raise errors! Thank you very much! That must be a strong argument!

@moorereason
Copy link
Contributor

PHP doesn't complain either, and we refer to the PHP implementation in our documentation. Try this:
https://www.tehplayground.com/mdhhZ1yQKelHccMS

Interesting. I assumed from the PHP docs that substr returning false in that scenario would raise an error somewhere, but apparently it doesn't. echo false prints an empty string in PHP (echo true prints 1). 🤦‍♂️

I think we can revert to the previous behavior by getting rid of the error and just returning an empty string. I'll work on it.

@moorereason moorereason added the Bug label Jan 3, 2021
@moorereason moorereason self-assigned this Jan 3, 2021
@jmooring
Copy link
Member

jmooring commented Jan 3, 2021

@moorereason, Thank you.

@zerovip
Copy link
Author

zerovip commented Jan 4, 2021

Thank you. Do we go further? I mean cancel the bound and always return empty string when START is greater than string's length. Because, PHP, javascript, and python all do like this.

moorereason added a commit to moorereason/hugo that referenced this issue Jan 4, 2021
Most other substr implementations don't error out in out-of-bounds cases
but simply return an empty string (or a value that's printed as an empty
string). We'll follow their lead and not exit template execution.  Allow
the user decide what to do with the empty result.

Fixes gohugoio#8113
@moorereason
Copy link
Contributor

PR #8115 submitted. PTAL

@davidsneighbour
Copy link
Contributor

I know it's splitting hairs, but per PHP documentation out of bounds substr returns false, not an empty string. That's probably because in PHP I would like to know WHAT went wrong instead of just going with what's coming back (nothing) from what I've done.

https://www.php.net/manual/en/function.substr.php#refsect1-function.substr-errors

@zerovip
Copy link
Author

zerovip commented Jan 4, 2021

I would like to know WHAT went wrong.

Just as @moorereason explained, echo false prints an empty string in PHP. See, https://www.php.net/manual/en/language.types.string.php#language.types.string.casting.


The PR LGTM. @moorereason , thank you for your work! And, @davidsneighbour , @jmooring , thank you for your participating in the discussion!

@davidsneighbour
Copy link
Contributor

If it's about that casting anything into a string will print an empty string if it's not already a stringable variable. A lot of these tickets here in the issue tracker are about "please don't throw an error if I do this" and half of the other tickets are "hugo is not throwing an error if I do this". Looks like Golang has an issue with typing :) As long as it's documented it's good.

@bep bep closed this as completed in #8115 Jan 8, 2021
bep pushed a commit that referenced this issue Jan 8, 2021
Most other substr implementations don't error out in out-of-bounds cases
but simply return an empty string (or a value that's printed as an empty
string). We'll follow their lead and not exit template execution.  Allow
the user decide what to do with the empty result.

Fixes #8113
gzagatti pushed a commit to gzagatti/hugo that referenced this issue Jan 11, 2021
Most other substr implementations don't error out in out-of-bounds cases
but simply return an empty string (or a value that's printed as an empty
string). We'll follow their lead and not exit template execution.  Allow
the user decide what to do with the empty result.

Fixes gohugoio#8113
gzagatti pushed a commit to gzagatti/hugo that referenced this issue Jan 11, 2021
Most other substr implementations don't error out in out-of-bounds cases
but simply return an empty string (or a value that's printed as an empty
string). We'll follow their lead and not exit template execution.  Allow
the user decide what to do with the empty result.

Fixes gohugoio#8113
gzagatti pushed a commit to gzagatti/hugo that referenced this issue Jan 11, 2021
Most other substr implementations don't error out in out-of-bounds cases
but simply return an empty string (or a value that's printed as an empty
string). We'll follow their lead and not exit template execution.  Allow
the user decide what to do with the empty result.

Fixes gohugoio#8113
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants