proposal: strings: add (*Reader).String() #36290

pierrec · 2019-12-27T11:01:21Z

Although it is possible to retrieve the string that a Reader contains, it would be nice to avoid this and have direct access to the string that a Reader contains.

Therefore I propose the following:

// String always returns the string the Reader was initialized or reset with,
// irrelevant of any other method calls.
func (r *Reader) String() string {
  return r.s
}

So the code required to get access to a Reader's content becomes:

Before:

  r := strings.NewReader("hello")
  var buf bytes.Buffer
  r.Seek(0, io.SeekStart) // make sure we get the whole string)
  io.Copy(&buf, r)
  s := buf.String() // hello

After:

  r := strings.NewReader("hello")
  s := r.String() // hello

The text was updated successfully, but these errors were encountered:

smasher164 · 2019-12-27T20:34:09Z

I find it counterintuitive that calling String returns the underlying string instead of the string starting at index i, based on how many bytes one has read from the Reader.

On the other hand, the string can be invalid if it begins at any arbitrary index, so if returning a valid string must be guaranteed, the closest behavior I can think of is

func (r *Reader) String() string {
    return ToValidUTF8(r.s[r.i:], "")
}

pierrec · 2019-12-28T10:41:38Z

@smasher164 I don't think there is the need to worry about returning an invalid string, as there is precedence in (*bytes.Buffer).String().

To be consistent with (*bytes.Buffer).String() however, (*strings.Reader).String() could return the unread portion of the string. To get the whole string, use Seek().

RodionGork · 2019-12-28T10:59:50Z

@pierrec Sorry, which Reader is it about? (from io?) should it always have an underlying string? This doesn't look obvious (and even correct) or I deeply misunderstood the suggestion.

String() could return the unread portion of the string

Should it advance pointer then? So that second call to String() won't return anything (as everything is read)?

deanveloper · 2019-12-28T17:31:40Z

@RodionGork Well it can't be io.Reader since that is an interface, and methods cannot be defined on interfaces. I believe that this proposal is to add it for *strings.Reader since the examples given use strings.NewReader.

I agree with @smasher164 in that it should return the underlying string, however not all strings are utf8 strings (blog), so I disagree that it should be wrapped in ToValidUTF8. It should be the caller's responsibility to call strings.ToValidUTF8 if they want a utf8 string, not the callee's.

RodionGork · 2019-12-28T19:04:38Z

@deanveloper thanks, I believe you are right (just had a momentary suspicion it is about extending interface and implementing method everywhere).

But then, if the Reader should resemble bytes.Buffer more, then besides String it makes sense to copy few more methods, e.g. ReadString and Next - isn't it so?

so I disagree that it should be wrapped in ToValidUTF8

yep, it would be confusing to receive empty string when reader is not at end (or error which is not part of behavior in bytes.Buffer)

pierrec · 2019-12-28T19:24:36Z

@RodionGork this proposal is not about making it resemble bytes.Buffer. It is about being able to easily recover the string that the reader is based on. I have been running into this a few times and think it would be a straight forward addition.

smasher164 · 2019-12-28T19:37:15Z

I agree with @smasher164 in that it should return the underlying string, however not all strings are utf8 strings

I see your point, but the package header specifically states, "package strings implements simple functions to manipulate UTF-8 encoded strings."

That being said, since strings.Builder allows one to construct a malformed string using the package, I don't see why strings.Reader shouldn't be allowed to, so I'm open to simply returning r.s[r.i:].

@pierrec I am still trying to understand the use case of this proposal. Whenever I use strings.Reader, I am trying to pass some input into a function that expects an io.Reader. I tend to always have access to the original string. Is the intended use case where one has to type-assert on an io.Reader to get a strings.Reader and get access to its underlying string?

deanveloper · 2019-12-28T20:17:58Z

I see your point, but the package header [snip]

That's true, I can also see it going either way. Just seems weird to me that strings.NewReader(str).String() wouldn't necessarily return str in the case where it uses ToValidUTF8

rsc · 2020-01-08T18:44:07Z

strings.Reader and bytes.Reader were meant to be the same API, just with different input types in the constructor. They're all about the streaming calls, not the bulk data of, say, bytes.Buffer. There's also ambiguity about which string should be returned - the remaining input or the original. So adding the API might confuse half the users anyway. And for bytes.Reader it would have to make a copy.

As @smasher164 says, usually you use strings.Reader because you already have the string and want to pass it to a function as an io.Reader (or one of the other interfaces).

What is the use case where you are passed a *strings.Reader and want to get the string back? How does that come up?

pierrec · 2020-01-08T19:16:03Z

I had to convert a type which was initially an io.ReadCloser to an io.ReadWriteCloser. That type was implemented using, you guessed it, a strings.Reader, and the hack I mention in the proposal, which seemed bad.
Coming back to it, I have cleaned up the type and split it into 2 types so that the need for this proposal becomes obsolete.
So... as and @rsc mentions, as it stands, the introduction of a String method would be confusing.
Thanks for your attention. Closing.

ianlancetaylor · 2020-01-08T22:26:18Z

For the record, you can retrieve the remainder of the string in a strings.Reader like this. This uses panic so that the reader is left unchanged.

type returnedString string

type fetchString struct{}

func (fetchString) WriteString(s string) (int, error) {
	panic(returnedString(s))
}

func (fetchString) Write([]byte) (int, error) {
	panic("can't happen")
}

func FetchString(sr *strings.Reader) (ret string) {
	defer func() {
		if s, ok := recover().(returnedString); ok {
			ret = string(s)
		} else {
			panic(recover())
		}
	}()
	sr.WriteTo(fetchString{})
	panic("can't happen")
}

pierrec · 2020-01-09T08:19:02Z

@ianlancetaylor yikes!
Shouldn't it just return at the end of FetchString instead of panicing if there is nothing left to read from the strings.Reader?

deanveloper · 2020-01-09T10:03:06Z

The , ok represents if the recovered panic was of type returnedString. If the recovered panic isn't a returnedString, then the panic wasnt from the WriteString function, meaning that it should continue unwrapping the stack.

pierrec · 2020-01-09T10:16:15Z

Yep. I meant this:

type returnedString string

type fetchString struct{}

func (fetchString) WriteString(s string) (int, error) {
	panic(returnedString(s))
}

func (fetchString) Write([]byte) (int, error) {
	panic("can't happen")
}

func FetchString(sr *strings.Reader) (ret string) {
	defer func() {
		if s, ok := recover().(returnedString); ok {
			ret = string(s)
		} else {
			panic(recover())
		}
	}()
	sr.WriteTo(fetchString{})
	return
}

as looking at the strings.Reader.WriteTo method code, it returns before calling WriteString if there is nothing to read.

deanveloper · 2020-01-09T10:24:09Z

Ahh I see that now - my bad. Misinterpreted your initial comment and I thought you meant inside the defer for some reason.

ianlancetaylor · 2020-01-09T16:59:20Z

@pierrec Yes, I think you're right. Oh well.

gopherbot added this to the Proposal milestone Dec 27, 2019

gopherbot added the Proposal label Dec 27, 2019

pierrec closed this as completed Jan 8, 2020

rsc mentioned this issue Jan 8, 2020

proposal: review meeting minutes #33502

Open

golang locked and limited conversation to collaborators Jan 8, 2021

gopherbot added the FrozenDueToAge label Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: strings: add (*Reader).String() #36290

proposal: strings: add (*Reader).String() #36290

pierrec commented Dec 27, 2019

smasher164 commented Dec 27, 2019

pierrec commented Dec 28, 2019

RodionGork commented Dec 28, 2019 •

edited

Loading

deanveloper commented Dec 28, 2019 •

edited

Loading

RodionGork commented Dec 28, 2019

pierrec commented Dec 28, 2019

smasher164 commented Dec 28, 2019

deanveloper commented Dec 28, 2019 •

edited

Loading

rsc commented Jan 8, 2020

pierrec commented Jan 8, 2020

ianlancetaylor commented Jan 8, 2020

pierrec commented Jan 9, 2020

deanveloper commented Jan 9, 2020 •

edited

Loading

pierrec commented Jan 9, 2020

deanveloper commented Jan 9, 2020

ianlancetaylor commented Jan 9, 2020

proposal: strings: add (*Reader).String() #36290

proposal: strings: add (*Reader).String() #36290

Comments

pierrec commented Dec 27, 2019

smasher164 commented Dec 27, 2019

pierrec commented Dec 28, 2019

RodionGork commented Dec 28, 2019 • edited Loading

deanveloper commented Dec 28, 2019 • edited Loading

RodionGork commented Dec 28, 2019

pierrec commented Dec 28, 2019

smasher164 commented Dec 28, 2019

deanveloper commented Dec 28, 2019 • edited Loading

rsc commented Jan 8, 2020

pierrec commented Jan 8, 2020

ianlancetaylor commented Jan 8, 2020

pierrec commented Jan 9, 2020

deanveloper commented Jan 9, 2020 • edited Loading

pierrec commented Jan 9, 2020

deanveloper commented Jan 9, 2020

ianlancetaylor commented Jan 9, 2020

RodionGork commented Dec 28, 2019 •

edited

Loading

deanveloper commented Dec 28, 2019 •

edited

Loading

deanveloper commented Dec 28, 2019 •

edited

Loading

deanveloper commented Jan 9, 2020 •

edited

Loading