-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/json: allow per-Encoder/per-Decoder registration of marshal/unmarshal functions #5901
Comments
I did some work on this sometime back in October, with CL https://go-review.googlesource.com/c/31091 to get the conversation started. In there I introduced Encoder.RegisterEncoder func (enc *Encoder) RegisterEncoder(t reflect.Type, fn func(interface{}) ([]byte, error)) and Decoder.RegisterDecoder func (dec *Decoder) RegisterDecoder(t reflect.Type, fn func([]byte) (interface{}, error)) |
We have a similar requirement for custom serialization. Our specific use cases are:
We are using CiscoM31@1e9514f. In our case, the client interface is implemented by doing a simple lookup in a map, so there is no need to register hundreds of custom marshaller (we have lots of structs). |
@rsc is there still interest in this feature on your end? I have a use-case for it and would be happy to implement it. [Edit 1] Spelling I wonder if the function signature should be func (enc *Encoder) RegisterMarshaller(t reflect.Type, f func(reflect.Value) ([]byte, error)) with all standard Encoders exposed so that users can leverage them. Example redactor: package main
import (
"bytes"
"encoding/json"
"os"
"reflect"
"strings"
"unicode/utf8"
)
func main() {
enc := json.NewEncoder(os.Stdout)
text := "My password, foo, is totally secure"
enc.RegisterMarshaller(reflect.TypeOf(""), StringMarshaller)
enc.Encode(text)
// Output
// "My password, foo, is totally secure"
enc.RegisterMarshaller(reflect.TypeOf(""), func(value reflect.Value) ([]byte, error) {
return StringMarshaller(reflect.ValueOf(strings.Replace(value.String(), "foo", "[REDACTED]", -1)))
})
enc.Encode(text)
// Output
// "My password, [REDACTED], is totally secure"
}
// Largely taken from `func (e *encodeState) `string(s string, escapeHTML bool)` in `encoding/json/encode.go`
// This would exist in encoding/json.
func StringMarshaller(value reflect.Value) ([]byte, error) {
e := bytes.Buffer{}
s := value.String()
escapeHTML := false // TODO: Refactor StringEncoder into a 'htmlEscaping' one and a non 'htmlEscaping' one.
e.WriteByte('"')
start := 0
for i := 0; i < len(s); {
if b := s[i]; b < utf8.RuneSelf {
if json.HTMLSafeSet[b] || (!escapeHTML && json.SafeSet[b]) {
i++
continue
}
if start < i {
e.WriteString(s[start:i])
}
e.WriteByte('\\')
switch b {
case '\\', '"':
e.WriteByte(b)
case '\n':
e.WriteByte('n')
case '\r':
e.WriteByte('r')
case '\t':
e.WriteByte('t')
default:
// This encodes bytes < 0x20 except for \t, \n and \r.
// If escapeHTML is set, it also escapes <, >, and &
// because they can lead to security holes when
// user-controlled strings are rendered into JSON
// and served to some browsers.
e.WriteString(`u00`)
e.WriteByte(json.Hex[b>>4])
e.WriteByte(json.Hex[b&0xF])
}
i++
start = i
continue
}
c, size := utf8.DecodeRuneInString(s[i:])
if c == utf8.RuneError && size == 1 {
if start < i {
e.WriteString(s[start:i])
}
e.WriteString(`\ufffd`)
i += size
start = i
continue
}
// U+2028 is LINE SEPARATOR.
// U+2029 is PARAGRAPH SEPARATOR.
// They are both technically valid characters in JSON strings,
// but don't work in JSONP, which has to be evaluated as JavaScript,
// and can lead to security holes there. It is valid JSON to
// escape them, so we do so unconditionally.
// See http://timelessrepo.com/json-isnt-a-javascript-subset for discussion.
if c == '\u2028' || c == '\u2029' {
if start < i {
e.WriteString(s[start:i])
}
e.WriteString(`\u202`)
e.WriteByte(json.Hex[c&0xF])
i += size
start = i
continue
}
i += size
}
if start < len(s) {
e.WriteString(s[start:])
}
e.WriteByte('"')
return e.Bytes(), nil
} |
I have a use-case for this as well, but it's a bit specialized. Essentially:
For prior art, the Even though I'd like to have something like this, I do have some concerns:
|
Thanks for the response! Some comments below:
I'd have to run a benchmark but I would expect a map lookup to be fairly quick. Perhaps others are concerned with a different scale of "slow" than I.
Interesting. I hadn't considered supporting interfaces. At the moment I only need concrete type overrides but if there are users out there that would benefit from an interface check I'd be willing to at least prototype it. [Edit] Perhaps we could rename
This actually came up in a discussion with a colleague. My preference would be to require users that want to custom marshal both T and *T to have to declare both marshaler overrides. You could use a single custom marshaler but would require both overrides.
Alternately perhaps we modify the signiture of
where, if the |
@dsnet I imagine you might be busy with the holidays but I wanted to give you a friendly ping on this. Any thoughts on my response? |
Change https://golang.org/cl/212998 mentions this issue: |
While the logic certainly uses // RegisterFunc registers a custom encoder to use for specialized types.
// The input f must be a function of the type func(T) ([]byte, error).
//
// When marshaling a value of type R, the function f is called
// if R is identical to T for concrete types or
// if R implements T for interface types.
// Precedence is given to registered encoders that operate on concrete types,
// then registered encoders that operate on interface types
// in the order that they are registered, then the MarshalJSON method, and
// lastly the default behavior of Encode.
//
// It panics if T is already registered or if interface{} is assignable to T.
func (e *Encoder) RegisterFunc(f interface{})
// RegisterFunc registers a custom decoder to use for specialized types.
// The input f must be a function of the type func([]byte, T) error.
//
// When unmarshaling a value of type R, the function f is called
// if R is identical to T for concrete types or
// if R implements T for interface types.
// Precedence is given to registered decoders that operate on concrete types,
// then registered decoders that operate on interface types
// in the order that they are registered, then the UnmarshalJSON method, and
// lastly the default behavior of Decode.
//
// It panics if T is already registered or if interface{} is assignable to T.
func (d *Decoder) RegisterFunc(f interface{}) Arguments for this API:
The proposed API combined with the ability to handle interfaces, allows you to do something like: e := json.NewEncoder(w)
e.RegisterFunc(protojson.Marshal)
e.Encode(v) where Since this is something I need for my other work, I uploaded a CL with my prototype implementation that I've been testing with actual code. Some additional thoughts:
|
Adding |
This issue is currently labeled as early-in-cycle for Go 1.17. |
@mvdan and I are doing a systematic review of the entire |
I'm very late to the party, but I believe generics would probably be fine here, might reduce the amount of reflection needed, and like you said would enable greater type safety for the proposed API. As for the type parameterization, I thought up something like this. // Encoder
func MarshalIP(ip *net.IP) ([]byte, error) { ... }
func Register[T any](f func(T) ([]byte, error)) { ... }
// Decoder
func UnmarshalIP(data []byte) (*net.IP, error) { ... }
func Register[T any](f func([]byte) (T, error)) { ... }
// or
func UnmarshalIP(data []byte, ip *net.IP) error { ... }
func Register[T any](f func([]byte, T) error) { ... } You'd obviously still need reflection to get the typeof out of |
How's this going? If this doesn't conflict with the overall package direction, it'd be nice to try to get this in for 1.18. |
I apologize, I got side-tracked from |
In implementing this for v2. I think we may want to defer on this feature until the release of generics. I propose the following alternative API: // Marshalers is a list of functions that each marshal a specific type.
// A nil Marshalers is equivalent to an empty list.
// For performance, marshalers should be stored in a global variable.
type Marshalers struct { ... }
// NewMarshalers constructs a list of functions that marshal a specific type.
// Functions that come earlier in the list take precedence.
func NewMarshalers(...*Marshalers) *Marshalers
// MarshalFunc constructs a marshaler that marshals values of type T.
func MarshalFunc[T any](fn func(T) ([]byte, error)) *Marshalers
// Unmarshalers is a list of functions that each unmarshal a specific type.
// A nil Unmarshalers is equivalent to an empty list.
type Unmarshalers struct { ... }
// NewUnmarshalers constructs a list of functions that unmarshal a specific type.
// Functions that come earlier in the list take precedence.
// For performance, unmarshalers should be stored in a global variable.
func NewUnmarshalers(...*Unmarshalers) *Unmarshalers
// UnmarshalFunc constructs an unmarshaler that unmarshals values of type T.
func UnmarshalFunc[T any](fn func([]byte, T) error) *Unmarshalers
// WithMarshalers configures the encoder to use any marshaler in m
// that operate on the current type that the encoder is marshaling.
func (*Encoder) WithMarshalers(m *Marshalers)
// WithUnmarshalers configures the decoder to use any unmarshaler in u
// that operate on the current type that the encoder is marshaling.
func (*Decoder) WithUnmarshalers(u *Unmarshalers) There are several advantages of this API:
Example usage: var protoMarshalers = json.MarshalFunc(protojson.Marshal)
enc := json.NewEncoder(...)
enc.WithMarshalers(protoMarshalers)
... = enc.Decode(...) Footnotes:
\cc @mvdan |
Generics are arriving in 1.18. Should we add this (with generics) in 1.18, or delay a release? |
I keep forgetting that generics is coming imminently in 1.18. Assuming newly proposed API is acceptable, I support a release for 1.18. In the event that generics is delayed, we could go with the following signatures: func MarshalFunc(fn interface{}) *Marshalers
func UnmarshalFunc(fn interface{}) *Unmarshalers It must rely on Go reflection for type safety at runtime. We could add a corresponding |
Using a |
There's been work on what a theoretical v2 |
Hi! Golang version 1.18 with generics is already out. Are there any updates now? |
Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal. See the "Caller-specified customization" section of the discussion. |
The text was updated successfully, but these errors were encountered: