feat: Add support for inline arrays of nillable types #759

AndrewSisley · 2022-08-24T21:15:53Z

Relevant issue(s)

Resolves #425

Description

Adds support for inline arrays of nillable values. Also adds a very minimal Option struct to hep facilitate this - this struct can be fleshed out further as and when required.

Specify the platform(s) on which this was tested:

Debian Linux

codecov · 2022-08-24T21:16:01Z

Codecov Report

Merging #759 (eed8a32) into develop (3727613) will increase coverage by 0.09%.
The diff coverage is 72.90%.

@@             Coverage Diff             @@
##           develop     #759      +/-   ##
===========================================
+ Coverage    58.25%   58.35%   +0.09%     
===========================================
  Files          146      147       +1     
  Lines        16652    16829     +177     
===========================================
+ Hits          9701     9820     +119     
- Misses        6043     6084      +41     
- Partials       908      925      +17

Impacted Files	Coverage Δ
client/descriptions.go	`83.33% <ø> (ø)`
db/fetcher/encoded_doc.go	`54.03% <53.03%> (-1.11%)`	⬇️
db/collection_update.go	`42.99% <60.65%> (+2.27%)`	⬆️
client/option.go	`100.00% <100.00%> (ø)`
connor/eq.go	`87.50% <100.00%> (+6.94%)`	⬆️
query/graphql/mapper/mapper.go	`86.74% <100.00%> (-1.12%)`	⬇️
query/graphql/planner/count.go	`92.13% <100.00%> (+0.77%)`	⬆️
query/graphql/planner/sum.go	`82.41% <100.00%> (+1.46%)`	⬆️
query/graphql/schema/descriptions.go	`87.57% <100.00%> (+0.73%)`	⬆️
query/graphql/schema/generate.go	`83.96% <100.00%> (+0.03%)`	⬆️
... and 2 more

shahzadlone · 2022-08-24T23:37:38Z

client/option.go

+package client
+
+// Option represents an item that may or may not have a value.
+type Option[T any] struct {


nitpick: Rename to Optional? Option makes me think 'like a config option'.

Option is a noun and Optional an adjective, Option also matches the rust name (and the name of type in the lib John linked)

Interesting to see rust uses Option. I might have been bias because I have seen C++ std library and typescript using Optional.

https://github.com/bromne/typescript-optional
https://en.cppreference.com/w/cpp/utility/optional

shahzadlone · 2022-08-24T23:39:11Z

client/option.go

+type Option[T any] struct {
+	// If HasValue is true, this Option contains a value, if
+	// it is false it contains no value.
+	HasValue bool
+
+	// The Value of this Option. Should be ignored if HasValue is false.
+	Value T
+}
+
+// Some returns an `Option` of type `T` with the given value.
+func Some[T any](value T) Option[T] {
+	return Option[T]{
+		HasValue: true,
+		Value:    value,
+	}
+}
+
+// Some returns an `Option` of type `T` with no value.
+func None[T any]() Option[T] {
+	return Option[T]{}
+}


praise: been looking forward for these magical types!

Cheers, is really nice having access generics again :)

shahzadlone · 2022-08-24T23:46:10Z

db/collection_update.go

@@ -548,6 +579,51 @@ func validateFieldSchema(val interface{}, field client.FieldDescription) (interf
 	return cval, err
 }

+func covertNillableArrayToCval[T any](val any) ([]*T, error) {


question: What does Cval stand for? Also covert as in hide / cover? or did you mean convert?

I can't remember what Cval stands for sorry. covert is a typo :) thanks for spotting, I'll correct it (and check the other places I may have copy-pasted it)

convert typo

shahzadlone · 2022-08-24T23:50:58Z

db/collection_update.go

+	if val == nil {
+		return nil, nil
+	}
+	untypedCollection := val.([]interface{})


question: Why not check the cast performed properly? I remember trying this cast from (any) -> .[]any a few times and the ok value (i.e. _, ok :=...) kept being false.

If it reaches this line and is not of that type we probably have far greater problems (like a corrupt database or something). an ok is a waste of eye space, dead code, and misleading in that it suggests such a thing is possible.

It would still avoid a panic and we can report a proper error no?

I am not particularly fussed about panics that can only happen if the database is already corrupted

I had an instance where the underlying type was []float but it in a switch case with multiple options ([]float, []string etc. all in one case) made it be any type which when casted to []any didn't make it behave like I wanted to. However assuming that was an odd case and you know what your are doing here such that it won't panic.

[]float doesnt cast to []any - arrays/slices are of a particular type and it would be unsafe to allow casts between them, particularly pointers (any == pointer). Think about what is going on size/memory wise if you do so - an array is (mostly) a continuous set in memory, there is usually a pointer to the first element and a count, if the size of each element could change (e.g. when casting from []bool to []any) then you'd end up with real problems :)

Here the underlying type should always be []any, any will be a pointer to something concrete, but it will always be an array of pointers.

quick nitpick: "any == pointer" isn't exactly correct :) - Its implemented as a struct that contains a pointer (technically 2 pointers, one of which is an unsafe pointer)

lol, seriously? 😅 What is the second pointer to, the actual type?

shahzadlone · 2022-08-26T14:02:55Z

db/collection_update.go

+func covertNillableArrayToCval[T any](val any) ([]*T, error) {
+	if val == nil {
+		return nil, nil
+	}
+	untypedCollection := val.([]interface{})
+	// Cbor deals with pointers better than structs by default, however in the future
+	// we may want to write a custom encoder for the Option[T] type
+	resultArray := make([]*T, len(untypedCollection))
+	for i, value := range untypedCollection {
+		if value == nil {
+			resultArray[i] = nil
+			continue
+		}
+		tValue, ok := value.(T)
+		if !ok {
+			return nil, fmt.Errorf("Failed to cast value: %v of type: %T to %T", value, value, *new(T))
+		}
+		resultArray[i] = &tValue
+	}
+	return resultArray, nil
+}


question: Also curious why this function needed to use a generics, could this just not be the following, and after that it can be be used like this cval, err = covertNillableArrayToCval(val) :

Suggested change

func covertNillableArrayToCval[T any](val any) ([]*T, error) {

if val == nil {

return nil, nil

}

untypedCollection := val.([]interface{})

// Cbor deals with pointers better than structs by default, however in the future

// we may want to write a custom encoder for the Option[T] type

resultArray := make([]*T, len(untypedCollection))

for i, value := range untypedCollection {

if value == nil {

resultArray[i] = nil

continue

}

tValue, ok := value.(T)

if !ok {

return nil, fmt.Errorf("Failed to cast value: %v of type: %T to %T", value, value, *new(T))

}

resultArray[i] = &tValue

}

return resultArray, nil

}

func covertNillableArrayToCval(val any) ([]*any, error) {

if val == nil {

return nil, nil

}

untypedCollection := val.([]any)

// Cbor deals with pointers better than structs by default, however in the future

// we may want to write a custom encoder for the Option[T] type

resultArray := make([]*any, len(untypedCollection))

for i, value := range untypedCollection {

if value == nil {

resultArray[i] = nil

continue

}

tValue := value

resultArray[i] = &tValue

}

return resultArray, nil

}

I strongly dislike losing known type information when there is little reason to do so. The generic version should also preserve it's efficiency if/when introducing the custom encoder, whereas the any version would force pointers over the raw structs

shahzadlone · 2022-08-26T14:08:32Z

db/collection_update.go

+
+	case client.FieldKind_NILLABLE_STRING_ARRAY:
+		cval, err = covertNillableArrayToCval[string](val)
+		if err != nil {
+			return nil, err
+		}
+		ok = true
+


thought: If you do update this function to just take any parameter and not be a generic, all these cases can just squeeze into the following two cases:

case client.FieldKind_NILLABLE_STRING_ARRAY, client.FieldKind_NILLABLE_BOOL_ARRAY, client.FieldKind_NILLABLE_FLOAT_ARRAY: cval, err = covertNillableArrayToCval(val) if err != nil { return nil, err } ok = true case client.FieldKind_NILLABLE_INT_ARRAY: cval, err = covertNillableArrayToCvalWithConverter(val, func(in float64) int64 { return int64(in) }) if err != nil { return nil, err } ok = true

This might justify the forcing of pointers and the extra hassle if/when using a custom encoder (and having to re-write to the current), however I do not see it as important, and impossible to know which is better. Will leave as is unless anyone else chimes in here

shahzadlone

LGTM!

jsimnz

Some minor stuff.

Overall I'm concerned about the structure of our dynamic types, but that isn't at all the fault of this PR but a symptom that is expanded a bit in this PR.

Started playing around with a better designed internal type system that may eventually make with like this more succinct.

Will prob try to formally address in 0.4.

But, specific to this PR 👍

jsimnz · 2022-08-24T23:12:49Z

client/option.go

+	// If HasValue is true, this Option contains a value, if
+	// it is false it contains no value.
+	HasValue bool
+
+	// The Value of this Option. Should be ignored if HasValue is false.
+	Value T


suggestion: I feel like these should be private with getters / helpers

:) Why force extra cost on the consumers of this?

Is little/no risk in allowing direct access (I want unions to do this properly, as atm you can access Value if HasValue is false, but in the future that will hopefully be impossible)

I don't see the extra cost, and it ensures correct usage of the API, as right now it's very easy to accidentally break the Option semantics by direct manipulation. There's no need for them to be public.

Calling a function has a (small) cost. It also obfuscates the underlying data structure. It would also be dead internally at the moment.

People have different preferences on this topic (OOP vs DDD/func-prog), and I see this discussion as one of mine vs yours personal preference.

Sorry, not dead if you make them private, but it adds no value to the current code

tiny-go more likely to be used in resource constrained environments

More likely, but not guaranteed, and resource constraints can may this kind of thing even more important.

If its inlined, then there is no "passing" the value between stacks into/outof a function register.

I thought assignment always copied the value (occasional compiler magic aside)? x = option.value always copies the value from option.value into whereever x is? Hiding the source behind a function forces that copying instead of being able to reference option.value directly?

In any case, we are way to far down the optimzation rabbit hole

100% agree, but I think you are misunderstanding me on the below?

The core discussion is the safety/immutability/readonly nature.

Not for me, this is about PR author preference vs reviewer preference. The engineering side has no right answer at the moment as we have no concrete use-cases for defra affected by this and so performance itself does not matter here. What does matter is that based on my personal past experience (typically writing software for the efficient aggregation and display of large datasets in mutation-safer langs) my personal preference (gut feeling saying this is safer) is to use the raw struct, whereas yours is to use a getter.

I do not care so much as to which one is chosen in the short term, but I do care an awful lot RE dev freedom and the priority of the author's personal preference - you earlier said you were happy enough either way and didn't want to play the CTO card. I also said I might even resort to flipping a coin :) I think the convo is now just continuing for fun/learning lol

typically writing software for the efficient aggregation and display of large datasets in mutation-safer langs

Just in case taken the wrong way, I mentioned that as the go-to/default tool of guy who has spent the last 5 years pushing nails into wood is going to be the hammer, even if the problem he is looking at is not necessarily nails-into-wood - we all have our biases.

Not for me, this is about PR author preference vs reviewer preference.

I mentioned further up that I don't believe this is a simple convo of author vs reviewer:
"I still don't think this particular example, knowing Idiomatic Go, and the gurantees/overhead of the language/type system, is a matter of personal preference, and would emphatically suggest implementing immutable gurantees on this type as described."

Obfuscation is a subjective definition, immutability is an objective one.

do not care so much as to which one is chosen in the short term, but I do care an awful lot RE dev freedom and the priority of the author's personal preference

I'm not sure I hold the same opinion here w.r.t freedom. We are all beholdent to the code, and we each individually have our biases and such. The whole point of the reviewer system to (imo) is to hopefully remove said biases by having multiple (one or more) ppl involved in the verification of said code. With all things being equal, a simple (subjective vs subjective) disagreement is fairly easy to resolve by defaulting to author, but I don't see that is a unbreakable rule. I am stubborn, I get tired, I write bad code (at times), I misunderstand goals/constraints, I accidentally leave fmt.Print debug statements in PRs, I have coding styles made up over experience, etc, simply being the author doesn't necessarily imbue stronger powers.

But, as I said before, my approval is already given, merge away 👍, most of my comments thus far have been to articulate my perceived dangers, not preferences. (We could say everything is a matter of preferences, but there has to be a line drawn in the sand, otherwise, noone would ever get anything done ;).)

Obfuscation is a subjective definition, immutability is an objective one.

And which one has greater costs is also highly subjective, and the opinions will vary depending upon which projects each developer has worked on in the past.

The code here is clear, correct (as far as we can see at least), easy to change, and doesn't run like a dog. It is not doing anything weird, is well enough organised, and has been tested. No new linter rules have been requested. I'd lump most things outside of those categories as personal opinion/preference and totally unimportant to the PR/codebase.

If we find ourselves regularly tripping up over mutations to this struct it is easy to change. Likewise performance issues directly resulting from getters are easy enough to solve once found. I would argue that poorly written code that came about due to overly abstracted data-structures is much harder to change, but that argument is only half-fair and loops us back into the subjectivity/personal-opinion topic.

The immutability guarantees provided by getters do not affect the code under review (nowhere in this review cares), and the against them are equally (ir)relevant.

well said!

I might even resort to flipping a coin :)

obligatory...
https://xkcd.com/221/

jsimnz · 2022-08-29T11:58:43Z

db/collection_update.go

+	return resultArray, nil
+}
+
+func convertNillableArrayToCvalWithConverter[TIn any, TOut any](val any, converter func(TIn) TOut) ([]*TOut, error) {


The Cval name isn't necessary, I assume you picked it up from the original variable name.

I thought it was the name of the serialization format value :) Is that incorrect? Will have a look at renaming these in the meantime - might not have to so specific anyway

rename?

Dropped ToCval from both function names

jsimnz · 2022-08-29T11:59:14Z

db/collection_update.go

@@ -548,6 +579,51 @@ func validateFieldSchema(val interface{}, field client.FieldDescription) (interf
 	return cval, err
 }

+func convertNillableArrayToCval[T any](val any) ([]*T, error) {


Cval name isn't necessary. Same as other comment

jsimnz · 2022-08-29T12:08:31Z

Some minor stuff.

Overall I'm concerned about the structure of our dynamic types, but that isn't at all the fault of this PR but a symptom that is expanded a bit in this PR.

Started playing around with a better designed internal type system that may eventually make with like this more succinct.

Will prob try to formally address in 0.4.

But, specific to this PR 👍

Also, huge praise on the scale of tests 🥰

AndrewSisley · 2022-08-29T13:15:00Z

Started playing around with a better designed internal type system that may eventually make with like this more succinct.

Would be interested in seeing that, our current setup is very wasteful in that everything on a doc is a pointer. And the obvious C-like correction of that would probably be quite unsafe and could freak out Go devs 😆

Also, huge praise on the scale of tests

Cheers, the prod code is simple enough, but the aggregates just means there are so many ways in which stuff like this could fail

Was no way to test this before for inline arrays - appart from via explain (which was written with an incorrect assertion), and it was broken :)

Allows for safe(r) and efficient handling of nillable values.

) * Correctly handle inline array nil-filters Was no way to test this before for inline arrays - appart from via explain (which was written with an incorrect assertion), and it was broken :) * Add option type Allows for safe(r) and efficient handling of nillable values. * Add nillable array decode func * Add func to convert to cval type * Add nillable bool array support * Add nillable int array support * Add nillable float array support * Add nillable string array support

AndrewSisley added feature New feature or request area/query Related to the query component area/schema Related to the schema system action/no-benchmark Skips the action that runs the benchmark. labels Aug 24, 2022

AndrewSisley added this to the DefraDB v0.3.1 milestone Aug 24, 2022

AndrewSisley requested a review from a team August 24, 2022 21:15

AndrewSisley self-assigned this Aug 24, 2022

shahzadlone reviewed Aug 24, 2022

View reviewed changes

shahzadlone reviewed Aug 26, 2022

View reviewed changes

AndrewSisley requested a review from shahzadlone August 26, 2022 15:08

shahzadlone approved these changes Aug 29, 2022

View reviewed changes

jsimnz requested changes Aug 29, 2022

View reviewed changes

AndrewSisley force-pushed the sisley/feat/I425-nillable-inline-arrays branch from 956afde to ea2c4cc Compare August 29, 2022 13:31

AndrewSisley added 9 commits August 29, 2022 09:31

Correctly handle inline array nil-filters

018a8ef

Was no way to test this before for inline arrays - appart from via explain (which was written with an incorrect assertion), and it was broken :)

Add option type

cd2d47c

Allows for safe(r) and efficient handling of nillable values.

Add nillable array decode func

4140ddf

Add func to convert to cval type

2468889

Add nillable bool array support

d4b342a

Add nillable int array support

768635c

Add nillable float array support

57a30b4

Add nillable string array support

648dd9e

HIDE - PR suggestions

eed8a32

AndrewSisley force-pushed the sisley/feat/I425-nillable-inline-arrays branch from ea2c4cc to eed8a32 Compare August 29, 2022 13:31

AndrewSisley requested a review from jsimnz August 29, 2022 13:41

jsimnz approved these changes Aug 29, 2022

View reviewed changes

AndrewSisley merged commit 7d072bf into develop Aug 30, 2022

AndrewSisley deleted the sisley/feat/I425-nillable-inline-arrays branch August 30, 2022 14:07

feat: Add support for inline arrays of nillable types #759

feat: Add support for inline arrays of nillable types #759

Conversation

AndrewSisley commented Aug 24, 2022

Relevant issue(s)

Description

codecov bot commented Aug 24, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shahzadlone Aug 24, 2022 • edited Loading

Choose a reason for hiding this comment

AndrewSisley Aug 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

AndrewSisley Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shahzadlone Aug 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shahzadlone left a comment

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

AndrewSisley Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz commented Aug 29, 2022

AndrewSisley commented Aug 29, 2022

codecov bot commented Aug 24, 2022 •

edited

Loading

shahzadlone Aug 24, 2022 •

edited

Loading

AndrewSisley Aug 25, 2022 •

edited

Loading

jsimnz Aug 29, 2022 •

edited

Loading

AndrewSisley Aug 29, 2022 •

edited

Loading

shahzadlone Aug 26, 2022 •

edited

Loading

AndrewSisley Aug 29, 2022 •

edited

Loading

AndrewSisley Aug 29, 2022 •

edited

Loading

AndrewSisley Aug 29, 2022 •

edited

Loading