feat: Expand enumerable types (queue, socket) #7

AndrewSisley · 2023-06-09T15:24:13Z

Resolves #6

Expands enumerable types, adding a ring-buffer queue, and a socket, as well as allowing the Concat enumerable to receive new sources after enumeration has begun.

These are to be used by the lens-in-defra code.

You can see how they are to be used in the WIP draft here: sourcenetwork/defradb#1564 - the code in Next function in lens/lens.go is the most relevant, and whilst still fairly unpolished it should be fairly stable.

This will be used by lens within Defra.

Will be used by lens within defra

Will be used by lens-in-defra

AndrewSisley · 2023-06-19T14:28:02Z

enumerable/concat.go

 func (s *enumerableConcat[T]) Next() (bool, error) {
+	startSourceIndex := s.currentSourceIndex
+	hasLooped := false
+
 	for {


thought: I messed up this logic briefly and there is a strong case for adding concat tests within this PR I think as this type is now a bit more complicated. It is tested by defra however. Let me know, otherwise I will just see how I feel once approved.

It would be nice to have it tested within the repo.

I feel like this Next logic is a lot more complicated then it needs to be, and a few potential oddities, holding off specifics as you mentioned it was briefly messed up, is the current version the corrected version.

To make sure we're on the same page on the intended behavior of Concatenation types, can you add a docstring to the interface describing the behavior of the new one. I think its straight forward, but just want to confirm, and make sure others that come across this also know.

I feel like this Next logic is a lot more complicated then it needs to be

The complication involved is to preserve the order of items, so that Next does not bounce in and out of sources. We want to yield items from source A, then B, then C, then check if A has new items.

To make sure we're on the same page on the intended behavior of Concatenation types, can you add a docstring to the interface describing the behavior of the new one.

I'm not sure I understand you here. There is only one concrete Concatenation type, and one interface, and they are both doucmented.

Tests have been added

fredcarle

I have a few suggestions related to return types and a question.

fredcarle · 2023-06-19T16:12:08Z

enumerable/concat.go

-func Concat[T any](sources ...Enumerable[T]) Enumerable[T] {
+//
+// New sources may be added after iteration has begun.
+func Concat[T any](sources ...Enumerable[T]) Concatenation[T] {


suggestion: Not sure if I mentioned this before but we should return concrete type when possible. This way type methods don't need to be strictly tied to the interface.

The concrete types are all private, and the new functions public.

You can make the concrete types public then. I really think that unless an interface method needs an interface return type that we try to avoid returning interface types

You can make the concrete types public then. I really think that unless an interface method needs an interface return type that we try to avoid returning interface types

I am unsure as to how to express just how much I really do not want to make the concrete types public.

Why? You don't have to make the fields public. It's much better than returning an interface type.

Nope. This are internal types, and I highly value the ability to rip and replace them, or allow for more complex constructors which may not return a single concrete type without having to worry about people directly referencing them.

These are not run of the mill business-logic types, but super generic data structures, and I am somewhat sceptical of anyone's ability to convince me to expose them in the short-medium term. I think that is very bad thing to do.

There is absolutely no reason for anyone to need to directly interact with the concrete types in this package.

I like the interfaces (granted I'm a little interface happy compared to average Go dev 😉)

Nope. This are internal types, and I highly value the ability to rip and replace them, or allow for more complex constructors which may not return a single concrete type without having to worry about people directly referencing them.

Returning a concrete type makes zero difference for this statement. You can still rip and replace without worrying about people directly referencing them.

There is absolutely no reason for anyone to need to directly interact with the concrete types in this package.

Again, returning a concrete type makes zero difference for this statement. All the methods relate to the concrete type and if there are no public fields the liming factor is the same.

The drawback with returning an interface is that if you want to add functionality to the type and still be able to use the constructor, you have to also change the interface which one might not want to do. It's much more flexible.

I like the interfaces (granted I'm a little interface happy compared to average Go dev 😉)

The number of times I've hit walls because of us using interfaces for return types 😤... lol

For reference: https://github.com/golang/go/wiki/CodeReviewComments#interfaces

fredcarle · 2023-06-19T16:12:42Z

enumerable/queue.go

+// NewQueue creates an empty FIFO queue.
+//
+// It is implemented using a dynamically sized ring-buffer.
+func NewQueue[T any]() Queue[T] {


suggestion: Same as above. Return concrete type instead of interface.

fredcarle · 2023-06-19T16:14:19Z

enumerable/socket.go

+// If enumeration begins before a source has been set it will behave as if empty.
+// Reseting the Socket will reset the source if there is one, and then remove it
+// as the source of this Socket.
+func NewSocket[T any]() Socket[T] {


suggestion: Same as above. Return concrete type.

fredcarle · 2023-06-19T16:18:12Z

enumerable/socket.go

+
+// Socket is an extention of the enumerable interface allowing the source
+// to be replaced after initial construction.
+type Socket[T any] interface {


question: Can you explain why you decided to name this type Socket? At the moment it doesn't jibe with my current definition of a socket.

A socket is a fixture into which other stuff may slot into.

I find it confusing as it leads me to believe that it relates to network sockets. Don't you think?

network sockets took the name from the same non-computing concept and I see them as being conceptually quite similar, only one is remote/cross-process, and this is not.

jsimnz

Have some questions on some of the design choices, no immedage "todos" at the moment, depending on the conversations, there may be followup suggestions/todos.

Marking as "Request Changes" for now just to prevent merge (sorry).

jsimnz · 2023-06-19T16:22:41Z

enumerable/concat.go

+func Concat[T any](sources ...Enumerable[T]) Concatenation[T] {
 	return &enumerableConcat[T]{
 		sources:            sources,
 		currentSourceIndex: 0,
 	}
 }

+// Append appends a new source to this concatenation.
+//
+// This may be done after enumeration has begun.
+func (s *enumerableConcat[T]) Append(newSource Enumerable[T]) {
+	s.sources = append(s.sources, newSource)
+}


question: Little confused on the balance of having both Append and Concat. Where Concat will take a list of sources, append them together, and produce a Concatentation[T] type, which also allows additional sources to be appeneded together via Append method.

Would it not be better to have a single Concat or Append top level function with no methods? This would produce a pattern similar to the existing native append in go, where you always use the append, and the first parameter of the append is the target slice, and the rest are elements to be added. Should be easy enough to define a

EnumerableSlice type for example, which still implements the Enumerable interface.

Would it not be better to have a single Concat or Append top level function with no methods?

I am not sure I understand what you mean here. Concat is the constructor, Append allows adding new source in to an existing item.

Append needs to live on the concat type, as only a concat type can handle this, and the stuff required for implementation are internal/private.

jsimnz · 2023-06-19T16:30:14Z

enumerable/concat.go

 func (s *enumerableConcat[T]) Next() (bool, error) {
+	startSourceIndex := s.currentSourceIndex
+	hasLooped := false
+
 	for {


I feel like this Next logic is a lot more complicated then it needs to be, and a few potential oddities, holding off specifics as you mentioned it was briefly messed up, is the current version the corrected version.

To make sure we're on the same page on the intended behavior of Concatenation types, can you add a docstring to the interface describing the behavior of the new one. I think its straight forward, but just want to confirm, and make sure others that come across this also know.

jsimnz · 2023-06-19T16:33:23Z

enumerable/queue.go

+
+// NewQueue creates an empty FIFO queue.
+//
+// It is implemented using a dynamically sized ring-buffer.


question: Why does a FIFO queue need to be implemented as a ring-buffer? Does this imply you can lose elements in the queue?

It is a dynamically sized ring buffer, no items can be lost. It is just a more efficienct way of implementing a queue.

jsimnz · 2023-06-19T16:37:48Z

enumerable/queue.go

+		// For now, increasing the size one at a time is likely optimal
+		// for the only useage of the queue type.  We may wish to change
+		// this at somepoint however.
+		newValues := make([]T, len(q.values)+1)
+		copy(newValues, q.values)
+		q.values = newValues


I don't see why this is a preferred method for growing the size of the values slice. Why not just append it? This current method is extremely ineffecient, as it allocates and copys the whole slices every time (short of the ring buffer loop back, but I have a seperate comment above regarding the ring-buffer design).

The native append is pretty optimized to avoid unnecessary allocations/copies.

The +1 is temporary-ish and is likely to change. copy is much better to use than append when adding more than 1 item at a time.

copy most certainly is not better when adding more than 1 item. append can append as many items as you want, and will determine the ideal allocations/copies when needed.

I think you are mistaken, check the benches in this blog post: https://eremeev.ca/posts/golang-copy-vs-append/

jsimnz · 2023-06-19T16:38:20Z

enumerable/queue.go

+	values       []T
+	currentIndex int
+	lastSetIndex int
+	zeroIndexSet bool


question: This seems to be a config/behavior flag, can you document it :)

It is not, it is internal stuff required for managing internal state.

documentation would be helpful

Will add, although I'm not sure much can be added to the declarations themselves.

doc private internal state props

jsimnz · 2023-06-19T16:39:06Z

enumerable/queue_test.go

praise: Great tests 👍

jsimnz · 2023-06-19T16:41:00Z

enumerable/socket.go

+
+func (s *socket[T]) Value() (T, error) {
+	if !s.source.HasValue() {
+		var v T


nitpick: Slight preference for var zero T when using the default/zero value in a generic function like this. (SMALL PREFERENCE, feel free to ignore :) )

jsimnz · 2023-06-19T16:42:00Z

enumerable/socket.go

+
+// Socket is an extention of the enumerable interface allowing the source
+// to be replaced after initial construction.
+type Socket[T any] interface {


question: Can you describe the usecase for this particular type?

You can see this being used here: https://github.com/sourcenetwork/defradb/pull/1564/files#diff-e7f1317923231a3017c3a580d21ce39346555446638d045a69310b1a1aeab15d in the MigrateUp function. It allows enumerables to be defined before their source, and for sources to be swapped in and out.

islamaliev

in general looks good, but there are some major concerns related to performance.

islamaliev · 2023-06-20T19:04:09Z

enumerable/concat.go

@@ -1,5 +1,15 @@
 package enumerable

+// Concatenation is an extention of the enumerable interface allowing new sources
+// to be added after initial construction.
+type Concatenation[T any] interface {


suggestion: I think the name does not quite reflect it's purpose.
If it's Enumerable that can be expanded why not to call it ExpendableEnumerable or DynamicEnumerable

expanded or Dynamic is way too vague for my liking as it can mean a whole bunch of stuff. And concat/concatenation is a fairly common term for this behaviour.

islamaliev · 2023-06-21T08:28:55Z

enumerable/concat.go

 func (s *enumerableConcat[T]) Next() (bool, error) {
+	startSourceIndex := s.currentSourceIndex
+	hasLooped := false
+
 	for {
 		if s.currentSourceIndex >= len(s.sources) {


suggestion: I think it's worth mentioning that this scenario is only possible if Next is called on an exhausted enumerable.

Not a problem, will add

doc if case

islamaliev · 2023-06-21T08:48:27Z

enumerable/concat_test.go

+
+	hasNext, err = concat.Next()
+	require.NoError(t, err)
+	require.False(t, hasNext)


suggestion: these repetitive checks can be put in a loop

I kind of dislike that. We are testing a small number of specific items, and in some cases, they are from different sources. I see a loop as a complication and abstraction that detracts from the readability of the test.

I also thought about refactoring it out into a private test func, but that also masks the Act and Assert part of the test and I much prefer them to be highly visible.

islamaliev · 2023-06-21T09:06:10Z

enumerable/queue.go

+	//
+	// This may include empty space where yield items previously resided.
+	// Useful for testing and debugging.
+	Size() int


question: why don't we have Size for Enumerable?

Would mean complicating the core interface with a function that is not used in production code. And implementing it on every type, regardless as to whether it makes any sense on that type (I dont think it does for a lot of them, many just have a source enumerable and a predicate/int/etc).

What you just stated Andy is one of the reasons why returning the concrete type is better practice in Go. If you want a Size when it make sense you don't need to modify the interface for it :)

islamaliev · 2023-06-21T09:35:51Z

enumerable/queue.go

+// For now, increasing the size one at a time is likely optimal
+// for the only useage of the queue type.  We may wish to change
+// this at somepoint however.
+const growthRate int = 1


todo: this is not a rate. Rate should be a floating number. The better name would be growthSize

Agreed, and I felt a little uncomfortable calling it rate initially but failed to come up with an alternative at the time. Size is much more accurate, thanks.

Rename const

islamaliev · 2023-06-21T09:44:11Z

enumerable/queue.go

+	}
+
+	if index >= len(q.values) {
+		newValues := make([]T, len(q.values)+growthRate)


todo: why to increment capacity by 1?

This is extremely inefficient for any reasonably sized data structure.
I'd say we need to go with 1.5 ratio. Can't think of any disadvantages

This is documented on the const.

islamaliev · 2023-06-21T09:50:04Z

enumerable/queue.go

+type queue[T any] struct {
+	values       []T
+	currentIndex int
+	lastSetIndex int


suggestion: wouldn't it be more intuitive to call these two readIndex and writeIndex?

I think read/write is less accurate though, especially given that the two props actually differ in terms of temporality: currentIndex is the last index produced by a call to Next, that may or may not be out of bounds, and may or may not have been read via Value. lastSetIndex has been set, was valid, and is essentially a record of the past.

Writing all that out though, I may have missed something whilst testing.

Test post-loop Next=>Put=>Value - I'm pretty sure it is broken, and that a space between these two integers should always been maintained to prevent the premature overwrite of Value

islamaliev · 2023-06-21T10:04:28Z

enumerable/queue.go

+	} else if index == q.currentIndex+growthRate {
+		// If the write index has caught up to the read index
+		// the new value needs to be written between the two
+		// e.g: [3,4,here,1,2]


question: why not rearrange the elements?

I see now the reason why you grow the size by 1: it's easier to handle this scenario.
I would say when this happens we can rearrange from:

[3,4,1,2] ^ | read and write index

to this:

[1,2,3,4,_,_,_] ^ ^ | | | write index read index

The queue is not thread-safe anyway

That is an interesting thought. Will look at this some more. I do worry that this might make an future threadsafetiness harder/more-expensive to achieve, but maybe not really anymore than allowing the buffer to grow already does.

What benefit do you see in this though?

I though this was the main reason why you chose (for now) to grow the array by 1. So that it's easier to manage indexes. Otherwise your read index will have to keep in mind that there are empty slots for future writing that it will have to jump over.

[1,2,_,_,_,3,4] ^ ^ ^ | | next read index | write index read index

But with the rearrangement you can grow it as much as you want.

Ah no, the +1 is because I only want it to grow by one. The problem you just described doesnt exist, as read never crosses the gap (it is a queue, if an item is read it is consumed and never gets read again - if the read index reaches the gap it means the queue is currently empty).

Note: I actually tried using this setup last week to help solve an edge case. It did not simplify the code unfortunately.

islamaliev · 2023-06-21T10:09:06Z

enumerable/concat_test.go

+	"github.com/stretchr/testify/require"
+)
+
+func TestConcatYieldsNothingGivenEmpty(t *testing.T) {


suggestion: may try the agreed upon test name structure: <category>_<condition>_<result>?

This example would be TestConcat_IfEmpty_YieldNothing

I had already added tests before that conversation took place, and I'd prefer not to mix and match within a PR, and I'd also rather not rename everything. The agreement was more of a 'lets try this for a bit' agreement, not a 'lets convert the entire codebase right now' to this agreement.

islamaliev · 2023-06-21T10:11:53Z

enumerable/queue_test.go

+	require.True(t, hasNext)
+
+	r2, err := queue.Value()
+	// [4, 5, 6, , 3]


praise: helpful visualisation

islamaliev · 2023-06-22T14:16:10Z

enumerable/queue.go

+
+// For now, increasing the size one at a time is likely optimal
+// for the only useage of the queue type.  We may wish to change
+// this at somepoint however.


question: I can't see why we would want to change it in the future and not now. (original comment #7 (comment))

Would be nice to hear some benefits of growing the size by 1.

Because +1 fits the current use case very well, but in the future that may change as both the scope of Lens-in-Defra grows, and/or if other areas wish to use this type.

Please remember that Lens in Defra is currently as simple as it will get, and it will very likely continue to grow in complexity.

I don't think it's a valid argument when developing an independent component, which the whole enumerable is.
We should keep in mind ideally all or at least major use cases, and not only specific ones.
What if we decide (which is highly likely) to use the queue in some other component and stuff it with thousands of items?
The cost of making it more future-proof is low, so I still don't understand why we don't do it now.

I agree with you Islam and I'm not a fan of using copy in this situation either but for the sake of moving this PR ahead, I think we can leave it as is and if we end up using it elsewhere, we can make the change at that time.

fredcarle

LGTM

AndrewSisley requested review from jsimnz, islamaliev, fredcarle and shahzadlone June 9, 2023 15:24

AndrewSisley self-assigned this Jun 9, 2023

AndrewSisley added 3 commits June 16, 2023 15:09

Allow appending new sources to concatenation

4ab8d6d

This will be used by lens within Defra.

Add FIFO ring-buffer queue

d1111f8

Will be used by lens within defra

Add socket enumerable

6063ec4

Will be used by lens-in-defra

AndrewSisley force-pushed the sisley/6-concat-fanciness branch from bf8492a to 6063ec4 Compare June 16, 2023 19:09

AndrewSisley commented Jun 19, 2023

View reviewed changes

fredcarle requested changes Jun 19, 2023

View reviewed changes

jsimnz requested changes Jun 19, 2023

View reviewed changes

AndrewSisley requested review from jsimnz and fredcarle June 19, 2023 17:38

AndrewSisley force-pushed the sisley/6-concat-fanciness branch 4 times, most recently from 639847a to 599e1c8 Compare June 20, 2023 16:53

islamaliev requested changes Jun 21, 2023

View reviewed changes

islamaliev reviewed Jun 22, 2023

View reviewed changes

AndrewSisley force-pushed the sisley/6-concat-fanciness branch from 599e1c8 to bda585e Compare June 23, 2023 14:18

AndrewSisley added 2 commits June 23, 2023 12:22

PR FIXUP - Correct post-loop growth behaviour

4ed3981

PR FIXUP - Add Concat tests

ca35901

AndrewSisley force-pushed the sisley/6-concat-fanciness branch from bda585e to ca35901 Compare June 23, 2023 16:22

AndrewSisley requested a review from islamaliev June 23, 2023 16:24

AndrewSisley added 3 commits June 23, 2023 15:29

PR FIXUP - Document concat next loop

326a7c5

PR FIXUP - Rename growthFoo const

ad537cc

PR FIXUP - Document queue private props

463e41f

fredcarle approved these changes Jun 27, 2023

View reviewed changes

AndrewSisley merged commit 65c38cd into main Jun 30, 2023

feat: Expand enumerable types (queue, socket) #7

feat: Expand enumerable types (queue, socket) #7

Conversation

AndrewSisley commented Jun 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fredcarle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fredcarle Jun 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Jun 19, 2023 • edited Loading

Choose a reason for hiding this comment

islamaliev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

islamaliev Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

AndrewSisley Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley commented Jun 9, 2023 •

edited

Loading

fredcarle Jun 19, 2023 •

edited

Loading

AndrewSisley Jun 21, 2023 •

edited

Loading

AndrewSisley Jun 19, 2023 •

edited

Loading

AndrewSisley Jun 21, 2023 •

edited

Loading

AndrewSisley Jun 21, 2023 •

edited

Loading

AndrewSisley Jun 21, 2023 •

edited

Loading

islamaliev Jun 22, 2023 •

edited

Loading

AndrewSisley Jun 22, 2023 •

edited

Loading