kvstore: simplified stmt exec #467

arnetheduck · 2022-01-25T22:09:47Z

iterator for exec - avoid callbacks etc for simplified row reading

iterator for `exec` - avoid callbacks etc for simplified row reading

zah · 2022-01-26T08:51:57Z

4 yield statements inside an iterator would be quite heavy when it comes to code bloat. In the generated code, each yield statement is replaced by the full body of the consuming loop and the iterator is inlined everywhere where it's used.

arnetheduck · 2022-01-26T09:25:27Z

3 (one is when), but ok - I think I'll note it in a TODO though - the whole machinery for exec is quite unergonomic because it tries to be too clever with the tuple-vs-non-tuple types etc, ie I lost half a day trying to fight it - using the raw sqlite interface would have been a lot easier - I think this code needs a round of simplifications in general, and then the iterator can be improved as well

arnetheduck · 2022-01-26T10:09:28Z

down to 2 with reasonable code: 2079341

zah · 2022-01-27T16:50:20Z

If I understand correctly, the benefits of providing this API are rather superficial (some people will like the syntax better), but there are objective technical downsides (code bloat due to inlining).

Were there any problems with the fact that the existing API is based on closures (e.g. were you running into limitations regarding captured variables)?

arnetheduck · 2022-01-27T19:11:47Z

Were there any problems with the fact that the existing API is based on closures (e.g. were you running into limitations regarding captured variables)?

yes, var parameters need ugly workarounds which means you can't update an existing instance which leads to a host of downstream problems - nim introduces an uncontrolled amount of temporaries along the way which becomes significant when dealing with lots of data - we use pointers in some places, but that's ugly and the compiler is too dumb to understand that a closure is local and therefore really should inline itself (like a function operator in c++ / not create an environment.

regarding bloat, I have sympathy for not introducing unnecessary bloat, but I have more sympathy for readability in general - I think some of your fears about bloat miss the bigger picture a bt: if we're going to combat bloat, we need to look at the real costs, and as we've discussed with actual numbers before, macros (async) and generics (ie Table, serialization) are the actual real sources of bloat - not a yield here and there used in limited contexts - but we use them anyway because of the readability and reasonability benefits they bring.

iterators are really really sad already in nim - if we take away the ability to yield more than once, we might as well remove them entirely from our codebase because at that point, the mental overhead of keeping them in short-term active language feature memory is way too large, and we should advocate for their complete removal from Nim.

finally, compilers are really good at removing bloat through CSE in cases like this when there there is a boolean choice between the branches (works / doesn't work) - the actual bloat in the final binary will be nonexistent.

mratsim · 2022-01-28T07:48:47Z

yes, var parameters need ugly workarounds which means you can't update an existing instance which leads to a host of downstream problems - nim introduces an uncontrolled amount of temporaries along the way which becomes significant when dealing with lots of data - we use pointers in some places, but that's ugly and the compiler is too dumb to understand that a closure is local and therefore really should inline itself (like a function operator in c++ / not create an environment.

I've mentioned to @Araq a couple of times that having something akin to "Heap Allocation Elision" would be quite important, both for async with a future that doesn't escape its scope (and multithreading with Flowvar) and for chained closure iterators:

I can open a RFC.

See: https://discord.com/channels/371759389889003530/371759389889003532/809802135629856808

I don't mind heap elision optimizations and we have the technology ready for it (it's not that different from escape analysis, deep immutability or anything we already do)

arnetheduck · 2022-01-28T09:45:15Z

I've mentioned to @Araq a couple of times that having something akin to "Heap Allocation Elision" would be quite important, both for async with a future that doesn't escape its scope (and multithreading with Flowvar) and for chained closure iterators:

this is not quite relevant here: the reason to pass in a var is not necessarily to avoid allocations or to avoid closures, but because desired semantics dictate that we modify an existing instance, and doing that with captures is not .. safe / possible / bla bla..

the heap allocation is of less importance generally, even though avoiding it would be nice (for orthogonal reasons)

zah · 2022-01-28T13:34:58Z

When you author code, there are many layers of knowledge that you take into account (algorithmic, hardware, compilation theory, etc). Claiming that some of this knowledge is too much overhead, while other parts are acceptable is unreasonable IMO.

Iterators should be used whenever this makes sense. If there are simple ways to optimize them, they should be exploited. If you weighting the two sides on a decision like "should I use a function or an iterator", the size of the body of the iterator and the number of yield statements should just add weight towards selecting the function approach. Technically, it's possible to implement the iterators with multiple code generation strategies in the compiler (e.g. you can pass the body as a closure), so perhaps the code bloat concerns can be addressed in a systematic way in the future without requiring much rework of the code.

arnetheduck · 2022-01-28T16:37:48Z

Claiming that some of this knowledge is too much overhead, while other parts are acceptable is unreasonable IMO.

price vs performance - if a feature has so many caveats that it becomes an exception that you manage to use it correctly, it's the feature that is wrong, not the developer.

arnetheduck · 2022-01-28T16:40:04Z

so perhaps the code bloat concerns can be addressed in a systematic way in the future without requiring much rework of the code.

yes, this is what I would hope for, and as long as there's no measurable downside, it's good to write code as naturally as possible so as to give the right priority signals to the compiler devs: maybe it's an imaginary downside or maybe it's really prevalent and important, and sometimes, you just don't know until you try.

kvstore: simplified stmt exec

eaaa4b0

iterator for `exec` - avoid callbacks etc for simplified row reading

avoid a yield

2079341

oops

49d1d29

zah merged commit ce4acc1 into master Jan 28, 2022

zah deleted the exec-iterator branch January 28, 2022 13:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvstore: simplified stmt exec #467

kvstore: simplified stmt exec #467

arnetheduck commented Jan 25, 2022

zah commented Jan 26, 2022

arnetheduck commented Jan 26, 2022

arnetheduck commented Jan 26, 2022

zah commented Jan 27, 2022

arnetheduck commented Jan 27, 2022 •

edited

Loading

mratsim commented Jan 28, 2022 •

edited

Loading

arnetheduck commented Jan 28, 2022

zah commented Jan 28, 2022

arnetheduck commented Jan 28, 2022

arnetheduck commented Jan 28, 2022

kvstore: simplified stmt exec #467

kvstore: simplified stmt exec #467

Conversation

arnetheduck commented Jan 25, 2022

zah commented Jan 26, 2022

arnetheduck commented Jan 26, 2022

arnetheduck commented Jan 26, 2022

zah commented Jan 27, 2022

arnetheduck commented Jan 27, 2022 • edited Loading

mratsim commented Jan 28, 2022 • edited Loading

arnetheduck commented Jan 28, 2022

zah commented Jan 28, 2022

arnetheduck commented Jan 28, 2022

arnetheduck commented Jan 28, 2022

arnetheduck commented Jan 27, 2022 •

edited

Loading

mratsim commented Jan 28, 2022 •

edited

Loading