You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A potential CPU usage improvement opportunity: When unpacking a record, e.g.
SELECT (rec).f1, (rec).f2, (rec).f3, (rec).f4, (rec).f5 FROM records
we execute this as 5 record_get function calls, and each record_get call gets passed the full record as a Datum, on which it does.unwrap_list().iter().nth(self.0).unwrap(). This means that each record_get call decodes each field that is before the requested field from bytes to Datums (i.e., read_datum is called a quadratic number of times in total). This could surely be improved somehow.
A possible approach is to create an unpack_record table function. Note that table functions are allowed to produce multiple columns (not just multiple rows).
The question is where should we introduce calls to unpack_record. One possibility is to introduce it when do it when the user types the special syntax (col).* (this syntax already exists). Another possibility is to make the optimizer capable of recognizing the pattern of multiple record_get calls on the same record, and turn it into an unpack_record. Note that these two possibilities are not mutually exclusive.
I think we should only go with the optimizer-based approach, and do the new transform near the end of the pipeline, because:
The user might not sue the (col).* syntax.
Window functions also introduce record_get calls. (Although, we could change the window function HIR lowering to introduce unpack_record instead of record_gets.)
The biggest issue with introducing unpack_record already when planning (col).* is that then we'd have new FlatMaps throughout the optimizer pipeline, and transforms are not very good at handling FlatMap.
The drawback of introducing it only at the end of the optimizer pipeline is that it's a bit delicate how things are happening near the end of the optimizer pipeline. But I think this would be managable.
There is the question of where exactly put it in the physical_optimizer. I'd sat after JoinImplementation but before NormalizeLets, because if it came before then it could prevent arrangement reuses.
Note that unfortunately it can make certain join plans worse in some corner cases even if it comes after, because it can prevent the lowering from pushing record deconstructions into the Join's LIR plan, but this problem would be present even if we put the new transform before JoinImplementation. If we see this to be a big problem, then later we could either
Implement the pushing of record unpacking into LIR Join plans,
Or make the transform not fire for MFPs that are directly on top of Joins where the MFP being pushed into the Join looks important (e.g. due to filters on record fields).
(The alternative idea of implementing advance_by for DatumListIter also came up, but Moritz said he already tried that, and it didn't make a difference.)
The text was updated successfully, but these errors were encountered:
Update: This turns out to be not so urgent. After #29554, we consolidate before unpacking the records, so the unpacking is only performed on the delta.
An exception is when the MFP above is fused into the Reduce. This happens mostly when it involves a filter, but even then it's probably unpacking just a few fields, so the quadraticness of a full unpack is not really happening.
Problem
A potential CPU usage improvement opportunity: When unpacking a record, e.g.
we execute this as 5
record_get
function calls, and each record_get call gets passed the full record as a Datum, on which it does.unwrap_list().iter().nth(self.0).unwrap()
. This means that eachrecord_get
call decodes each field that is before the requested field from bytes to Datums (i.e.,read_datum
is called a quadratic number of times in total). This could surely be improved somehow.Benefits of solving this
This comes up when unpacking nested avro records (which are very common, according to @benesch), and also here, and also for every window function call.
Solution
A possible approach is to create an
unpack_record
table function. Note that table functions are allowed to produce multiple columns (not just multiple rows).The question is where should we introduce calls to
unpack_record
. One possibility is to introduce it when do it when the user types the special syntax(col).*
(this syntax already exists). Another possibility is to make the optimizer capable of recognizing the pattern of multiplerecord_get
calls on the same record, and turn it into anunpack_record
. Note that these two possibilities are not mutually exclusive.I think we should only go with the optimizer-based approach, and do the new transform near the end of the pipeline, because:
(col).*
syntax.record_get
calls. (Although, we could change the window function HIR lowering to introduceunpack_record
instead ofrecord_get
s.)unpack_record
already when planning(col).*
is that then we'd have newFlatMap
s throughout the optimizer pipeline, and transforms are not very good at handlingFlatMap
.The drawback of introducing it only at the end of the optimizer pipeline is that it's a bit delicate how things are happening near the end of the optimizer pipeline. But I think this would be managable.
There is the question of where exactly put it in the
physical_optimizer
. I'd sat afterJoinImplementation
but beforeNormalizeLets
, because if it came before then it could prevent arrangement reuses.Note that unfortunately it can make certain join plans worse in some corner cases even if it comes after, because it can prevent the lowering from pushing record deconstructions into the Join's LIR plan, but this problem would be present even if we put the new transform before
JoinImplementation
. If we see this to be a big problem, then later we could eitherSlack discussion:
https://materializeinc.slack.com/archives/C0761MZ3QD9/p1718655594016239
(The alternative idea of implementing
advance_by
forDatumListIter
also came up, but Moritz said he already tried that, and it didn't make a difference.)The text was updated successfully, but these errors were encountered: