Optimize record unpacking #27737

ggevay · 2024-06-18T15:47:36Z

Problem

A potential CPU usage improvement opportunity: When unpacking a record, e.g.

SELECT (rec).f1, (rec).f2, (rec).f3, (rec).f4, (rec).f5 FROM records

we execute this as 5 record_get function calls, and each record_get call gets passed the full record as a Datum, on which it does .unwrap_list().iter().nth(self.0).unwrap(). This means that each record_get call decodes each field that is before the requested field from bytes to Datums (i.e., read_datum is called a quadratic number of times in total). This could surely be improved somehow.

Benefits of solving this

This comes up when unpacking nested avro records (which are very common, according to @benesch), and also here, and also for every window function call.

Solution

A possible approach is to create an unpack_record table function. Note that table functions are allowed to produce multiple columns (not just multiple rows).

The question is where should we introduce calls to unpack_record. One possibility is to introduce it when do it when the user types the special syntax (col).* (this syntax already exists). Another possibility is to make the optimizer capable of recognizing the pattern of multiple record_get calls on the same record, and turn it into an unpack_record. Note that these two possibilities are not mutually exclusive.

I think we should only go with the optimizer-based approach, and do the new transform near the end of the pipeline, because:

The user might not sue the (col).* syntax.
Window functions also introduce record_get calls. (Although, we could change the window function HIR lowering to introduce unpack_record instead of record_gets.)
The biggest issue with introducing unpack_record already when planning (col).* is that then we'd have new FlatMaps throughout the optimizer pipeline, and transforms are not very good at handling FlatMap.

The drawback of introducing it only at the end of the optimizer pipeline is that it's a bit delicate how things are happening near the end of the optimizer pipeline. But I think this would be managable.

There is the question of where exactly put it in the physical_optimizer. I'd sat after JoinImplementation but before NormalizeLets, because if it came before then it could prevent arrangement reuses.

Note that unfortunately it can make certain join plans worse in some corner cases even if it comes after, because it can prevent the lowering from pushing record deconstructions into the Join's LIR plan, but this problem would be present even if we put the new transform before JoinImplementation. If we see this to be a big problem, then later we could either

Implement the pushing of record unpacking into LIR Join plans,
Or make the transform not fire for MFPs that are directly on top of Joins where the MFP being pushed into the Join looks important (e.g. due to filters on record fields).

Slack discussion:
https://materializeinc.slack.com/archives/C0761MZ3QD9/p1718655594016239

(The alternative idea of implementing advance_by for DatumListIter also came up, but Moritz said he already tried that, and it didn't make a difference.)

The text was updated successfully, but these errors were encountered:

ggevay · 2024-06-18T15:48:25Z

I've assigned this to myself, because it has some synergies with my current work on #17522.

ggevay · 2024-09-21T12:32:36Z

Update: This turns out to be not so urgent. After #29554, we consolidate before unpacking the records, so the unpacking is only performed on the delta.

An exception is when the MFP above is fused into the Reduce. This happens mostly when it involves a filter, but even then it's probably unpacking just a few fields, so the quadraticness of a full unpack is not really happening.

ggevay added A-CLUSTER Topics related to the CLUSTER layer A-optimization Area: query optimization and transformation labels Jun 18, 2024

ggevay self-assigned this Jun 18, 2024

ggevay mentioned this issue Jul 2, 2024

[Evergreen] Window functions #16367

Open

59 tasks

This was referenced Sep 6, 2024

Micro-optimizations around window functions #29379

Merged

[Epic] Window Function Optimizations #20741

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize record unpacking #27737

Optimize record unpacking #27737

ggevay commented Jun 18, 2024 •

edited

Loading

ggevay commented Jun 18, 2024

ggevay commented Sep 21, 2024

Optimize record unpacking #27737

Optimize record unpacking #27737

Comments

ggevay commented Jun 18, 2024 • edited Loading

Problem

Benefits of solving this

Solution

ggevay commented Jun 18, 2024

ggevay commented Sep 21, 2024

ggevay commented Jun 18, 2024 •

edited

Loading