The execution time in `pipeline` mode is distortion sometimes #9489

windtalker · 2024-09-29T06:01:03Z

Bug Report

Please answer these questions before submitting your issue. Thanks!
Running the same query in pipeline mode and pull mode, the overall query peformance is almost the same, but the execution time for each executor differs a lot between pipeline mode and pull mode

In pipeline mode

HashAgg: 2.23s
Projection: 2.71s
Selection: 1.75s

Cpu flame graph

And in pull mode

HashAgg: 4.4s
Projection: 2.8s
Selection: 0.26s

Cpu flame graph

And compared with cpu flame graph, the executor time in pull mode is much more reasonable.

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

windtalker · 2024-09-30T05:40:08Z

After some investigations, I don't find obviously bugs that is releated to the executor's execution time. One possible cause of this distortion seems to be in PipelineExec::finalizeProfileInfo

tiflash/dbms/src/Flash/Pipeline/Exec/PipelineExec.cpp

Lines 237 to 250 in 69dd613

    
           void PipelineExec::finalizeProfileInfo(UInt64 queuing_time, UInt64 pipeline_breaker_wait_time) 
        
           { 
        
               // For the pipeline_breaker_wait_time, it should be added to the pipeline breaker operator(AggConvergent and JoinProbe), 
        
               // However, if there are multiple pipeline breaker operators within a single pipeline, it can become very complex. 
        
               // Therefore, to simplify matters, we will include the pipeline schedule duration in the execution time of the source operator. 
        
               // 
        
               // For the queuing_time, it should be evenly distributed across all operators. 
        
               // 
        
               // TODO Refining execution summary, excluding extra time from execution time. 
        
               // For example: [total_time:6s, execution_time:1s, queuing_time:2s, pipeline_breaker_wait_time:3s] 
        
               // The execution time of operator[i] = self_time_from_profile_info + sum(self_time_from_profile_info[i-1, .., 0]) + (i + 1) * extra_time / operator_num. 
        
               source_op->getProfileInfo()->execution_time += pipeline_breaker_wait_time;

It evenly distributes the task queue time to all the operators.

In this specific case, the TiDB's executor tree is
TableScan => Selection => Projection => Aggregation
The task queue time is about 28.32 seconds.

In pipemode, the executor tree is converted to a pipeline task that looks like

SourceOp(TableScan)=>FilterTransformOp=>ExpressionTransformOp(Selection)=>ExpressionTransformOp=>ExpressionTransfromOp(Projection)=>SinkOp(Aggregation)

The concurrency is 8.
Now assuming all the task has the same queue time, then each task the queue time is 28.32/8 = 3.54s, for each op, the queue time is 3.54/6 = 0.59s
If we remove the task queue time, then each executor's execution time will be
HashAgg: 1.64s
Projection: 1.53s
Selection: 0.57s
TableScan: 0.42s

Although still not the same as pull mode, the results is much better than the original one.

Actually, the output of Explain analyze is designed with the assumption that the execution model is similar to the volcano execution model, while pipeline execution model is completely different from the volcano execution model, it is kind of expected that the execution time displayed in Explain analyze maybe distortion. I think we need a comprehensive design of how to adapt the output of explain analyze in the pipeline model. cc @yibin87

windtalker · 2024-10-15T07:31:06Z

Change this to an enhancement

…mmary (#9566) close #9489 Signed-off-by: yibin <[email protected]> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

ref pingcap/tiflash#9489, close #57059

…mmary (#9566) (#9612) close #9489 Signed-off-by: yibin <[email protected]> Signed-off-by: yibin <[email protected]> Co-authored-by: yibin <[email protected]> Co-authored-by: yibin <[email protected]>

ref pingcap/tiflash#9489, close #57059

windtalker added the type/bug The issue is confirmed as a bug. label Sep 29, 2024

windtalker changed the title ~~The execution time in pipeline mode is distortion sometimes~~ The execution time in pipeline mode is distortion sometimes Sep 29, 2024

yibin87 self-assigned this Oct 8, 2024

jebter added severity/major component/compute labels Oct 8, 2024

ti-chi-bot bot added may-affects-5.4 may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 may-affects-8.1 labels Oct 8, 2024

windtalker added type/enhancement The issue or PR belongs to an enhancement. and removed type/bug The issue is confirmed as a bug. labels Oct 15, 2024

This was referenced Oct 21, 2024

Execution summary improvements pingcap/tidb#56232

Open

Add tiflash pipeline wait summary and minTSO wait time in execution summary pingcap/tipb#346

Merged

This was referenced Oct 30, 2024

Add pipeline wait time of minTSO and pipeline breaker in execution summary #9566

Merged

util: add tiflash wait time in execution summary pingcap/tidb#57058

Merged

Add tiflash wait info in execution summary pingcap/tidb#57059

Closed

ti-chi-bot added the affects-8.5 label Nov 1, 2024

ti-chi-bot bot closed this as completed in #9566 Nov 13, 2024

ti-chi-bot bot pushed a commit to pingcap/tidb that referenced this issue Nov 13, 2024

util: add tiflash wait time in execution summary (#57058)

aa9b6f4

ref pingcap/tiflash#9489, close #57059

This was referenced Nov 14, 2024

Add pipeline wait time of minTSO and pipeline breaker in execution summary (#9566) #9612

Merged

util: add tiflash wait time in execution summary (#57058) pingcap/tidb#57371

Merged

ti-chi-bot bot pushed a commit to pingcap/tidb that referenced this issue Nov 14, 2024

util: add tiflash wait time in execution summary (#57058) (#57371)

f485d63

ref pingcap/tiflash#9489, close #57059

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The execution time in `pipeline` mode is distortion sometimes #9489

The execution time in `pipeline` mode is distortion sometimes #9489

windtalker commented Sep 29, 2024 •

edited

Loading

windtalker commented Sep 30, 2024 •

edited

Loading

windtalker commented Oct 15, 2024

The execution time in pipeline mode is distortion sometimes #9489

The execution time in pipeline mode is distortion sometimes #9489

Comments

windtalker commented Sep 29, 2024 • edited Loading

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

windtalker commented Sep 30, 2024 • edited Loading

windtalker commented Oct 15, 2024

The execution time in `pipeline` mode is distortion sometimes #9489

The execution time in `pipeline` mode is distortion sometimes #9489

windtalker commented Sep 29, 2024 •

edited

Loading

windtalker commented Sep 30, 2024 •

edited

Loading