FEATURE: Speedup content cache flush by using cte in `findAncestorNodeAggregateIds` #5261

mhsdesign · 2024-09-24T15:46:28Z

Followup to #5221

ContentGraph::findParentNodeAggregates becomes slower on bigger datasets. Due to the mass on executions on a cr:replay, this sums up very quickly. Via #5268 the query will be improved but this pr introduces ContentGraph::findAncestorNodeAggregateIds to make this operation as performant as possible.

Adds comment for CacheFlushingStrategy strategies
Introduces ContentGraphInterface::findAncestorNodeAggregateIds using native sql cte to speedup cache flushing. (see comment)
Move test which creates illegal state to content graph package and use native sql to create the state to not run any catchup hooks. Previously we needed to handle the case of infinite loops to not crash:

Prevent infinite loops
NOTE: Normally, the content graph cannot contain cycles. However, during the
testcase "Features/ProjectionIntegrityViolationDetection/AllNodesAreConnectedToARootNodePerSubgraph.feature"
and in case of bugs, it could have actually cycles.
The content cache catchup hook leverage this method and would otherwise be hanging up in an endless loop.
That's why we track the seen NodeAggregateIds to be sure we don't travers them multiple times.
fixes a bug where you cannot replay because the workspace is "missing" and no content graph exists

Upgrade instructions

Review instructions

Checklist

Code follows the PSR-2 coding style
Tests have been created, run and adjusted as needed
The PR is created against the lowest maintained branch
Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
Reviewer - The first section explains the change briefly for change-logs
Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

… applying event in test directly

...res/ProjectionIntegrityViolationDetection/AllNodesAreConnectedToARootNodePerSubgraph.feature

…manually

dlubitz · 2024-09-25T11:55:39Z

Neos.ContentGraph.DoctrineDbalAdapter/src/Domain/Repository/ContentGraph.php

@@ -188,6 +189,19 @@ public function findParentNodeAggregates(
        return $this->mapQueryBuilderToNodeAggregates($queryBuilder);
    }

+    public function findAncestorNodeAggregateIds(NodeAggregateId $entryNodeAggregateId): NodeAggregateIds


As far a I understood, this function needs to be more complex and return a set of pathways with ancestor nodes.
https://neos-project.slack.com/archives/C04PYL8H3/p1725605274115779?thread_ts=1725460465.658809&cid=C04PYL8H3

Hmm but to get that right, does that mean that if implemented more complex also other node ids will be returned or just that all the node ids will be nicely ordered by dimension in a special value object. The latter i consider just an improvement and not critical. The idea is to improve the performance of the method via native sql at first. Its marked as @internal so we can improve the structure later?

the node ids will be nicely ordered by dimension in a special value object

Yes this one

Hmm but to get that right, does that mean that if implemented more complex also other node ids will be returned or just that all the node ids will be nicely ordered by dimension in a special value object. The latter i consider just an improvement and not critical. The idea is to improve the performance of the method via native sql at first. Its marked as @internal so we can improve the structure later?

But you don't improve that with this PR.
And even if it is marked as internal, i think we shouldn't put that into an interface if we already know, that we will change that.

But you don't improve that with this PR.

i could try to do that ... emphasis on try :D or ask bernhard. But i get your point. Lets see how we can continue.

cc @nezaniel @bwaidelich @kitsunet

edit: and we def need behat tests for the method i forgot :D

added behat tests and introduced cte query to improve performance. Its now ordered desc by the parentnodeanchor

…result

In the case of the content cache flusher we do not care about the order and ordering it by parentnodeanchor and position (for siblings) is slower and not even correct in all situations as the parentnodeanchor is just an autoincrement without meaning.

mhsdesign added 3 commits September 24, 2024 17:38

TASK: Add docs to CacheFlushingStrategy

1ee4ed2

TASK: introduce ContentGraphInterface::findAncestorNodeAggregateIds

4db7389

WIP: remove hacky loop detection in findAncestorNodeAggregateIds by…

127afd0

… applying event in test directly

github-actions bot added Task 9.0 labels Sep 24, 2024

mhsdesign commented Sep 24, 2024

View reviewed changes

...res/ProjectionIntegrityViolationDetection/AllNodesAreConnectedToARootNodePerSubgraph.feature Outdated Show resolved Hide resolved

mhsdesign marked this pull request as draft September 24, 2024 19:19

mhsdesign added 2 commits September 25, 2024 13:38

TASK: Remove apply event hack and change hierarchy relation's parent …

18ea4ba

…manually

TASK: Prefer commands instead publishing events directly

c01a83f

mhsdesign marked this pull request as ready for review September 25, 2024 11:43

mhsdesign mentioned this pull request Sep 25, 2024

FEATURE: Overhaul ContentCacheFlusher #5221

Merged

mhsdesign requested review from skurfuerst, bwaidelich and dlubitz September 25, 2024 11:49

dlubitz reviewed Sep 25, 2024

View reviewed changes

mhsdesign marked this pull request as draft September 25, 2024 17:33

mhsdesign added 4 commits September 25, 2024 21:20

TASK: Add behat test for findAncestorNodeAggregateIds

1a644ee

TASK: Add behat test for multiple parent node aggregates

9801101

FEATURE: Optimise findAncestorNodeAggregateIds to use CTE and sort …

16c0b30

…result

WIP position ordering for sibling in findAncestorNodeAggregateIds

3cdf712

mhsdesign changed the title ~~TASK: content cache flusher followup~~ FEATURE: Speedup content cache flush by using cte in findAncestorNodeAggregateIds Sep 25, 2024

github-actions bot added the Feature label Sep 25, 2024

mhsdesign marked this pull request as ready for review September 26, 2024 09:17

mhsdesign requested review from dlubitz and kitsunet September 26, 2024 09:17

mhsdesign added 2 commits September 26, 2024 11:36

TASK: Simplify code in GraphProjectorCatchUpHookForCacheFlushing

965cc34

dlubitz mentioned this pull request Sep 27, 2024

BUG: Very slow doctrineDbalContentGraph projection replay #5269

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: Speedup content cache flush by using cte in `findAncestorNodeAggregateIds` #5261

FEATURE: Speedup content cache flush by using cte in `findAncestorNodeAggregateIds` #5261

mhsdesign commented Sep 24, 2024 •

edited

Loading

dlubitz Sep 25, 2024

mhsdesign Sep 25, 2024

dlubitz Sep 25, 2024 •

edited

Loading

mhsdesign Sep 25, 2024 •

edited

Loading

mhsdesign Sep 25, 2024

FEATURE: Speedup content cache flush by using cte in findAncestorNodeAggregateIds #5261

Are you sure you want to change the base?

FEATURE: Speedup content cache flush by using cte in findAncestorNodeAggregateIds #5261

Conversation

mhsdesign commented Sep 24, 2024 • edited Loading

dlubitz Sep 25, 2024

Choose a reason for hiding this comment

mhsdesign Sep 25, 2024

Choose a reason for hiding this comment

dlubitz Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

mhsdesign Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

mhsdesign Sep 25, 2024

Choose a reason for hiding this comment

FEATURE: Speedup content cache flush by using cte in `findAncestorNodeAggregateIds` #5261

FEATURE: Speedup content cache flush by using cte in `findAncestorNodeAggregateIds` #5261

mhsdesign commented Sep 24, 2024 •

edited

Loading

dlubitz Sep 25, 2024 •

edited

Loading

mhsdesign Sep 25, 2024 •

edited

Loading