-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send vertex label to user log processor for all mutations #3264
Send vertex label to user log processor for all mutations #3264
Conversation
dff3e2c
to
1976b7e
Compare
41855fb
to
875cad6
Compare
String vertexLabel = changes.getVertices(Change.ADDED).iterator().next().label(); | ||
assertEquals(testVertexLabel, vertexLabel); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two lines would fail without the business logic change because vertexLabel
here would be equal to vertex
even so real label of the vertex label is testVertex
. With this PR this vertex now returns correct vertex label.
String vertexLabel = changes.getVertices(Change.REMOVED).iterator().next().label(); | ||
assertEquals(testVertexLabel, vertexLabel); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two lines would fail without the business logic change because vertexLabel
here would be equal to vertex
even so real label of the vertex label is testVertex
. With this PR this vertex now returns correct vertex label.
@JanusGraph/committers I have intention to merge this breaking change on Monday using lazy consensus if there are no reviews. In case you need more time, please, let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this change 👍 I just have some quick questions
@@ -171,6 +171,24 @@ A new optimization has been added to compute aggregations (min, max, sum and avg | |||
If the index backend is Elasticsearch, a `double` value is used to hold the result. As a result, aggregations on long numbers greater than 2^53 are approximate. | |||
In this case, if the accurate result is essential, the optimization can be disabled by removing the strategy `JanusGraphMixedIndexAggStrategy`: `g.traversal().withoutStrategies(JanusGraphMixedIndexAggStrategy.class)`. | |||
|
|||
##### Breaking change for transaction logs processing | |||
|
|||
[Transaction Log](advanced-topics/transaction-log.md) processing has a breaking change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only affects transaction logs started by users, right? I think so but just wanted to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually changes the mutation of two transactions logs:
- JanusGraph's write-ahead transaction log which can be enabled by
tx.log-tx = true
. - Configured user's transaction logs.
InternalVertex vertex = rel.getVertex(0); | ||
VariableLong.writePositive(out,vertex.longId()); | ||
VertexLabel vertexLabel = vertex.vertexLabel(); | ||
VariableLong.writePositive(out, vertexLabel.hasId() ? vertexLabel.longId() : 0L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine, and I believe we should cache vertex labels aggressively. But just to understand more about the intention here: cannot the receiver execute a backend query to fetch vertex label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not guaranteed that there will be such vertex in the graph because the vertex might be already removed.
- It's guaranteed to be removed if we receive the change for removed vertex.
- Even if we received a change for updated / added vertex - it doesn't guarantee that this vertex still exists because someone could remove that vertex faster then the change is received by the processing JanusGraph instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the vertex is already removed, why do we need the vertex label on the receiver side? I guess my real question is, can you elaborate on the following statement?
Without knowing vertex label of the mutated vertex we won't be able to properly invalidate indices which are bound to specific vertex label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To invalidate an index which is constrained to a specific label we need to know the label of the removed / updated / added vertex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have db-cache
enabled, and we have an index which was created with indexOnly(vertexLabel)
and in case we cached some query results in indexStore then we will need vertex label + vertex id + vertex properties to invalidate related cached queries like: g.V().hasLabel("someVertexLabel").has("myProperty", "propValue")
We can't invalidate the above query knowing vertex id only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those which use 'indexOnly' features won't be invalidated using vertex id + properties only. We need to have vertex label as well for that invalidation.
Gotcha. Do you have a code snippet showing how we could invalidate normal index entries in DB cache? I thought index entries with indexOnly
constraints are no different from normal index entries (from cache perspective) but from what you said it's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@li-boxuan added testIndexWithIndexOnlyConstraintForceInvalidationFromDBCache()
to show why it's not enough to have vertex id and properties to invalidate indexStore cache.
https://github.com/JanusGraph/janusgraph/compare/875cad608d82cd5e825a0fa6f96e4f79e8f261fd..48cd7aeb3cf776db04b34b80c69bd92988a79e2b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure but I believe vertex label is not needed here.
In invalidateUpdatedVertexProperty
, you called
Collection<IndexUpdate> indexUpdates = indexSerializer.getIndexUpdates(cacheVertex, Arrays.asList(propertyPreviousVal, propertyNewVal));
to fetch a collection of index records.
In IndexSerializer::getIndexUpdates
, it skips an index entry candidate if it does not conform to the index only constraint:
janusgraph/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/IndexSerializer.java
Line 190 in 11e40fc
if (!indexAppliesTo(index,relation)) continue; |
This makes sense when we want to generate a new index entry, but it is not necessary when we try to remove an existing index entry. If you temporarily comment out this line, I believe your test case would pass even if the LogEntry does not contain vertex label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@li-boxuan you are right. I guess in this case it's better to add another method to IndexSerializer
which won't filter indices by indexOnly
constraint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@li-boxuan I opened the PR which adds possibility to get Index updates with disabled constraints here #3279
Just one question, what happens during an upgrade? How old janusgraph instances handle these messages? |
After updated you will need to start consuming new messages only. Thus, it's necessary that you stop any mutations during upgrade, consume all old logs and only then upgrade JanusGraph nodes. After that you can enable consumption again on new logs. |
We should add a note to that during upgrade you should stop mutation. |
Fixes JanusGraph#3263 Signed-off-by: Oleksandr Porunov <[email protected]>
875cad6
to
48cd7ae
Compare
Related JanusGraph#3155 JanusGraph#3263 JanusGraph#3264 Signed-off-by: Oleksandr Porunov <[email protected]>
The PR is obsolete because we are able to invalidate cached indexes without knowing vertex label as stated here: #3264 (comment) Thus, closing this PR without merging it. |
Related #3155 #3263 #3264 Signed-off-by: Oleksandr Porunov <[email protected]>
Fixes #3263
Related to #3155
Signed-off-by: Oleksandr Porunov [email protected]
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
master
)?For code changes:
For documentation related changes: