-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] CTS failures #7148
Comments
The above output is from the graph repo |
will repeat with current code from main |
The CTS tests from main are clean. This means it is one of
|
repeating test after updating v4 from master |
After incorporating updates from main, the CTS (graph) continues to fail in the same way |
Server log: https://gist.github.com/8e242fa5769a8856976122d5e29b7303 |
Having retested with odpi/egeria-charts#225 (egeria-release-3.14), the graph CTS is failing in the same way This is going to block the release - will discuss/investigate more on Monday. |
@mandy-chessell Do you have any thoughts as to whether the above have changes may have resulted in this cts behaviour? |
d669a35 made a change to one of the converters in the OMAS layer to populate extra properties in an OMAS interface bean - highly unlikely to impact the CTS which operates at the OMRS level. f37c0ad made changes to entity/classifications and all of the failures are in the management of relationships. I had a clean CTS run on my machine before pushing the changes into main. (Built with maven) The recent changes I made that broke the CTS was to add support for DEREGISTERED_REPOSITORY instance provenance type. The CTS was storing an instance, retrieving it and using equals() to determine that it had not changed. The problem was that the CTS was using a nonsense metadata collection id in the instance. Before the DEREGISTERED_REPOSITORY instance provenance type fix, no check was made on the validity of the metadata collection id. The local metadata collection now validates the metadata collection id to determine if it is an active member of the cohort. If it is not then the instance provenance type is updated to DEREGISTERED_REPOSITORY before the instance is returned. This change causes the equals() method to fail in the CTS. The fix I added to the CTS was to update the instance provenance type to DEREGISTERED_REPOSITORY in the original instance before the test for equals(). I am thinking that the CTS fixes missed the branch or somehow I missed fixing the relationship tests. But then I do not know how it worked on my machine. I will run the CTS on main to see if the failure occurs on our latest code. This will isolate whether there is a missing fix and we have a new/different problem, or the CTS fix missed the branch. The fact that it fails on the latest V4.0 branch suggests a new problem. |
The way to identify the test that is failing is to search the CTS code for the part of the assertion message that does not include the typeName. For example, for failing test message ….
Search for This gives you the assertion message.
Find where it is used in the code and this give you the condition and profile
Using this technique, the other failing test is
Both failures are using |
Thanks - I had looked at the cts code,and established it was that comparison - was just wondering what you thought ref recent changes. It's been failing on v4 for a couple of weeks (ON GRAPH ONLY) I'm concerned about releasing 3.14 until we understand the cts failure I'm also today looking at #353 - so we can pick up on cts issues quicker & ensure consistency |
If it is graph only it is probably a new issue and not related to the DEREGISTERED_REPOSITORY change which was in the OMRS and not the connectors. |
Rerunning CTS on 3.14 (k8s, via egeria-cts chart) |
This is the comparison of the two relationships. The retrieved headerVersion is 0 and should be 1 in the relationship, and each entityProxy. In the comparison below, retrievedReferenceCopy is on the left and newRelationship is on the right. The headerVersion property is used to discover whether the OMRS is talking to a future version of OMRS that uses later structures. The value should be 1. 0 is the default value in the bean and suggests it is not being set. It hints at a problem in the way instances are stored/retrieved from the graph store. I do not know what has changed to cause it to fail now and why it is only affecting relationships. |
Further investigation shows that
It is not reason that the CTS is failing. |
This is the comparison of the relationships for a failing case. What we see is that the newRelationship incorrectly includes a null qualifiedName property in the entityProxies for each end of the relationship. This is an invalid property for the type. The error is in the CTS since it is setting up the newRelationship. The question is, why has this changed? |
I have located the source of the problem in EntityProxy.java. It is setting up qualifiedName whether it is null or not. Fixing that led to an NPE in Subject Area OMAS that was dereferencing uniqueProperties without checking it was not null. The CTS is now running cleanly on graph |
CTS tests
Graph Reporting test failures
Summary output:
FAIL [241911/3768] In-memory Summary: yet:
The new, trial CTS test runs See https://github.com/planetf1/cts. Both FAILED Identical results from graph in terms of profile ( https://github.com/planetf1/cts/actions/runs/3715784372/jobs/6301358261 ) Inmemory showing a few issues https://github.com/planetf1/cts/actions/runs/3715784372/jobs/6301358199 - also a fail |
Checked with MAIN The SAME cts graph failures are reported Therefore the test may be invalid. Perhaps the container image (tagged 3.14) is cached ? Yet this wouldn't affect the trial test runs since there is no persistent container registry (k8s cluster built each time) and yet they had the same results |
The inmemory CTS on 3.14 (k8s/usual way) (which is using latest image) reports one exception:
And 3474 Assertion failures, suggestion the overall test count showing failures is correct (whilst the profile results are not) |
I reran the tests (k8s) :
|
Whilst graph consistently fails, I believe CTS failures for inmemory are inconsistent. To avoid more clutter I'll open up a new issue for inmemory |
By reverting the header version fix, CTS graph now passes. PR opened for 3.14 It's possible the error was caused by conflict resolution, but also that it's genuine. @mandy-chessell your call whether we merge that main fix anyway and then run cts if clean in your environment with it? Or we can build a local image to test if needed |
Unfortunately, the exception shown in comment #7148 (comment) is an error in the CTS. It is catching an assertion failure rather than an exception from Egeria. The assertion failure it masks is probably the real problem. I will merge the current PR and retest with the fix to the exception management. |
Fix errors in headers and entity proxies detected by the CTS (#7148)
If you want to try running the CTS pipelines (to avoid impact on local environment)
|
see CTS results after main -> https://github.com/planetf1/cts/actions/runs/3751230156 (though note inmem and xtdb are both showing some intermittency) |
CTS graph errors continue to be reported by the pipeline above. Additionally, I did a standalone test on a 3 x 16GB/8CPU k8s cluster, and had the same results for graph, which continues to fail in 'main' with:
|
CTS is clean for inmemory & has been clean for a while, so the issue is addressed |
Is there an existing issue for this?
Current Behavior
Seeing CTS failures in both in-mem & graph (but this is on the experimental v4 build)- and I understand @dwolfson is also seeing in XTDB on v3.14
This was seen when testing v4 (java 17/gradle)
In my case I did a test run on Nov 22nd with in-memory using 'main'.
That is currently failing 16 tests of 6105 with 6089 passing
Expected Behavior
CTS to pass
Steps To Reproduce
The result is
Environment
Any Further Information?
No response
The text was updated successfully, but these errors were encountered: