Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] FullClusterRestartIT.testRollupIDSchemeAfterRestart failed #32773

Closed
spinscale opened this issue Aug 10, 2018 · 18 comments
Closed

[CI] FullClusterRestartIT.testRollupIDSchemeAfterRestart failed #32773

spinscale opened this issue Aug 10, 2018 · 18 comments
Assignees
Labels
:StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data >test-failure Triaged test failures from CI

Comments

@spinscale
Copy link
Contributor

This test failed in CI, but I was unable to reproduce it on osx or linux.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=amazon/1231/console

Also note that there are still running tasks according to the build output.

  1> [2018-08-10T18:48:19,692][INFO ][o.e.x.r.FullClusterRestartIT] [testRollupIDSchemeAfterRestart] There are still tasks running after this test that might break subsequent tests [indices:data/write/bulk[s], indices:data/write/bulk[s][p], indices:data/write/bulk[s][r], indices:data/write/index, xpack/rollup/job[c]].
  1> [2018-08-10T18:48:19,729][INFO ][o.e.x.r.FullClusterRestartIT] [testRollupIDSchemeAfterRestart] after test
FAILURE 12.2s | FullClusterRestartIT.testRollupIDSchemeAfterRestart <<< FAILURES!
   > Throwable #1: java.lang.AssertionError:
   > Expected: iterable over ["3310683722", "rollup-id-test$ehY4NAyVSy8xxUDZrNXXIA"] in any order
  2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/amazon/x-pack/qa/full-cluster-restart/without-system-key/build/testrun/v6.3.3-SNAPSHOT#upgradedClusterTestRunner/J0/temp/org.elasticsearch.xpack.restart.FullClusterRestartIT_A9D42E18F468D297-001
   >      but: Not matched: "621059582"
   >    at __randomizedtesting.SeedInfo.seed([A9D42E18F468D297:F1BA0C257BB524E0]:0)
   >    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >    at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testRollupIDSchemeAfterRestart$5(FullClusterRestartIT.java:424)
   >    at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:842)
   >    at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:816)
   >    at org.elasticsearch.xpack.restart.FullClusterRestartIT.testRollupIDSchemeAfterRestart(FullClusterRestartIT.java:412)
   >    at java.lang.Thread.run(Thread.java:748)
   >    Suppressed: java.lang.AssertionError:
   > Expected: <2>
   >      but: was <1>
   >            at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
  2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=1961, maxMBSortInHeap=7.446664250363229, sim=RandomSimilarity(queryNorm=false): {}, locale=tr-TR, timezone=Pacific/Chuuk
  2> NOTE: Linux 4.9.62-21.56.amzn1.x86_64 amd64/Oracle Corporation 1.8.0_181 (64-bit)/cpus=16,threads=1,free=418041304,total=514850816
   >            at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testRollupIDSchemeAfterRestart$5(FullClusterRestartIT.java:418)
   >            at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:830)
   >            ... 39 more
   >    Suppressed: java.lang.AssertionError:
   > Expected: <2>
   >      but: was <1>
  2> NOTE: All tests run in this JVM: [FullClusterRestartIT]
   >            at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >            at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testRollupIDSchemeAfterRestart$5(FullClusterRestartIT.java:418)
   >            at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:830)
   >            ... 39 more
   >    Suppressed: java.lang.AssertionError:
   > Expected: <2>
   >      but: was <1>
   >            at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >            at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testRollupIDSchemeAfterRestart$5(FullClusterRestartIT.java:418)
   >            at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:830)
   >            ... 39 more
   >    Suppressed: java.lang.AssertionError:
   > Expected: <2>
@spinscale spinscale added >test-failure Triaged test failures from CI :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels Aug 10, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@colings86
Copy link
Contributor

@polyfractal could you take a look at this one?

@polyfractal
Copy link
Contributor

Hm, I'm not positive what's going on here but I have a guess. Will open a PR with a test fix and get Jim's opinion.

polyfractal added a commit to polyfractal/elasticsearch that referenced this issue Aug 10, 2018
We only upgrade the ID when the state is saved in one of four scenarios:

- when we reach a checkpoint (every 50 pages)
- when we run out of data
- when explicitly stopped
- on failure

The test was relying on the pre-upgrade to finish, save state and then
the post-upgrade to start, hit the end of data and upgrade ID.  THEN
get the new doc and apply the new ID.

But I think this is vulnerable to timing issues. If the pre-upgrade
portion shutdown before it saved the state, when restarting we would run
through all the data from the beginning with the old ID, meaning both
docs would still have the old scheme.

This change makes the pre-upgrade wait for the job to go back to STARTED
so that we know it persisted the end point.  Post-upgrade, it stops and
restarts the job to ensure the state was persisted and the ID upgraded.

That _should_ rule out the above timing issue.

Closes elastic#32773
polyfractal added a commit that referenced this issue Aug 13, 2018
We only upgrade the ID when the state is saved in one of four scenarios:

- when we reach a checkpoint (every 50 pages)
- when we run out of data
- when explicitly stopped
- on failure

The test was relying on the pre-upgrade to finish, save state and then
the post-upgrade to start, hit the end of data and upgrade ID.  THEN
get the new doc and apply the new ID.

But I think this is vulnerable to timing issues. If the pre-upgrade
portion shutdown before it saved the state, when restarting we would run
through all the data from the beginning with the old ID, meaning both
docs would still have the old scheme.

This change makes the pre-upgrade wait for the job to go back to STARTED
so that we know it persisted the end point.  Post-upgrade, it stops and
restarts the job to ensure the state was persisted and the ID upgraded.

That _should_ rule out the above timing issue.

Closes #32773
polyfractal added a commit that referenced this issue Aug 13, 2018
We only upgrade the ID when the state is saved in one of four scenarios:

- when we reach a checkpoint (every 50 pages)
- when we run out of data
- when explicitly stopped
- on failure

The test was relying on the pre-upgrade to finish, save state and then
the post-upgrade to start, hit the end of data and upgrade ID.  THEN
get the new doc and apply the new ID.

But I think this is vulnerable to timing issues. If the pre-upgrade
portion shutdown before it saved the state, when restarting we would run
through all the data from the beginning with the old ID, meaning both
docs would still have the old scheme.

This change makes the pre-upgrade wait for the job to go back to STARTED
so that we know it persisted the end point.  Post-upgrade, it stops and
restarts the job to ensure the state was persisted and the ID upgraded.

That _should_ rule out the above timing issue.

Closes #32773
polyfractal added a commit that referenced this issue Aug 13, 2018
We only upgrade the ID when the state is saved in one of four scenarios:

- when we reach a checkpoint (every 50 pages)
- when we run out of data
- when explicitly stopped
- on failure

The test was relying on the pre-upgrade to finish, save state and then
the post-upgrade to start, hit the end of data and upgrade ID.  THEN
get the new doc and apply the new ID.

But I think this is vulnerable to timing issues. If the pre-upgrade
portion shutdown before it saved the state, when restarting we would run
through all the data from the beginning with the old ID, meaning both
docs would still have the old scheme.

This change makes the pre-upgrade wait for the job to go back to STARTED
so that we know it persisted the end point.  Post-upgrade, it stops and
restarts the job to ensure the state was persisted and the ID upgraded.

That _should_ rule out the above timing issue.

Closes #32773
@andyb-elastic
Copy link
Contributor

I'm not sure if this is strictly related but this failure on 6.x is from one of the lines added in b32fbbe

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java10,nodes=virtual&&linux/236/console

Full log is too large to upload sorry. Doesn't reproduce for me

  1> [2018-08-14T13:17:05,638][INFO ][o.e.x.r.FullClusterRestartIT] [testRollupIDSchemeAfterRestart] after test                                               
ERROR   10.5s | FullClusterRestartIT.testRollupIDSchemeAfterRestart <<< FAILURES!                                                                             
   > Throwable #1: org.elasticsearch.client.ResponseException: method [POST], host [http://[::1]:45233], URI [_xpack/rollup/job/rollup-id-test/_stop], status line [HTTP/1.1 404 Not Found]          
   > {"error":{"root_cause":[{"type":"resource_not_found_exception","reason":"Task for Rollup Job [rollup-id-test] not found"}],"type":"resource_not_found_exception","reason":"Task for Rollup Job [rollup-id-test] not found"},"status":404}                                                                              
   >    at __randomizedtesting.SeedInfo.seed([452BE0CB85C8B7F5:1D45C2F60A154182]:0)                                                                           
   >    at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:920)                                                                     >    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:227)                                                                            
   >    at org.elasticsearch.xpack.restart.FullClusterRestartIT.testRollupIDSchemeAfterRestart(FullClusterRestartIT.java:416)                                 
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                     
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)                                                   
   >    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)                                           
   >    at java.base/java.lang.reflect.Method.invoke(Method.java:564)          
   >    at java.base/java.lang.Thread.run(Thread.java:844)                     
   > Caused by: org.elasticsearch.client.ResponseException: method [POST], host [http://[::1]:45233], URI [_xpack/rollup/job/rollup-id-test/_stop], status line [HTTP/1.1 404 Not Found]             
   > {"error":{"root_cause":[{"type":"resource_not_found_exception","reason":"Task for Rollup Job [rollup-id-test] not found"}],"type":"resource_not_found_exception","reason":"Task for Rollup Job [rollup-id-test] not found"},"status":404}                                                                              
   >    at org.elasticsearch.client.RestClient$1.completed(RestClient.java:540)                                                                               
   >    at org.elasticsearch.client.RestClient$1.completed(RestClient.java:529)                                                                               
   >    at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)                                                                             
   >    at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)                      
   >    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)                                           
   >    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)                                                
   >    at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)                                          
   >    at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)                                                        
   >    at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)                                                        
   >    at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)                                                        
   >    at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)                                                                    
   >    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)                                                        
   >    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)                                                       
   >    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)                                                             
   >    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)                                                                     
   >    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)                                    
   >    ... 1 more 
REPRODUCE WITH: ./gradlew :x-pack:qa:full-cluster-restart:without-system-key:v6.3.3-SNAPSHOT#upgradedClusterTestRunner -Dtests.seed=452BE0CB85C8B7F5 -Dtests.class=org.elasticsearch.xpack.restart.FullClusterRestartIT -Dtests.method="testRollupIDSchemeAfterRestart" -Dtests.security.manager=true -Dtests.locale=id-ID -Dtests.timezone=Africa/Freetown -Dcompiler.java=10 -Druntime.java=10

@andyb-elastic andyb-elastic reopened this Aug 14, 2018
@polyfractal
Copy link
Contributor

Grr, I think it's trying to stop the job before the task is fully initialized. I should have left the awaitBusy for the persistent task.

I'll take care of this shortly... if it keeps failing before I get there feel free to mute. Sorry for the noise!

@jpountz
Copy link
Contributor

jpountz commented Aug 16, 2018

I just AwaitFix'ed this test.

polyfractal added a commit that referenced this issue Aug 31, 2018
We need to wait for the job to fully initialize and start before
we can attempt to stop it.  If we don't, it's possible for the stop
API to be called before the persistent task is fully loaded and it'll
throw an exception.

Closes #32773
polyfractal added a commit that referenced this issue Aug 31, 2018
We need to wait for the job to fully initialize and start before
we can attempt to stop it.  If we don't, it's possible for the stop
API to be called before the persistent task is fully loaded and it'll
throw an exception.

Closes #32773
@polyfractal
Copy link
Contributor

Thanks Adrien.

I think this test is fixed now. But if it fails again, I'll re-evaluate how we're doing things in the test and see if there's a way to rewrite it to be less flaky.

Sorry for the noise everyone!

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Sep 17, 2018

This test is a gift that keeps on giving 😁

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+bwc-tests/3/console failed as follows:

java.util.NoSuchElementException
	at __randomizedtesting.SeedInfo.seed([E7E5A73C8428AF07:BF8B85010BF55970]:0)
	at java.util.HashMap$HashIterator.nextNode(HashMap.java:1444)
	at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
	at org.elasticsearch.xpack.restart.FullClusterRestartIT.assertRollUpJob(FullClusterRestartIT.java:651)
	at org.elasticsearch.xpack.restart.FullClusterRestartIT.testRollupIDSchemeAfterRestart(FullClusterRestartIT.java:392)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)

Allegedly, but not for me:

REPRODUCE WITH: ./gradlew :x-pack:qa:full-cluster-restart:with-system-key:v6.3.2#upgradedClusterTestRunner \
  -Dtests.seed=E7E5A73C8428AF07 \
  -Dtests.class=org.elasticsearch.xpack.restart.FullClusterRestartIT \
  -Dtests.method="testRollupIDSchemeAfterRestart" \
  -Dtests.security.manager=true \
  -Dtests.locale=vi-VN \
  -Dtests.timezone=Atlantic/Stanley \
  -Dcompiler.java=10 \
  -Druntime.java=8

I wonder if perhaps the following line should have an equalTo("started") in it so it waits for things to settle down after the restart?

@DaveCTurner DaveCTurner reopened this Sep 17, 2018
@polyfractal
Copy link
Contributor

Gift that keeps on giving indeed :(

I wonder if perhaps the following line should have an equalTo("started") in it so it waits for things to settle down after the restart?

Unfortunately, it does that in assertRollupJob():

final Matcher<?> expectedStates = anyOf(equalTo("indexing"), equalTo("started"));

The failure seems to be during the assertion of the job details, as provided by the Tasks API. This occurs after we have confirmed the job exists in the GetRollupJob API (which means the persistent task exists).

I wonder if we need another await in there, to account for the allocated task taking slightly longer to "exist" after the persistent task is created. Not sure, will poke around some today.

Ugh. :(

@DaveCTurner
Copy link
Contributor

Unfortunately, it does that in assertRollupJob():

So it does, sorry, I must have misread something.

@polyfractal
Copy link
Contributor

No worries, if that'd been the case I would have been overjoyed at a simple fix :)

@original-brownbear
Copy link
Member

Another one:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+bwc-tests/61/console

REPRODUCE WITH: ./gradlew :x-pack:qa:full-cluster-restart:without-system-key:v6.3.2#upgradedClusterTestRunner \
  -Dtests.seed=3E23E8BB6F6FE2BC \
  -Dtests.class=org.elasticsearch.xpack.restart.FullClusterRestartIT \
  -Dtests.method="testRollupIDSchemeAfterRestart" \
  -Dtests.security.manager=true \
  -Dtests.locale=is \
  -Dtests.timezone=America/Metlakatla \
  -Dcompiler.java=11 \
  -Druntime.java=8

@droberts195
Copy link
Contributor

@polyfractal if something were to wipe all the persistent tasks from the cluster state in between the upgraded cluster starting up and FullClusterRestartIT.testRollupIDSchemeAfterRestart running, could that explain the failures that have been seen in this issue? I ask because FullClusterRestartIT.testSnapshotRestore sometimes does just that (depending on the order in which the tests run) - see #36816 (comment). Also, even when FullClusterRestartIT.testSnapshotRestore doesn't cause the persistent tasks to be wiped out, it can cause their assignments to become stale, so cause them to be reassigned to the other node in the cluster part way through the series of assertions that FullClusterRestartIT.testRollupIDSchemeAfterRestart is making. Could that cause the test failures that were observed?

@polyfractal
Copy link
Contributor

@droberts195 Oh! Yes, I bet that would/could have caused some of these failures. This test:

  1. Creates a job, indexes some data, makes sure everything looks good
  2. Upgrades and restarts cluster
  3. Ensures the job created in step 1 is active/ready
  4. Stops and restarts the job to check a certain flag has changed

Step 3. would definitely break if the state were wiped in between 1 and 3, since the job wouldn't exist and the assertions would fail (which is what the failures in this issue are showing).

I don't think shifting assignments would cause failures in this case, since we're just asserting that the job exists somewhere and has been upgraded correctly. But if the state were totally wiped away I certainly would.

@polyfractal
Copy link
Contributor

Oops, closed via that test fix which predated the ^^ comments. I think the fix in #38218 probably contributed to some of the failures, but the snapshot/restore could easily have contributed to the others.

@jaymode
Copy link
Member

jaymode commented Mar 18, 2019

This has reappeared in the 6.6 BWC tests
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.6+bwc-tests/313/console.

Expected: iterable over ["3310683722", "rollup-id-test$ehY4NAyVSy8xxUDZrNXXIA"] in any order
     but: Not matched: "621059582"
		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
		at org.junit.Assert.assertThat(Assert.java:956)
		at org.junit.Assert.assertThat(Assert.java:923)
		at org.elasticsearch.xpack.restart.FullClusterRestartIT.lambda$testRollupIDSchemeAfterRestart$5(FullClusterRestartIT.java:413)
		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
		... 39 more
	Suppressed: java.lang.AssertionError: 
Expected: <2>
     but: was <3>

@jaymode jaymode reopened this Mar 18, 2019
@polyfractal
Copy link
Contributor

Thanks for the ping. Looks like it might be a missed backport. Investigating

@polyfractal
Copy link
Contributor

Ok, backported to 6.6. I believe that particular fix has been fully backported, so if it fails again it may be the issue @droberts195 mentioned... or something else.

Closing for now. 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

10 participants