Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Reproducible test failure .RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot #5823

Closed
mch2 opened this issue Jan 11, 2023 · 9 comments
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.

Comments

@mch2
Copy link
Member

mch2 commented Jan 11, 2023

Caught this seed while running local checks against 2.5 branch. Seed fails 100% of the time for me.

./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE 

Trace:

~/workspace/OpenSearch (2.5)$ ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE 
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
Starting a Gradle Daemon (subsequent builds will be faster)

> Configure project :qa:os
Cannot add task 'destructiveDistroTest.docker' as a task with that name already exists.
=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.6
  OS Info               : Mac OS X 12.6.1 (x86_64)
  JDK Version           : 17 (OpenJDK)
  JAVA_HOME             : /Users/handalm/.sdkman/candidates/java/17.0.2-open
  Random Testing Seed   : 7ED21C571F7C7EBE
  In FIPS 140 mode      : false
=======================================

> Task :server:compileJava
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :test:framework:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:test
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
OpenJDK 64-Bit Server VM warning: Ignoring option --illegal-access=warn; support was removed in 17.0
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.BootstrapForTesting (file:/Users/handalm/workspace/OpenSearch/test/framework/build/distributions/framework-2.5.0-SNAPSHOT.jar)
WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.BootstrapForTesting
WARNING: System::setSecurityManager will be removed in a future release
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.gradle.api.internal.tasks.testing.worker.TestWorker (file:/Users/handalm/.gradle/wrapper/dists/gradle-7.6-all/9f832ih6bniajn45pbmqhk2cw/gradle-7.6/lib/plugins/gradle-testing-base-7.6.jar)
WARNING: Please consider reporting this to the maintainers of org.gradle.api.internal.tasks.testing.worker.TestWorker
WARNING: System::setSecurityManager will be removed in a future release

REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=zh-Hant-HK -Dtests.timezone=Asia/Manila -Druntime.java=17

org.opensearch.index.translog.RemoteFSTranslogTests > testConcurrentWriteViewsAndSnapshot FAILED
    java.io.IOException: Failed to upload 2 files during transfer
        at __randomizedtesting.SeedInfo.seed([7ED21C571F7C7EBE:5A0BE57AC4EB8D3D]:0)
        at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121)
        at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212)
        at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195)
        at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145)
        at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708)


Suite: Test class org.opensearch.index.translog.RemoteFSTranslogTests
  1> [2023-01-11T11:59:59,200][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] before test
  1> [2023-01-11T12:00:01,277][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] using [3] readers. [1] writers. flushing every ~[98] ops.
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:01,285][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:03,415][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,612][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,769][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:07,844][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,273][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,503][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:14,726][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:23,690][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,113][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,113][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:35,516][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,118][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,118][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:00:49,560][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:05,716][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:05,716][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:06,227][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:24,912][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:26,053][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:26,053][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:47,606][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> [reader_0] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:48,884][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> [reader_1] min gen after acquiring lock [1]
  1> [2023-01-11T12:01:49,606][INFO ][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> [reader_2] min gen after acquiring lock [1]
  1> [2023-01-11T12:02:00,603][ERROR][o.o.i.t.t.BlobStoreTransferService] [org.opensearch.index.translog.RemoteFSTranslogTests] Failed to upload blob translog-682.ckp
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:00,603][ERROR][o.o.i.t.t.BlobStoreTransferService] [org.opensearch.index.translog.RemoteFSTranslogTests] Failed to upload blob translog-682.tlog
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:00,620][ERROR][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] Exception during transfer for file translog-682.tlog
  1> org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>    ... 6 more
  1> [2023-01-11T12:02:00,620][ERROR][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] Exception during transfer for file translog-682.ckp
  1> org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) [main/:?]
  1>    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) [main/:?]
  1>    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [main/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>    at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>    at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>    ... 6 more
  1> [2023-01-11T12:02:00,629][ERROR][o.o.i.t.t.TranslogTransferManager] [[writer_0]] Transfer failed for snapshot TranslogTransferSnapshot [ primary term = 211153147, generation = 682 ]
  1> java.io.IOException: Failed to upload 2 files during transfer
  1>    at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145) [main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1> [2023-01-11T12:02:00,646][ERROR][o.o.i.t.RemoteFSTranslogTests] [[writer_0]] --> writer [writer_0] had an error
  1> java.io.IOException: Failed to upload 2 files during transfer
  1>    at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.tlog-8rOYLjgzRcaU_rJcefXo0Q: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1>    Suppressed: org.opensearch.index.translog.transfer.FileTransferException: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:58) ~[main/:?]
  1>            at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:88) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[main/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
  1>            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-003/repo/SAGfAPSbRN/_na_/1/211153147/pending-translog-682.ckp-z1G2WGflQlKpQFIIIILBew: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.file.Files.newOutputStream(Files.java:228) ~[?:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeToPath(FsBlobContainer.java:228) ~[main/:?]
  1>            at org.opensearch.common.blobstore.fs.FsBlobContainer.writeBlobAtomic(FsBlobContainer.java:213) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$ThrowingBlobContainer.writeBlobAtomic(RemoteFSTranslogTests.java:1253) ~[test/:?]
  1>            at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$uploadBlobAsync$1(BlobStoreTransferService.java:54) ~[main/:?]
  1>            ... 6 more
  1> [2023-01-11T12:02:01,025][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_2]] --> reader [reader_2] had an error
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,025][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_1]] --> reader [reader_1] had an error
  1> org.apache.lucene.store.AlreadyClosedException: translog [684] is already closed (path [/Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-684.tlog]
  1>    at org.opensearch.index.translog.TranslogWriter.closeIntoReader(TranslogWriter.java:439) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:167) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-683.ckp: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$CheckpointFileSnapshot.<init>(FileSnapshot.java:193) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:190) ~[main/:?]
  1>    ... 7 more
  1>    Suppressed: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,036][ERROR][o.o.i.t.RemoteFSTranslogTests] [[reader_0]] --> reader [reader_0] had an error
  1> java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-683.ckp: Too many open files
  1>    at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>    at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.FileSnapshot$CheckpointFileSnapshot.<init>(FileSnapshot.java:193) ~[main/:?]
  1>    at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:190) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>    at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>    at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1>    Suppressed: java.nio.file.FileSystemException: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001/tempDir-002/translog-2.tlog: Too many open files
  1>            at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at org.apache.lucene.tests.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:202) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:298) ~[?:?]
  1>            at java.nio.channels.FileChannel.open(FileChannel.java:357) ~[?:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot.<init>(FileSnapshot.java:45) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TransferFileSnapshot.<init>(FileSnapshot.java:116) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.FileSnapshot$TranslogFileSnapshot.<init>(FileSnapshot.java:156) ~[main/:?]
  1>            at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:147) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:209) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:192) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:255) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.trimUnreferencedReaders(Translog.java:1748) ~[main/:?]
  1>            at org.opensearch.index.translog.Translog.lambda$acquireTranslogGenFromDeletionPolicy$12(Translog.java:735) ~[main/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.closeRetentionLock(RemoteFSTranslogTests.java:762) ~[test/:?]
  1>            at org.opensearch.index.translog.RemoteFSTranslogTests$3.doRun(RemoteFSTranslogTests.java:816) [test/:?]
  1>            at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
  1>            at java.lang.Thread.run(Thread.java:833) [?:?]
  1> [2023-01-11T12:02:01,066][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWriteViewsAndSnapshot] after test
  2> REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=7ED21C571F7C7EBE -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=zh-Hant-HK -Dtests.timezone=Asia/Manila -Druntime.java=17
  2> java.io.IOException: Failed to upload 2 files during transfer
        at __randomizedtesting.SeedInfo.seed([7ED21C571F7C7EBE:5A0BE57AC4EB8D3D]:0)
        at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:121)
        at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:212)
        at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:195)
        at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:145)
        at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:708)
  2> NOTE: leaving temporary files on disk at: /Users/handalm/workspace/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_7ED21C571F7C7EBE-001
  2> NOTE: test params are: codec=Lucene94, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=zh-Hant-HK, timezone=Asia/Manila
  2> NOTE: Mac OS X 12.6.1 x86_64/Eclipse Adoptium 17.0.5 (64-bit)/cpus=8,threads=1,free=255491192,total=536870912
  2> NOTE: All tests run in this JVM: [RemoteFSTranslogTests]

Tests with failures:
 - org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot

1 test completed, 1 failed

> Task :server:test FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':server:test'.
> There were failing tests. See the report at: file:///Users/handalm/workspace/OpenSearch/server/build/reports/tests/test/index.html

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 4m 31s
@mch2 mch2 added bug Something isn't working untriaged >test-failure Test failure from CI, local build, etc. labels Jan 11, 2023
@mch2
Copy link
Member Author

mch2 commented Jan 11, 2023

@sachinpkale FYI

@sachinpkale
Copy link
Member

Taking a look

@sachinpkale
Copy link
Member

Fix is merged to main: #5789
Need to backport to 2.x and 2.5

@sachinpkale
Copy link
Member

Backport PRs:

#5828
#5829

@sachinpkale
Copy link
Member

The fix is merged and backported.

@sachinpkale sachinpkale self-assigned this Jan 11, 2023
@nknize
Copy link
Collaborator

nknize commented Jul 21, 2023

Heads up I ran into a different error when running :server:test locally before opening #8826. It does not repro on it's own. Maybe someone knows the cause of this and whether we should re-open this issue:

./gradlew ':server:test' --tests "org.opensearch.index.translog.RemoteFSTranslogTests.testConcurrentWriteViewsAndSnapshot" -Dtests.seed=11209ED8456F2E6A -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-ME -Dtests.timezone=Pacific/Johnston -Druntime.java=20
  2> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3157, name=writer_0, state=RUNNABLE, group=TGRP-RemoteFSTranslogTests]
        at __randomizedtesting.SeedInfo.seed([F8C495824172C1FF:DC1D6CAF9AE5327C]:0)

        Caused by:
        java.lang.AssertionError: [index][1] Expected non-empty readers
            at __randomizedtesting.SeedInfo.seed([F8C495824172C1FF]:0)
            at org.opensearch.index.translog.RemoteFsTranslog.deleteStaleRemotePrimaryTerms(RemoteFsTranslog.java:430)
            at org.opensearch.index.translog.RemoteFsTranslog.trimUnreferencedReaders(RemoteFsTranslog.java:400)
            at org.opensearch.index.translog.RemoteFSTranslogTests$2.doRun(RemoteFSTranslogTests.java:821)
  1> [2023-07-21T22:25:10,111][INFO ][o.o.i.t.RemoteFSTranslogTests] [testReadLocation] before test
  1> [2023-07-21T22:25:10,132][INFO ][o.o.i.t.RemoteFSTranslogTests] [testReadLocation] after test
  1> [2023-07-21T22:25:10,139][INFO ][o.o.i.t.RemoteFSTranslogTests] [testUploadWithPrimaryModeTrue] before test
  1> [2023-07-21T22:25:10,155][INFO ][o.o.i.t.RemoteFSTranslogTests] [testUploadWithPrimaryModeTrue] after test
  1> [2023-07-21T22:25:10,162][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterFsyncDisabledInRemoteFsTranslog] before test
  1> [2023-07-21T22:25:10,190][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterFsyncDisabledInRemoteFsTranslog] after test
  1> [2023-07-21T22:25:10,197][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] before test
  1> [2023-07-21T22:25:10,207][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] testing with [7] threads, each doing [14] ops
  1> [2023-07-21T22:25:10,440][INFO ][o.o.i.t.RemoteFSTranslogTests] [testConcurrentWritesWithVaryingSize] after test
  1> [2023-07-21T22:25:10,451][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterCanFlushInAddOrReadCall] before test
  1> [2023-07-21T22:25:10,476][INFO ][o.o.i.t.RemoteFSTranslogTests] [testTranslogWriterCanFlushInAddOrReadCall] after test
  1> [2023-07-21T22:25:10,482][INFO ][o.o.i.t.RemoteFSTranslogTests] [testRangeSnapshot] before test
  1> [2023-07-21T22:25:10,532][INFO ][o.o.i.t.RemoteFSTranslogTests] [testRangeSnapshot] after test
  1> [2023-07-21T22:25:10,538][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] before test
  1> [2023-07-21T22:25:10,561][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] All md files [9223372035702745179__9223372036854775805__9223370346876465252__1, 9223372035702745179__9223372036854775804__9223370346876465248__1]
  1> [2023-07-21T22:25:10,563][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] All data files [translog-3.ckp, translog-1.tlog, translog-2.ckp, translog-3.tlog, translog-1.ckp, translog-2.tlog]
  1> [2023-07-21T22:25:10,563][ERROR][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] Asserting content of 2
  1> [2023-07-21T22:25:10,564][ERROR][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] Asserting content of 3
  1> [2023-07-21T22:25:10,567][INFO ][o.o.i.t.t.TranslogTransferManager] [testSimpleOperationsUpload] [index][1] Deleting primary terms from remote store lesser than 1152030628
  1> [2023-07-21T22:25:10,578][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperationsUpload] after test
  1> [2023-07-21T22:25:10,590][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSyncUpToStream] before test
  1> [2023-07-21T22:25:10,669][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSyncUpToStream] after test
  1> [2023-07-21T22:25:10,675][INFO ][o.o.i.t.RemoteFSTranslogTests] [testCloseIntoReader] before test
  1> [2023-07-21T22:25:10,697][INFO ][o.o.i.t.RemoteFSTranslogTests] [testCloseIntoReader] after test
  1> [2023-07-21T22:25:10,708][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] before test
  1> [2023-07-21T22:25:10,745][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Deleting primary terms from remote store lesser than 1979622222
  1> [2023-07-21T22:25:10,770][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] numDocs=7 moreDocs=4
  1> [2023-07-21T22:25:10,813][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Downloading translog files with: Primary Term = 1979622222, Generation = 13, Location = /opt/dev/opensearch-project/opensearch/.worktrees/enhance/mediaTypeParserRegistry/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_F8C495824172C1FF-001/tempDir-044
  1> [2023-07-21T22:25:10,814][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Downloading translog files with: Primary Term = 1979622222, Generation = 12, Location = /opt/dev/opensearch-project/opensearch/.worktrees/enhance/mediaTypeParserRegistry/server/build/testrun/test/temp/org.opensearch.index.translog.RemoteFSTranslogTests_F8C495824172C1FF-001/tempDir-044
  1> [2023-07-21T22:25:10,831][INFO ][o.o.i.t.t.TranslogTransferManager] [testMetadataFileDeletion] [index][1] Deleting primary terms from remote store lesser than 1979622223
  1> [2023-07-21T22:25:10,836][INFO ][o.o.i.t.t.TranslogTransferManager] [org.opensearch.index.translog.RemoteFSTranslogTests] [index][1] Deleted primary term 1979622222
  1> [2023-07-21T22:25:10,863][INFO ][o.o.i.t.RemoteFSTranslogTests] [testMetadataFileDeletion] after test
  1> [2023-07-21T22:25:10,873][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperations] before test
  1> [2023-07-21T22:25:10,898][INFO ][o.o.i.t.RemoteFSTranslogTests] [testSimpleOperations] after test
  2> NOTE: test params are: codec=Asserting(Lucene95): {}, docValues:{}, maxPointsInLeafNode=948, maxMBSortInHeap=5.221747612462641, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=he-IL, timezone=America/Danmarkshavn
  2> NOTE: Linux 5.17.0-1033-oem amd64/Eclipse Adoptium 20.0.1 (64-bit)/cpus=24,threads=1,free=424241208,total=536870912
  2> NOTE: All tests run in this JVM: [DynamicActionRegistryTests, AddVotingConfigExclusionsRequestTests, DecommissionResponseTests, CancelTasksRequestTests, ClusterGetSettingsResponseTests, SnapshotIndexShardStatusTests, MappingVisitorTests, GetAliasesResponseTests, CreateIndexResponseTests, GetIndexActionTests, ResolveIndexResponseTests, UpdateSettingsRequestSerializationTests, GetIndexTemplatesResponseTests, BulkRequestModifierTests, DeleteResponseTests, TransportMultiGetActionTests, SimulateProcessorResultTests, CreatePitControllerTests, SearchPhaseExecutionExceptionTests, TransportMultiSearchActionTests, RetryableActionTests, TransportWriteActionForIndexingPressureTests, JavaVersionTests, NodeClientHeadersTests, ShardFailedClusterStateTaskExecutorTests, ClusterBootstrapServiceRenamedSettingTests, LeaderCheckerTests, DecommissionControllerTests, ComponentTemplateTests, IndexAbstractionTests, MetadataIndexStateServiceTests, DiscoveryNodeTests, PrimaryTermsTests, AllocationConstraintsTests, DecisionsImpactOnClusterHealthTests, MaxRetryAllocationDeciderTests, RemoteShardsMoveShardsTests, TenShardsOneReplicaRoutingTests, RestoreInProgressAllocationDeciderTests, TaskBatcherTests, RoundingTests, CompositeBytesReferenceTests, AutoCloseableRefCountedTests, GeometryIndexerTests, PointBuilderTests, HeaderWarningTests, MinScoreScorerTests, NetworkUtilsTests, MemorySizeSettingsTests, JavaDateMathParserTests, ByteUtilsTests, ReorganizingLongHashTests, FutureUtilsTests, SizeBlockingQueueTests, JsonVsCborTests, JacksonLocationTests, SettingsBasedSeedHostsProviderTests, ExtensionActionUtilTests, RegisterCustomSettingsTests, PriorityComparatorTests, IndexingPressureServiceTests, ShardIndexingPressureTests, PreConfiguredTokenFilterTests, EngineConfigFactoryTests, RecoverySourcePruneMergePolicyTests, NoOrdinalsStringFieldDataTests, BinaryFieldMapperTests, DocCountFieldMapperTests, FieldAliasMapperValidationTests, GeoShapeFieldTypeTests, KeywordFieldTypeTests, NumberFieldTypeTests, SourceFieldMapperTests, CombineIntervalsSourceProviderTests, GeoBoundingBoxQueryBuilderTests, MatchNoneQueryBuilderTests, QueryStringQueryBuilderTests, SpanFirstQueryBuilderTests, WildcardQueryBuilderTests, DeleteByQueryRequestTests, MultiMatchQueryTests, GlobalCheckpointSyncActionTests, RetentionLeasesTests, PrimaryReplicaSyncerTests, ShardUtilsTests, RemoteBufferedOutputDirectoryTests, FileCacheCleanerTests, RemoteFSTranslogTests]

@mch2
Copy link
Member Author

mch2 commented Sep 21, 2023

Hit this again with - #9743 (comment)

@mch2 mch2 reopened this Sep 21, 2023
@peternied peternied added flaky-test Random test failure that succeeds on second run and removed untriaged labels Nov 30, 2023
@sachinpkale
Copy link
Member

Taking a look

@sachinpkale
Copy link
Member

Ran on local env 25K+ times without any failures. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.
Projects
None yet
Development

No branches or pull requests

4 participants