Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the topic in fenced state and can not recover. #11737

Merged
merged 3 commits into from
Aug 25, 2021

Conversation

codelipenghui
Copy link
Contributor

@codelipenghui codelipenghui commented Aug 21, 2021

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.

19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable

After check the heap dump of the broker, the pendingWriteOps is 5, this is the reason why the topic can not recover from the fenced state.

image

The topic will change to unfenced only the pendingWriteOps is 0, details can find at

if (pending == 0 && isFenced && !isClosingOrDeleting) {

But after checking the ML state of the topic, it shows the pendingAddEntries is 0 which not equals to pendingWriteOps of the topic.

image

The root cause is we are polling add entry op from the pendingAddEntries in multiple threads, one is the the ZK callback thread when complete the ledger creating (

,
void clearPendingAddEntries(ManagedLedgerException e) {
),
another one is the ML worker thread when complete the add entry op (
OpAddEntry firstInQueue = ml.pendingAddEntries.poll();
)

After the ongoing add entry op complete, but the corresponding op might been polled by the clearPendingAddEntries method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the pendingWriteOps will not change to 0.

I have attached the complete logs for the topic: downloaded-logs-20210820-125853.csv

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.
```
19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable
```

After check the heap dump of the broker, the `pendingWriteOps` is 5, this is the reason why the topic can not recover from the fenced state.

The topic will change to unfenced only the `pendingWriteOps` is 0, details can find at [PersistentTopic.decrementPendingWriteOpsAndCheck()](https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L463)

But after checking the ML state of the topic, it shows the `pendingAddEntries` is 0 which not equals to `pendingWriteOps` of the topic.
The root cause is we are polling add entry op from the `pendingAddEntries` in multiple threads, one is the the ZK callback thread when complete the ledger creating (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1406, https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1669),
another one is the ML worker thread when complete the add entry op (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L181)

After the ongoing add entry op complete, but the corresponding op might been polled by the `clearPendingAddEntries` method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the `pendingWriteOps` will not change to 0.

I have attached the complete logs for the topic:

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.
@codelipenghui codelipenghui added type/bug The PR fixed a bug or issue reported a bug release/2.8.1 doc-not-needed Your PR changes do not impact docs labels Aug 21, 2021
@codelipenghui codelipenghui added this to the 2.9.0 milestone Aug 21, 2021
@codelipenghui codelipenghui self-assigned this Aug 21, 2021
Copy link
Contributor

@315157973 315157973 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member

lhotari commented Aug 23, 2021

@codelipenghui Do you think that this PR could fix #5284 ?

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member

lhotari commented Aug 23, 2021

@codelipenghui Doesn't the problem impact 2.7.x too? Can we cherry-pick to 2.7.4 ?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I wound if it is the case to add a unit test around this method, in order to prevent regressions in the future. I am not talking about adding something that reproduces the issue, but to add some entry in pendingAddEntries and trigger safeRun

@codelipenghui
Copy link
Contributor Author

@lhotari I have added the label release/2.7.4.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM
thank you @lhotari for your second pass

Comment on lines 182 to 193
if (firstInQueue == null) {
// The pending op might been polled by others such as cleanup pending op when create Ledger failed.
ReferenceCountUtil.release(data);
this.recycle();
return;
}
if (this != firstInQueue) {
firstInQueue.failed(new ManagedLedgerException("Unexpected add entry op when complete the add entry op."));
ReferenceCountUtil.release(data);
this.recycle();
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one more question about this solution. Would it be necessary to call the callback for the OpAddEntry by using the "failed" method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. @codelipenghui, can you explain why we call firstInQueue.failed and then release this.data and call this.recycle() without also calling this.failed? Note that the failed method has concurrency controls built in by using the callbackUpdater. Wouldn't we want to use those concurrency controls here as well?

    public void failed(ManagedLedgerException e) {
        AddEntryCallback cb = callbackUpdater.getAndSet(this, null);
        if (cb != null) {
            ReferenceCountUtil.release(data);
            cb.addFailed(e, ctx);
            ml.mbean.recordAddEntryError();
        }
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari @michaeljmarshall As I mentioned in the PR description, there are 2 threads polling the add entry op from the queue, but here poll out the op from the queue but does not call the callback method of the op, this will lead to the pendingWriteOps of the topic can't change to 0 (before adding an entry, pendingWriteOps++. after entry add success for failed, pendingWriteOps--), so we should call the callback of the op here to make sure the pendingWriteOps can change to 0. The root cause of the issue is we missed some op callback calls here because the first op in the queue is not equaled to the current OpAddEntry(the previous behavior just throws an exception). So It's not about concurrency controls of the callbackUpdater.

Copy link
Member

@michaeljmarshall michaeljmarshall Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your clarification. I think I understand why we're calling firstInQueue.failed. I don't understand why we're calling this.recycle without also calling addComplete or failed. If I understand correctly, by calling this.recycle, we're going to set callback = null; without calling its callback. If my understanding is incorrect, can you explain why we're recycling this without calling its callback? Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are 2 threads polling the add entry op from the queue

@codelipenghui The 2 threads polling seem to be part of the problem. Is that so?

One goal of the Managed Ledger design seems to be that there's a single thread per managed ledger executing operations. PR #11387 "Pin executor and scheduled executor threads for ManagedLedgerImpl" fixes some issues where this pinned executor solution wasn't used.

Just curious, would #11387 prevent the 2 threads polling the add entry op from the queue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I made a wrong change there, the OpAddEntry.failed() has released the data but looks like missed the object recycle, So we don't need to release it again.

So a thread takes the op from the queue, it should call the callback and release the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari Yes, I think run the clearPendingAddEntries() in the pinned executor can also fix the problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I made a wrong change there, the OpAddEntry.failed() has released the data but looks like missed the object recycle, So we don't need to release it again.

So a thread takes the op from the queue, it should call the callback and release the data.

@codelipenghui Why isn't this.failed(...) called when this != firstInQueue ?

I would be expecting something like this:

        if (this != firstInQueue) {
            if (firstInQueue != null) {
                firstInQueue.failed(
                        new ManagedLedgerException("Unexpected add entry op when complete the add entry op."));
            }
            this.failed(new ManagedLedgerException("Unexpected add entry op when complete the add entry op."));
            return;
        }

Is there a reason to not call this.failed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui reading your previous answer, I would understand that clearPendingAddEntries is expected to call failed?

Since failed method has concurrency controls, I think it wouldn't cause any harm to call this.failed when this != firstInQueue. No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, clearPendingAddEntries is expected to call failed.
And yes no harm to call this.failed, I think we only complete the op which comes from the queue is more clear here. All the add entry ops will add the queue, so someone poll the op from the queue, it should make sure the op.callback has been called so that we don't need to care about this.failed, we just need to complete all the ops in the queue.

Comment on lines 182 to 193
if (firstInQueue == null) {
// The pending op might been polled by others such as cleanup pending op when create Ledger failed.
ReferenceCountUtil.release(data);
this.recycle();
return;
}
if (this != firstInQueue) {
firstInQueue.failed(new ManagedLedgerException("Unexpected add entry op when complete the add entry op."));
ReferenceCountUtil.release(data);
this.recycle();
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. @codelipenghui, can you explain why we call firstInQueue.failed and then release this.data and call this.recycle() without also calling this.failed? Note that the failed method has concurrency controls built in by using the callbackUpdater. Wouldn't we want to use those concurrency controls here as well?

    public void failed(ManagedLedgerException e) {
        AddEntryCallback cb = callbackUpdater.getAndSet(this, null);
        if (cb != null) {
            ReferenceCountUtil.release(data);
            cb.addFailed(e, ctx);
            ml.mbean.recordAddEntryError();
        }
    }

Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@lhotari lhotari self-requested a review August 24, 2021 12:45
Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hangc0276 hangc0276 merged commit 1bcbab0 into apache:master Aug 25, 2021
hangc0276 pushed a commit that referenced this pull request Aug 25, 2021
* Fix the topic in fenced state and can not recover.

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.
```
19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable
```

After check the heap dump of the broker, the `pendingWriteOps` is 5, this is the reason why the topic can not recover from the fenced state.

The topic will change to unfenced only the `pendingWriteOps` is 0, details can find at [PersistentTopic.decrementPendingWriteOpsAndCheck()](https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L463)

But after checking the ML state of the topic, it shows the `pendingAddEntries` is 0 which not equals to `pendingWriteOps` of the topic.
The root cause is we are polling add entry op from the `pendingAddEntries` in multiple threads, one is the the ZK callback thread when complete the ledger creating (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1406, https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1669),
another one is the ML worker thread when complete the add entry op (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L181)

After the ongoing add entry op complete, but the corresponding op might been polled by the `clearPendingAddEntries` method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the `pendingWriteOps` will not change to 0.

I have attached the complete logs for the topic:

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.

* Release buffer.

* Revert

(cherry picked from commit 1bcbab0)
@hangc0276 hangc0276 added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Aug 25, 2021
@codelipenghui codelipenghui deleted the penghui/fix-topic-fenced branch August 25, 2021 06:26
eolivelli pushed a commit to datastax/pulsar that referenced this pull request Sep 14, 2021
* Fix the topic in fenced state and can not recover.

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.
```
19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable
```

After check the heap dump of the broker, the `pendingWriteOps` is 5, this is the reason why the topic can not recover from the fenced state.

The topic will change to unfenced only the `pendingWriteOps` is 0, details can find at [PersistentTopic.decrementPendingWriteOpsAndCheck()](https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L463)

But after checking the ML state of the topic, it shows the `pendingAddEntries` is 0 which not equals to `pendingWriteOps` of the topic.
The root cause is we are polling add entry op from the `pendingAddEntries` in multiple threads, one is the the ZK callback thread when complete the ledger creating (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1406, https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1669),
another one is the ML worker thread when complete the add entry op (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L181)

After the ongoing add entry op complete, but the corresponding op might been polled by the `clearPendingAddEntries` method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the `pendingWriteOps` will not change to 0.

I have attached the complete logs for the topic:

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.

* Release buffer.

* Revert

(cherry picked from commit 1bcbab0)
codelipenghui added a commit to codelipenghui/incubator-pulsar that referenced this pull request Nov 26, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
apache#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by apache#11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.
codelipenghui added a commit that referenced this pull request Nov 27, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by #11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.
codelipenghui added a commit that referenced this pull request Nov 27, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by #11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.

(cherry picked from commit 3e3622c)
eolivelli pushed a commit to eolivelli/pulsar that referenced this pull request Nov 29, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
apache#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by apache#11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.
lhotari pushed a commit to datastax/pulsar that referenced this pull request Dec 3, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
apache#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by apache#11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.

(cherry picked from commit 3e3622c)
(cherry picked from commit 628212e)
codelipenghui added a commit that referenced this pull request Dec 11, 2021
* Fix the topic in fenced state and can not recover.

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.
```
19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable
```

After check the heap dump of the broker, the `pendingWriteOps` is 5, this is the reason why the topic can not recover from the fenced state.

The topic will change to unfenced only the `pendingWriteOps` is 0, details can find at [PersistentTopic.decrementPendingWriteOpsAndCheck()](https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L463)

But after checking the ML state of the topic, it shows the `pendingAddEntries` is 0 which not equals to `pendingWriteOps` of the topic.
The root cause is we are polling add entry op from the `pendingAddEntries` in multiple threads, one is the the ZK callback thread when complete the ledger creating (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1406, https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1669),
another one is the ML worker thread when complete the add entry op (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L181)

After the ongoing add entry op complete, but the corresponding op might been polled by the `clearPendingAddEntries` method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the `pendingWriteOps` will not change to 0.

I have attached the complete logs for the topic:

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.

* Release buffer.

* Revert

(cherry picked from commit 1bcbab0)
@codelipenghui codelipenghui added the cherry-picked/branch-2.7 Archived: 2.7 is end of life label Dec 11, 2021
codelipenghui added a commit that referenced this pull request Dec 11, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by #11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.

(cherry picked from commit 3e3622c)
eolivelli pushed a commit that referenced this pull request Dec 15, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by #11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.

(cherry picked from commit 3e3622c)
fxbing pushed a commit to fxbing/pulsar that referenced this pull request Dec 19, 2021
### Motivation

There are 2 ways to complete the OpAddEntry with exception, one is the bk client callback and another one is `ManagedLedgerImple.clearPendingAddEntries`.
But, if the OpAddEntry be completed more than once, we will get NPE:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.lambda$handleAddFailure$0(OpAddEntry.java:291) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```

Another one:

```
java.lang.NullPointerException: null
        at org.apache.bookkeeper.mledger.impl.OpAddEntry.addComplete(OpAddEntry.java:153) ~[org.apache.pulsar-managed-ledger-2.8.1.jar:2.8.1]
        at org.apache.bookkeeper.client.AsyncCallback$AddCallback.addCompleteWithLatency(AsyncCallback.java:92) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.PendingAddOp.submitCallback(PendingAddOp.java:431) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle.errorOutPendingAdds(LedgerHandle.java:1799) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.client.LedgerHandle$5.safeRun(LedgerHandle.java:574) ~[org.apache.bookkeeper-bookkeeper-server-4.14.2-2.jar:4.14.2-2]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.2-2.jar:4.14.2-2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
```
apache#12364 tries to fix the NPE by change the state of the OpAddEntry to CLOSED, but the OpAddEntry still will be recyled and be reused.
And when we get the add entry complete callback from the bk client, we will reach here: https://github.com/apache/pulsar/blob/5dbb7d25849f3a037aa522b5d0767801aa0a5096/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L147

But it might to recycle the OpAddEntry already be reused by other entry add operation. It might lead to lost data for this case.

### Modification

So the fix is do not recycle the OpAddEntry when call OpAddEntry.failed() which is introduced by apache#11737.

We should contain this fix in 2.8.2, we have encounter serious problem when unloading the bundles, the topic close will be blocked and never complete
because of LedgerHandle.errorOutPendingAdds() https://github.com/apache/bookkeeper/blob/87579b0a9f18833ee41fcae37582bb68606d68e7/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L574 get NPE but can't throw out and the ledger close callback will never complete.
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
* Fix the topic in fenced state and can not recover.

Here is the log when the issue happens. The producer continues to reconnect to the broker, but the fenced state of the topic is always true.
```
19:01:42.351 [pulsar-io-4-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052][persistent://public/default/test-8] Creating producer. producerId=8
19:01:42.352 [Thread-174681] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] persistent://public/default/test-8 configured with schema false
19:01:42.352 [Thread-174681] WARN  org.apache.pulsar.broker.service.AbstractTopic - [persistent://public/default/test-8] Attempting to add producer to a fenced topic
19:01:42.352 [Thread-174681] ERROR org.apache.pulsar.broker.service.ServerCnx - [/10.24.34.151:48052] Failed to add producer to topic persistent://public/default/test-8: producerId=8, org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException: Topic is temporarily unavailable
```

After check the heap dump of the broker, the `pendingWriteOps` is 5, this is the reason why the topic can not recover from the fenced state.

The topic will change to unfenced only the `pendingWriteOps` is 0, details can find at [PersistentTopic.decrementPendingWriteOpsAndCheck()](https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L463)

But after checking the ML state of the topic, it shows the `pendingAddEntries` is 0 which not equals to `pendingWriteOps` of the topic.
The root cause is we are polling add entry op from the `pendingAddEntries` in multiple threads, one is the the ZK callback thread when complete the ledger creating (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1406, https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1669),
another one is the ML worker thread when complete the add entry op (https://github.com/apache/pulsar/blob/794aa20d9f2a4e668cc36465362d22e042e6e536/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L181)

After the ongoing add entry op complete, but the corresponding op might been polled by the `clearPendingAddEntries` method. So it will poll another one, but due to
not equals to the current op, the polled op will not get a chance to be failed, so that the `pendingWriteOps` will not change to 0.

I have attached the complete logs for the topic:

The fix is to complete the add entry op with ManagedLedgerException if the polled op is not equals to the current op.

* Release buffer.

* Revert
@codelipenghui codelipenghui restored the penghui/fix-topic-fenced branch May 17, 2022 01:23
@codelipenghui codelipenghui deleted the penghui/fix-topic-fenced branch May 17, 2022 01:29
@krishhh5
Copy link

krishhh5 commented Oct 27, 2022

the reason why Topic isTemporarily unavailable is the value “isFenced” is true。the code firstInQueue.failed(newManagedLedgerException) still set isFenced=true because of the addFauled method

public synchronized void addFailed(ManagedLedgerException exception, Object ctx) {
        if (exception instanceof ManagedLedgerFencedException) {
            // If the managed ledger has been fenced, we cannot continue using it. We need to close and reopen
            close();
        } else {
            // fence topic when failed to write a message to BK
            fence();

why not fix like this
firstInQueue.failed(newManagedLedgerFencedException)`
to set isFenced=false

@codelipenghui
Copy link
Contributor Author

@krishhh5

The topic will be closed if we use new ManagedLedgerFencedException. All the producers and consumers will be disconnected. We don't want to close the topic to recover the topic from the wrong state. Instead, we can just make sure the managed ledger will not miss any callback for add entry operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life doc-not-needed Your PR changes do not impact docs release/2.7.4 release/2.8.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants