Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][client] Fix consumer can't consume resent chunked messages #21070

Merged
merged 1 commit into from
Aug 29, 2023

Conversation

RobertIndie
Copy link
Member

@RobertIndie RobertIndie commented Aug 26, 2023

Motivation

Current, when the producer resend the chunked message like this:

  • M1: UUID: 0, ChunkID: 0
  • M2: UUID: 0, ChunkID: 0 // Resend the first chunk
  • M3: UUID: 0, ChunkID: 1

When the consumer received the M2, it will find that it's already tracking the UUID:0 chunked messages, and will then discard the message M1 and M2. This will lead to unable to consume the whole chunked message even though it's already persisted in the Pulsar topic.

Here is the code logic:

if (msgMetadata.getChunkId() == 0) {
if (chunkedMsgCtx != null) {
// The first chunk of a new chunked-message received before receiving other chunks of previous
// chunked-message
// so, remove previous chunked-message from map and release buffer
if (chunkedMsgCtx.chunkedMsgBuffer != null) {
ReferenceCountUtil.safeRelease(chunkedMsgCtx.chunkedMsgBuffer);
}
chunkedMsgCtx.recycle();
chunkedMessagesMap.remove(msgMetadata.getUuid());
}
pendingChunkedMessageCount++;
if (maxPendingChunkedMessage > 0 && pendingChunkedMessageCount > maxPendingChunkedMessage) {
removeOldestPendingChunkedMessage();
}
int totalChunks = msgMetadata.getNumChunksFromMsg();
ByteBuf chunkedMsgBuffer = PulsarByteBufAllocator.DEFAULT.buffer(msgMetadata.getTotalChunkMsgSize(),
msgMetadata.getTotalChunkMsgSize());
chunkedMsgCtx = chunkedMessagesMap.computeIfAbsent(msgMetadata.getUuid(),
(key) -> ChunkedMessageCtx.get(totalChunks, chunkedMsgBuffer));
pendingChunkedMessageUuidQueue.add(msgMetadata.getUuid());
}
// discard message if chunk is out-of-order
if (chunkedMsgCtx == null || chunkedMsgCtx.chunkedMsgBuffer == null
|| msgMetadata.getChunkId() != (chunkedMsgCtx.lastChunkedMessageId + 1)) {
// means we lost the first chunk: should never happen
log.info("Received unexpected chunk messageId {}, last-chunk-id{}, chunkId = {}", msgId,
(chunkedMsgCtx != null ? chunkedMsgCtx.lastChunkedMessageId : null), msgMetadata.getChunkId());
if (chunkedMsgCtx != null) {
if (chunkedMsgCtx.chunkedMsgBuffer != null) {
ReferenceCountUtil.safeRelease(chunkedMsgCtx.chunkedMsgBuffer);
}
chunkedMsgCtx.recycle();
}
chunkedMessagesMap.remove(msgMetadata.getUuid());
compressedPayload.release();
increaseAvailablePermits(cnx);
if (expireTimeOfIncompleteChunkedMessageMillis > 0
&& System.currentTimeMillis() > (msgMetadata.getPublishTime()
+ expireTimeOfIncompleteChunkedMessageMillis)) {
doAcknowledge(msgId, AckType.Individual, Collections.emptyMap(), null);
} else {
trackMessage(msgId);
}
return null;
}

The bug can be easily reproduced using the testcase testResendChunkMessages introduced by this PR.

Modifications

  • When receiving the new duplicated first chunk of a chunked message, the consumer discard the current chunked message context and create a new context to track the following messages. For the case mentioned in Motivation, the M1 will be released and the consumer will assemble M2 and M3 as the chunked message.

Verifying this change

This change added tests.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions
Copy link

@RobertIndie Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@RobertIndie RobertIndie added doc-not-needed Your PR changes do not impact docs release/3.0.2 and removed doc-label-missing labels Aug 26, 2023
@github-actions github-actions bot added doc-label-missing and removed doc-not-needed Your PR changes do not impact docs labels Aug 26, 2023
@RobertIndie RobertIndie added release/3.1.1 doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Aug 26, 2023
@github-actions github-actions bot added doc-label-missing and removed doc-not-needed Your PR changes do not impact docs labels Aug 26, 2023
@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Aug 28, 2023
@RobertIndie RobertIndie merged commit eb2e3a2 into apache:master Aug 29, 2023
54 of 60 checks passed
@RobertIndie RobertIndie deleted the resend-chunk branch August 29, 2023 16:23
RobertIndie added a commit that referenced this pull request Aug 31, 2023
)

### Motivation

Current, when the producer resend the chunked message like this:
- M1: UUID: 0, ChunkID: 0
- M2: UUID: 0, ChunkID: 0 // Resend the first chunk
- M3: UUID: 0, ChunkID: 1

When the consumer received the M2, it will find that it's already tracking the UUID:0 chunked messages, and will then discard the message M1 and M2. This will lead to unable to consume the whole chunked message even though it's already persisted in the Pulsar topic.

Here is the code logic:
https://github.com/apache/pulsar/blob/44a055b8a55078bcf93f4904991598541aa6c1ee/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1436-L1482

The bug can be easily reproduced using the testcase `testResendChunkMessages` introduced by this PR.

### Modifications

- When receiving the new duplicated first chunk of a chunked message, the consumer discard the current chunked message context and create a new context to track the following messages. For the case mentioned in Motivation, the M1 will be released and the consumer will assemble M2 and M3 as the chunked message.

(cherry picked from commit eb2e3a2)
RobertIndie added a commit that referenced this pull request Aug 31, 2023
)

### Motivation

Current, when the producer resend the chunked message like this:
- M1: UUID: 0, ChunkID: 0
- M2: UUID: 0, ChunkID: 0 // Resend the first chunk
- M3: UUID: 0, ChunkID: 1

When the consumer received the M2, it will find that it's already tracking the UUID:0 chunked messages, and will then discard the message M1 and M2. This will lead to unable to consume the whole chunked message even though it's already persisted in the Pulsar topic.

Here is the code logic:
https://github.com/apache/pulsar/blob/44a055b8a55078bcf93f4904991598541aa6c1ee/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1436-L1482

The bug can be easily reproduced using the testcase `testResendChunkMessages` introduced by this PR.

### Modifications

- When receiving the new duplicated first chunk of a chunked message, the consumer discard the current chunked message context and create a new context to track the following messages. For the case mentioned in Motivation, the M1 will be released and the consumer will assemble M2 and M3 as the chunked message.

(cherry picked from commit eb2e3a2)
liangyepianzhou pushed a commit that referenced this pull request Sep 4, 2023
)

Current, when the producer resend the chunked message like this:
- M1: UUID: 0, ChunkID: 0
- M2: UUID: 0, ChunkID: 0 // Resend the first chunk
- M3: UUID: 0, ChunkID: 1

When the consumer received the M2, it will find that it's already tracking the UUID:0 chunked messages, and will then discard the message M1 and M2. This will lead to unable to consume the whole chunked message even though it's already persisted in the Pulsar topic.

Here is the code logic:
https://github.com/apache/pulsar/blob/44a055b8a55078bcf93f4904991598541aa6c1ee/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1436-L1482

The bug can be easily reproduced using the testcase `testResendChunkMessages` introduced by this PR.

- When receiving the new duplicated first chunk of a chunked message, the consumer discard the current chunked message context and create a new context to track the following messages. For the case mentioned in Motivation, the M1 will be released and the consumer will assemble M2 and M3 as the chunked message.

(cherry picked from commit eb2e3a2)
liangyepianzhou pushed a commit that referenced this pull request Sep 4, 2023
)

Current, when the producer resend the chunked message like this:
- M1: UUID: 0, ChunkID: 0
- M2: UUID: 0, ChunkID: 0 // Resend the first chunk
- M3: UUID: 0, ChunkID: 1

When the consumer received the M2, it will find that it's already tracking the UUID:0 chunked messages, and will then discard the message M1 and M2. This will lead to unable to consume the whole chunked message even though it's already persisted in the Pulsar topic.

Here is the code logic:
https://github.com/apache/pulsar/blob/44a055b8a55078bcf93f4904991598541aa6c1ee/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1436-L1482

The bug can be easily reproduced using the testcase `testResendChunkMessages` introduced by this PR.

- When receiving the new duplicated first chunk of a chunked message, the consumer discard the current chunked message context and create a new context to track the following messages. For the case mentioned in Motivation, the M1 will be released and the consumer will assemble M2 and M3 as the chunked message.

(cherry picked from commit eb2e3a2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants