You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.
However, there is no common interface to get the specific field like ledger id and entry id. These details might be not much useful to application users, but they are important to developers of Pulsar and its ecosystems. Currently, we can only access the specific implementations directly. So there are a lot of unnecessary type assumptions checks like
Another problem is that when users want to create a MessageId used in seek or acknowledge, they have to create instances of these implementations defined in pulsar-client module and the impl package. Any API change to these implementations could bring a breaking change. However, it should be allowed to make some modifications on the impl package, otherwise differing api and impl would be meaningless.
Goal
All the problems are all caused by the lack of abstraction of MessageIdData. There is no interface to represent the MessageIdData. This proposal aims at adding a common interface to access the fields of MessageIdData so that all usages of msgId instanceof XXXImpl could be simplified to something like (MessageIdAdv) msgId
API Changes
Introduce a new interface to represent a MessageIdData.
packageorg.apache.pulsar.client.api;
/** * The {@link MessageId} interface provided for advanced users. * <p> * It supports retrieving any field of {@link org.apache.pulsar.common.api.proto.MessageIdData}, which is generated * from `PulsarApi.proto`. */publicinterfaceMessageIdAdvextendsMessageId {
longgetLedgerId();
longgetEntryId();
defaultintgetPartition() {
return -1;
}
defaultintgetBatchIndex() {
return -1;
}
default@NullableBitSetgetAckSet() {
returnnull;
}
defaultintgetBatchSize() {
return0;
}
default@NullableMessageIdAdvgetFirstChunkMessageId() {
returnnull;
}
defaultbooleanisBatch() {
returngetBatchIndex() >= 0 && getBatchSize() > 0;
}
}
Since the aimed developers are Pulsar core developers, it's added in the pulsar-common module (PulsarApi.proto is also in this module), not the pulsar-client-api module.
To avoid naming conflicts with proto.MessageIdData, the interface name just adds the PulsarApi prefix to represent it's a representation of the message Id defined in PulsarApi.proto.
Only getLedgerId and getEntryId are required because when seeking to a specific position, the partition field is not needed. For example, if users want to create its own implementation for seek or acknowledge, they can create an implementation like:
@AllArgsConstructorprivatestaticclassNonBatchedMessageIdimplementsMessageIdAdv {
// For non-batched message id in a single topic, only ledger id and entry id are requiredprivatefinallongledgerId;
privatefinallongentryId;
@Overridepublicbyte[] toByteArray() {
returnnewbyte[0]; // dummy implementation
}
@OverridepubliclonggetLedgerId() {
returnledgerId;
}
@OverridepubliclonggetEntryId() {
returnentryId;
}
}
Implementation
Most modifications are replacing the msgId instanceof XXXImpl with (MessageIdAdv) msgId. And some methods like TopicMessageIdImpl#getInnerMessageId will be marked as deprecated. They might need to be retained for one or more major releases for compatibility.
There is a point that since we use a BitSet to represent the ack set, which is a long array defined in PulsarApi.proto.
We have to deprecated the BatchMessageAcker, which is just a wrapper of a BitSet and the batch size. After that, we no longer needs to acknowledge one message of the batch like:
if (msgIdinstanceofBatchMessageIdImpl) {
((BatchMessageIdImpl) msgId).getAcker().ackIndividual();
}
Use the getAckSet() API and clear the specific bits of the BitSet instead.
Original Issue: apache#18950
Motivation
MessageIdData
is defined inPulsarApi.proto
: https://github.com/apache/pulsar/blob/a35670d83b0b3f2fb63d11e4fb222d7f24099c9a/pulsar-common/src/main/proto/PulsarApi.proto#L57-L67However, there is no common interface to get the specific field like ledger id and entry id. These details might be not much useful to application users, but they are important to developers of Pulsar and its ecosystems. Currently, we can only access the specific implementations directly. So there are a lot of unnecessary type assumptions checks like
https://github.com/apache/pulsar/blob/a35670d83b0b3f2fb63d11e4fb222d7f24099c9a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ResetCursorData.java#L70-L81
And for
TopicMessageIdImpl
, we have to check theMessageId
is aTopicMessageIdImpl
and then call thegetInnerMessageId()
method:https://github.com/apache/pulsar/blob/a35670d83b0b3f2fb63d11e4fb222d7f24099c9a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MultiTopicsConsumerImpl.java#L457
https://github.com/apache/pulsar/blob/a35670d83b0b3f2fb63d11e4fb222d7f24099c9a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MultiTopicsConsumerImpl.java#L467
Another problem is that when users want to create a MessageId used in
seek
oracknowledge
, they have to create instances of these implementations defined inpulsar-client
module and theimpl
package. Any API change to these implementations could bring a breaking change. However, it should be allowed to make some modifications on theimpl
package, otherwise differingapi
andimpl
would be meaningless.Goal
All the problems are all caused by the lack of abstraction of
MessageIdData
. There is no interface to represent theMessageIdData
. This proposal aims at adding a common interface to access the fields ofMessageIdData
so that all usages ofmsgId instanceof XXXImpl
could be simplified to something like(MessageIdAdv) msgId
API Changes
Introduce a new interface to represent a
MessageIdData
.Since the aimed developers are Pulsar core developers, it's added in the
pulsar-common
module (PulsarApi.proto
is also in this module), not thepulsar-client-api
module.To avoid naming conflicts with
proto.MessageIdData
, the interface name just adds thePulsarApi
prefix to represent it's a representation of the message Id defined inPulsarApi.proto
.Only
getLedgerId
andgetEntryId
are required because when seeking to a specific position, thepartition
field is not needed. For example, if users want to create its own implementation forseek
oracknowledge
, they can create an implementation like:Implementation
Most modifications are replacing the
msgId instanceof XXXImpl
with(MessageIdAdv) msgId
. And some methods likeTopicMessageIdImpl#getInnerMessageId
will be marked as deprecated. They might need to be retained for one or more major releases for compatibility.There is a point that since we use a
BitSet
to represent the ack set, which is a long array defined inPulsarApi.proto
.https://github.com/apache/pulsar/blob/a35670d83b0b3f2fb63d11e4fb222d7f24099c9a/pulsar-common/src/main/proto/PulsarApi.proto#L62
We have to deprecated the
BatchMessageAcker
, which is just a wrapper of aBitSet
and the batch size. After that, we no longer needs to acknowledge one message of the batch like:Use the
getAckSet()
API and clear the specific bits of theBitSet
instead.Alternatives
Add the getters to the
MessageId
directly. This idea was denied from the discussion here: https://lists.apache.org/thread/rdkqnkohbmkjjs61hvoqplhhngr0b0sdAnything else?
No response
The text was updated successfully, but these errors were encountered: