Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix NPE when reset Replicator's cursor by position. #20597

Conversation

lifepuzzlefun
Copy link
Contributor

@lifepuzzlefun lifepuzzlefun commented Jun 17, 2023

Motivation

When init in PersistentReplicator, the field of PersistentSubscription is null.
see :

this.expiryMonitor = new PersistentMessageExpiryMonitor(localTopicName,
Codec.decode(cursor.getName()), cursor, null);

public PersistentMessageExpiryMonitor(String topicName, String subscriptionName, ManagedCursor cursor,
PersistentSubscription subscription) {
this.topicName = topicName;

so an NPE throws when executing expireMessages(Position messagePosition):

public boolean expireMessages(Position messagePosition) {
// If it's beyond last position of this topic, do nothing.
if (((PositionImpl) subscription.getTopic().getLastPosition()).compareTo((PositionImpl) messagePosition) < 0) {
if (log.isDebugEnabled()) {
log.debug("[{}][{}] Ignore expire-message scheduled task, given position {} is beyond "
+ "current topic's last position {}", topicName, subName, messagePosition,
subscription.getTopic().getLastPosition());
}
return false;
}

Modifications

save PersistentTopic in PersistentMessageExpiryMonitor field.

ensure when call expireMessages(Position messagePosition) only check topic LastPosition by topic. not null subscription.

Verifying this change

add new test to verify the method wont throw npe. but not the logic.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jun 17, 2023
Copy link
Contributor

@poorbarcode poorbarcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@poorbarcode poorbarcode added this to the 3.1.0 milestone Jun 28, 2023
@poorbarcode poorbarcode added the type/bug The PR fixed a bug or issue reported a bug label Jun 28, 2023
@Technoboy- Technoboy- force-pushed the fix_npe_when_expire_replicator_message_by_position branch from 5a7dfd1 to 8c9b1d3 Compare June 29, 2023 12:08
@codecov-commenter
Copy link

Codecov Report

Merging #20597 (8c9b1d3) into master (0e60340) will increase coverage by 0.51%.
The diff coverage is 73.07%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #20597      +/-   ##
============================================
+ Coverage     72.60%   73.11%   +0.51%     
+ Complexity    32018    31936      -82     
============================================
  Files          1855     1867      +12     
  Lines        138569   138764     +195     
  Branches      15250    15265      +15     
============================================
+ Hits         100605   101458     +853     
+ Misses        29945    29258     -687     
- Partials       8019     8048      +29     
Flag Coverage Δ
inttests 24.05% <19.23%> (-0.06%) ⬇️
systests 25.04% <15.38%> (?)
unittests 72.39% <73.07%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...pulsar/broker/admin/impl/PersistentTopicsBase.java 64.30% <63.15%> (+0.17%) ⬆️
...ice/persistent/PersistentMessageExpiryMonitor.java 70.10% <100.00%> (+1.35%) ⬆️
...roker/service/persistent/PersistentReplicator.java 69.46% <100.00%> (+0.67%) ⬆️
...ker/service/persistent/PersistentSubscription.java 75.91% <100.00%> (ø)

... and 144 files with indirect coverage changes

Comment on lines -60 to +63
public PersistentMessageExpiryMonitor(String topicName, String subscriptionName, ManagedCursor cursor,
PersistentSubscription subscription) {
this.topicName = topicName;
public PersistentMessageExpiryMonitor(PersistentTopic topic, String subscriptionName, ManagedCursor cursor,
@Nullable PersistentSubscription subscription) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR modifies the public API (even it's in the pulsar-broker module), we should not cherry-pick it to release branches unless we make it compatible. @poorbarcode @Technoboy-

liangyepianzhou added a commit that referenced this pull request Jul 11, 2023
liangyepianzhou pushed a commit that referenced this pull request Jul 11, 2023
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this pull request Jul 13, 2023
### Motivation

When I reviewed apache#20597, the
unrelated changes in `PersistentTopicsBase` are hard to read. The logic
could be simplified to:

```java
PersistentSubscription sub = null;
PersistentReplicator repl = null;
if (metSomeCondition()) {
    repl = /* ... */;
    if (repl == null) {
        /* ... */
        return;
    }
} else {
    sub = /* ... */;
    if (repl == null) {
        /* ... */
        return;
    }
}
final PersistentSubscription finalSub = sub;
final PersistentReplicator finalRepl = repl;
future.thenAccept(__ -> {
    if (metSomeCondition()) {
        repl.expireMessages(/* ... */);
    } else {
        sub.expireMessages(/* ... */);
    }
});
```

The code above is such a mess. It adds two final variables because the
lambda can only capture final variables. The `metSomeCondition` check is
performed unnecessarily twice. The original code is more hard to read
because the logic in `/* ... */` takes a few lines so that the two calls
of `metSomeCondition()` are not near.

From the code search I see all these classes implement two
`expireMessages` methods that accept an integer or a position.

- PersistentMessageExpiryMonitor
- PersistentSubscription
- PersistentReplicator
- NonPersistentSubscription

The code can be simplified to introduce a new interface.

### Modifications

Introduce a `MessageExpirer` interface and change the class hierarchy to:

```
// [I] is interface, [C] is class
[I] MessageExpirer
  [I] Subscription
    [C] PersistentSubscription
    [C] NonPersistentSubscription
  [C] PersistentReplicator
  [C] PersistentMessageExpiryMonitor
```

The method invocation can be simplified much as shown in this patch.

P.S. Inserting such an interface in the type hierarchy does not even
break the ABI compatibility, see https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this pull request Jul 13, 2023
### Motivation

When I reviewed apache#20597, the
unrelated changes in `PersistentTopicsBase` are hard to read. The logic
could be simplified to:

```java
PersistentSubscription sub = null;
PersistentReplicator repl = null;
if (metSomeCondition()) {
    repl = /* ... */;
    if (repl == null) {
        /* ... */
        return;
    }
} else {
    sub = /* ... */;
    if (repl == null) {
        /* ... */
        return;
    }
}
final PersistentSubscription finalSub = sub;
final PersistentReplicator finalRepl = repl;
future.thenAccept(__ -> {
    if (metSomeCondition()) {
        repl.expireMessages(/* ... */);
    } else {
        sub.expireMessages(/* ... */);
    }
});
```

The code above is such a mess. It adds two final variables because the
lambda can only capture final variables. The `metSomeCondition` check is
performed unnecessarily twice. The original code is more hard to read
because the logic in `/* ... */` takes a few lines so that the two calls
of `metSomeCondition()` are not near.

From the code search I see all these classes implement two
`expireMessages` methods that accept an integer or a position.

- PersistentMessageExpiryMonitor
- PersistentSubscription
- PersistentReplicator
- NonPersistentSubscription

The code can be simplified to introduce a new interface.

### Modifications

Introduce a `MessageExpirer` interface and change the class hierarchy to:

```
// [I] is interface, [C] is class
[I] MessageExpirer
  [I] Subscription
    [C] PersistentSubscription
    [C] NonPersistentSubscription
  [C] PersistentReplicator
  [C] PersistentMessageExpiryMonitor
```

The method invocation can be simplified much as shown in this patch.

P.S. Inserting such an interface in the type hierarchy does not even
break the ABI compatibility, see https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html
shibd pushed a commit to shibd/pulsar that referenced this pull request Jul 16, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 17, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 17, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 17, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 17, 2023
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 13, 2024
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 17, 2024
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 19, 2024
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.10 doc-not-needed Your PR changes do not impact docs ready-to-test release/2.10.5 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants