Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] PIP-352: Event time based compaction #22517

Merged
merged 7 commits into from
Aug 23, 2024

Conversation

marekczajkowski
Copy link
Contributor

@marekczajkowski marekczajkowski commented Apr 16, 2024

Copy link

@marekczajkowski Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @marekczajkowski . I added an initial review.

@lhotari
Copy link
Member

lhotari commented May 15, 2024

One of the challenges with event time order compacting is that it would require some additional configuration and logic for the acceptable time skew (late arrival of events). @marekczajkowski Any thoughts on how to solve that challenge?

@lhotari
Copy link
Member

lhotari commented May 15, 2024

One of the challenges with event time order compacting is that it would require some additional configuration and logic for the acceptable time skew (late arrival of events). @marekczajkowski Any thoughts on how to solve that challenge?

I might be overthinking this. Perhaps it doesn't matter that much. :)

@marekczajkowski
Copy link
Contributor Author

One of the challenges with event time order compacting is that it would require some additional configuration and logic for the acceptable time skew (late arrival of events). @marekczajkowski Any thoughts on how to solve that challenge?

Can you explain what you mean by that ? I don't quite get it

@lhotari
Copy link
Member

lhotari commented May 20, 2024

One of the challenges with event time order compacting is that it would require some additional configuration and logic for the acceptable time skew (late arrival of events). @marekczajkowski Any thoughts on how to solve that challenge?

Can you explain what you mean by that ? I don't quite get it

I think I just wasn't thinking through it. It's not a problem since the last message with the most recent event time "wins". It's eventually consistent.

@marekczajkowski marekczajkowski marked this pull request as ready for review May 29, 2024 08:14
@lhotari lhotari changed the title Event time based compaction PIP-352: Event time based compaction Aug 1, 2024
@lhotari lhotari changed the title PIP-352: Event time based compaction [feat] PIP-352: Event time based compaction Aug 1, 2024
@lhotari
Copy link
Member

lhotari commented Aug 1, 2024

@marekczajkowski Do you have tests for the new behavior? How about the solution for configuring the compactor?

@marekczajkowski
Copy link
Contributor Author

@marekczajkowski Do you have tests for the new behavior? How about the solution for configuring the compactor?

I have added unit tests. As for configuring the compactor there is no new mechanism. It uses existing 'compactionServiceFactoryClassName' property name to switch the factory that creates compactor

@lhotari
Copy link
Member

lhotari commented Aug 13, 2024

I have added unit tests. As for configuring the compactor there is no new mechanism. It uses existing 'compactionServiceFactoryClassName' property name to switch the factory that creates compactor

Thanks. The PR itself is ok.

Configuring the compactor will require some updates in documentation. It would be great if our documention presents this feature, the purpose of it and shows how to enable it. I guess the minimal approach would be to add the configuration key to broker.conf and standalone.conf and document the configuration key in the comments.

@lhotari lhotari added this to the 4.0.0 milestone Aug 13, 2024
@marekczajkowski
Copy link
Contributor Author

I have added unit tests. As for configuring the compactor there is no new mechanism. It uses existing 'compactionServiceFactoryClassName' property name to switch the factory that creates compactor

Thanks. The PR itself is ok.

Configuring the compactor will require some updates in documentation. It would be great if our documention presents this feature, the purpose of it and shows how to enable it. I guess the minimal approach would be to add the configuration key to broker.conf and standalone.conf and document the configuration key in the comments.

Sure. Both configs updated with documenting comments

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good work @marekczajkowski ! Thanks for contributing!

@marekczajkowski
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2024

Codecov Report

Attention: Patch coverage is 70.55394% with 101 lines in your changes missing coverage. Please review.

Project coverage is 74.53%. Comparing base (bbc6224) to head (e0a1c70).
Report is 543 commits behind head on master.

Files Patch % Lines
...e/pulsar/compaction/AbstractTwoPhaseCompactor.java 76.88% 37 Missing and 12 partials ⚠️
...che/pulsar/compaction/EventTimeOrderCompactor.java 36.76% 35 Missing and 8 partials ⚠️
.../compaction/EventTimeCompactionServiceFactory.java 0.00% 5 Missing ⚠️
...he/pulsar/compaction/PublishingOrderCompactor.java 93.18% 2 Missing and 1 partial ⚠️
...g/apache/pulsar/client/impl/RawBatchConverter.java 90.90% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #22517      +/-   ##
============================================
+ Coverage     73.57%   74.53%   +0.96%     
- Complexity    32624    33654    +1030     
============================================
  Files          1877     1924      +47     
  Lines        139502   144548    +5046     
  Branches      15299    15822     +523     
============================================
+ Hits         102638   107743    +5105     
+ Misses        28908    28543     -365     
- Partials       7956     8262     +306     
Flag Coverage Δ
inttests 27.50% <41.98%> (+2.92%) ⬆️
systests 24.70% <31.19%> (+0.37%) ⬆️
unittests 73.89% <70.26%> (+1.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...va/org/apache/pulsar/compaction/CompactorTool.java 68.91% <100.00%> (ø)
...pache/pulsar/compaction/MessageCompactionData.java 100.00% <100.00%> (ø)
...sar/compaction/PulsarCompactionServiceFactory.java 82.60% <100.00%> (ø)
.../pulsar/compaction/StrategicTwoPhaseCompactor.java 73.72% <ø> (-2.97%) ⬇️
...g/apache/pulsar/client/impl/RawBatchConverter.java 94.25% <90.90%> (+0.35%) ⬆️
...he/pulsar/compaction/PublishingOrderCompactor.java 93.18% <93.18%> (ø)
.../compaction/EventTimeCompactionServiceFactory.java 0.00% <0.00%> (ø)
...che/pulsar/compaction/EventTimeOrderCompactor.java 36.76% <36.76%> (ø)
...e/pulsar/compaction/AbstractTwoPhaseCompactor.java 76.88% <76.88%> (ø)

... and 496 files with indirect coverage changes

@lhotari lhotari merged commit 1c495e1 into apache:master Aug 23, 2024
51 checks passed
grssam pushed a commit to grssam/pulsar that referenced this pull request Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants