Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

Kafka and Natss Channels Redelivery #1114

Closed
wants to merge 3 commits into from

Conversation

pierDipi
Copy link
Member

@pierDipi pierDipi commented Apr 7, 2020

Fixes #933
Fixes #1105
Fixes #753
Fixes #596

Proposed Changes

  • Kafka consumer exponential backoff retries
  • E2E test for Kafka Channel redelivery
  • Fix Natss Channel redelivery
  • E2E test for Natss Channel redelivery

Signed-off-by: Pierangelo Di Pilato <[email protected]>
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Apr 7, 2020
@knative-prow-robot knative-prow-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 7, 2020
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pierDipi
To complete the pull request process, please assign harwayne
You can assign the PR to them by writing /assign @harwayne in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot
Copy link
Contributor

Hi @pierDipi. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 7, 2020
Signed-off-by: Pierangelo Di Pilato <[email protected]>
Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Produced via:
prettier --write --prose-wrap=always $(find -name '*.md' | grep -v vendor | grep -v .github | grep -v docs/cmd/)

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Produced via:
prettier --write --prose-wrap=always $(find -name '*.md' | grep -v vendor | grep -v .github | grep -v docs/cmd/)

test/test_images/droplogevents/README.md Outdated Show resolved Hide resolved
test/test_images/droplogevents/README.md Show resolved Hide resolved
This was referenced Apr 7, 2020
@lionelvillard
Copy link
Member

nice!

/ok-to-test

@knative-prow-robot knative-prow-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 7, 2020
session sarama.ConsumerGroupSession,
message *sarama.ConsumerMessage,
) error {
return wait.ExponentialBackoff(consumer.backoff, func() (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to rely on the CloudEvent SDK to do the retry. Also look at the delivery spec for additional strategies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not here, because we use bindings apis so we don't rely on client

Signed-off-by: Pierangelo Di Pilato <[email protected]>
@slinkydeveloper
Copy link
Contributor

@pierDipi thanks for starting tackling this, but I'm not sure this should be solved at kafkachannel level...
IMO this is a problem that should be solved from the dispatcher, specifically straight from the message sender in kncloudevents package

@slinkydeveloper
Copy link
Contributor

And I would love to explore a library before implementing this manually

@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-eventing-contrib-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
kafka/channel/pkg/dispatcher/dispatcher.go 58.9% 61.2% 2.3
kafka/common/pkg/kafka/consumer_handler.go 88.9% 88.5% -0.4

@slinkydeveloper
Copy link
Contributor

This also gives us the ability to have fine grained retries: eg if the destination accepts the message but the reply fails, there is no point to retry again on destination

@knative-prow-robot
Copy link
Contributor

@pierDipi: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-knative-eventing-contrib-build-tests 05ce793 link /test pull-knative-eventing-contrib-build-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pierDipi
Copy link
Member Author

pierDipi commented Apr 7, 2020

This also gives us the ability to have fine grained retries: eg if the destination accepts the message but the reply fails, there is no point to retry again on destination

Agree

I would love to explore a library before implementing this manually

We can use k8s.io/apimachinery/pkg/util/wait package for linear and exponential back-off
Note: the package has only a function called ExponentialBackoff but we can actually implement the Linear back-off.

@slinkydeveloper
Copy link
Contributor

We can use k8s.io/apimachinery/pkg/util/wait package for linear and exponential back-off
Note: the package has only a function called ExponentialBackoff but we can actually implement the Linear back-off.

yeah i think it's definitely a good idea, given that we already depend on this package it's even better

@slinkydeveloper
Copy link
Contributor

I think you should close this and work on dispatcher directly, in order to have the retry transparent to the channel implementation

@pierDipi
Copy link
Member Author

pierDipi commented Apr 7, 2020

I think e2e tests here are needed

@pierDipi
Copy link
Member Author

pierDipi commented Apr 7, 2020

Or should we move them to eventing and use like TestBrokerChannelFlow?

@slinkydeveloper
Copy link
Contributor

Yeah maybe we should have them directly in eventing

@pierDipi pierDipi closed this Apr 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/test-and-release cla: yes Indicates the PR's author has signed the CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
7 participants