Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR] Disable replay in session mode when rate limit is active #3854

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

romtsn
Copy link
Member

@romtsn romtsn commented Nov 6, 2024

📜 Description

  • Add mappings for DataCategory <-> ItemType to make replays respect rate-limiting
  • Check if rate-limiting is active for Replays or All before recording a screenshot in session mode to not consume disk space. In session mode we capture segments every 5s so if a rate limit is active for e.g. 10 minutes we'd quickly overflow the envelope cache with replay segments. In buffer mode, however, it's still fine to capture the replay because it'll be only one envelope that we store if an error occurs.

💡 Motivation and Context

Part of getsentry/sentry#74441

💚 How did you test it?

Manually + automated

📝 Checklist

  • I reviewed the submitted code.
  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

@romtsn romtsn changed the title [SR] Do not capture screenshots in session mode when rate limit is active [SR] Disable replay in session mode when rate limit is active Nov 6, 2024
Copy link
Contributor

github-actions bot commented Nov 6, 2024

🚨 Detected changes in high risk code 🚨

High-risk code has higher potential to break the SDK and may be hard to test. To prevent severe bugs, apply the rollout process for releasing such changes and be extra careful when changing and reviewing these files:

  • sentry/src/main/java/io/sentry/transport/RateLimiter.java

@romtsn romtsn requested a review from brustolin November 6, 2024 22:47
Copy link
Contributor

github-actions bot commented Nov 6, 2024

🚨 Detected changes in high risk code 🚨

High-risk code has higher potential to break the SDK and may be hard to test. To prevent severe bugs, apply the rollout process for releasing such changes and be extra careful when changing and reviewing these files:

  • sentry/src/main/java/io/sentry/transport/RateLimiter.java

Copy link
Contributor

github-actions bot commented Nov 6, 2024

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 392.38 ms 452.42 ms 60.04 ms
Size 1.70 MiB 2.35 MiB 668.54 KiB

Comment on lines +83 to +88
val rateLimiter = hub?.rateLimiter
if (rateLimiter?.isActiveForCategory(All) == true || rateLimiter?.isActiveForCategory(Replay) == true) {
options.logger.log(DEBUG, "Skipping screenshot recording, rate limit is active")
bitmap?.recycle()
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brustolin and @romtsn I think this is a different behaviour than on iOS getsentry/sentry-cocoa#4496. On iOS, when a limit for SR gets active, we drop everything until the next session starts. Please align.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in our last replay meeting, we agreed to start with the basics and stop session replay altogether for the remainder of the app session.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be more work for me to stop the entire replay than just not recording screenshots tbh. Besides, what if rate-limit is active for let's say 30seconds when we''re just starting the replay and because of that we're going to drop the entire replay? i'd prefer to do our best and start capturing from when the rate-limit is lifted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer to do our best and start capturing from when the rate-limit is lifted?

That's also my opinion. This would also simplify the logic.

Copy link

@brustolin brustolin Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But your current logic is not taking into account that the envelope with segment 0 could be the one that was lost. It does not matter anymore to send to following segments, they all will be ignored in the server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this approach is just throwing away the already-taken screenshot. Maybe Android is different; for iOS, making the screenshot is the most expensive part of the process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point to take into account, someone can correct me on this one, is that the most common form of rate limiting is quota exceeded. And this will only be lifted at the end of the month.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But your current logic is not taking into account that the envelope with segment 0 could be the one that was lost. It does not matter anymore to send to following segments, they all will be ignored in the server.

Why would it be lost? We're not capturing envelopes if rate limit is active right? Or do you mean if we get rate-limited upon sending the segment 0 event? In this case on Android it'd be stored in cache until rate-limit is lifted. The only case it gets lost if the cache gets rotated, but it's an edge-case enough to not handle it for now I guess.

Btw, this approach is just throwing away the already-taken screenshot. Maybe Android is different; for iOS, making the screenshot is the most expensive part of the process.

That's a good point, I should probably move this logic to where we capture the screenshot 👍

Another point to take into account, someone can correct me on this one, is that the most common form of rate limiting is quota exceeded. And this will only be lifted at the end of the month.

Probably yes, but not sure about the end of the month (could be on-demand events etc etc). I think this is also mostly relevant for big customers who have spike protection and whatnot in place, so we shouldn't think in "months" here probably :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this will only be lifted at the end of the month.

Nope, spike protection works on an hour basis: https://docs.sentry.io/pricing/quotas/spike-protection/#how-sentry-detects-spikes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough insight into SR. I let you two decide this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants