-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): faster InsertObject()
uploads
#9997
feat(storage): faster InsertObject()
uploads
#9997
Conversation
With this change the client library will only validate message boundaries for `InsertObject()` **only** if requested. The new default makes sense as (1) it saves about 2ms of client CPU time for 128MiB uploads, (2) if there is a collision this will be detected by the service via the upload checksums, and (3) the probability of a collision is small. The current implementation has a population of more than $$10^100$$ strings to pick from. The probability of finding this string in a EB of data is less than $$1 / 10^95$$. You could upload a EB of data per second for a billion years and still not find a collision.
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
Codecov ReportBase: 94.22% // Head: 94.23% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #9997 +/- ##
=======================================
Coverage 94.22% 94.23%
=======================================
Files 1501 1501
Lines 141086 141085 -1
=======================================
+ Hits 132943 132948 +5
+ Misses 8143 8137 -6
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obligatory reminder to update the commit message.
With this change the client library will optimistically use a randomly generated string as a message boundary, but will not validate if the string appears in the payload.
The new behavior makes sense as (1) it saves about 2ms of client CPU time for 128MiB uploads, (2) if there is a collision this will be detected by the service via the upload checksums, and (3) the probability of a collision is small.
The current implementation has a population of more than$10^{100}$ strings to pick from. The probability of finding this string in a EB of data is less than $1 / 10^{95}$ . You could upload a EB of data per second for a billion years and still not find a collision.
This change is