Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UploadStreamToBlockBlob: only require body be io.Reader #15958

Conversation

serprex
Copy link
Contributor

@serprex serprex commented Oct 27, 2021

This function only takes an io.Reader in azure-storage-blob-go,
& wal-g abstracts over multiple upload mechanisms which all operate on io.Reader

@ghost ghost added the Storage Storage Service (Queues, Blobs, Files) label Oct 27, 2021
@ghost
Copy link

ghost commented Oct 27, 2021

Thank you for your contribution serprex! We will review the pull request and get back to you soon.

@ghost ghost added the customer-reported Issues that are reported by GitHub users external to the Azure organization. label Oct 27, 2021
@mohsha-msft mohsha-msft changed the base branch from main to dev-azblob November 3, 2021 04:14
@mohsha-msft mohsha-msft changed the base branch from dev-azblob to main November 3, 2021 04:15
Copy link
Contributor

@mohsha-msft mohsha-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Track2 (New SDK) was introduced

  1. To provide more stable SDKs to avoid (or minimize) breaking changes for the end-user.
  2. To introduce big design changes and decouple components such as logging, authentication, etc.

We've kept the similar architecture and use case across all new SDKs. So, I'll have to decline to this PR. Can you tell me why it should be changed?

@@ -33,7 +33,7 @@ type blockWriter interface {
// well, 4 MiB or 8 MiB, and auto-scale to as many goroutines within the memory limit. This gives a single dial to tweak and we can
// choose a max value for the memory setting based on internal transfers within Azure (which will give us the maximum throughput model).
// We can even provide a utility to dial this number in for customer networks to optimize their copies.
func copyFromReader(ctx context.Context, from io.ReadSeekCloser, to blockWriter, o UploadStreamToBlockBlobOptions) (BlockBlobCommitBlockListResponse, error) {
func copyFromReader(ctx context.Context, from io.Reader, to blockWriter, o UploadStreamToBlockBlobOptions) (BlockBlobCommitBlockListResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're moving to use io.ReadSeekCloser across all the SDKs. You can write simple struct like here

Copy link
Contributor Author

@serprex serprex Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'd implemented that in my PR to workaround for now, but then if this interface does change to use these unused methods that'll introduce a bug rather than a compile time error. Unless your api holds itself to not break callers who rely on the implicit contract that you won't call close/seek, in which case you don't have the flexibility to make use of ReadSeekCloser interface in the future anyways so you're stuck to io.Reader implicitly. It also makes the functions more confusing to use because then I can't tell from the function signature whether the reader will be kept open or not

But alright, if there's some design ideology at play I'll have to accept it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, we require a seekable stream to facilitate our retry logic. If the PUT request fails, we can rewind the stream to try again.

@mohsha-msft I'm confused in this case. The API takes a ReadSeekCloser but doesn't actually require a Seeker or a Closer. Is it used elsewhere (maybe via a type-assertion)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serprex sorry, just to clarify, this is a private method right? How are you leveraging it exactly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's called from BlockBlobClient.UploadStreamToBlockBlob() which also takes a ReadSeekCloser

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this requirement is hard to meet in services.
Often I stream data right from a client through my service to blob storage. The HTTP connection I am feeding from can't be a Seeker. If azblob wants a retry mechanism, I think it should implement this internally with some kind of (configurable?) buffer. The burden should not be on the caller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zezha-msft I thought we had an API that requires just an io.Reader for exactly this reason and does what's described above?

This function only takes an io.Reader in azure-storage-blob-go,
& wal-g abstracts over multiple upload mechanisms which all operate on io.Reader
@serprex serprex force-pushed the upload-stream-to-block-blob-takes-io-reader branch from cbdf650 to ba8db25 Compare November 8, 2021 23:03
@mohsha-msft mohsha-msft changed the base branch from main to dev-azblob December 16, 2021 08:00
@mohsha-msft
Copy link
Contributor

Running CI.

@mohsha-msft mohsha-msft merged commit ece5efb into Azure:dev-azblob Dec 16, 2021
serprex pushed a commit to wal-g/wal-g that referenced this pull request Dec 30, 2021
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary

There are two other PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Dec 30, 2021
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary

There are two other PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Dec 30, 2021
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary

There are two other PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Dec 30, 2021
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary

There are two other PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
jhendrixMSFT added a commit that referenced this pull request Jan 13, 2022
* Make azblob.BlobClient.downloadBlobToWriterAt public (#15899)

* azblob.downloadBlobToWriterAt: initialDownloadResponse: remove as unused

* azblob.DownloadBlobToWriterAt: make public

* Replace the advancer (#16354)

* Handle error response XML with whitespace (#16639)

* [chunkwriting] Return original buffer to the pool (#16067)

When the buffer isn't filled to capacity a new slice is created that's
smaller than the original. In that case the smaller slice is returned to
the pool which prevents the rest of the capacity from being used again.

The solution is to pass the original slice through and attach the
length. This allows the original slice to be returned to the pool once
the operation is complete.

This change also simplifies the `sendChunk` method, ensuring that the
buffer is returned to the TransferManager even when no bytes were read
from the reader.

* UploadStreamToBlockBlob: only require body be io.Reader (#15958)

This function only takes an io.Reader in azure-storage-blob-go,
& wal-g abstracts over multiple upload mechanisms which all operate on io.Reader

* uncomment o.Progress(progress) in getDownloadBlobOptions() (#16727)

* Updated azblob README sample code (#16721)

* Updated README sample code

container name should be all lowercase

* Fix typo GetAccountSASToken to GetSASToken

I believe 'GetAccountSASToken' was written by mistake. It's inconsistent with what is described on line 97 and I don't see this method in the documentation.

* [KeyVault] updating examples to use ClientSecretCredential, fix parsing keyID (#16689)

* updating examples to use ClientSecretCredential, fix parsing keyID

* fixing constructor test URL

* adding test for no keyid

* key version, not key ID, better error reporting

* changes from working with maor and daniel

* working for both hsm and non hsm, need to fix up for recorded tests

* fixed implementation, i think...

* working for hsm too

* working challenge policy for hsm and non hsm

* new recordings

* adding all final recordings

* reveerting back to DefaultAzCred

* using streaming package from azcore

* Add spell check warnings (#16656)

* Add spell check warnings

* Basic cspell.json

* Ignore files in .vscode except cspell.json, new line before EOF in cspell.json

* Spell check, ignore thyself

* Adding Smoketests to nightly runs (#16226)

* Adding Smoketests to nightly runs

* updating location, fixes to script

* starting go script

* finishing script

* updating yml file

* formatting

* adding snippet for finding go code

* adding funcitonality for copying examples

* trimming out unused funcs

* fixed regexp, thanks benbp

* fixing smoke test program to create go.mod file correctly, update powershell for nightly

* removing need for argument in go program, updating yml and powershell to reflect

* scripts not common

* smoketests, plural not singular

* finally got the right directory

* fixed script locally, running into permissions issue on ci

* updating script to exit properly, logging an error instead of panicing

* manually set go111module to on

* removing references to go111module

* issue with duplicated function names...

* updating to only pull examples from the service directory if one is provided

* runs samples now too!

* adding 'go run .' step to ps1, triggering for tables

* adding step to analyze.yml file

* adding debugging for ci

* updating to work in ci

* updating to specify go module name, removing print statements

* updating scripts to fmt for prettier printing, find all environment variables

* working on loading environment variables from file

* removing env vars from example_test.go for testing

* adding the environment variable portion to the generated main.go file

* forgot to remove change to nightly script

* adding import to the main file

* cleaning up code, adding comments

* don't import os if no env vars

* small changes for checking all packages

* removing _test suffix on copied files

* converting to use cobra for better support

* formatting

* Sync eng/common directory with azure-sdk-tools for PR 2464 (#16730)

* Support AAD graph and Microsoft Graph service principal APIs

* Consolidate service principal wrapper creation

Co-authored-by: Ben Broderick Phillips <[email protected]>

* Make EndpointToScope resilient to subdomains (#16737)

* Add user assigned managed identity example to docs (#16738)

* [KeyVault] fixing broken live test (#16752)

* fixing broken live test

* coverage

* forcing ci

* Sync eng/common directory with azure-sdk-tools for PR 2484 (#16753)

* Add weekly pipeline generation to prepare-pipelines template

* Add succeeded condition to pipeline generation pipelines

Co-authored-by: Ben Broderick Phillips <[email protected]>

* Release v61.1.0 1641448664 (#16762)

* Generated from specification/automation/resource-manager/readme.md tag package-2020-01-13-preview (commit hash: 3b9b0a930e29cbead33df69ae46c7080408e4c0f)

* Generated from specification/compute/resource-manager/readme.md tag package-2021-08-01 (commit hash: 3b9b0a930e29cbead33df69ae46c7080408e4c0f)

* v61.1.0

* Handle skipping docker build when PushImages is set and there is no dockerfile (#16555)

Co-authored-by: Ben Broderick Phillips <[email protected]>

* add new config (#16774)

* Add offline test for Azure Arc managed identity (#16771)

* Rename armmanagedapplications ci.yml files to avoid pipeline name definition collisions (#16770)

* Refactor IMDS discovery to remove probing, stop caching failures (#16267)

* Sync eng/common directory with azure-sdk-tools for PR 2500 (#16779)

* Update pipeline generator tool feed to azure-sdk-for-net

* Update pipeline generator tool version

Co-authored-by: Ben Broderick Phillips <[email protected]>

* feat: add generator cmd to generate or update readme.go.md file for track2 sdk param (#16780)

* [azservicebus] Updating to handle breaking changes in azcore (#16776)

Updating sb to handle breaking changes in azcore

At the moment we're not using azcore's HTTP library, so we don't need to test for specific errors that come from azcore.

* [Core] Bump version of internal (#16793)

<!--
Thank you for contributing to the Azure SDK for Go.

Please verify the following before submitting your PR, thank you!
-->

- [ ] The purpose of this PR is explained in this or a referenced issue.
- [ ] The PR does not update generated files.
   - These files are managed by the codegen framework at [Azure/autorest.go][].
- [ ] Tests are included and/or updated for code changes.
- [ ] Updates to [CHANGELOG.md][] are included.
- [ ] MIT license headers are included in each file.

[Azure/autorest.go]: https://github.com/Azure/autorest.go
[CHANGELOG.md]: https://github.com/Azure/azure-sdk-for-go/blob/main/CHANGELOG.md

* [Core] Update Changelog.md (#16794)

Update changelog for release

* Update CHANGELOG.md (#16795)

* Increment version for azcore releases (#16798)

Increment package version after release of azcore

* [azidentity] Making ChainedTokenCredential re-use the first successful credential (#16392)

This PR makes it so instances of `ChainedTokenCredential` will now re-use the first successful credential on `GetToken` calls.

Fixed #16268

* Align authentication errors with azcore.ResponseError (#16777)

* Increment version for azidentity releases (#16799)

Increment package version after release of azidentity

* [KeyVault] release ready for keyvault keys (#16731)

* release ready for keyvault keys

* updating the api surface to latest azcore

* updating to ResponseError

* update with latest codegen

* fixing ci

* formatting

* bumping azcore to v0.21.0

* updating azidentity and autorest version

* updating go.sum

* final upgrade for changelog

Co-authored-by: Joel Hendrix <[email protected]>

* [Tables] preparing tables for release (#16733)

* preparing tables for release

* prepping for release with latest azcore

* update with latest code generator

* formatting

* updating to released azcore version

* upgrading azidentity

* updating autorest.go version

* final changes to readme

Co-authored-by: Joel Hendrix <[email protected]>

* [KeyVault] prepare secrets for release (#16732)

* prepare secrets for release

* updating to latest azcore

* update with latest codegen

* fixing secret version issues

* formatting

* manual checkout of azkeys

* updating to azcore v0.21.0

* updating autorest.go and azidentity

* updating Changelog

* updating changelog

* updating changelog

* modifying moduleVersion

* undoing changes to keys and tables

Co-authored-by: Joel Hendrix <[email protected]>

* chore: pump codegen version in scripts (#16802)

* Update azblob with the latest azcore (#16784)

* Update azblob with the latest azcore

This tactially resolves the small number of breaking changes in azcore.

* clean-up

Co-authored-by: Benoit Perrot <[email protected]>
Co-authored-by: Mohit Sharma <[email protected]>
Co-authored-by: adreed-msft <[email protected]>
Co-authored-by: John Stairs <[email protected]>
Co-authored-by: Philip Dubé <[email protected]>
Co-authored-by: Brandon Kurtz <[email protected]>
Co-authored-by: Kim Ying <[email protected]>
Co-authored-by: Sean Kane <[email protected]>
Co-authored-by: Daniel Jurek <[email protected]>
Co-authored-by: Azure SDK Bot <[email protected]>
Co-authored-by: Ben Broderick Phillips <[email protected]>
Co-authored-by: Charles Lowell <[email protected]>
Co-authored-by: Jiahui Peng <[email protected]>
Co-authored-by: Dapeng Zhang <[email protected]>
Co-authored-by: Chenjie Shi <[email protected]>
Co-authored-by: Richard Park <[email protected]>
Co-authored-by: Daniel Rodríguez <[email protected]>
serprex pushed a commit to citusdata/wal-g that referenced this pull request Feb 5, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary

There are two other PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to citusdata/wal-g that referenced this pull request Feb 8, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 8, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 8, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 14, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 14, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 14, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
serprex pushed a commit to wal-g/wal-g that referenced this pull request Feb 15, 2022
Specific commit used because I'm not sure when a new version will be made,
but there's a particularly important fix:
Azure/azure-sdk-for-go#16354 fixes ListBlobsHierarchy when paging is necessary
which turns out to be a broken fix, so a fork is used in the meantime which includes a proper fix:
Azure/azure-sdk-for-go#16992

On some containers paging happens much sooner than MaxResults would imply,
causing restoring LATEST to either fail because wal-g can't find backups, or wal-g selects an old backup

There are two other notable PRs:
Azure/azure-sdk-for-go#16067 fixes a chunk pooling bug
Azure/azure-sdk-for-go#15958 removes the need for FakeIoReadCloser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants