-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 sync recursively with per-object metadata #2045
Comments
I'm -1 on adding that. I don't think providing that kind of mapping is a very good experience. At that point you're effectively setting everything manually anyway, so it would take just as much time to perform all those requests. As far as using our code, we don't guarantee we won't break internals. However, it is MIT licensed so feel free to vendor or copy it. |
In my use case, the metadata is precomputed against the objects I'm trying to store and placed in a storage backend (details not important, but e.g. MongoDB). All I'm doing is retrieving the data from that backend and storing it with the objects. If I do this object-by-object, then I need to recreate threaded uploads, multipart handling, sync strategies, etc -- all of the things that the |
+1 for me. I think this is a reasonable request. Out of all the proposed solutions, I like the metadata JSON file the best. I'm inclined to mark this as a feature request. @jcmcken One other thing worth considering is the work @kyleknap's been doing for s3transfer. It's still under active development so I wouldn't recommend it for general use just yet, but the idea is to create a good python API for the functionality that's currently exposed in the AWS CLI. |
@jamesls Since s3transfer is still very much in active development, do you have a recommendation for syncing with per file metadata? node-s3-client is the most promising library I've come across, but the project seems to be having problems with the underlying AWS SDK, see andrewrk/node-s3-client#129 |
Good Morning! We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI. This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports. As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions. We’ve imported existing feature requests from GitHub - Search for this issue there! And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue. GitHub will remain the channel for reporting bugs. Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface -The AWS SDKs & Tools Team |
Based on community feedback, we have decided to return feature requests to GitHub issues. |
I'd like this to exist and am willing to spend some time building it. What is the best way to proceed here? I can jump right to submitting a PR for the single metadata JSON file, but would it be helpful to discuss design / implementation strategy first? I've never committed to this repo before, so if there are any pointers to related code / suggested supporting infrastructure, I'm all ears. |
Hi @pgriess thanks for your willingness to contribute. If you want to create a PR then I recommended reading the contributing guide here: https://github.com/aws/aws-cli/blob/master/CONTRIBUTING.md You can expand on your proposed implementation here or in a PR. I think looking through these s3 sync customizations is a good place to start: https://github.com/aws/aws-cli/tree/develop/awscli/customizations/s3/syncstrategy |
I'm looking to take advantage of the
aws s3 sync
command, but provide per-object metadata (i.e. metadata that can change per object) rather than provide global metadata with--metadata
.Right now, I have basically a couple of options:
sync
command.What would be nice is if I could somehow indicate to the CLI that I want to map each object to a set of metadata, and then upload each object with that metadata. A couple of solutions come to mind:
sync
command read the metadata for each object prior to uploading. For example, I could have a local directory:(So when this is run, the
$filename.meta
files would just be read for metadata, and would not be transferred)Alternatively, what would be really great is if the syncing functionality were available independently of the CLI from within Python (without requiring me to figure out the internals of how to properly initialize the CLI environment, etc.), so that I could subclass and customize the process. I started going down this route somewhat, but am worried that this API is not for public consumption and would break in the future.
Any thoughts?
The text was updated successfully, but these errors were encountered: