Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: cannot set object properties #536

Closed
pdknsk opened this issue Jan 12, 2015 · 14 comments · Fixed by #629
Closed

storage: cannot set object properties #536

pdknsk opened this issue Jan 12, 2015 · 14 comments · Fixed by #629
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@pdknsk
Copy link

pdknsk commented Jan 12, 2015

Or at least I haven't figured out how. This is how the documentation suggests it may work.

>>> from gcloud import storage
>>> bucket = storage.get_bucket('bucket', 'id')
>>> key = storage.key.Key(bucket=bucket, name='dir/file.webp', 
...                       properties={'contentType': 'image/webp'})
>>> key.upload_from_filename('dir/file.webp')

Success?

>>> key.properties['contentType']
'image/webp'
>>> key.content_type
'image/webp'

It's a lie unfortunately. (Also verified with gsutil).

>>> webp = bucket.get_key('dir/file.webp')
>>> webp.properties['contentType']
u'application/unknown'
>>> webp.content_type
u'application/unknown'
@pdknsk pdknsk changed the title cannot set object properties storage: cannot set object properties Jan 12, 2015
@dhermes dhermes added api: datastore Issues related to the Datastore API. api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed api: datastore Issues related to the Datastore API. labels Jan 12, 2015
@pdknsk
Copy link
Author

pdknsk commented Jan 12, 2015

Just an additional minor remark. gsutil prefers application/octet-stream to application/unknown.

@dhermes
Copy link
Contributor

dhermes commented Jan 12, 2015

@pdknsk Thanks for the feedback, we'll look into it. (Also thanks for hunting down a more sensible default.)

@tseaver
Copy link
Contributor

tseaver commented Jan 12, 2015

Note that upload_from_filename overwrites any previously-set contentType (via upload_from_file). So, the fix would be to call self._reload_properties() at the end of upload_from_file.

@pdknsk
Copy link
Author

pdknsk commented Jan 15, 2015

I applied the fix and the result is this.

>>> key.content_type
'image/webp'
>>> key.upload_from_filename('dir/file.webp')
>>> key.content_type
u'application/unknown'

Which is correct in a way, because that's how the object is stored online, even if incorrectly so.

I tried Cache-Control (or cacheControl rather) and it doesn't work either.

@pdknsk
Copy link
Author

pdknsk commented Jan 28, 2015

I've figured out that in order to send metadata with the request it has to be set on the request body before upload.ConfigureRequest is called. This works for Content-Type and Cache-Control in this case.

--- a/gcloud/storage/blob.py
+++ b/gcloud/storage/blob.py
@@ -296,7 +296,7 @@ class Blob(_PropertyMixin):
         }

         upload = transfer.Upload(file_obj,
-                                 content_type or 'application/unknown',
+                                 content_type or self.properties.get('contentType', 'application/unknown'),
                                  total_bytes, auto_transfer=False,
                                  chunksize=self.CHUNK_SIZE)

@@ -311,6 +311,11 @@ class Blob(_PropertyMixin):
         # Use apitools 'Upload' facility.
         request = http_wrapper.Request(upload_url, 'POST', headers)

+        if self.properties:
+            import json
+            headers['content-type'] = 'application/json'
+            request.body = json.dumps(self.properties)
+
         upload.ConfigureRequest(upload_config, request, url_builder)
         query_params = url_builder.query_params
         request.url = conn.build_api_url(path=self.bucket.path + '/o',

@jgeewax jgeewax modified the milestone: Storage Stable Jan 30, 2015
@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

So the core issue here is that Blob.upload_from_filename uses mimetypes rather than the given content_type. I tested with a real webp file and mimetypes doesn't know what to think.

You could "get around this" by using upload_from_file directly.

@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

@tseaver I'm going to send a fix and we should discuss the "right approach" in the PR.

@dhermes dhermes self-assigned this Feb 13, 2015
dhermes added a commit to dhermes/google-cloud-python that referenced this issue Feb 13, 2015
This is instead of using the mimetype for all files (the Python
mimetypes.guess_type method is a bit limited).

Fixes googleapis#536.
@pdknsk
Copy link
Author

pdknsk commented Feb 13, 2015

I haven't tried but from what I can tell 83c0fc1 only fixes Content-Type but not the other properties, called Optional Properties in the docs. The underlying library nicely does all the work on this when it detects that request.body is set.

@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

@pdknsk Yeah, per my comments in #629 I think we're better off having a clear Blob.create() and Blob.update(). Still trying to figure out what the best API surface is.

@pdknsk
Copy link
Author

pdknsk commented Feb 13, 2015

The only difference I noticed in setting the metadata separately afterwards is that it increases metageneration version. So you wouldn't be able to make an object with metageneration 1. I have no clue if anyone would need it — I very much doubt it.

And another remark on this bug. The gsutil documentation has an interesting comment stating that if Content-Type is not set, it will be set server-side.

For example, at the time of writing this, the filename extension ”.webp” was recognized by the server-side content detection system, but not by gsutil.

I haven't been able to get this to work, using gsutil at least.

@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

Thanks for digging more. I filed #632 and hopefully (if we move forward on it) something like Blob.create() will handle this correctly.

@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

@pdknsk So I merged the fix, but you indicated other things are not addressed.

Essentially upload doesn't send properties is the issue?

It seems instead of blob._reload_properties() (as @tseaver said) we would want to call

blob._patch_properties(blob._properties)

at the end of upload. WDYT? Shall we open a new bug (not specific to content type)?

Or instead do you think the plan put forth in #632 is sufficient?

@pdknsk
Copy link
Author

pdknsk commented Feb 15, 2015

I tried and it works well.

It does however also set metageneration version to 2. As I already said, I have no clue if anyone needs metageneration set to 1.

Another potential problem is that between both separate requests, the object is very briefly online with incorrect properties, I think. This could theoretically bite anyone who uses permissive acl by default on the bucket, and sets a restrictive acl per object. There is also the freak chance that your connection fails just after the first request, which makes it not so brief.

@dhermes
Copy link
Contributor

dhermes commented Feb 15, 2015

Yeah it needs a lot of work, thanks for verifying. Hopefully #632 leads to something useable.

parthea pushed a commit that referenced this issue Aug 15, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
vchudnov-g pushed a commit that referenced this issue Sep 20, 2023
parthea pushed a commit that referenced this issue Sep 22, 2023
Source-Link: https://togithub.com/googleapis/synthtool/commit/d0f51a0c2a9a6bcca86911eabea9e484baadf64b
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:240b5bcc2bafd450912d2da2be15e62bc6de2cf839823ae4bf94d4f392b451dc
parthea pushed a commit that referenced this issue Oct 21, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this issue Oct 21, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this issue Oct 22, 2023
Source-Link: googleapis/synthtool@352b9d4
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:3e3800bb100af5d7f9e810d48212b37812c1856d20ffeafb99ebe66461b61fc7

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
4 participants