Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Storage: Google Drive: 403 User Rate Limit Exceeded #20481

Closed
swurzinger opened this issue Nov 12, 2015 · 42 comments
Closed

External Storage: Google Drive: 403 User Rate Limit Exceeded #20481

swurzinger opened this issue Nov 12, 2015 · 42 comments

Comments

@swurzinger
Copy link

swurzinger commented Nov 12, 2015

Steps to reproduce

  1. Set up Google Drive as external Storage. In my case I was accessing via a shared (sub)folder
  2. Upload a lot of files to that folder e.g. using the ownCloud client (probably also download?)
  3. Errors will occur if more than e.g. 10 requests/second

Expected behaviour

Files should be uploaded without errors.

Actual behaviour

Error: (403) User Rate Limit Exceeded

Google Drive has a limitation of maximum requests per second, that is set to 10 max according to Google's API documentation and the (max) value set in the Google Developer Console.

According to Google's documentation an application should implement exponential backoff when it gets that error, see https://developers.google.com/drive/web/handle-errors#implementing_exponential_backoff

Although owncloud/google returns a 403 error the upload succeeds sometimes, see also http://stackoverflow.com/questions/18578768/403-rate-limit-on-insert-sometimes-succeeds

Finally the owncloud client uploaded all my files successfully, I guess it has retried the ones that failed initially.

Server configuration

Operating system:
Linux af91f 2.6.32-504.8.1.el6.x86_64
Web server:
Apache (unknown version; shared hosting provider)
Database:
5.5.46 - MySQL Community Server
PHP version:
PHP Version 5.4.45
ownCloud version: (see ownCloud admin page)
ownCloud 8.2.0 (stable)
Updated from an older ownCloud or fresh install:
fresh install
List of activated apps:
default + external storage

The content of config/config.php:
config.txt

Are you using external storage, if yes which one: local/smb/sftp/...
Yes, Google Drive.

Are you using encryption: yes/no
No

Are you using an external user-backend, if yes which one: LDAP/ActiveDirectory/Webdav/...
No

Client configuration

Browser:
irrelevant

Operating system:
irrelevant; in my case it was a Windows XP Virtual Machine

Logs

Web server error log

empty

ownCloud log (data/owncloud.log)

owncloud.txt

Browser log

irrelevant

@PVince81
Copy link
Contributor

The only way to fix this is trying to reduce the number of API calls.
Currently it is likely that too many calls are made.

Such repeated calls could be buffered/prevented by using a local stat cache, similar to #7897

@swurzinger
Copy link
Author

I really like the idea of buffering the API requests. That would both improve speed and reduce the number of requests.

Another way to deal with that behavior is to retry the request when it fails with that error as Google suggests. The question is whether that's the task of files_external or the client. For more abstract things I'd say this is the task of the client, for internal things I'd say this can be the task of files_external.

@PVince81
Copy link
Contributor

A bit of research shows that the AWS library and others are referring to a curl plugin called "BackoffStrategy".

That might be it: https://github.com/Guzzle3/plugin-backoff/blob/master/CurlBackoffStrategy.php
It seems to react on some error codes and then will try again after a delay.
That plugin isn't available in ownCloud / 3rdparty libs so might need to be included.

@PVince81
Copy link
Contributor

@swurzinger
Copy link
Author

BackoffStrategy seems to implement what I mentioned in the first post and is recommended by Google. Here's the link again (it is from the REST API, but that applies to the other APIs as well): https://developers.google.com/drive/web/handle-errors#implementing_exponential_backoff

Batching will probably not help in regard of this issue as "A set of n requests batched together counts toward your usage limit as n requests, not as one request." (from https://developers.google.com/drive/v2/web/batch)

@PVince81
Copy link
Contributor

Unfortunately it seems the library OC uses doesn't use the Curl library but rather uses PHP's curl_xxx statements directly: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Curl.php#L85

So even if that plugin was added, I'm not sure it would fit in.
We'd need to find another library that actually uses Guzzle

@PVince81 PVince81 added this to the 9.1-next milestone Feb 19, 2016
@PVince81
Copy link
Contributor

@davitol this is what you observed yesterday

@davitol
Copy link
Contributor

davitol commented Feb 19, 2016

@PVince81 thanks 😺

@PVince81
Copy link
Contributor

Setting this to critical, it has been observed already 3-4 times in different environments.

@LukasReschke
Copy link
Member

Considering that the limit is 1000 requests per 100 seconds per user we probably need some change detection here. Otherwise this seems like something that one can easily fall into again.

@thefirstofthe300
Copy link

I too am experiencing this issue on OC 9. I am attempting to upload a music library to ownCloud so I have a lot of little files being synced. If you need any more logs, I will happily provide them.

@Spacefish
Copy link

i have the same problem, should be fixed somehow.. maybe we can cache the api calls? especially for the single files.. Or introduce some sort of rate limit counter, which postpones api calls not absolutely necesarry?

@PVince81
Copy link
Contributor

I think there is already some caching inside the library, I remember seeing some code that "remembers" calls made to the same URLs. Not sure if that works though.

@JDrewes
Copy link

JDrewes commented Apr 16, 2016

I am experiencing issues involving the 403 User Rate Limit Exceeded message.
My problem is that afaik, part of my GDrive has been copied to the owncloud server (and is accessible there through the files webfront), but of my many files and folders, only 2 empty folders have actually made their way to my harddrive through my owncloud-client (linux, 2.1.1).
Also, even on the webfront of my owncloud server, many of the files in the Google Drive folder are marked as "Pending", even after more than a day has passed. I would like to help get Google Drive sync to work - what kind of information can I provide?

Owncloud is 9.0.1 running on Debian 7
I use a free personal Google Drive account.

@PVince81
Copy link
Contributor

I'm not sure but from what I heard, in some setups people seem to hit against the limit less often.
And some people have reported hitting against the limit more quickly before realizing they hadn't setup their GDrive app properly. So if you're sure that your GDrive app was configured properly on the GDrive side (API keys, etc), then I'm not aware of any possible workaround at this time.

In theory one could add a few usleep(100) in the GDrive library to make it slower and less likely to run into limits, but it's really not a proper solution. I haven't tried it, just guessing.

The proper solution is to implement exponential backoff, which would use adaptive sleep to sleep longer and longer until the request goes through, retrying several times.

@guruz
Copy link
Contributor

guruz commented Apr 18, 2016

(I haven't checked how our GDrive stuff works, I don't even know how often we "list" the remote directory)

Maybe it's interesting to incorporate this into our caching or even ETag logic: https://developers.google.com/drive/v2/web/manage-changes#retrieving_changes

@PVince81
Copy link
Contributor

@guruz yeah, that would be part of looking into update detection: #11797

@PVince81
Copy link
Contributor

PVince81 commented Jun 1, 2016

Good news ! We've updated the Google SDK library and from grepping the code I saw that there are parts that will automatically do the exponential backoff !

* A task runner with exponential backoff support.

@PVince81
Copy link
Contributor

PVince81 commented Jun 1, 2016

From looking at the code it seems that it should already work without any additional configs, so I'm going to close this.

If you are able to, please try the 9.1beta1 build that contains this update and let me know if you're still getting "403 User Rate Limit Exceeded" as I'm not able to reproduce this locally.

CC @davitol @SergioBertolinSG

@PVince81
Copy link
Contributor

PVince81 commented Jun 2, 2016

Kudos to @Altyr for submitting the library update PR #24516 😄

@PVince81 PVince81 removed this from the 9.1 milestone Aug 9, 2016
@stevenmcastano
Copy link

Most of them are fairly small. Some tiny spreadsheets and word documents. I'd say there's about 225 files total, with maybe 1 or 2 of them being between 75 and 100M, some big photoshop .PSD files.

It hasn't even gotten to the big files yet... it's synced about 40 or so files, and everytime I try to pause the sync then resume it to try to grab some more, by the time it verifies the files it already has, it hits the user rate limit before it can download anymore.

@Trefex
Copy link

Trefex commented Oct 11, 2016

Would it be possible to return a different error code to the client in addition to trying to fix the rate limit issue?

@PVince81
Copy link
Contributor

You are probably referring to owncloud/client#5187 (comment).
In the case of a PROPFIND, a 503 would be returned.

In the case of PUT or any write operation, we could change Webdav to translate the exception to another failure code. @ogoffart any suggestion ?

@ogoffart
Copy link

The problem is that 503 for PUT will stop the sync.
I guess the 502 error code might work better in this case if it is only for one file.

Alternative is to dinstinguish code that should block the sync using another header or something.

@PVince81 PVince81 modified the milestones: 9.1.3, 9.1.2 Oct 20, 2016
@PVince81 PVince81 self-assigned this Nov 21, 2016
@PVince81 PVince81 modified the milestones: 9.1.4, 9.1.3 Nov 30, 2016
@PVince81
Copy link
Contributor

I need to build a good test case where this issue is reproducible every time. Any suggestions ?

@Trefex
Copy link

Trefex commented Nov 30, 2016

@PVince81 upload a lot of files to Google Drive and sync everything to your local machine.

I guess that should do it. A lot of small files, and perhaps 100.000 files or so.

@stevenmcastano
Copy link

stevenmcastano commented Nov 30, 2016 via email

@Trefex
Copy link

Trefex commented Nov 30, 2016 via email

@stevenmcastano
Copy link

stevenmcastano commented Nov 30, 2016 via email

@atlcell
Copy link

atlcell commented Dec 13, 2016

I store my entire music Library in Google Drive (~65GB)

This includes mixtapes and other shit you cannot find on Spotify and the likes. I also tend to keep high quality files like 320kbps MP3s and FLACs/ALACs

I tried to use Drive as my MASTER copy but the Google Drive client along with syncing between NTFS+ and HFS has fucked it sometimes and created duplicates, doesn't sync files, etc.

I'm paying for 1TB premium in Google

I just got this error and I assumed I would, I have moved 500GB of data through Googles network they have to have some sort of internal guage for consumers, correct?

@PVince81 PVince81 modified the milestones: 9.1.5, 9.1.4 Feb 6, 2017
@DejaVu
Copy link

DejaVu commented Mar 7, 2017

Another way to put this to the test I've found is to upload/sync a folder with 1,000 files - any size and then rename them locally and resync.

Expected behaviour is to simply rename remote files and the procedure is offered correctly.

Trying to rename at the rate GoodSync does on Goohle Drive, although quick and we all prefer quick, these 403 User Limits are hit and no files that are already uploaded are renamed either.

@PVince81
Copy link
Contributor

PVince81 commented Mar 7, 2017

I suspect that GoodSync doesn't rename the way the desktop client renames. Instead of a single MOVE it might be doing a copy of every file first. Or maybe it does a MKCOL on the new folder and then recursively move every file there instead of doing just once. Looking at the web server access log should tell.

In general I'd expect a simple Webdav MOVE to not cause any rate limit issue.

@PVince81
Copy link
Contributor

Here you go, one liner to enable retries in the GDrive lib: #27530

@PVince81
Copy link
Contributor

Please all help testing this and let us know if it solved your problem.

You might still bump into API limits from time to time but it shouldn't be as bad as before.

@PVince81
Copy link
Contributor

PVince81 commented Apr 3, 2017

This will be in 9.1.5

@lmiol
Copy link

lmiol commented Feb 18, 2019

Hi there
i need explanation for some limits (thx in advance)
Limit 1000 requests per 100 seconds
i have api key requests to get files.

and if i using this requests from 5 devices i will be like 1 user?
or like 5 users?

@lock lock bot locked as resolved and limited conversation to collaborators Feb 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests