Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcs_list_objects seems limited to return 1000 rows max #58

Closed
G3rtjan opened this issue Dec 30, 2016 · 6 comments
Closed

gcs_list_objects seems limited to return 1000 rows max #58

G3rtjan opened this issue Dec 30, 2016 · 6 comments
Assignees
Labels

Comments

@G3rtjan
Copy link

G3rtjan commented Dec 30, 2016

Calling gcs_list_objects() on my Google Cloud bucket returns exactly 1000 rows, even after adding some additional files just to make sure that here is more in there than a 1000 files.

Maybe the maxResults parameter in one of the API calls you use can be used:
https://cloud.google.com/storage/docs/json_api/v1/objects/list

Perhaps this could be implemented as a parameter in the function :)

@MarkEdmondson1234
Copy link
Collaborator

@G3rtjan thanks, thats a bug. I'll look to add paging ASAP

@G3rtjan
Copy link
Author

G3rtjan commented Jan 2, 2017

@MarkEdmondson1234 I've created a workaround for myself for the time being and in doing so I found out that:

  • The maxResults parameter can only limit results BELOW 1000, it doesn't enable you to get more than 1000 rows ...
  • However, within the first set of (1000) results a nextPageToken is included (within content$nextPageToken), which can be used in the next call for the pageToken parameter to get the next set of (1000) results
  • You can continue this within a loop until the nextPageToken is NULL, which means you have all available results
  • This way, I was able to get all +/- 40k results from my Google Cloud storage bucket

I hope this helps :)

@MarkEdmondson1234
Copy link
Collaborator

Thanks @G3rtjan , I will handle it in a very similar manner to what you outline above.

@MarkEdmondson1234
Copy link
Collaborator

@G3rtjan I have published a fix for this, if you have time please test it let me know if it works as expected.

@MarkEdmondson1234
Copy link
Collaborator

Ok it works, but if you have a lot of objects it takes ages. Going to add a limit to it

@MarkEdmondson1234
Copy link
Collaborator

No limit as the order it comes out is random so doesn't make much sense, but use the new prefix and delimiter instead to limit what is downloaded ( #68 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants