Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend bulk-data homepage to show more entries #424

Open
metachris opened this issue May 22, 2023 · 5 comments
Open

Extend bulk-data homepage to show more entries #424

metachris opened this issue May 22, 2023 · 5 comments
Labels
feature request help wanted Extra attention is needed up for grabs anyone welcome to PR

Comments

@metachris
Copy link
Collaborator

The relay bulk data export is only showing the first 1k files: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/index.html

This is because the AWS S3 XML only serves that many entries per page.

Task: Update https://github.com/flashbots/mev-boost-relay/blob/main/static/s3/index.html to either have pagination or simply query as many pages as there are (maybe listing in reverse order?)

@metachris metachris added up for grabs anyone welcome to PR help wanted Extra attention is needed feature request labels May 22, 2023
@xrchz
Copy link

xrchz commented Aug 16, 2023

Are they still accessible if you guess the correct URL? As a workaround...

@Valdorff
Copy link

This bulk data would be extremely helpful for Rocket Pool if it could be made available.

We wouldn't need to hit it too much - would mostly grab once and then run locally.

@metachris
Copy link
Collaborator Author

Actually, you can query all the items by just using the XML page (instead of the HTML) at https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com and using the marker query argument:

https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html

Marker is where you want Amazon S3 to start listing from. Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.

I.e.: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/?marker=data/1_payloads-delivered/monthly/2022-09.json


We'd be glad about a code contribution to add pagination or query until there's no more entries to the HTML.

You can reproduce the issue by:

  1. create a S3 bucket with 1,100 files (can be of size 1b)
  2. use this HTML to serve an index: https://github.com/flashbots/mev-boost-relay/blob/main/static/s3/index.html (there are no changes necessary to that file)
  3. implement any changes to show more entries

@metachris
Copy link
Collaborator Author

The only dataset that's fully up-to-date is the delivered payloads by month: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/?prefix=data/1_payloads-delivered/monthly

@Valdorff
Copy link

Ok - thanks for that. Unfortunately we're looking at the builder submission side, so that doesn't help our immediate use case. Is the intent to add more and some bit of the pipeline broke, or were bulk builder submissions removed?

Thanks in any case 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request help wanted Extra attention is needed up for grabs anyone welcome to PR
Projects
None yet
Development

No branches or pull requests

3 participants