Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visor walk changes for backfill #347

Closed
thattommyhall opened this issue Jan 6, 2021 · 2 comments
Closed

Visor walk changes for backfill #347

thattommyhall opened this issue Jan 6, 2021 · 2 comments

Comments

@thattommyhall
Copy link

thattommyhall commented Jan 6, 2021

It is incredibly handy that we have the option to pass flags as environment variables or on the cli but there is an unfortunate limitation of how to pass commands to ECS tasks that I would like to avoid having to work around.

Cant pass empty LOTUS_DB in ECS

On the cli I can use --db "" or LOTUS_DB="", but I cant set them empty in ECS, I'd like the default changing to be empty rather than postgres://postgres:password@localhost:5432/postgres?sslmode=disable, but we could allow LOTUS_DB=false or something if we dont want to change the default.

--csv improvements

at the mo, we dump files into the folder passed.

actors.csv         block_messages.csv   derived_gas_outputs.csv  messages.csv                      miner_locked_funds.csv      miner_sector_events.csv  receipts.csv
actor_states.csv   block_parents.csv    drand_block_entries.csv  miner_current_deadline_infos.csv  miner_pre_commit_infos.csv  miner_sector_infos.csv   visor_processing_reports.csv
block_headers.csv  chain_economics.csv  message_gas_economy.csv  miner_fee_debts.csv               miner_sector_deals.csv      parsed_messages.csv

I could make folders like $CSV_PATH/$START_$STOP or something but I was thinking it would be better if we used a single folder as the target and did a file per tipset, named by the height? Downside would be lots of files but EFS is quite forgiving here. I think we could use it to avoid re-work if the walk skipped over when the file for the height existed (maybe add a --force flag or we delete or target a new output folder if we want to rerun some) and it would be a simple way of spotting failures w/o having to look at the logs etc.

eg <CSVPATH>/<TABLENAME>/####.csv

@iand
Copy link
Contributor

iand commented Jan 7, 2021

We can remove the default for the db parameter.

I don't think writing a file per tipset would be what most people would expect when using walk and seems specific the particular configuration of the backfill.

The visor_processing_reports table that is also output will contain one row for each tipset+task combination so it should be easy to grep to spot failures. Some tipsets will contain reproducible failures that indicate missing functionality in visor (#273 for example)

@thattommyhall
Copy link
Author

thattommyhall commented Jan 7, 2021

Yay, I'd not noticed that you output the report. I'll do some wrangling to make a folder per backfill job and slurp up all the reports and write some sort of meta-report 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants