Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(bulk-load): bulk load ingestion part5 - replica handle bulk load request during ingestion #496

Merged
merged 5 commits into from
Jun 15, 2020

Conversation

hycdong
Copy link
Contributor

@hycdong hycdong commented Jun 10, 2020

The whole bulk load ingestion process is like:

  1. meta server set app and all partitions bulk load status as ingesting feat(bulk-load): bulk load ingestion part1 - meta start ingestion #486
  2. meta send two requests to primary
  3. primary handle two requests above:
  4. secondary handle two cases:
  5. primary report group ingestion status to meta server
  6. meta handle two response

This pull request is about how replica handle bulk_load_request during ingestion.

  • implement function do_bulk_load
    • this function will be called when primary receives bulk_load_request and secondary receives group_bulk_load_request. It will first check primary status, and compare bulk load status from request and replica local bulk load status (in function validate_bulk_load_status, not implemented in this pull request), then call different functions according to bulk load status.
    • when request status is downloading, and local status is invalid, will call start_download to start download files from file provider.
    • when request status is ingestion, local status is downloaded, will call start_ingestion
    • when request status is ingestion, local status is ingestion, will call check_ingestion_finish
  • ingestion is considered as a write request, but it is different from normal write, we should add some special checking for it
    • write an empty put after ingestion. For write request, when primary receives secondaries prepare reply, it will commit this mutation, and secondaries will commit the mutation when they prepare next mutation, so send an empty prepare to guarantee secondary commit ingestion request.
    • pop all committed mutation when prepare the empty put. For ingestion mutation, replay private log can not learn data ingested. To make data correct, we will create a checkpoint ( not implemented in this pull request), and should guarantee learner learn data from checkpoint. To satisfy the condition above, we should pop all committed mutation when prepare the empty put, so the prepare list is empty, learn type will always be LT_APP, learn checkpoint firstly.
      • add ingestion_is_empty_prepare_sent in primary_context to write empty put only once.
      • add pop_all_committed_mutations parameter for function init_prepare, send_prepare_message, prepare, add pop_all for thrift structure replica_configuration.
  • report ingestion status
    • secondary report ingestion status in function report_bulk_load_states_to_primary
    • primary report group ingestion status in function report_group_ingestion_status, when group ingestion succeed, recover normal write

@hycdong hycdong marked this pull request as ready for review June 10, 2020 06:51
}

// ThreadPool: THREAD_POOL_REPLICATION
void replica_bulk_loader::check_ingestion_finish()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood the method is check whether the ingetion is finished, but the logic seem mean gurantee it can be finished?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is depends on when ingestion is finished, in my view, this function means check whether primary replica's ingestion_status is ingestion_succeed, besides, if it has not sent empty prepare to secondaries, sends empty prepare to make secondaries commit this mutation, I think it is to guarantee data integrity. Actually, when primary sends empty prepare to secondaries, ingestion is not finished. All replicas' ingestion_status are ingestion_succeed means ingestion finished.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants