-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HTTP connection error for long running transfers #842
Conversation
This change introduces an API for Skyplane Broadcast Todos: - [x] Fix provisioning in BroadcastDataplane - Reuse provision loop via inheritance - Move `_start_gateway` to a class method and override it - Adapt broadcast to use `bound_nodes` - [x] Add BroadcastCopyJob (ideally extend CopyJob) - [x] Update tracker to monitor broadcast jobs - [x] Add multipart support - [x] Fix dependency issue via adding dockerfile and bc_requirements - [x] Integrate with gateway and test the monitoring side Co-authored-by: Paras Jain <[email protected]> Co-authored-by: Sarah Wooders <[email protected]>
…ct#829) Transferred to/from local paths needs to fall back on cloud provider tools until we provide full support for this. I cleaned up the `skyplane cp/sync` code to share a single function for shared logic for fallback options and initiating transfers.
except Exception as e: | ||
UsageClient.log_exception( | ||
"dispatch job", | ||
e, | ||
args, | ||
self.dataplane.topology.src_region_tag, | ||
self.dataplane.topology.dest_region_tags, | ||
self.dataplane.topology.dest_region_tags[0], # TODO: support multiple destinations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.dataplane.topology.dest_region_tags[0], # TODO: support multiple destinations | |
*self.dataplane.topology.dest_region_tags, |
You can just "spread" this list as arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean?
skyplane/api/transfer_job.py
Outdated
headers={"Content-Type": "application/json"}, | ||
) | ||
reply_json = json.loads(reply.data.decode("utf-8")) | ||
print(server, min_idx, "added", n_added, len(chunk_batch), reply_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(server, min_idx, "added", n_added, len(chunk_batch), reply_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log debug messages?
@@ -78,11 +78,14 @@ def worker_loop(self, worker_id: int, *args): | |||
self.worker_id = worker_id | |||
while not self.exit_flags[worker_id].is_set() and not self.error_event.is_set(): | |||
try: | |||
# print(f"[{self.handle}:{self.worker_id}] Waiting for chunk, queue size {self.input_queue.size()}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up this file a bit?
if chunk_req.chunk.chunk_length_bytes == 0: | ||
# nothing to do | ||
# create empty file | ||
open(fpath, "a").close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path('path/to/file.txt').touch()
Implements a few bug fixes causing errors for large transfers:
chunk_requests
POST request will return how many chunks were added and the current queue size, informing the HTTP client making the request to send the remaining chunks (those not added) to a different gateway or to wait and try again. With this change, I was able to transfer 1TB.There are still issues for SSH connections for long running transfers, and listing files can take an extremely long time on the client (#841), so these issues need to be fixed to for very large transfers.