Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulkwriter sample read csv #1673

Merged
merged 1 commit into from
Sep 12, 2023
Merged

bulkwriter sample read csv #1673

merged 1 commit into from
Sep 12, 2023

Conversation

yhmo
Copy link
Contributor

@yhmo yhmo commented Sep 7, 2023

No description provided.


fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
FieldSchema(name="path", dtype=DataType.VARCHAR, max_length=512),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this field be primary key?

with LocalBulkWriter(
schema=schema,
local_path="/tmp/bulk_writer",
segment_size=4*1024*1024,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for hardcode 4 here?

local_path="/tmp/bulk_writer",
segment_size=4*1024*1024,
) as local_writer:
read_sample_data("./data/train_embeddings.csv", local_writer)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that csv is too big(100 GB) that cannot load into the processing?

threads = []
thread_count = 100
rows_per_thread = 1000
rows_per_thread = 100
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any limitation for size per row here?

@xiaofan-luan
Copy link
Contributor

/lgtm
/approve

@XuanYang-cn
Copy link
Contributor

/approve

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xiaofan-luan, XuanYang-cn, yhmo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit b5b4db3 into milvus-io:2.2 Sep 12, 2023
8 checks passed
@@ -161,11 +307,11 @@ def test_cloud_bulkinsert():
access_key=object_url_access_key,
secret_key=object_url_secret_key,
cluster_id=cluster_id,
collection_name=COLLECTION_NAME,
collection_name=CSV_COLLECTION_NAME,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bulk_import miss api-key parameter

)
print(resp)

print(f"===================== get import job progress ====================")
print(f"\n===================== get import job progress ====================")
job_id = resp['data']['jobId']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json.loads(resp.text)['data']['jobId']

)
print(resp)

print(f"===================== get import job progress ====================")
print(f"\n===================== get import job progress ====================")
job_id = resp['data']['jobId']
resp = get_import_progress(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miss api_key parameter

@@ -174,7 +320,7 @@ def test_cloud_bulkinsert():
)
print(resp)

print(f"===================== list import jobs ====================")
print(f"\n===================== list import jobs ====================")
resp = list_import_jobs(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miss api_key parameter

@yhmo yhmo deleted the sss branch September 14, 2023 03:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants