Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(vectorStores): correctly handle missing files in uploadAndPoll #926

Merged
merged 1 commit into from
Jul 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions src/resources/beta/vector-stores/file-batches.ts
Original file line number Diff line number Diff line change
Expand Up @@ -155,19 +155,22 @@ export class FileBatches extends APIResource {
{ files, fileIds = [] }: { files: Uploadable[]; fileIds?: string[] },
options?: Core.RequestOptions & { pollIntervalMs?: number; maxConcurrency?: number },
): Promise<VectorStoreFileBatch> {
if (files === null || files.length == 0) {
throw new Error('No files provided to process.');
if (files == null || files.length == 0) {
throw new Error(
`No \`files\` provided to process. If you've already uploaded files you should use \`.createAndPoll()\` instead`,
);
}

const configuredConcurrency = options?.maxConcurrency ?? 5;
//We cap the number of workers at the number of files (so we don't start any unnecessary workers)

// We cap the number of workers at the number of files (so we don't start any unnecessary workers)
const concurrencyLimit = Math.min(configuredConcurrency, files.length);

const client = this._client;
const fileIterator = files.values();
const allFileIds: string[] = [...fileIds];

//This code is based on this design. The libraries don't accommodate our environment limits.
// This code is based on this design. The libraries don't accommodate our environment limits.
// https://stackoverflow.com/questions/40639432/what-is-the-best-way-to-limit-concurrency-when-using-es6s-promise-all
async function processFiles(iterator: IterableIterator<Uploadable>) {
for (let item of iterator) {
Expand All @@ -176,10 +179,10 @@ export class FileBatches extends APIResource {
}
}

//Start workers to process results
// Start workers to process results
const workers = Array(concurrencyLimit).fill(fileIterator).map(processFiles);

//Wait for all processing to complete.
// Wait for all processing to complete.
await allSettledWithThrow(workers);

return await this.createAndPoll(vectorStoreId, {
Expand Down