Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow target MAX_SIZE_DEFAULT to be externally configurable #2155

Closed
pnadolny13 opened this issue Jan 18, 2024 · 1 comment · Fixed by #2248
Closed

feat: allow target MAX_SIZE_DEFAULT to be externally configurable #2155

pnadolny13 opened this issue Jan 18, 2024 · 1 comment · Fixed by #2248

Comments

@pnadolny13
Copy link
Contributor

In the base sink class we define a MAX_SIZE_DEFAULT thats set to 10,000

MAX_SIZE_DEFAULT = 10000
. Thats a fine base default and target developers can override it if their target prefers smaller or larger batches.

The thing thats not supported today is the use case where I want to configure a target, say target-postgres, to output more frequently because my tap records are slow/expensive to get so the possibility of waiting until 10k records are retrieved, the batch is old enough to drain, or the sync completes, is undesirable. I might want to drain after every 50 records or so in some case.

One example that came up a while ago is from the map-gpt-embeddings plugin that requests from openai's api. It takes a while to embed all input data and each request costs money so preferably I'd drain as frequently as possible.

cc @edgarrmondragon

@BuzzCutNorman
Copy link
Contributor

@pnadolny13 if you are able to would you mind giving PR #1876 a try with a target and to see if it works to satisfies your scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
2 participants