Replies: 1 comment
-
Performing the batch based on event count is consistent with many other batched processing systems I've seen, and can be facilitated with some flexibility by offering a method by which to trigger the execution on-demand as well (or in more standard terms to a batching process, "flush" the queue). However, that does make the task have a variable response time - instead of a guaranteed fire within 60 seconds of the event, it is now dependent on the frequency of poll events a given system has. This is generally acceptable in other systems I've seen, so long as a configurable
One consideration may be to utilize an expansion of the "backoff" strategy configuration utilized by This kind of logic could alternatively be implemented by accepting a function for the If the number of attempted runs for a poll is exposed at all to the job/run, it also offers the opportunity to set another configuration value, such as Apologies if this overlooks any of the existing options or limitations in place for Tasks/Runs. |
Beta Was this translation helpful? Give feedback.
-
This idea comes from thinking through the scaling implications of the
io.waitUntil
task, which would "poll" the task on an interval to determine if some condition inside a callback was true, and then only move to the next task once that happens:This would be extremely useful, but unfortunately could also cause a massive amount of wasted function execution time for Trigger.dev clients. Imagine the condition runs every 60 seconds (the minimum interval) but never returns true (and the timeout is in 14 days). That would be an additional 20,160 executions. And that's only for a single task in a single run.
Clearly, this does not scale well. And this situation would get even worse when we implement polling triggers (e.g. Notion doesn't have webhooks).
I think we'll need to developing a "Batch Polling" system before these features can be released, which will aggregate all the Polling work over the last interval and then we make a single request to perform all the work in a single request.
One consideration with this system is we'd have to make sure our request body payload isn't too large, and if it is, we should be able to split the polling work across multiple requests.
Another issue this system could run into (and open to ideas on how to solve this) is not being able to finish all the work required within the function execution time.
Maybe instead of doing the Batch Polling request every X seconds, it would do it after X number of polling tasks had been scheduled. This could drastically reduce the work required while getting around the issues listed above. Even if it was just every 10 polling tasks, that would drastically reduce the number of requests.
Beta Was this translation helpful? Give feedback.
All reactions