fix: increase timeout rate of lora metadata/model downloads #211

tazlin · 2024-03-07T12:45:44Z

The new multiprocess worker paradigm is very sensitive to timing in certain respects. This PR seeks to address the tendency for LoRa metadata or model downloads to get hung up at certain points in the model manager, which tends to lead to the worker detecting the working process as stale and leading to all sorts of issues.

In short:

Adjusts timeouts to be more in line with real-time generation expectations as is the case with the worker.
Gives up metadata collection sooner in certain probably unrecoverable circumstances, such as getting a 500 or HTML doc back.
Increases the number of download threads to improve the chances that things move along
- During testing, this is has been increased to 5x the previous number which greatly increases the rate at which LoRa based tests run.
Increasing the polling frequency of threads (THREAD_WAIT_TIME)
Binds several other time.sleep(...) calls to the value of THREAD_WAIT_TIME.

hordelib/model_manager/lora.py

db0 · 2024-03-07T12:53:06Z

Looks OK, my comment is more than I don't think increasing the threads would do much, but it won't hurt.

tazlin added 2 commits March 7, 2024 07:36

fix: increase timeout rate of lora metadata/model downloads

878d8bf

fix: give longer metadata queries more time

8e67a3b

db0 reviewed Mar 7, 2024

View reviewed changes

hordelib/model_manager/lora.py Show resolved Hide resolved

tazlin added 3 commits March 7, 2024 08:15

fix: retry less often with TI model manager also

30966b4

fix: retry 500s a few times on lora/ti metadata dl timeout

51d9541

tests: corrects non existing lora test to new logic

820b1c3

tazlin merged commit e0fb477 into main Mar 7, 2024
2 checks passed

tazlin deleted the lora-metadata-timeout branch March 7, 2024 16:44

This was referenced Mar 7, 2024

fix: increase timeout rate of lora metadata/model downloads #213

Merged

fix: performance improvements; use hordelib w/ lora/ti metadata download fixes Haidra-Org/horde-worker-reGen#155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: increase timeout rate of lora metadata/model downloads #211

fix: increase timeout rate of lora metadata/model downloads #211

tazlin commented Mar 7, 2024

db0 commented Mar 7, 2024

fix: increase timeout rate of lora metadata/model downloads #211

fix: increase timeout rate of lora metadata/model downloads #211

Conversation

tazlin commented Mar 7, 2024

db0 commented Mar 7, 2024