VRAM is not freed on errors #303

rofoto · 2024-04-10T23:47:42Z

When using musicgen the process completes and all files are created, but I have blocked network traffic out and this causes an error. If you enable multi band after this, there is a chance that there will not be enough vram for it.

Not sure this is intended behavior since there seems to be a delayed cleanup between runs.

This error does not effect the outputs but it also puts the GPU in a state where VRAM is not freed, forcing a restart.

This is not the only error that puts the GPU in this state. It appears that pretty much any error, including but not limited to;
torch.cuda.OutOfMemoryError and errors when trying to download models puts the GPU in this state.

rsxdalv · 2024-04-11T01:58:27Z

Thanks for the tip, the models should definitely free memory on failure. I think some of them would free it on next load, but that's not ideal. As for the issue itself, do you see any error messages in the console? I don't think musicgen should be doing telemetry and I have disabled Gradio telemetry. (By the way I have almost no idea what people are *actually* using.)

…

On Thu, Apr 11, 2024, 8:48 AM rofoto ***@***.***> wrote: When using musicgen the process completes and all files are created, but I have blocked network traffic out and this causes an error when ( i assume ) musicgen tries to send out telemetry. This error does not effect the outputs but it also puts the GPU in a state where VRAM is not freed, forcing a restart. This is not the only error that puts the GPU in this state. It appears that pretty much any error, including but not limited to; torch.cuda.OutOfMemoryError and errors when trying to download models puts the GPU in this state. — Reply to this email directly, view it on GitHub <#303>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI2F7QKQBFO7GKNVSJLY4XFTHAVCNFSM6AAAAABGBKBVEKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZTMNRRGAYTMNI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

rofoto · 2024-04-12T05:59:20Z

this is the only other error I am seeing

"tts-6.0_webui\installer_files\env\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
    self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host.

That's why I thought telemetry was the potential issue.

I think some of them would free it on next load, but that's not ideal.
After looking, it would appear that in some situations the vram is freed on the next run but ideally it can be cleared at the end of generation, just in case.

rsxdalv · 2024-04-12T06:54:10Z

That error is because you still have a frontend and a server. This is generally from gradio. Although it's not impossible that this could from telemetry at some point, it's a different situation.

…

On Fri, Apr 12, 2024, 2:59 PM rofoto ***@***.***> wrote: this is the only other error I am seeing "tts-6.0_webui\installer_files\env\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost self._sock.shutdown(socket.SHUT_RDWR) ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host. That's why I thought telemetry was the potential issue. I think some of them would free it on next load, but that's not ideal. After looking, it would appear that in some situations the vram is freed on the next run but ideally it can be cleared at the end of generation, just in case. — Reply to this email directly, view it on GitHub <#303 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI2VIKMSGVOLN36M7OTY45Z43AVCNFSM6AAAAABGBKBVEKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJRGAZTKMJVGQ> . You are receiving this because you commented.Message ID: ***@***.***>

rsxdalv added bug Something isn't working enhancement New feature or request downstream-fix Likely requires fixing another project downstream labels Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VRAM is not freed on errors #303

VRAM is not freed on errors #303

rofoto commented Apr 10, 2024 •

edited

Loading

rsxdalv commented Apr 11, 2024 via email

rofoto commented Apr 12, 2024

rsxdalv commented Apr 12, 2024 via email

VRAM is not freed on errors #303

VRAM is not freed on errors #303

Comments

rofoto commented Apr 10, 2024 • edited Loading

rsxdalv commented Apr 11, 2024 via email

rofoto commented Apr 12, 2024

rsxdalv commented Apr 12, 2024 via email

rofoto commented Apr 10, 2024 •

edited

Loading