-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
excessive memory usage and crashes with long duration files #70
Comments
Dear @realies , Thanks you for your message. First of all, according to the the error message you provided, it seems your problem is not only related to the amount of RAM, but to the inability of the docker image to initialize cuda. This may happen when the nvidia-drivers are not compatible with the tensorflow version used. Could you provide me the output of command Could you also provide me the whole output of your docker command for a 1h long file and for a shorter file ? As you noticed, the current design of inaSpeechSegmenter requires a large amount of RAM, which is dependent on the duration of the file being processed. Up to now, these requirements haven't been a problem for my use-cases, as long as the hardware has enough RAM : in my point of view, 4.5 Gb is not a large amount of RAM when used on laptops or GPU servers. I guess a RAM-friendly implementation of inaSpeechSegmenter could be done using keras data generator structures. While being useful, I currently don't have the time to do this improvement. If this feature is necessary for your use-case, and you're willing to contribute to the improvement of inaSpeechSegmenter's code-base, please let me know and we may plan a meeting to discuss about these issues. Kind regards, |
Thanks for the prompt reply, @DavidDoukhan! I think the error message is unrelated to the issue and is expected, as this Docker container runs without a GPU, and the machine it runs on does not have one.
Here's the output of segmenting a 1 minute file vs 1 hour file when the container is limited to 1GB of system memory:
Trying to use the library on a non-GPU system, and those are usually available with less ram than GPU systems. I think it would be a good optimisation to lower the memory footprint of inaSpeechSegmenter.
Appreciate the suggestion on using Keras data generator structures. I have very little experience in writing neural networks outside of Matlab and was wondering if you might be interested in doing this refactoring for a fee, or helping with guiding through the high-level changes that need to happen for this to work (might require extra noob patience). |
Update: using 1h_file.wav as input file made inaSpeechSegmenter peak at around 7GB at the beginning before stabilising at around 4.5G for most of the time it runs. |
One quick and dirty solution is to probe the length of your media file with ffprobe (which is in the ffmpeg package), then segment it into shorter chunks and do inasegmenter piece by piece with start_sec and stop_sec of Segmenter.call, and finally join the result arrays together. Make sure to import the gc module and do a garbage collection after each segmenter run. From my experience, 10 minutes chunks are perfectly fine for a 1GB RAM, 5GB swap docker container; the swap size is very much exaggerated because i really dont want to have a signal 9. Segmentation quality is assumed to be worse but not critical to my use screnario. See my Segmenter.call's wrapper at The same idea applies to limited VRAM scenarios. Although I have to warn about processing speeds using limited hardware resources, an oracle 1vcore 1GB RAM is at 230ms/step; GTX1070 is at 2ms/step by comparison. |
Using a rolling window for analysis instead of buffering the whole file into RAM/VRAM would be much better than workarounds that make the library more inaccurate and complicated to use. |
Trying to process files that are around 1 hour long makes inaSpeechSegmenter want to use about 4.5GB of system memory. If that can't be allocated it crashes with this output and code
137
:This happens when inaSpeechSegmenter is used with:
Limiting the Docker container to a fixed amount of memory (e.g.
--memory=1g
) makes it crash with the same error message as above. Can the memory footprint be controlled in any way?System information
Expected Behavior
Use as much memory as available, and slow down the process if can't allocate everything.
Current Behavior
Crash
The text was updated successfully, but these errors were encountered: