You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The 'seldon-container-engine' sidecar that is attached to every service runs as a Spring application that monitors and routes incoming prediction requests to the underlying wrapper service (in Python, that's the wrapped Flask application that calls your model class).
When the JVM runs out of heap, lots of weird and unexplainable problems occur. For example, I had one model that was processing large 4k images for classification and the JSON result was failing to serialize into a protobuf because of lack of heap. I had another instance where the model would fail fairly silently.
Raising the maximum heap size should not degrade slim deployments either IMO. All it does is allows the JVM to use more heap space if the container is given more memory. But I admit, I am not a Java expert nor do I drink coffee.
The text was updated successfully, but these errors were encountered:
By setting a max heap Java may keep growing to fill this up before doing a big GC. Need to check how this works with kubernetes requests and limits for memory
Don't you want that rather than crashing in unexpected ways?
How about this:
Can you auto discovery the amount of memory k8s allocates on container startup via some wrapper script that ultimately calls 'java' with the "right" CLI parameters? The container engine could I suppose just set a percentage of memory as a default for the heap.
The 'seldon-container-engine' sidecar that is attached to every service runs as a Spring application that monitors and routes incoming prediction requests to the underlying wrapper service (in Python, that's the wrapped Flask application that calls your model class).
When the JVM runs out of heap, lots of weird and unexplainable problems occur. For example, I had one model that was processing large 4k images for classification and the JSON result was failing to serialize into a protobuf because of lack of heap. I had another instance where the model would fail fairly silently.
Raising the maximum heap size should not degrade slim deployments either IMO. All it does is allows the JVM to use more heap space if the container is given more memory. But I admit, I am not a Java expert nor do I drink coffee.
The text was updated successfully, but these errors were encountered: