You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to deploy a large number of models (modelling different things, not different versions of the same thing) using the python wrapper, and in many cases we just want the model, not any transformer or fancy AB routing. In our case we see that the Python model takes roughly 200Mb of memory, while the Java Spring application which is also deployed takes roughly 700 Mb. Is it possible to write a seldondeployment which does not deploy the java application for the cases when we just want to deploy a "naked" model? The relevant part of our seldondeployment is:
Allowing direct access via the API for single model scenarios like yours where a service orchestrator is not required. This is in our roadmap. We need to conform the APIs internal and external to be the same so they can be used in this way and then allow an annotation to wire things up for the case where no service orchestrator is used.
For the size of the running images, yes we want to ensure smaller image footprints. You can set the JAVA_OPTS for the engine to decrease the default JVM usage.
We want to deploy a large number of models (modelling different things, not different versions of the same thing) using the python wrapper, and in many cases we just want the model, not any transformer or fancy AB routing. In our case we see that the Python model takes roughly 200Mb of memory, while the Java Spring application which is also deployed takes roughly 700 Mb. Is it possible to write a seldondeployment which does not deploy the java application for the cases when we just want to deploy a "naked" model? The relevant part of our seldondeployment is:
The text was updated successfully, but these errors were encountered: