You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from sagemaker.huggingface.model import HuggingFaceModel
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
model_data=model_data, # path to your model and script
role=role, # iam role with permissions to create an Endpoint
image_uri='763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference-neuron:1.10.2-transformers4.20.1-neuron-py37-sdk1.19.1-ubuntu18.04'
)
# Let SageMaker know that we've already compiled the model via neuron-cc
huggingface_model._is_compiled_model = True
# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type="ml.inf1.24xlarge" # AWS Inferentia Instance
)
when I am using inf1.xlarge my endpoints works as expected. the moment I switch to ml.inf1.24xlarge or ml.inf1.6xlarge or ml.inf1.2xlarge I get hit with the following error.
What am I missing here ?
The text was updated successfully, but these errors were encountered:
aj2622
changed the title
Inference works on ml.inf1.xlarge but fails on ml.inf1.xlarge with ""The PyTorch Neuron Runtime could not be initialized""
Inference works on ml.inf1.xlarge but fails on ml.inf1.24xlarge with ""The PyTorch Neuron Runtime could not be initialized""
Aug 18, 2022
deployment code
when I am using inf1.xlarge my endpoints works as expected. the moment I switch to ml.inf1.24xlarge or ml.inf1.6xlarge or ml.inf1.2xlarge I get hit with the following error.
What am I missing here ?
The text was updated successfully, but these errors were encountered: