Deploy an endpoint for a custom EfficientNet model¶
Once a custom model has been trained it's possible to create a SageMaker Endpoint to make inferences on new data.
1. Create Model Package¶
Under Training > Training Jobs select the target training job,
click on Actions and then Create Model Package.

In the Create model package page:
- Fill the
Model package namee.g.ants-and-bees - Verify that in
Inference specification options:Provide the algorithm used for training and its model artifactsis selected - The
Algorithm and model artifactswill be filled automatically including theAlgorithm ARNandLocation of model artifacts- The
Location of model artifactswill be the output of theTraining Jobe.g.s3://my-bucket/ants-bees-output/output/model.tar.gz
- The
- Click on
Next - Under
Validate this resourceselectNo - Click
Create model package
A new Model Package will be created.
2. Create Endpoint¶
Select the Model package, click on Action and then on Create endpoint.

In the Create model and endpoint:
- Under
Model settings- Select a
Model namee.g.ants-and-bees - Select an
IAM rolethat has access to S3 (where the trained model was saved)
- Select a
- Under
Container definition- Verify
Use a model package subscription from AWS Marketplaceis selected - Verify
Selected model package ARNis pointing to the newly createdModel Package
- Verify
- Click on
Next - Select an
Endpoint namee.g.ants-and-bees - Under
Attach endpoint configurationselectCreate a new endpoint configuration - Under
New endpoint configurationandProduction variants- Click
Editin theActionscolumns and select the instance type for the endpoint. The minimum recommended isml.c5.xlarge - Click on
Create endpoint configuration
- Click
- Click on
Submit
A new endpoint will be created.
Note
It might take a couple of minutes for the endpoint to be available.
3. Making a query¶
With the endpoint ready you will have an URL to make predictions, for example:
How to query the Invocations endpoint
For complete documentation on how to query this endpoint see the AWS Docs: InvokeEndpoint documentation in AWS.
The key part being how to handle the AWS Signature Version 4, for example using Python.
As a quick example let's use these two images:


Querying the endpoint:
import boto3
client = boto3.client("sagemaker-runtime")
endpoint_name = "ants-and-bees"
with open("validation/ant1.jpg", "rb") as f:
payload = bytearray(f.read())
response = client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="image/jpeg",
CustomAttributes='{"decode": true}',
Body=payload,
)
print(response["Body"].json())
We can see the first label is ants.
import boto3
client = boto3.client("sagemaker-runtime")
endpoint_name = "ants-and-bees"
with open("validation/bee1.jpg", "rb") as f:
payload = bytearray(f.read())
response = client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="image/jpeg",
CustomAttributes='{"decode": true}',
Body=payload,
)
print(response["Body"].json())
We can see the first predicted label is bees.
Full API docs
For the complete documentation of the API including the different inputs and responses and more ways to query the Invocations endpoint see the API page.