Azure Execution Provider (Preview)
The Azure Execution Provider enables ONNX Runtime to invoke an remote Azure endpoint for inferenece, the endpoint must be deployed beforehand. To consume the endpoint, a model of same inputs and outputs must be loaded locally in the first place.
One use case for Azure Execution Provider is small-big models. E.g. A smaller model deployed on edge device for faster inference, while a bigger model deployed on Azure for higher precision, with Azure Execution Provider, a switch between the two could be easily achieved. Again, the two models must have same inputs and outputs.
Azure Execution Provider is in preview stage, all API(s) and usage are subjuct to change.
Limitations
So far, Azure Execution Provider is limited to:
- only support triton server on AML.
- only build and run on Windows and Linux.
- available only as python package, but user could also build from source and consume the feature by C/C++ API(s).
Requirements
For Windows, please install zlib and re2, and add their binaries into the system path. If built from source, zlib and re2 binaries could be easily located with:
cd <build_output_path>
dir /s zlib1.dll re2.dll
For Linux, please make sure openssl is installed.
Known Issue
For certain ubuntu versions, https call made by AzureEP might report error - “error setting certificate verify location …”. To silence it, please create file “/etc/pki/tls/certs/ca-bundles.crt” that link to “/etc/ssl/certs/ca-certificates.crt”.
Build
For build instructions, please see the BUILD page.
Usage
Python
from onnxruntime import *
import numpy as np
import os
sess_opt = SessionOptions()
sess_opt.add_session_config_entry('azure.endpoint_type', 'triton'); # only support triton server for now
sess_opt.add_session_config_entry('azure.uri', 'https://...')
sess_opt.add_session_config_entry('azure.model_name', 'a_simple_model');
sess_opt.add_session_config_entry('azure.model_version', '1'); # optional, default 1
sess_opt.add_session_config_entry('azure.verbose', 'true'); # optional, default false
sess = InferenceSession('a_simple_model.onnx', sess_opt, providers=['CPUExecutionProvider','azureExecutionProvider'])
run_opt = RunOptions()
run_opt.add_run_config_entry('use_azure', '1') # optional, default '0' to run inference locally.
run_opt.add_run_config_entry('azure.auth_key', '...') # optional, required only when use_azure set to 1
x = np.array([1,2,3,4]).astype(np.float32)
y = np.array([4,3,2,1]).astype(np.float32)
z = sess.run(None, {'X':x, 'Y':y}, run_opt)[0]