Azure Execution Provider (Preview)

The Azure Execution Provider enables ONNX Runtime to invoke an remote Azure endpoint for inferenece, the endpoint must be deployed beforehand. To consume the endpoint, a model of same inputs and outputs must be loaded locally in the first place.

One use case for Azure Execution Provider is small-big models. E.g. A smaller model deployed on edge device for faster inference, while a bigger model deployed on Azure for higher precision, with Azure Execution Provider, a switch between the two could be easily achieved. Again, the two models must have same inputs and outputs.

Azure Execution Provider is in preview stage, all API(s) and usage are subjuct to change.

Limitations

So far, Azure Execution Provider is limited to:

only support triton server on AML.
only build and run on Windows and Linux.
available only as python package, but user could also build from source and consume the feature by C/C++ API(s).

Requirements

For Windows, please install zlib and re2, and add their binaries into the system path. If built from source, zlib and re2 binaries could be easily located with:

cd <build_output_path>
dir /s zlib1.dll re2.dll

For Linux, please make sure openssl is installed.

Known Issue

For certain ubuntu versions, https call made by AzureEP might report error - “error setting certificate verify location …”. To silence it, please create file “/etc/pki/tls/certs/ca-bundles.crt” that link to “/etc/ssl/certs/ca-certificates.crt”.

Build

For build instructions, please see the BUILD page.

Usage

Python

from onnxruntime import *
import numpy as np
import os

sess_opt = SessionOptions()
sess_opt.add_session_config_entry('azure.endpoint_type', 'triton'); # only support triton server for now
sess_opt.add_session_config_entry('azure.uri', 'https://...')
sess_opt.add_session_config_entry('azure.model_name', 'a_simple_model');
sess_opt.add_session_config_entry('azure.model_version', '1'); # optional, default 1
sess_opt.add_session_config_entry('azure.verbose', 'true'); # optional, default false

sess = InferenceSession('a_simple_model.onnx', sess_opt, providers=['CPUExecutionProvider','azureExecutionProvider'])

run_opt = RunOptions()
run_opt.add_run_config_entry('use_azure', '1') # optional, default '0' to run inference locally.
run_opt.add_run_config_entry('azure.auth_key', '...') # optional, required only when use_azure set to 1

x = np.array([1,2,3,4]).astype(np.float32)
y = np.array([4,3,2,1]).astype(np.float32)

z = sess.run(None, {'X':x, 'Y':y}, run_opt)[0]