前言

分類模型輸出如果要轉成人類可讀性的文字類別,在 Triton Server 端模型庫中添加類別名稱文件即可,本篇將舉 Fashion MNIST 為例。
詳細 Triton Inference Server 的功能可參考此文章


範例

Server

系統環境

  • OS:Ubuntu 20.04
  • GPU Driver:450.119.04
  • Docker:19.03.14
  • Docker Image:nvcr.io/nvidia/tritonserver:21.05-py3

訓練模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tensorflow as tf

train_images = train_images / 255.0
test_images = test_images / 255.0

model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)
])

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=10)

model.save('model.savedmodel')

編輯設定檔

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
name: "fashion"
platform: "tensorflow_savedmodel"
max_batch_size: 32
input [
{
name: "flatten_input"
data_type: TYPE_FP32
dims: [28, 28]
}
]
output [
{
name: "dense_1"
data_type: TYPE_FP32
dims: [10]
label_filename: "labels.txt"
}
]
instance_group [
{
kind: KIND_GPU
count: 2
}
]

optimization { execution_accelerators {
gpu_execution_accelerator : [ {
name : "tensorrt"
parameters { key: "precision_mode" value: "FP16" }}]
}}

version_policy { latest { num_versions: 1 } }

dynamic_batching {
preferred_batch_size: [ 4, 8 ]
max_queue_delay_microseconds: 100
}

編輯類別名稱檔

檔名要跟模型設定檔裡面設的一樣,如此範例設為 labels.txt
請依照當初訓練時,分類的順序依序填入

1
2
3
4
5
6
7
8
9
10
T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot

建立模型庫

檔案架構如下圖

運行伺服器端

1
sudo docker run -d --gpus all --name Triton_Server --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/user/Documents/models:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-store=/models

Client

系統環境

  • OS:Ubuntu 20.04
  • GPU Driver:450.119.04
  • Docker:19.03.14
  • Docker Image:nvcr.io/nvidia/tritonserver:21.05-py3-sdk

撰寫程式碼

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from PIL import Image
import numpy as np
import tritonclient.grpc as grpcclient
from tritonclient.utils import triton_to_np_dtype
from tritonclient import utils

## 前處理
img = Image.open('input.JPG').convert('L')
img = img.resize((28, 28))
imgArr = np.asarray(img)/255
imgArr = np.expand_dims(imgArr, axis=0)
imgArr= imgArr.astype(triton_to_np_dtype('FP32'))

## Client-Server 溝通
triton_client = grpcclient.InferenceServerClient(url='localhost:8001', verbose=0)
inputs = []
inputs.append(grpcclient.InferInput('flatten_input', imgArr.shape, 'FP32'))
inputs[0].set_data_from_numpy(imgArr)
outputs = []
outputs.append(grpcclient.InferRequestedOutput('dense_1',class_count=10))
responses = []
responses.append(triton_client.infer('fashion',inputs,
request_id=str(1),
model_version='1',
outputs=outputs))

## 後處理
print (responses[0].as_numpy("dense_1")[0][0])

運行客戶端

1
sudo docker run -it --rm --name Triton_Client -v /home/user/data:/data nvcr.io/nvidia/tritonserver:21.05-py3-sdk bash -c 'python /data/client.py'

欲推論的圖片

輸出結果

意義為 (模型輸出數值):(標籤):(類別名稱)