前言 分類模型輸出如果要轉成人類可讀性的文字類別,在 Triton Server 端模型庫中添加類別名稱文件即可,本篇將舉 Fashion MNIST 為例。 詳細 Triton Inference Server 的功能可參考此文章 。
範例 Server 系統環境
OS:Ubuntu 20.04
GPU Driver:450.119.04
Docker:19.03.14
Docker Image:nvcr.io/nvidia/tritonserver:21.05-py3
訓練模型 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 import tensorflow as tftrain_images = train_images / 255.0 test_images = test_images / 255.0 model = tf.keras.Sequential([ tf.keras.layers.Flatten(input_shape=(28 , 28 )), tf.keras.layers.Dense(128 , activation='relu' ), tf.keras.layers.Dense(10 ) ]) model.compile(optimizer='adam' , loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True ), metrics=['accuracy' ]) model.fit(train_images, train_labels, epochs=10 ) model.save('model.savedmodel' )
編輯設定檔 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 name: "fashion" platform: "tensorflow_savedmodel" max_batch_size: 32 input [ { name: "flatten_input" data_type: TYPE_FP32 dims: [28 , 28 ] } ] output [ { name: "dense_1" data_type: TYPE_FP32 dims: [10 ] label_filename: "labels.txt" } ] instance_group [ { kind: KIND_GPU count: 2 } ] optimization { execution_accelerators { gpu_execution_accelerator : [ { name : "tensorrt" parameters { key: "precision_mode" value: "FP16" }}] }} version_policy { latest { num_versions: 1 } } dynamic_batching { preferred_batch_size: [ 4 , 8 ] max_queue_delay_microseconds: 100 }
編輯類別名稱檔 檔名要跟模型設定檔裡面設的一樣,如此範例設為 labels.txt
請依照當初訓練時,分類的順序依序填入
1 2 3 4 5 6 7 8 9 10 T-shirt/top Trouser Pullover Dress Coat Sandal Shirt Sneaker Bag Ankle boot
建立模型庫 檔案架構如下圖
運行伺服器端 1 sudo docker run -d --gpus all --name Triton_Server --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/user/Documents/models:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-store=/models
Client 系統環境
OS:Ubuntu 20.04
GPU Driver:450.119.04
Docker:19.03.14
Docker Image:nvcr.io/nvidia/tritonserver:21.05-py3-sdk
撰寫程式碼 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 from PIL import Imageimport numpy as npimport tritonclient.grpc as grpcclientfrom tritonclient.utils import triton_to_np_dtypefrom tritonclient import utilsimg = Image.open('input.JPG' ).convert('L' ) img = img.resize((28 , 28 )) imgArr = np.asarray(img)/255 imgArr = np.expand_dims(imgArr, axis=0 ) imgArr= imgArr.astype(triton_to_np_dtype('FP32' )) triton_client = grpcclient.InferenceServerClient(url='localhost:8001' , verbose=0 ) inputs = [] inputs.append(grpcclient.InferInput('flatten_input' , imgArr.shape, 'FP32' )) inputs[0 ].set_data_from_numpy(imgArr) outputs = [] outputs.append(grpcclient.InferRequestedOutput('dense_1' ,class_count=10 )) responses = [] responses.append(triton_client.infer('fashion' ,inputs, request_id=str(1 ), model_version='1' , outputs=outputs)) print (responses[0 ].as_numpy("dense_1" )[0 ][0 ])
運行客戶端 1 sudo docker run -it --rm --name Triton_Client -v /home/user/data:/data nvcr.io/nvidia/tritonserver:21.05-py3-sdk bash -c 'python /data/client.py'
欲推論的圖片
輸出結果
意義為 (模型輸出數值):(標籤):(類別名稱)