私的AI研究会 > OpenVINO12
物体検出 でよくわからないまま実行した「YOLO v3 + TensorFlow」による物体検出を OpenVINO™ の学習済みモデル「yolo-v3-tiny-tf」 + NCS2 を使って検証する。
OpenVINO™ ツールインストール済みの Hyper-V 仮想環境下の Ubuntu20.04LTS で Model Optimizer を使って IRモデルに変換する。
mizutu@ubuntu2004dk:~/work$ python3 $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader/downloader.py --name yolo-v3-tiny-tf ################|| Downloading yolo-v3-tiny-tf ||################ ========== Downloading /home/mizutu/work/public/yolo-v3-tiny-tf/yolo-v3-tiny-tf.zip ... 100%, 32066 KB, 9046 KB/s, 3 seconds passed ========== Unpacking /home/mizutu/work/public/yolo-v3-tiny-tf/yolo-v3-tiny-tf.zip
mizutu@ubuntu2004dk:~/work$ ls ./public/yolo-v3-tiny-tf/* README.txt yolo-v3-tiny-tf.json yolo-v3-tiny-tf.pb
mizutu@ubuntu2004dk:~/work$ python3 $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader/converter.py --name yolo-v3-tiny-tf --precisions FP16 ========== Converting yolo-v3-tiny-tf to IR (FP16) Conversion command: /bin/python3 -- /opt/intel/openvino_2021/deployment_tools/model_optimizer/mo.py --framework=tf --data_type=FP16 --output_dir=/home/mizutu/work/public/yolo-v3-tiny-tf/FP16 --model_name=yolo-v3-tiny-tf '--input_shape=[1,416,416,3]' --input=image_input '--scale_values=image_input[255]' --reverse_input_channels --transformations_config=/home/mizutu/work/public/yolo-v3-tiny-tf/yolo-v3-tiny-tf/yolo-v3-tiny-tf.json --input_model=/home/mizutu/work/public/yolo-v3-tiny-tf/yolo-v3-tiny-tf/yolo-v3-tiny-tf.pb /opt/intel/openvino_2021.2.185/deployment_tools/model_optimizer/mo/main.py:85: SyntaxWarning: "is" with a literal. Did you mean "=="? if op is 'k': Model Optimizer arguments: Common parameters: - Path to the Input Model: /home/mizutu/work/public/yolo-v3-tiny-tf/yolo-v3-tiny-tf/yolo-v3-tiny-tf.pb - Path for generated IR: /home/mizutu/work/public/yolo-v3-tiny-tf/FP16 - IR output name: yolo-v3-tiny-tf - Log level: ERROR - Batch: Not specified, inherited from the model - Input layers: image_input - Output layers: Not specified, inherited from the model - Input shapes: [1,416,416,3] - Mean values: Not specified - Scale values: image_input[255] - Scale factor: Not specified - Precision of IR: FP16 - Enable fusing: True - Enable grouped convolutions fusing: True - Move mean values to preprocess section: None - Reverse input channels: True TensorFlow specific parameters: - Input model in text protobuf format: False - Path to model dump for TensorBoard: None - List of shared libraries with TensorFlow custom layers implementation: None - Update the configuration file with input/output node names: None - Use configuration file used to generate the model with Object Detection API: None - Use the config file: None Model Optimizer version: 2021.2.0-1877-176bdf51370-releases/2021/2 : [ SUCCESS ] Total execution time: 12.81 seconds. [ SUCCESS ] Memory consumed: 480 MB.
mizutu@ubuntu2004dk:~/work$ ls ./public/yolo-v3-tiny-tf/FP16/* ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.bin ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.mapping ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.xml
index (coco.names) | 日本語 (coco.names_jp) |
person | 人 |
bicycle | 自転車 |
car | 車 |
motorbike | バイク |
aeroplane | 飛行機 |
bus | バス |
train | 列車 |
truck | トラック |
boat | ボート |
traffic light | 信号機 |
fire hydrant | 消火栓 |
stop sign | 一時停止標識 |
parking meter | パーキングメーター |
bench | ベンチ |
bird | 鳥 |
cat | 猫 |
dog | 犬 |
horse | 馬 |
sheep | 羊 |
cow | 牛 |
elephant | 象 |
bear | 熊 |
zebra | シマウマ |
giraffe | キリン |
backpack | バックパック |
umbrella | 傘 |
handbag | ハンドバック |
tie | ネクタイ |
suitcase | スーツケース |
frisbee | フリスビー |
skis | スキー板 |
snowboard | スノーボード |
sports ball | スポーツボール |
kite | 凧 |
baseball bat | 野球のバット |
baseball glove | 野球のグローブ |
skateboard | スケートボード |
surfboard | サーフボード |
tennis racket | テニスラケット |
bottle | 瓶 |
wine glass | ワイングラス |
cup | カップ |
fork | フォーク |
knife | ナイフ |
spoon | スプーン |
bowl | 丼鉢 |
banana | バナナ |
apple | リンゴ |
sandwich | サンドイッチ |
orange | オレンジ |
broccoli | ブロッコリー |
carrot | 人参 |
hot dog | ホットドッグ |
pizza | ピザ |
donut | ドーナッツ |
cake | ケーキ |
chair | 椅子 |
sofa | ソファー |
pottedplant | 鉢植え |
bed | ベッド |
diningtable | ダイニングテーブル |
toilet | トイレ |
tvmonitor | テレビ |
laptop | ラップトップコンピューター |
mouse | マウス |
remote | リモコン |
keyboard | キーボード |
cell phone | 携帯電話 |
microwave | 電子レンジ |
oven | オーブン |
toaster | トースター |
sink | キッチン・シンク |
refrigerator | 冷蔵庫 |
book | 本 |
clock | 時計 |
vase | 花瓶 |
scissors | ハサミ |
teddy bear | テディベア |
hair drier | ヘアドライヤー |
toothbrush | 歯ブラシ |
Input Converted model Image, name - image_input, shape - 1,3,416,416, format is B,C,H,W where: B - batch size C - channel H - height W - width Channel order is BGR.
Output Converted model The array of detection summary info, name - conv2d_9/BiasAdd/YoloRegion, shape - 1,255,13,13. The anchor values are 81,82, 135,169, 344,319. The array of detection summary info, name - conv2d_12/BiasAdd/YoloRegion, shape - 1,255,26,26. The anchor values are 23,27, 37,58, 81,82. For each case format is B,N*85,Cx,Cy, where B - batch size N - number of detection boxes for cell Cx, Cy - cell index Detection box has format [x,y,h,w,box_score,class_no_1, ..., class_no_80], where: (x,y) - coordinates of box center relative to the cell h,w - raw height and width of box, apply exponential function and multiply by corresponding anchors to get absolute height and width values box_score - confidence of detection box in [0,1] range class_no_1,...,class_no_80 - probability distribution over the classes in the [0,1] range, multiply by confidence value to get confidence of each class
./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.bin~ ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.xml~
coco.names~ coco.names_jp~
vi yolo-v3-tiny-tf.py # -*- coding: utf-8 -*- # import import cv2 import numpy as np # モジュール読み込み from openvino.inference_engine import IECore # モデルの読み込み ie = IECore() model = './public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf' net = ie.read_network(model=model+'.xml', weights=model+'.bin') exec_net = ie.load_network(network=net, device_name="MYRIAD") # 入出力データのキー取得 input_blob = net.input_info['image_input'].name out_blob = next(iter(net.outputs)) print('input blob: name="{}", output blob: name="{}"'.format(input_blob, out_blob)) # 画像読み込み frame = cv2.imread('image/cat.jpg') # 入力データフォーマットへ変換 img = cv2.resize(frame, (416, 416)) # HeightとWidth変更 img = img.transpose((2, 0, 1)) # HWC > CHW img = np.expand_dims(img, axis=0) # CHW > BCHW # 推論実行 out = exec_net.infer(inputs={input_blob: img}) # 出力から必要なデータのみ取り出し out = out[out_blob] # 不要な次元を削減 out = np.squeeze(out) # 中身を出力 print(out)
pi@raspberrypi-mas:~/workspace $ python3 yolo-v3-tiny-tf.py input blob: name="image_input", output blob: name="conv2d_12/Conv2D/YoloRegion" [[[ 6.0791016e-01 5.0830078e-01 4.3676758e-01 ... 6.9677734e-01 7.3242188e-01 4.0942383e-01] [ 4.1870117e-01 4.8095703e-01 5.3027344e-01 ... 4.9096680e-01 6.4990234e-01 6.3427734e-01] [ 2.7929688e-01 4.9829102e-01 5.4199219e-01 ... 3.8720703e-01 6.5380859e-01 7.8564453e-01] ... [ 3.8476562e-01 3.8696289e-01 5.2050781e-01 ... 6.0937500e-01 5.1806641e-01 7.4804688e-01] [ 6.0693359e-01 4.6069336e-01 5.5029297e-01 ... 6.1230469e-01 5.1708984e-01 6.7333984e-01] [ 7.7343750e-01 6.6259766e-01 4.2651367e-01 ... 5.8251953e-01 4.5068359e-01 4.1381836e-01]] [[ 4.0454102e-01 3.6254883e-01 3.4619141e-01 ... 2.5537109e-01 3.0517578e-01 4.4213867e-01] [ 4.5971680e-01 2.9516602e-01 3.7377930e-01 ... 3.8452148e-01 3.4570312e-01 4.1284180e-01] [ 3.3642578e-01 3.0688477e-01 3.7402344e-01 ... 4.6972656e-01 5.0146484e-01 5.3369141e-01] ... [ 4.6411133e-01 5.6640625e-01 7.0312500e-01 ... 4.0893555e-01 4.6752930e-01 4.7436523e-01] [ 6.2451172e-01 6.9384766e-01 3.1762695e-01 ... 5.6103516e-01 5.1855469e-01 4.9975586e-01] [ 2.7172852e-01 4.5629883e-01 6.5722656e-01 ... 6.6552734e-01 5.7617188e-01 3.6450195e-01]] [[-4.0502930e-01 3.1103516e-01 5.0292969e-01 ... 9.6679688e-01 6.1523438e-01 3.1051636e-02] [-1.0419922e+00 -6.6650391e-02 -9.2529297e-02 ... -2.9541016e-01 -1.3870239e-02 -7.7294922e-01] [-1.6083984e+00 6.6162109e-02 1.1340332e-01 ... -5.0830078e-01 1.5539551e-01 -1.2939453e+00] ... [-1.1230469e+00 5.8105469e-01 4.8022461e-01 ... 4.3579102e-01 2.1765137e-01 -1.4394531e+00] [-5.8740234e-01 4.6362305e-01 3.0566406e-01 ... 4.5581055e-01 2.5781250e-01 -8.0908203e-01] [-1.0180664e-01 6.5332031e-01 1.5214844e+00 ... 1.3134766e+00 7.2656250e-01 -2.5073242e-01]] ... [[ 2.8285980e-03 1.6651154e-03 5.0811768e-03 ... 1.7023087e-03 4.3678284e-03 8.6517334e-03] [ 2.5405884e-03 1.3647079e-03 2.9697418e-03 ... 2.0236969e-03 4.8522949e-03 8.9569092e-03] [ 1.2989044e-03 4.5466423e-04 6.6041946e-04 ... 4.8804283e-04 1.3647079e-03 2.5520325e-03] ... [ 2.5924683e-02 7.2814941e-02 5.3375244e-02 ... 1.7480850e-03 1.8062592e-03 4.4021606e-03] [ 3.7261963e-02 1.3842773e-01 6.4758301e-02 ... 2.0236969e-03 2.0027161e-03 6.6490173e-03] [ 3.4545898e-02 1.1242676e-01 1.0021973e-01 ... 4.8408508e-03 4.8675537e-03 8.6746216e-03]] [[ 3.8452148e-03 1.4467239e-03 1.7299652e-03 ... 3.7841797e-03 3.1356812e-03 3.1700134e-03] [ 3.4713745e-03 1.1281967e-03 1.7395020e-03 ... 5.7983398e-03 1.2619019e-02 1.1039734e-02] [ 2.4986267e-03 1.3494492e-03 3.2024384e-03 ... 4.2991638e-03 7.6484680e-03 9.5291138e-03] ... [ 1.6040802e-03 1.3856888e-03 9.8133087e-04 ... 5.9938431e-04 6.3228607e-04 2.0351410e-03] [ 4.0397644e-03 5.1612854e-03 3.6468506e-03 ... 8.0728531e-04 9.5987320e-04 1.8548965e-03] [ 5.0659180e-03 6.4353943e-03 5.6724548e-03 ... 1.1405945e-03 1.2311935e-03 3.0193329e-03]] [[ 9.2239380e-03 6.4735413e-03 3.1356812e-03 ... 1.0108948e-02 5.5656433e-03 2.5939941e-03] [ 6.7176819e-03 3.4542084e-03 2.1724701e-03 ... 7.0533752e-03 3.1013489e-03 4.6501160e-03] [ 3.5305023e-03 3.7841797e-03 2.0027161e-03 ... 3.7250519e-03 9.8133087e-04 4.3678284e-03] ... [ 4.4136047e-03 4.2533875e-03 2.7256012e-03 ... 6.2942505e-04 5.4073334e-04 1.4085770e-03] [ 4.6348572e-03 4.8675537e-03 8.4457397e-03 ... 1.3933182e-03 1.0404587e-03 1.3856888e-03] [ 4.4136047e-03 4.7607422e-03 5.2032471e-03 ... 2.6817322e-03 2.1247864e-03 3.0841827e-03]]] E: [global] [ 558361] [python3] XLink_sem_wait:94 XLink_sem_inc(sem) method call failed with an error: -1 E: [global] [ 558361] [python3] XLinkResetRemote:257 can't wait dispatcherClosedSem
yolo-v3-tiny-tf.py:17: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property. input_blob = next(iter(net.inputs))
Classification3.py:15:DeprecationWarning:IENetworkクラスの 'inputs'プロパティは非推奨になりました。 DataPtrsにアクセスするには、ユーザーは、「input_info」プロパティでアクセスできるInputInfoPtrオブジェクトの「input_data」プロパティを使用する必要があります。
# 入力データと出力データのキーを取得 input_blob = next(iter(net.inputs))
# 入力データと出力データのキーを取得 input_blob = net.input_info['image_input'].name
コマンドオプション | デフォールト設定 | 意味 |
--ir | yolo-v3-tiny-tf.xml | 学習済みIRファイル |
-l, --label | coco.names_jp | ラベルファイル |
-i, --image | ./desk-image.jpg | 入力画像ファイルパス |
--threshold | 0.6 ※ | 表示する閾値 |
--iou | 0.25 | 重なりを許す閾値 |
pi@raspberrypi:~/workspace $ python3 tiny_yolo_v3_a.py --- ** tiny_Yolo_v3_a ** Object identification --- 4.5.1-openvino OpenVINO inference_engine: 2.1.2021.2.0-1877-176bdf51370-releases/2021/2 Running OpenVINO NCS Tensorflow TinyYolo v3 example... Displaying image with objects detected in GUI... Click in the GUI window and hit any key to exit. tiny_yolo_v3_a.py:246: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property. input_blob = next(iter(net.inputs)) Tiny Yolo v3: Starting application... - Plugin: Myriad - IR File: ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.xml - Input Shape: [1, 3, 416, 416] - Output Shapes: - output #0 name: conv2d_12/Conv2D/YoloRegion - output shape: [1, 255, 26, 26] - output #1 name: conv2d_9/Conv2D/YoloRegion - output shape: [1, 255, 13, 13] - Labels File: coco.names_jp - Image File: ../image/desk-image.jpg - Threshold: 0.4 - Intersection Over Union: 0.25 tiny_yolo_v3_a.py:281: DeprecationWarning: 'outputs' property of InferRequest is deprecated. Please instead use 'output_blobs' property. all_output_results = req_handle.outputs Finished.
pi@raspberrypi:~/workspace $ python3 tiny_yolo_v3_b.py --- ** tiny_Yolo_v3_b ** Object identification --- 4.5.1-openvino OpenVINO inference_engine: 2.1.2021.2.0-1877-176bdf51370-releases/2021/2 Running OpenVINO NCS Tensorflow TinyYolo v3 example... Displaying image with objects detected in GUI... Click in the GUI window and hit any key to exit. tiny_yolo_v3_b.py:248: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property. input_blob = next(iter(net.inputs)) Tiny Yolo v3: Starting application... - Plugin: Myriad - IR File: ./public/yolo-v3-tiny-tf/FP16/yolo-v3-tiny-tf.xml - Input Shape: [1, 3, 416, 416] - Output Shapes: - output #0 name: conv2d_12/Conv2D/YoloRegion - output shape: [1, 255, 26, 26] - output #1 name: conv2d_9/Conv2D/YoloRegion - output shape: [1, 255, 13, 13] - Labels File: coco.names_jp - Image File: 0 - Threshold: 0.6 - Intersection Over Union: 0.25 tiny_yolo_v3_b.py:293: DeprecationWarning: 'outputs' property of InferRequest is deprecated. Please instead use 'output_blobs' property. all_output_results = req_handle.outputs Finished.