OpenVINO9

ゼロから学ぶディープラーニング推論 -学習済み物体検知モデル- †

ゼロから学ぶディープラーニング推論 -学習済み物体検知モデル-
Intel® OpenVINO™ Model Optimizer で学習済み物体検知モデル
参考資料

※ 最終更新:2021/02/11　

↑

Intel® OpenVINO™ Model Optimizer で学習済み物体検知モデル †

　Open Model Zooと呼ばれる学習済みモデルファイルのアーカイブから Model Optimizer で学習済みモデルの変換を行いディープラーニング推論を実習する。

↑

人や車を検知 †

　Open Model Zoo の public アーカイブから Caffe の学習済み物体検知モデル「mobilenet-ssd」を使って、人を検知した時は赤色の枠、車を検知したときは黄色の枠で囲うアプリケーションを作成する。

引用したサイト → https://www.nskint.co.jp/2020/08/06/ai_column_7/

↑

事前準備 †

入力画像
- 人と車の画像を準備する。
- アプリケーションを動作させる Raspberry Pi のディレクトリ ~/workspace/image に配置する。

Model Optimizer の動作環境を作る
Raspberry Pi のToolKit には含まれないので、別途PC上にツールキットをインストールして環境を作成する。
→ OpenVINO™ Toolkit for Linux

↑

モデルを準備 †

Open Model Zooからツールキットに含まれる機能を使ってモデル mobilenet-ssd をダウンロードする。

~/work$ python3 $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader/downloader.py --name mobilenet-ssd

################|| Downloading mobilenet-ssd ||################

========== Downloading /home/mizutu/work/public/mobilenet-ssd/mobilenet-ssd.prototxt
... 100%, 28 KB, 100614 KB/s, 0 seconds passed

========== Downloading /home/mizutu/work/public/mobilenet-ssd/mobilenet-ssd.caffemodel
... 100%, 22605 KB, 13653 KB/s, 1 seconds passed

INTEL_OPENVINO_DIR変数は OpenVINO™ の環境変数設定を行った際に設定されるものでOpenVINO™ がインストールされているディレクトリパスが格納されている。

ダウンロードされたモデルは、./public/mobilenet-ssd/ ディレクトリ配下に格納される。

~/work$ ls ./public/mobilenet-ssd/*
./public/mobilenet-ssd/mobilenet-ssd.caffemodel
./public/mobilenet-ssd/mobilenet-ssd.prototxt

ダウンロードしたモデルをOpenVINO™ の推論エンジンで使用する IRフォーマットに変換する。

~/work$ python3 $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader/converter.py --name mobilenet-ssd --precisions FP16

========== Converting mobilenet-ssd to IR (FP16)
   :
Model Optimizer arguments:
Common parameters:
	- Path to the Input Model: 	/home/mizutu/work/public/mobilenet-ssd/mobilenet-ssd.caffemodel
	- Path for generated IR: 	/home/mizutu/work/public/mobilenet-ssd/FP16
	- IR output name: 	mobilenet-ssd
	- Log level: 	ERROR
	- Batch: 	Not specified, inherited from the model
	- Input layers: 	data
	- Output layers: 	detection_out
	- Input shapes: 	[1,3,300,300]
	- Mean values: 	data[127.5,127.5,127.5]
	- Scale values: 	data[127.5]
	- Scale factor: 	Not specified
	- Precision of IR: 	FP16
	- Enable fusing: 	True
	- Enable grouped convolutions fusing: 	True
	- Move mean values to preprocess section: 	None
	- Reverse input channels: 	False
Caffe specific parameters:
	- Path to Python Caffe* parser generated from caffe.proto: 	/opt/intel/openvino_2021/deployment_tools/model_optimizer/mo/front/caffe/proto
	- Enable resnet optimization: 	True
	- Path to the Input prototxt: 	/home/mizutu/work/public/mobilenet-ssd/mobilenet-ssd.prototxt
	- Path to CustomLayersMapping.xml: 	Default
	- Path to a mean file: 	Not specified
	- Offsets for a mean file: 	Not specified
Model Optimizer version: 
   :
[ SUCCESS ] Total execution time: 8.14 seconds. 
[ SUCCESS ] Memory consumed: 365 MB.

引数に指定した –precisions FP16 は、変換するモデルの精度。(NCS2を使用する場合FP16)
--precisions を付けなかった場合は、生成可能な精度で全て変換する。
IRフォーマットに変換されたモデルは、./public/mobilenet-ssd/FP16/ ディレクトリ配下に格納される。
```
~/work$ ls ./public/mobilenet-ssd/FP16/*
mobilenet-ssd.bin     mobilenet-ssd.mapping mobilenet-ssd.xml
```

OpenVINO™ に付属の Model Downloader を使用して、Open Model Zoo から好きなモデルをダウンロードすることができる。
ダウンロード可能なモデル一覧は以下のコマンドを実行することで確認。
```
~/work$ python3 $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader/downloader.py --all
```
- 各モデルの詳細情報 → https://github.com/opencv/open_model_zoo/tree/master/models
- mobilenet-ssd モデル → https://github.com/opencv/open_model_zoo/blob/master/models/public/mobilenet-ssd/mobilenet-ssd.md

モデルが学習しているラベルファイルを用意する。
- mobilenet-ssdの出力は、オブジェクトの矩形情報と、そのオブジェクトが何のクラスなのかを示すインデックスになる。
- モデルの出力結果を人間が分かるクラス名と紐づけるために、ラベルファイルを用意する。
- mobilenet-ssd は VOC データセットで学習された物体検知のモデルになる。バックグラウンド(1クラス) + VOCのクラス(20クラス)の合計21クラスで学習されてる。
- OpenVINO™ OpenVINOにあらかじめインストールされているラベルファイルをコピーして使用する。
```
~/work$ cp $INTEL_OPENVINO_DIR/deployment_tools/open_model_zoo/demos/python_demos/voc_labels.txt .
```

voc_labels.txtは、インデックス順にクラス名を列挙したテキストファイル。

~/work$ cat voc_labels.txt
background
aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor

学習済みモデル mobilenet-ssd の詳細 → https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/public/mobilenet-ssd/mobilenet-ssd.md

学習済みモデルの入出力（抜粋）

入力

Input

Original model
Image, name - prob, shape - 1,3,300,300, format is B,C,H,W where:

B - batch size
C - channel
H - height
W - width
Channel order is BGR. Mean values - [127.5, 127.5, 127.5], scale value - 127.5. 

Converted model
Image, name - prob, shape - 1,3,300,300, format is B,C,H,W where:

B - batch size
C - channel
H - height
W - width
Channel order is BGR

※ 入力 Name 間違っているようだ。'prob' → 'data'

出力

Output

Original model
The array of detection summary info, name - detection_out, shape - 1, 1, N, 7, where N is the number of detected bounding boxes. For each detection, the description has the format: [image_id, label, conf, x_min, y_min, x_max, y_max], where:

image_id - ID of the image in the batch
label - predicted class ID
conf - confidence for the predicted class
(x_min, y_min) - coordinates of the top left bounding box corner (coordinates are in normalized format, in range [0, 1])
(x_max, y_max) - coordinates of the bottom right bounding box corner (coordinates are in normalized format, in range [0, 1])

Converted model
The array of detection summary info, name - detection_out, shape - 1, 1, N, 7, where N is the number of detected bounding boxes. For each detection, the description has the format: [image_id, label, conf, x_min, y_min, x_max, y_max], where:

image_id - ID of the image in the batch
label - predicted class ID
conf - confidence for the predicted class
(x_min, y_min) - coordinates of the top left bounding box corner (coordinates are in normalized format, in range [0, 1])
(x_max, y_max) - coordinates of the bottom right bounding box corner (coordinates are in normalized format, in range [0, 1])

Raspberry Pi のディレクトリ /workspace に転送する。
- 学習済みデータ
  ./public/mobilenet-ssd/FP16/mobilenet-ssd.xml
  ./public/mobilenet-ssd/FP16/mobilenet-ssd.bin
- ラベルファイル
  voc_labels.txt

↑

推論を実行するアプリケーション †

　※ 以下の操作は Raspberry Pi 上で行う。

プログラム object_detect.py を新規作成

vi object_detect.py

# -*- coding: utf-8 -*-

#%matplotlib inline
import cv2
import matplotlib.pyplot as plt
import numpy as np      
from openvino.inference_engine import IECore

label = open('voc_labels.txt').readlines()
print(label)

# Inference Engineコアオブジェクトの生成
ie = IECore()

# IRモデルファイルの読み込み
model = './public/mobilenet-ssd/FP16/mobilenet-ssd'
net = ie.read_network(model=model+'.xml', weights=model+'.bin')

# 入出力blobの名前の取得、入力blobのシェイプの取得
input_blob_name  = net.input_info['data'].name
output_blob_name = next(iter(net.outputs))
batch,channel,height,width = net.input_info[input_blob_name].input_data.shape

exec_net = ie.load_network(network=net, device_name='MYRIAD', num_requests=1)

def infer(path):
    print('input blob: name="{}", N={}, C={}, H={}, W={}'.format(input_blob_name, batch, channel, height, width))
    img    = cv2.imread(path)
    in_img = cv2.resize(img, (width,height))
    in_img = in_img.transpose((2, 0, 1))
    in_img = in_img.reshape((1, channel, height, width))
    return img, exec_net.infer(inputs={input_blob_name: in_img})

def show(img, res):
    print('output blob: name="{}", shape={}'.format(output_blob_name, net.outputs[output_blob_name].shape))
    result = res[output_blob_name][0][0]
    img_h, img_w, _ = img.shape
    for obj in result:
        imgid, clsid, confidence, x1, y1, x2, y2 = obj
        if confidence>0.6:
            x1 = int(x1 * img_w)
            y1 = int(y1 * img_h)
            x2 = int(x2 * img_w)
            y2 = int(y2 * img_h)

            color = (0,255,0)
            if label[int(clsid)][:-1] == 'car':
                color = (0,255,255)
            elif label[int(clsid)][:-1] == 'person':
                color = (0,0,255)
        
            cv2.rectangle(img, (x1, y1), (x2, y2), color, thickness=4 )
            cv2.putText(img, label[int(clsid)][:-1], (x1, y1), cv2.FONT_HERSHEY_PLAIN, fontScale=4, color=color, thickness=4)

#    %matplotlib inline
    import matplotlib.pyplot as plt
    img=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)
    plt.show()

img, res = infer('./image/car.jpg')
show(img, res)

引用プログラム中 Jupyter Notebook のコード2行をコメントアウト(#)している。

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor']
input blob: name="data", N=1, C=3, H=300, W=300
output blob: name="detection_out", shape=[1, 1, 100, 7]

その他の入力画像での実行結果

↑

推論を実行するアプリケーション 2 †

　引用プログラムをこれまでの形で作り直してみる。

プログラム object_detect1.py を新規作成

vi object_detect1.py

# -*- coding: utf-8 -*-
##------------------------------------------
## OpenVINO™ model -mobilenet-ssd-
##  ** Object Detect **
##               2021.01.18 Masahiro Izutsu
##
##               2021.02.10 warning error
##------------------------------------------

import cv2
import numpy as np

# モジュール読み込み
from openvino.inference_engine import IECore

# ラベル読み込み
label = open('voc_labels.txt').readlines()
print(label)

# Inference Engineコアオブジェクトの生成
ie = IECore()

# IRモデルファイルの読み込み
model = './public/mobilenet-ssd/FP16/mobilenet-ssd'
net = ie.read_network(model=model+'.xml', weights=model+'.bin')

# 入出力blobの名前の取得、入力blobのシェイプの取得
input_blob_name  = net.input_info['data'].name
output_blob_name = next(iter(net.outputs))
batch,channel,height,width = net.input_info[input_blob_name].input_data.shape

exec_net = ie.load_network(network=net, device_name='MYRIAD', num_requests=1)

# 入力画像読み込み
frame = cv2.imread('./image/car-person.jpg')

# 入力データフォーマットへ変換
img = cv2.resize(frame, (width,height))
img = img.transpose((2, 0, 1))
img = img.reshape((1, channel, height, width))

# 推論実行
out = exec_net.infer(inputs={input_blob_name: img})

# 出力から必要なデータのみ取り出し 
print('output blob: name="{}", shape={}'.format(output_blob_name, net.outputs[output_blob_name].shape))
result = out[output_blob_name][0][0]
img_h, img_w, _ = frame.shape

# 検出されたすべてのオブジェクトに対して１つずつ処理
for obj in result:
    imgid, clsid, confidence, x1, y1, x2, y2 = obj
    if confidence>0.6:
        x1 = int(x1 * img_w)
        y1 = int(y1 * img_h)
        x2 = int(x2 * img_w)
        y2 = int(y2 * img_h)

        color = (0,255,0)
        if label[int(clsid)][:-1] == 'car':
            color = (0,255,255)
        elif label[int(clsid)][:-1] == 'person':
            color = (0,0,255)

        cv2.rectangle(frame, (x1, y1), (x2, y2), color, thickness=2 )
        cv2.putText(frame, label[int(clsid)][:-1], (x1, y1), cv2.FONT_HERSHEY_PLAIN, fontScale=2, color=color, thickness=2)

# 画像表示
cv2.imshow('Object-Detect', frame)

# キーが押されたら終了
cv2.waitKey(0)
cv2.destroyAllWindows()
pi@raspberrypi-mas:~/workspace $

実行結果

~/workspace $ python3 object_detect1.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor\n']
object_detect1.py:18: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property.
 input_blob_name  = list(net.inputs.keys())[0]
output blob: name="detection_out", shape=[1, 1, 100, 7]

↑

推論を実行するアプリケーション 2 日本語で表示する †

　object_detect1.py の検出オブジェクトを日本語で表示する。

プログラム object_detect1_jp.py を新規作成

vi object_detect1_jp.py

# -*- coding: utf-8 -*-
##------------------------------------------
## OpenVINO™ model -mobilenet-ssd-
##  ** Object Detect ** japanease
##               2021.01.18 Masahiro Izutsu
##
##               2021.02.10 warning error
##------------------------------------------

import cv2
import numpy as np
import myfunction

# モジュール読み込み
from openvino.inference_engine import IECore

# ラベル読み込み
label = open('voc_labels.txt').readlines()
label_jp = open('voc_labels_jp.txt').readlines()

# 日本語フォント指定
fontPIL = 'NotoSansCJK-Bold.ttc'

# Inference Engineコアオブジェクトの生成
ie = IECore()

# IRモデルファイルの読み込み
model = './public/mobilenet-ssd/FP16/mobilenet-ssd'
net = ie.read_network(model=model+'.xml', weights=model+'.bin')

# 入出力blobの名前の取得、入力blobのシェイプの取得
input_blob_name  = net.input_info['data'].name
output_blob_name = next(iter(net.outputs))
batch,channel,height,width = net.input_info[input_blob_name].input_data.shape

exec_net = ie.load_network(network=net, device_name='MYRIAD', num_requests=1)

# 入力画像読み込み
frame = cv2.imread('./image/car-person.jpg')

# 入力データフォーマットへ変換
img = cv2.resize(frame, (width,height))
img = img.transpose((2, 0, 1))
img = img.reshape((1, channel, height, width))

# 推論実行
out = exec_net.infer(inputs={input_blob_name: img})

# 出力から必要なデータのみ取り出し 
print('output blob: name="{}", shape={}'.format(output_blob_name, net.outputs[output_blob_name].shape))
result = out[output_blob_name][0][0]
img_h, img_w, _ = frame.shape

# 検出されたすべてのオブジェクトに対して１つずつ処理
for obj in result:
    imgid, clsid, confidence, x1, y1, x2, y2 = obj
    if confidence>0.6:
        x1 = int(x1 * img_w)
        y1 = int(y1 * img_h)
        x2 = int(x2 * img_w)
        y2 = int(y2 * img_h)

        color = (0,255,0)
        if label[int(clsid)][:-1] == 'car':
            color = (0,255,255)
        elif label[int(clsid)][:-1] == 'person':
            color = (0,0,255)

        cv2.rectangle(frame, (x1, y1), (x2, y2), color, thickness=2 )
#        cv2.putText(frame, label[int(clsid)][:-1], (x1, y1), cv2.FONT_HERSHEY_PLAIN, fontScale=2, color=color, thickness=2)
        myfunction.cv2_putText(img = frame,
                           text = label_jp[int(clsid)][:-1],
                           org = (x1, y1), fontFace = fontPIL,
                           fontScale = 12,
                           color = color,
                           mode = 0)

# 画像表示
cv2.imshow('Object-Detect', frame)

# キーが押されたら終了
cv2.waitKey(0)
cv2.destroyAllWindows()
pi@raspberrypi-mas:~/workspace $

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect1_jp.py
input blob: name="data", N=1, C=3, H=300, W=300
output blob: name="detection_out", shape=[1, 1, 100, 7]

↑

カメラ画像で推論を実行するアプリケーション †

　内蔵カメラでリアルタイムに物体認識を行う。

識別できるオブジェクト

labels	--
background	背景
aeroplane	飛行機
bicycle	自転車
bird	鳥
boat	ボート
bottle	ボトル
bus	バス
car	車
cat	猫
chair	椅子
cow	牛
diningtable	ダイニングテーブル
dog	犬
horse	馬
motorbike	バイク
person	人
pottedplant	鉢植え
sheep	羊
sofa	ソファー
train	列車
tvmonitor	テレビ

プログラム object_detect2.py を新規作成

vi object_detect2.py

# -*- coding: utf-8 -*-
##------------------------------------------
## OpenVINO™ model -mobilenet-ssd-
##  ** Object Detect ** camera
##               2021.01.18 Masahiro Izutsu
##
##               2021.02.10 warning error
##------------------------------------------

import cv2
import numpy as np      

# モジュール読み込み
from openvino.inference_engine import IECore

# ラベル読み込み
label = open('voc_labels.txt').readlines()
print(label)

# Inference Engineコアオブジェクトの生成
ie = IECore()

# IRモデルファイルの読み込み
model = './public/mobilenet-ssd/FP16/mobilenet-ssd'
net = ie.read_network(model=model+'.xml', weights=model+'.bin')

# 入出力blobの名前の取得、入力blobのシェイプの取得
input_blob_name  = net.input_info['data'].name
output_blob_name = list(net.outputs.keys())[0]
batch,channel,height,width = net.input_info[input_blob_name].input_data.shape

exec_net = ie.load_network(network=net, device_name='MYRIAD', num_requests=1)

# カメラ準備 
cap = cv2.VideoCapture(0)

# メインループ 
while True:
    ret, frame = cap.read()

    # Reload on error 
    if ret == False:
        continue

    # 入力データフォーマットへ変換 
    img = cv2.resize(frame, (width,height))
    img = img.transpose((2, 0, 1))
    img = img.reshape((1, channel, height, width))
    
    # 推論実行
    out = exec_net.infer(inputs={input_blob_name: img})
    
    # 出力から必要なデータのみ取り出し 
    result = out[output_blob_name][0][0]
    img_h, img_w, _ = frame.shape
    
    # 検出されたすべてのオブジェクトに対して１つずつ処理
    for obj in result:
        imgid, clsid, confidence, x1, y1, x2, y2 = obj
        
        # conf値が0.6より大きい場合のみバウンディングボックス表示 
        if confidence>0.6:
            x1 = int(x1 * img_w)
            y1 = int(y1 * img_h)
            x2 = int(x2 * img_w)
            y2 = int(y2 * img_h)

            color = (0,255,0)
            if label[int(clsid)][:-1] == 'car':
                color = (0,255,255)
            elif label[int(clsid)][:-1] == 'person':
                color = (0,0,255)

            cv2.rectangle(frame, (x1, y1), (x2, y2), color, thickness=2 )
            cv2.putText(frame, label[int(clsid)][:-1], (x1, y1), cv2.FONT_HERSHEY_PLAIN, fontScale=2, color=color, thickness=2)

    # 画像表示
    cv2.imshow('Object-Detect', frame)
    
    # 何らかのキーが押されたら終了
    key = cv2.waitKey(1)
    if key != -1:
        break

# 終了処理 
cap.release()
cv2.destroyAllWindows()

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect2.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor']

タブレット端末に表示させた写真をカメラの前においてテストする。

↑

カメラ画像で推論を実行するアプリケーション日本語で表示する †

プログラム object_detect2_jp.py を新規作成

vi object_detect2_jp.py

# -*- coding: utf-8 -*-
##------------------------------------------
## OpenVINO™ model -mobilenet-ssd-
##  ** Object Detect ** camera japanease
##               2021.01.18 Masahiro Izutsu
##
##               2021.02.10 warning error
##------------------------------------------

import cv2
import numpy as np      
import myfunction

# モジュール読み込み
from openvino.inference_engine import IECore

# ラベル読み込み
label = open('voc_labels.txt').readlines()
label_jp = open('voc_labels_jp.txt').readlines()
print(label, label_jp)

# Inference Engineコアオブジェクトの生成
ie = IECore()

# IRモデルファイルの読み込み
model = './public/mobilenet-ssd/FP16/mobilenet-ssd'
net = ie.read_network(model=model+'.xml', weights=model+'.bin')

# 入出力blobの名前の取得、入力blobのシェイプの取得
input_blob_name  = net.input_info['data'].name
output_blob_name = list(net.outputs.keys())[0]
batch,channel,height,width = net.input_info[input_blob_name].input_data.shape

exec_net = ie.load_network(network=net, device_name='MYRIAD', num_requests=1)

# カメラ準備 
cap = cv2.VideoCapture(0)

# メインループ 
while True:
    ret, frame = cap.read()

    # Reload on error 
    if ret == False:
        continue

    # 入力データフォーマットへ変換 
    img = cv2.resize(frame, (width,height))
    img = img.transpose((2, 0, 1))
    img = img.reshape((1, channel, height, width))
    
    # 推論実行
    out = exec_net.infer(inputs={input_blob_name: img})
    
    # 出力から必要なデータのみ取り出し 
    result = out[output_blob_name][0][0]
    img_h, img_w, _ = frame.shape
    
    # 検出されたすべてのオブジェクトに対して１つずつ処理
    for obj in result:
        imgid, clsid, confidence, x1, y1, x2, y2 = obj
        
        # conf値が0.6より大きい場合のみバウンディングボックス表示 
        if confidence>0.6:
            x1 = int(x1 * img_w)
            y1 = int(y1 * img_h)
            x2 = int(x2 * img_w)
            y2 = int(y2 * img_h)

            color = (0,255,0)
            if label[int(clsid)][:-1] == 'car':
                color = (0,255,255)
            elif label[int(clsid)][:-1] == 'person':
                color = (0,0,255)

            cv2.rectangle(frame, (x1, y1), (x2, y2), color, thickness=2 )
            myfunction.cv2_putText(img = frame,
                           text = label_jp[int(clsid)][:-1],
                           org = (x1, y1), fontFace = fontPIL,
                           fontScale = 12,
                           color = color,
                           mode = 0)

    # 画像表示
    cv2.imshow('Object-Detect', frame)
    
    # 何らかのキーが押されたら終了
    key = cv2.waitKey(1)
    if key != -1:
        break

# 終了処理 
cap.release()
cv2.destroyAllWindows()

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect2_jp.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor'] ['背景\n', '飛行機\n', '自転車\n', '鳥\n', 'ボート\n', 'ボトル\n', 'バス\n', '車\n', '猫\n', '椅子\n', '牛\n', 'ダイニングテーブル\n', '犬\n', '馬\n', 'バイク\n', '人\n', '鉢植え\n', '羊\n', 'ソファー\n', '列車\n', 'テレビ']

↑

動画ファイルで推論を実行するアプリケーション †

　動画ファイルに対して物体認識を行う。「YOLO v3」のサンプルビデオ(champs-elysees.mp4)を試す。

champs-elysees.mp4 ファイルを ~/workspace/Videos ディレクトリに入れる。
object_detect2.py をコピーして object_detect3.py を作成。
```
$ cp object_detect2.py object_detect3.py
```

カメラ準備の後の1行を変更する。

# カメラ準備
cap = cv2.VideoCapture('../Videos/champs-elysees.mp4')

ファイル読み取り後の終了処理。

    # Reload on error
    if ret == False:
        print('File End')
        break

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect3.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor']
[ WARN:0] global ../opencv/modules/videoio/src/cap_gstreamer.cpp (919) open OpenCV | GStreamer warning: unable to query duration of stream
[ WARN:0] global ../opencv/modules/videoio/src/cap_gstreamer.cpp (956) open OpenCV | GStreamer warning: Cannot query video position: status=1, value=0, duration=-1

↑

動画ファイルで推論を実行するアプリケーション日本語で表示する †

object_detect3_jp.py をコピーして object_detect3_jp.py を作成。
```
$ cp object_detect2_jp.py object_detect3_jp.py
```

カメラ準備の後の1行を変更する。

# カメラ準備
cap = cv2.VideoCapture('../Videos/champs-elysees.mp4')

ファイル読み取り後の終了処理。

    # Reload on error
    if ret == False:
        print('File End')
        break

実行結果

pi@raspberrypi:~/workspace $ python3 object_detect3_jp.py
['background\n', 'aeroplane\n', 'bicycle\n', 'bird\n', 'boat\n', 'bottle\n', 'bus\n', 'car\n', 'cat\n', 'chair\n', 'cow\n', 'diningtable\n', 'dog\n', 'horse\n', 'motorbike\n', 'person\n', 'pottedplant\n', 'sheep\n', 'sofa\n', 'train\n', 'tvmonitor'] ['背景\n', '飛行機\n', '自転車\n', '鳥\n', 'ボート\n', 'ボトル\n', 'バス\n', '車\n', '猫\n', '椅子\n', '牛\n', 'ダイニングテーブル\n', '犬\n', '馬\n', 'バイク\n', '人\n', '鉢植え\n', '羊\n', 'ソファー\n', '列車\n', 'テレビ']
[ WARN:0] global ../opencv/modules/videoio/src/cap_gstreamer.cpp (919) open OpenCV | GStreamer warning: unable to query duration of stream
[ WARN:0] global ../opencv/modules/videoio/src/cap_gstreamer.cpp (956) open OpenCV | GStreamer warning: Cannot query video position: status=1, value=0, duration=-1

↑

プログラムの考察など †

↑

ワーニングエラーについて †

classification3.py:15: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property.

(機械翻訳)

Classification3.py:15：DeprecationWarning：IENetworkクラスの 'inputs'プロパティは非推奨になりました。 DataPtrsにアクセスするには、ユーザーは、「input_info」プロパティでアクセスできるInputInfoPtrオブジェクトの「input_data」プロパティを使用する必要があります。

APIのバージョンアップに伴う変更のよう。以下の個所を修正した。

修正前

# 入力データと出力データのキーを取得 
input_blob = next(iter(net.inputs))

修正後

# 入力データと出力データのキーを取得 
input_blob = net.input_info['data'].name

↑

画像出力結果の日本語表示 †

次のサイトのプログラムを使わせてもらった。
→ https://qiita.com/mo256man/items/b6e17b5a66d1ea13b5e3
フォントフェイスとフォントサイズの指定に改良に余地あり。
フォントサイズの基準がよくわからない。
ラベルファイルの日本語訳は個別に修正の必要あり。

↑

更新履歴 †

2021/02/11 ワーニングエラー対応

↑

参考資料 †

Open Model Zoo (INTEL® 学習済みモデルファイルのアーカイブ) を利用するお手本のサイト
- OpenVINO上でのアプリケーション開発の方法を学ぶ

機械学習開発のための学習済モデル

INTEL® オフィシャル・ドキュメント
- OpenVINO™ Toolkit Overview
- Install OpenVINO™ toolkit for Raspbian* OS
- API ドキュメント → Overview of Inference Engine Python* API
- 学習済みモデルの場所 → INTEL OPENSOURCE.org
- 学習済みモデルのドキュメント → Overview of OpenVINO™ Toolkit Intel's Pre-Trained Models

最新の20件

ゼロから学ぶディープラーニング推論 -学習済み物体検知モデル- †

Intel® OpenVINO™ Model Optimizer で学習済み物体検知モデル †

人や車を検知 †

事前準備 †

モデルを準備 †

推論を実行するアプリケーション †

推論を実行するアプリケーション 2 †

推論を実行するアプリケーション 2 日本語で表示する †

カメラ画像で推論を実行するアプリケーション †

カメラ画像で推論を実行するアプリケーション 日本語で表示する †

動画ファイルで推論を実行するアプリケーション †

動画ファイルで推論を実行するアプリケーション 日本語で表示する †

プログラムの考察など †

ワーニングエラーについて †

画像出力結果の日本語表示 †

更新履歴 †

参考資料 †

カメラ画像で推論を実行するアプリケーション日本語で表示する †

動画ファイルで推論を実行するアプリケーション日本語で表示する †