私的AI研究会 > OpenVINO18
OpenVINO™ツールキットの最新版「2021.3」を 「Celeron® N3050」上の ubuntu20.04 にインストールする。
Webサーバーとして使用していたマシンをOSを入れ替えてこの用途で動作するかをテストしてみる。
sudo apt install openssh-server sudo apt install net-tools
$ vi ~/.vimrc set nocompatible set backspace=indent,eol,start set expandtab set tabstop=4 set shiftwidth=4 set autoindent
$ sudo ln -s ~/.vimrc /root/.vimrc $ sudo ls -la /root 合計 24 drwx------ 4 root root 4096 3月 19 05:34 . drwxr-xr-x 21 root root 4096 3月 19 04:49 .. -rw-r--r-- 1 root root 570 9月 9 2019 .bashrc drwx------ 2 root root 4096 9月 26 09:13 .cache -rw-r--r-- 1 root root 148 9月 9 2019 .profile lrwxrwxrwx 1 root root 15 3月 19 05:34 .vimrc -> /home/pi/.vimrc drwx------ 3 root root 4096 3月 17 15:37 .vnc
$ tar -xvzf l_openvino_toolkit_p_2021.3.394.tgz
$ cd l_openvino_toolkit_p_2021.3.394 $ sudo ./install_GUI.sh
$ cd /opt/intel/openvino_2021/install_dependencies $ sudo -E ./install_openvino_dependencies.sh
$ source /opt/intel/openvino_2021/bin/setupvars.sh [setupvars.sh] OpenVINO environment initializedシェルを起動時に自動的に環境変数を設定するため 「~/.bashrc」ファイルの最後に「source /opt/intel/openvino_2021/bin/setupvars.sh」の1行を追記する。
$ cd /opt/intel/openvino_2021/deployment_tools/model_optimizer/install_prerequisites $ sudo ./install_prerequisites.sh※ 1度エラーで終了したので再試行
pip3 install torch==1.8.0+cpu torchvision==0.9.0+cpu torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
$ pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
$ pip3 install scipy
$ sudo usermod -a -G users "$(whoami)" $ sudo cp /opt/intel/openvino_2021/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ $ sudo udevadm control --reload-rules $ sudo udevadm trigger $ sudo ldconfig
$ lsusb Bus 001 Device 002: ID 03e7:2485 Intel Movidius MyriadX Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 002: ID 80ee:0021 VirtualBox USB Tablet Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub $ id mizutu uid=1000(mizutu) gid=1000(mizutu) groups=1000(mizutu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),100(users),120(lpadmin),131(lxd),132(sambashare)
$ cd /opt/intel/openvino_2021/deployment_tools/demo $ ./demo_security_barrier_camera.sh
$ ./demo_squeezenet_download_convert_run.sh※ エラー発生で終了 別のマシンから実行済みディレクトリをコピー
$ ./demo_benchmark_app.sh※ エラー発生で終了 別のマシンから実行済みディレクトリをコピー
$ tar xvzf model_xxxxxxxx.tar.gz学習済みモデルの場所
$ tar xvzf Images_xxxxxxxx.tar.gz $ tar xvzf Videos_xxxxxxxx.tar.gz $ tar xvzf run_app_xxxxxxxx.tar.gz $ tar xvzf workspace_xxxxxxxx.tar.gz $ tar xvzf omz_demos_build_xxxxxxxx.tar.gz
$ cd ~/run_app/ $ ./_benchmark_app.sh <MYRIAD/NCS2>
項目 | Core™ i7-1185G7 | Core™ i3-1115G4 | Core™ i5-10210U | Core™ i7-6700 | Core™ i7-2620M | Celeron® N3050 | |||
GPU(32) | GPU(16) | CPU(32) | CPU(16) | ||||||
Duration (ms) | 1332 | 2263 | 2342 | 2389 | 4332 | 2119 | 2933 | 24671 | 60666 |
Latency (ms) | 5.17 | 7.85 | 9.42 | 9.60 | 3.89 | 8.41 | 9.90 | 22.21 | 57.54 |
Throughput (fps) | 751 | 442 | 427 | 419 | 230.8 | 472 | 341 | 40.5 | 16.48 |
項目 | NCS2 | ||||||
Core™ i7-1185G7 | Core™ i3-1115G4 | Core™ i5-10210U | Core™ i7-6700 | Core™ i7-2620M | Raspberry Pi4※ | Celeron® N3050 | |
Duration (ms) | 3505 | 3601 | 3477 | 3716 | 10659 | 3533 | 8176 |
Latency (ms) | 14.0 | 14.29 | 13.86 | 14.69 | 41.04 | 14.08 | 33.23 |
Throughput (fps) | 285 | 278 | 288 | 269 | 93.8 | 283.07 | 122.31 |
※ CPU Broadcom 2711/4コア1.5GHz Arm Cortex-A72
$ cd ~/run_app/
機種 | CPU/GPU |
① | GPU (FP16) Core™ i7-1185G7 |
② | GPU (FP32) Core™ i7-1185G7 |
③ | CPU Core™ i7-1185G7 |
④ | CPU Core™ i5-10210 |
⑤ | CPU Core™ i3-1115G4 |
⑥ | CPU Core™ i7-6700 |
⑦ | NCS2 (on Core™ i7-1185G7) |
⑧ | CPU Core™ i7-2620M |
⑨ | Celeron® N3050 |
$ ./_human_pose_estimation_3d_demo.sh・ NCS2
$ ./_human_pose_estimation_3d_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
fps | 41 | 30.2 | 13.8 | 13.5 | 4.7 | 8.1 | 4.1 | 2.0 | 0.6 |
$ ./_action_recognition.sh・ NCS2
$ ./_action_recognition_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) | |
Data totai | fps | 642 | 528 | 552 | 1152 | 365 | 532 | 435 | 573 | 216 |
ms | 1.56 | 1.89 | 1.81 | 0.87 | 2.74 | 1.88 | 2.30 | 1.75 | 4.63 | |
Encoder total | fps | 197 | 186 | 184 | 235 | 42 | 118 | 125 | 13.9 | 7.4 |
ms | 5.1 | 5.38 | 5.43 | 4.3 | 23.7 | 8.44 | 7.99 | 71.9 | 135.1 | |
Decoder total | fps | 1239 | 76.4 | 65.8 | 44 | 10.7 | 26.3 | 580 | 15.2 | 2.46 |
ms | 0.81 | 13.1 | 15.2 | 23 | 93.1 | 38.0 | 1.72 | 65.8 | 406.8 | |
Render total | fps | 30.4 | 21.7 | 23.6 | 27 | 6.39 | 11.2 | 31.8 | 3.51 | 1.03 |
ms | 33 | 46.1 | 42.4 | 37 | 156 | 90.0 | 31.4 | 285 | 374.6 |
$ ./_object_detection_python_demo.sh・ NCS2
$ ./_object_detection_python_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Latency(ms) | 33.0 | 124 | 235 | 219.2 | 670 | 439 | 488 | 2236 | 8590 |
fps | 10.5 | 7.9 | 4.2 | 4.5 | 1.2 | 2.2 | 2.1 | 0.4 | 0.1 |
$ ./_human_pose_estimation.sh・ NCS2
$ ./_human_pose_estimation_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Latency (ms) | 23.2 | 28.5 | 51.4 | 214.6 | 162.3 | 88.4 | 214.6 | 406 | 1194.8 |
FPS | 38.8 | 32.2 | 18.5 | 4.5 | 5.3 | 10.7 | 4.5 | 0.4 | 0.8 |
$ ./_gesture_recognition_demo.sh・ NCS2
-- No Support!! --
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
fps | 13.5 | 13.5 | 14.3 | 14.3 | 3.34 | 8.2 | X | 1.5 | 0.36 |
$ ./_handwritten_text_recognition_demo.sh・ NCS2
$ ./_handwritten_text_recognition_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Average throughput(ms) | 52.7 | 91.2 | 373 | 277 | 1129 | 495 | 314 | 1928 | 1148 |
$ ./_text_detection_demo.sh・ NCS2
$ ./_text_detection_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) | |
detection model inference | ms | 40.9 | 62 | 103 | 102 | 386 | 194 | 648 | 1000 | 1907 |
fps | 24.5 | 16.1 | 9.68 | 9.8 | 2.59 | 9.13 | 1.54 | 1 | 0.52 | |
detection model postprocessing | ms | 73.7 | 90.7 | 56.9 | 40.4 | 70.2 | 109.5 | 86.6 | 198.9 | 3183 |
fps | 13.6 | 11.0 | 17.6 | 24.7 | 14.3 | 9.1 | 11.5 | 5.0 | 3.14 | |
recognition model inference | ms | 54.3 | 59.0 | 8.14 | 7.1 | 25.2 | 14.14 | 76.6 | 59.6 | 278.6 |
fps | 18.4 | 16.9 | 122.9 | 140 | 40.0 | 70.7 | 13.1 | 16.8 | 3.59 | |
recognition model postprocessing | ms | 0.0091 | 0.0091 | 0.0094 | 0.007 | 0.0175 | 0.0205 | 0.018 | 0.0416 | 0.049 |
fps | 109375 | 108834 | 106020 | 136364 | 57143 | 488829 | 58852 | 24055 | 20423 |
$ ./_crossroad_camera_demo.sh・ NCS2
$ ./_crossroad_camera_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
detection time(ms) | 20 | 17.6 | 24.3 | 30 | 81.0 | 58.6 | 269.2 | 319.5 | 679 |
fps | 74 | 56 | 40.1 | 32 | 12.34 | 17.6 | 3.71 | 3.13 | 1.9 |
$ ./_crossroad_camera_demo.sh・ NCS2
$ ./_crossroad_camera_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Latency (ms) | 23.5 | 28.7 | 51.8 | 43.4 | 161.0 | 93.5 | 43.4 | 427.2 | 1252.6 |
FPS | 37.9 | 31.3 | 18.5 | 21.5 | 6.0 | 9.8 | 21.5 | 2.2 | 0.8 |
$ ./_object_detection_demo.sh・ NCS2
$ ./_object_detection_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Latency(ms) | 20.3 | 30.9 | 44.6 | 22 | 89.5 | 98.9 | 22 | 214.9 | 432.6 |
fps | 214.7 | 151.7 | 104.5 | 73.0 | 21.0 | 52.6 | 73.0 | 8.6 | 4.3 |
$ ./_smart_classroom_demo.sh・ NCS2
$ ./_smart_classroom_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Mean Fps | 39.2 | 22.5 | 20.3 | 20 | 6.3 | 12.4 | 3.7 | 2.44 | 2.74 |
Frames processed | 363 | 87 | 117 | 168 | 61 | 125 | 25 | 23 | 18 |
$ ./_pedestrian_tracker_demo.sh・ NCS2
-- No Support!! --
$ ./_super_resolution_demo.sh・ NCS2
$ ./_super_resolution_demo_16.sh
$ ./_single_human_pose_estimation_demo.sh・ NCS2
$ ./_single_human_pose_estimation_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
summary (fps) | 4.5 | 3.2 | 1.5 | 0.9 | 0.5 | 0.5 | 0.3 | 0.2 | 0.0 |
estimation (fps) | 18.1 | 10.0 | 4.7 | 5.6 | 1.5 | 2.7 | 1.5 | 0.6 | 1.4 |
detection (fps) | 140.2 | 118 | 78.3 | 109.9 | 20.1 | 67.6 | 26.3 | 7.0 | 15.1 |
$ ./_interactive_face_detection_demo.sh・ NCS2
$ ./_interactive_face_detection_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Number of processed frames | 218 | 467 | 379 | 86 | 41 | 40 | 86 | 19 | 20 |
Total image throughput (fps) | 19.2 | 16.5 | 17.6 | 17.4 | 4.48 | 9.64 | 17.4 | 2.16 | 0.98 |
$ ./_gaze_estimation_demo.sh・ NCS2
$ ./_gaze_estimation_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Overeli (fps) | 40 | 32 | 62 | 110 | 23 | 43 | 110 | 12 | 6 |
Interface (fps) | 46 | 41 | 75 | 202 | 35 | 55 | 202 | 14 | 8 |
$ ./_security_barrier_camera_demo.sh・ NCS2
$ ./_security_barrier_camera_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
fps | 157.7 | 161.4 | 128.9 | 164.5 | 46.7 | 91.2 | 164.5 | 23.6 | 11.7 |
$ ./_image_inpainting_demo.sh・ NCS2
$ ./_image_inpainting_demo_16.sh
$ ./_colorization_demo.sh・ NCS2
$ ./_colorization_demo_16.sh
$ ./_deblurring_demo.sh・ NCS2
$ ./_deblurring_demo_16.sh
項目 | ①GPU(16) | ②GPU(32) | ③CPU(i7-11) | ④CPU(i5-10) | ⑤CPU(i3-11) | ⑥CPU(i7-6) | ⑦NCS2 | ⑧CPU(i7-2) | ⑨CPU(N3050) |
Latency(ms) | 160.0 | 182.9 | 164.3 | 213.4 | 629.6 | 298.4 | 213.4 | 1175 | 2952 |
fps | 5.9 | 5.1 | 5.8 | 4.5 | 1.6 | 3.2 | 4.5 | 0.8 | 0.3 |
使用可能なすべての推論エンジン デバイスを照会し、サポートされているメトリックと既定の構成値を出力する。
$ cd ~/run_app/ $ ./_hello_query_device.sh [hello_query_device.sh] 'hello_query_device' Run !! Available devices: [E:] [BSL] found 0 ioexpander device Device: CPU Metrics: AVAILABLE_DEVICES: SUPPORTED_METRICS: AVAILABLE_DEVICES, SUPPORTED_METRICS, FULL_DEVICE_NAME, OPTIMIZATION_CAPABILITIES, SUPPORTED_CONFIG_KEYS, RANGE_FOR_ASYNC_INFER_REQUESTS, RANGE_FOR_STREAMS FULL_DEVICE_NAME: Intel(R) Celeron(R) CPU N3050 @ 1.60GHz OPTIMIZATION_CAPABILITIES: FP32, FP16, INT8, BIN SUPPORTED_CONFIG_KEYS: CPU_BIND_THREAD, CPU_THREADS_NUM, CPU_THROUGHPUT_STREAMS, DUMP_EXEC_GRAPH_AS_DOT, DYN_BATCH_ENABLED, DYN_BATCH_LIMIT, ENFORCE_BF16, EXCLUSIVE_ASYNC_REQUESTS, PERF_COUNT RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1 RANGE_FOR_STREAMS: 1, 2 Default values for device configuration keys: CPU_BIND_THREAD: YES CPU_THREADS_NUM: 0 CPU_THROUGHPUT_STREAMS: 1 DUMP_EXEC_GRAPH_AS_DOT: DYN_BATCH_ENABLED: NO DYN_BATCH_LIMIT: 0 ENFORCE_BF16: NO EXCLUSIVE_ASYNC_REQUESTS: NO PERF_COUNT: NO Device: GNA Metrics: GNA_LIBRARY_FULL_VERSION: 2.0.0.1047 FULL_DEVICE_NAME: GNA_SW OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1 SUPPORTED_CONFIG_KEYS: EXCLUSIVE_ASYNC_REQUESTS, GNA_COMPACT_MODE, GNA_DEVICE_MODE, GNA_FIRMWARE_MODEL_IMAGE, GNA_FIRMWARE_MODEL_IMAGE_GENERATION, GNA_LIB_N_THREADS, GNA_PRECISION, GNA_PWL_UNIFORM_DESIGN, GNA_SCALE_FACTOR, GNA_SCALE_FACTOR_0, PERF_COUNT, SINGLE_THREAD SUPPORTED_METRICS: GNA_LIBRARY_FULL_VERSION, FULL_DEVICE_NAME, OPTIMAL_NUMBER_OF_INFER_REQUESTS, SUPPORTED_CONFIG_KEYS, SUPPORTED_METRICS, AVAILABLE_DEVICES AVAILABLE_DEVICES: GNA_SW Default values for device configuration keys: EXCLUSIVE_ASYNC_REQUESTS: NO GNA_COMPACT_MODE: NO GNA_DEVICE_MODE: GNA_SW_EXACT GNA_FIRMWARE_MODEL_IMAGE: GNA_FIRMWARE_MODEL_IMAGE_GENERATION: GNA_LIB_N_THREADS: 1 GNA_PRECISION: I16 GNA_PWL_UNIFORM_DESIGN: NO GNA_SCALE_FACTOR: 1.000000 GNA_SCALE_FACTOR_0: 1.000000 PERF_COUNT: NO SINGLE_THREAD: YES Device: MYRIAD Metrics: DEVICE_THERMAL: UNSUPPORTED TYPE OPTIMIZATION_CAPABILITIES: FP16 RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1 SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION FULL_DEVICE_NAME: Intel Movidius Myriad X VPU AVAILABLE_DEVICES: 1.4-ma2480 Default values for device configuration keys: DEVICE_ID: EXCLUSIVE_ASYNC_REQUESTS: NO LOG_LEVEL: LOG_NONE VPU_MYRIAD_FORCE_RESET: NO VPU_MYRIAD_PLATFORM: VPU_CUSTOM_LAYERS: PERF_COUNT: NO VPU_PRINT_RECEIVE_TENSOR_TIME: NO CONFIG_FILE: VPU_HW_STAGES_OPTIMIZATION: YES MYRIAD_THROUGHPUT_STREAMS: -1 MYRIAD_ENABLE_FORCE_RESET: NO MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO MYRIAD_CUSTOM_LAYERS: MYRIAD_ENABLE_HW_ACCELERATION: YES