OpenVINOtest - PukiWiki

OpenVINO™ 動作テスト †

OpenVINO™ 動作テスト
参考資料

※ 最終更新:2021/07/15　

↑

OpenVINO™ Benchmark Test †

↑

OpenVINO™ ベンチマークテスト概要 †

　OpenVINO™ インストール時のサンプルデモ「demo_benchmark_app.sh」を実行することにより、ベンチマークテストのツールがインストールされる。

インストール場所: ~/inference_engine_samples_build/intel64/Release/
実行ファイル　　: benchmark_app
実行コマンド　　: ./benchmark_app -m <model> -i <input> -d <device CPU/GPU/MYRIAD>
ツールの詳細　　: Benchmark C++ Tool

※ benchmark_app コマンドオプション(抜粋)

option	説明
-m PATH_TO_MODEL	推論に使用するIRモデルファイルを指定(.xml)
-d TARGET_DEVICE	推論デバイスを指定。CPU, GPU, MYRIAD, HDDL, HETERO: FPGA, CPUなどを指定可能
-niter NUMBER_ITERATIONS	実行する推論数。省略すると1分間推論を行う
-nireq NUMBER_INFER_REQUESTS	同時推論実行数。たとえば4を指定すると同じデバイスに推論要求を同時に4つ投げる。Throughputを上げるためにはデバイス特性に合った同時推論数を指定するのが肝要。省略するとbenchmark_appが自動的に推論デバイスに適切な同時推論数を使用
-b BATCH_SIZE	バッチ推論数
-i PATH_TO_INPUT	推論に使用する入力画像ファイルを指定。benchmark_appでは省略可能(入力データなしでもベンチマーク可能)
-pc	レイヤーごとの詳細実行レポートを表示。レイヤーごとの実行時間も含まれる

▼　「benchmark_app コマンド・オプション詳細」

$ cd ~/inference_engine_samples_build/intel64/Release/
$ ./benchmark_app -h
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters

benchmark_app [OPTION]
Options:

    -h, --help                Print a usage message
    -m "<path>"               Required. Path to an .xml/.onnx/.prototxt file with a trained model or to a .blob files with a trained compiled model.
    -i "<path>"               Optional. Path to a folder with images and/or binaries or to specific image or binary file.
    -d "<device>"             Optional. Specify a target device to infer on (the list of available devices is shown below). Default value is CPU. Use "-d HETERO:<comma-separated_devices_list>" format to specify HETERO plugin. Use "-d MULTI:<comma-separated_devices_list>" format to specify MULTI plugin. The application looks for a suitable plugin for the specified device.
    -l "<absolute_path>"      Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.
          Or
    -c "<absolute_path>"      Required for GPU custom kernels. Absolute path to an .xml file with the kernels description.
    -api "<sync/async>"       Optional. Enable Sync/Async API. Default value is "async".
    -niter "<integer>"        Optional. Number of iterations. If not specified, the number of iterations is calculated depending on a device.
    -nireq "<integer>"        Optional. Number of infer requests. Default value is determined automatically for device.
    -b "<integer>"            Optional. Batch size value. If not specified, the batch size value is determined from Intermediate Representation.
    -stream_output            Optional. Print progress as a plain text. When specified, an interactive progress bar is replaced with a multiline output.
    -t                        Optional. Time in seconds to execute topology.
    -progress                 Optional. Show progress bar (can affect performance measurement). Default values is "false".
    -shape                    Optional. Set shape for input. For example, "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in case of one input size.
    -layout                   Optional. Prompts how network layouts should be treated by application. For example, "input1[NCHW],input2[NC]" or "[NCHW]" in case of one input size.

  device-specific performance options:
    -nstreams "<integer>"     Optional. Number of streams to use for inference on the CPU, GPU or MYRIAD devices (for HETERO and MULTI device cases use format <dev1>:<nstreams1>,<dev2>:<nstreams2> or just <nstreams>). Default value is determined automatically for a device.Please note that although the automatic selection usually provides a reasonable performance, it still may be non - optimal for some cases, especially for very small networks. See sample's README for more details. Also, using nstreams>1 is inherently throughput-oriented option, while for the best-latency estimations the number of streams should be set to 1.
    -nthreads "<integer>"     Optional. Number of threads to use for inference on the CPU (including HETERO and MULTI cases).
    -enforcebf16              Optional. Enforcing of floating point operations execution in bfloat16 precision where it is acceptable.
    -pin "YES"/"NO"/"NUMA"    Optional. Enable threads->cores ("YES", default), threads->(NUMA)nodes ("NUMA") or completely disable ("NO") CPU threads pinning for CPU-involved inference.

  Statistics dumping options:
    -report_type "<type>"     Optional. Enable collecting statistics report. "no_counters" report contains configuration options specified, resulting FPS and latency. "average_counters" report extends "no_counters" report and additionally includes average PM counters values for each layer from the network. "detailed_counters" report extends "average_counters" report and additionally includes per-layer PM counters and latency for each executed infer request.
    -report_folder            Optional. Path to a folder where statistics report is stored.
    -exec_graph_path          Optional. Path to a file where to store executable graph information serialized.
    -pc                       Optional. Report performance counters.
    -dump_config              Optional. Path to XML/YAML/JSON file to dump IE parameters, which were set by application.
    -load_config              Optional. Path to XML/YAML/JSON file to load custom IE parameters. Please note, command line parameters have higher priority then parameters from configuration file.
    -qb                       Optional. Weight bits for quantization:  8 or 16 (default)
[E:] [BSL] found 0 ioexpander device

Available target devices:  CPU  GNA  MYRIAD

↑

ベンチマークテスト実行スクリプト †

　パラメータ指定を省略した実行スクリプトを「~/run_app/」に作る。

実行コマンド

 ./_benchmark_app.sh <デバイス> [<モデルパス>] [<入力ファイルパス>]

パラメータ初期値

 デバイス　　　　: CPU
 モデルパス　　　: ~/model/public/FP32/squeezenet1.1.xml
 入力ファイルパス: なし

簡易指定

$ cd ~/run_app/
$ ./_benchmark_app.sh <GPU32/GPU16/CPU32/CPU16/MYRIAD/NCS2>

▼　「_benchmark_app.sh」

#!/bin/sh

PROG_NAME='benchmark_app'
RUN_DIR="$HOME/inference_engine_samples_build/intel64/Release/"
DEVICE="CPU"
MODEL_FILE="$HOME/model/public/FP32/squeezenet1.1.xml"

echo [$PROG_NAME.sh] \'$PROG_NAME\' Run !!
cd $RUN_DIR

if [ $# = 1 ]
then
    DEVICE=$1
    if [ $1 = 'GPU32' ]
    then
        DEVICE="GPU"
    elif [ $1 = 'GPU16' ]
    then
        DEVICE="GPU"
        MODEL_FILE="$HOME/model/public/FP16/squeezenet1.1.xml"
    elif [ $1 = 'CPU32' ]
    then
        DEVICE="CPU"
        MODEL_FILE="$HOME/model/public/FP32/squeezenet1.1.xml"
    elif [ $1 = 'CPU16' ]
    then
        DEVICE="CPU"
        MODEL_FILE="$HOME/model/public/FP16/squeezenet1.1.xml"
    elif [ $1 = 'MYRIAD' ]
    then
        DEVICE="MYRIAD"
        MODEL_FILE="$HOME/model/public/FP16/squeezenet1.1.xml"
    elif [ $1 = 'NCS2' ]
    then
        DEVICE="MYRIAD"
        MODEL_FILE="$HOME/model/public/FP16/squeezenet1.1.xml"
    fi
elif [ $# = 2 ]
then
    DEVICE=$1
    MODEL_FILE=$2
elif [ $# = 3 ]
then
    DEVICE=$1
    MODEL_FILE=$2
    INPUT_FILE=-i $3
fi

echo \'command: ./$PROG_NAME -d $DEVICE -m $MODEL_FILE $INPUT_FILE -pc -niter 1000\'

./$PROG_NAME -d $DEVICE -m $MODEL_FILE $INPUT_FILE -pc -niter 1000

↑

OpenVINO™ ベンチマークテスト実行結果 †

項目	Core™ i7-1185G7				Core™ i3-1115G4	Core™ i5-10210U	Core™ i7-6700	Core™ i7-2620M	Core™ i5-M520
項目	GPU(32)	GPU(16)	CPU(32)	CPU(16)	Core™ i3-1115G4	Core™ i5-10210U	Core™ i7-6700	Core™ i7-2620M	Core™ i5-M520
Duration (ms)	1332	2263	2342	2389	4332	2119	2933	24671	32341
Latency (ms)	5.17	7.85	9.42	9.60	3.89	8.41	9.90	22.21	30.36
Throughput (fps)	751	442	427	419	230.8	472	341	40.5	30.92

▼　「benchmark_app 実行コマンド」

　実行ディレクトリへ

$ cd ~/inference_engine_samples_build/intel64/Release

● Core™ i7-1185G7
・GPU (32)

$ ./benchmark_app -d GPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Gen12LP HD Graphics (iGPU)

Count:      1000 iterations
Duration:   1332.08 ms
Latency:    5.17 ms
Throughput: 750.70 FPS

・GPU (16)

$ ./benchmark_app -d GPU -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Gen12LP HD Graphics (iGPU)

Count:      1000 iterations
Duration:   2263.99 ms
Latency:    7.85 ms
Throughput: 441.70 FPS

・CPU (32)

$ ./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Gen12LP HD Graphics (iGPU)

Count:      1000 iterations
Duration:   2341.66 ms
Latency:    9.42 ms
Throughput: 427.05 FPS

・CPU (16)

$ ./benchmark_app -d CPU -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel Movidius Myriad X VPU

Count:      1000 iterations
Duration:   2389.01 ms
Latency:    9.60 ms
Throughput: 418.58 FPS

● Core™ i3-1115G4

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000

● Core™ i5-10210U

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz

Count:      1000 iterations
#codeprettify(){{
Duration:   2119.29 ms
Latency:    8.41 ms
Throughput: 471.86 FPS

● Core™ i7-6700

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz

Count:      1000 iterations
Duration:   2932.73 ms
Latency:    9.90 ms
Throughput: 340.98 FPS

● Core™ i7-2620M

./benchmark_app -d CPU-m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name:        Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Count:      1000 iterations
Duration:   24671.35 ms
Latency:    22.21 ms
Throughput: 40.53 FPS

↑

Neural Compute Stick 2 (NCS2) の実行速度 †

項目	NCS2
項目	Core™ i7-1185G7	Core™ i3-1115G4	Core™ i5-10210U	Core™ i7-6700	Core™ i7-2620M	Raspberry Pi4※
Duration (ms)	3505	3601	3477	3716	10659	3533
Latency (ms)	14.0	14.29	13.86	14.69	41.04	14.08
Throughput (fps)	285	278	288	269	93.8	283.07

　　※ CPU Broadcom 2711/4コア1.5GHz Arm Cortex-A72

▼　Neural Compute Stick 2 (NCS2) で benchmark_app を実行

　実行ディレクトリへ

$ cd ~/inference_engine_samples_build/intel64/Release

● Core™ i7-1185G7

./benchmark_app -d MYRIAD -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel Movidius Myriad X VPU

Count:      1000 iterations
Duration:   3504.72 ms
Latency:    14.00 ms
Throughput: 285.33 FPS

● Core™ i3-1115G4

./benchmark_app -d MYRIAD -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000

● Core™ i5-10210U

./benchmark_app -d MYRIAD -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel Movidius Myriad X VPU

Count:      1000 iterations
Duration:   3476.72 ms
Latency:    13.86 ms
Throughput: 287.63 FPS

● Core™ i7-6700

./benchmark_app -d MYRIAD -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel Movidius Myriad X VPU

Count:      1000 iterations
Duration:   3716.35 ms
Latency:    14.69 ms
Throughput: 269.08 FPS

● Core™ i7-2620M

./benchmark_app -d MYRIAD -m ~/model/public/FP16/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel Movidius Myriad X VPU

Count:      1000 iterations
Duration:   10659.16 ms
Latency:    41.04 ms
Throughput: 93.82 FPS

NCS2 の実行速度はある程度高速のマシン上では本来の速度になるが非力なマシン上ではそのスペックに左右される。
Raspberry Pi などで動作させる場合は複数駆動などの検討がいるかもしれない。

↑

Neural Compute Stick 2 (NCS2) 並列動作時の実行速度 †

項目	NCS2 (Core™ i7-1185G7)
項目	3本	2本	1本
Duration (ms)	1181	1755	3505
Latency (ms)	-	-	14.0
Throughput (fps)	853.6	569.7	285

▼　Neural Compute Stick 2 (NCS2) マルチ駆動で benchmark_app を実行

　スクリプト実行ディレクトリへ

$ cd ~/run_app/

● X 3 本

$ ./_benchmark_app.sh MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480,MYRIAD.3.4.4-ma2480 ~/model/public/FP16/squeezenet1.1.xml
[benchmark_app.sh] 'benchmark_app' Run !!
'command: ./benchmark_app -d MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480,MYRIAD.3.4.4-ma2480 -m /home/mizutu/model/public/FP16/squeezenet1.1.xml -pc -niter 1000'
    :
    :
Full device name: 

Count:      1008 iterations
Duration:   1180.91 ms
Throughput: 853.58 FPS

● X 2 本

$ ./_benchmark_app.sh MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480 ~/model/public/FP16/squeezenet1.1.xml
[benchmark_app.sh] 'benchmark_app' Run !!
'command: ./benchmark_app -d MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480 -m /home/mizutu/model/public/FP16/squeezenet1.1.xml -pc -niter 1000'
    :
    :
Full device name: 

Count:      1000 iterations
Duration:   1755.24 ms
Throughput: 569.72 FPS

▼　hello_query_device 実行結果

● X 3 本

$ ./_hello_query_device.sh 
[hello_query_device.sh] 'hello_query_device' Run !!
Available devices:
[E:] [BSL] found 0 ioexpander device
	Device: CPU
	Metrics:
		AVAILABLE_DEVICES: 
		SUPPORTED_METRICS: AVAILABLE_DEVICES, SUPPORTED_METRICS, FULL_DEVICE_NAME, OPTIMIZATION_CAPABILITIES, SUPPORTED_CONFIG_KEYS, RANGE_FOR_ASYNC_INFER_REQUESTS, RANGE_FOR_STREAMS
		FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
		OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN
		SUPPORTED_CONFIG_KEYS: CPU_BIND_THREAD, CPU_THREADS_NUM, CPU_THROUGHPUT_STREAMS, DUMP_EXEC_GRAPH_AS_DOT, DYN_BATCH_ENABLED, DYN_BATCH_LIMIT, ENFORCE_BF16, EXCLUSIVE_ASYNC_REQUESTS, PERF_COUNT
		RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
		RANGE_FOR_STREAMS: 1, 8

	Default values for device configuration keys:
		CPU_BIND_THREAD: YES
		CPU_THREADS_NUM: 0
		CPU_THROUGHPUT_STREAMS: 1
		DUMP_EXEC_GRAPH_AS_DOT: 
		DYN_BATCH_ENABLED: NO
		DYN_BATCH_LIMIT: 0
		ENFORCE_BF16: NO
		EXCLUSIVE_ASYNC_REQUESTS: NO
		PERF_COUNT: NO
	Device: GNA
	Metrics:
		GNA_LIBRARY_FULL_VERSION: 2.0.0.1047
		FULL_DEVICE_NAME: GNA_SW
		OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
		SUPPORTED_CONFIG_KEYS: EXCLUSIVE_ASYNC_REQUESTS, GNA_COMPACT_MODE, GNA_DEVICE_MODE, GNA_FIRMWARE_MODEL_IMAGE, GNA_FIRMWARE_MODEL_IMAGE_GENERATION, GNA_LIB_N_THREADS, GNA_PRECISION, GNA_PWL_UNIFORM_DESIGN, GNA_SCALE_FACTOR, GNA_SCALE_FACTOR_0, PERF_COUNT, SINGLE_THREAD
		SUPPORTED_METRICS: GNA_LIBRARY_FULL_VERSION, FULL_DEVICE_NAME, OPTIMAL_NUMBER_OF_INFER_REQUESTS, SUPPORTED_CONFIG_KEYS, SUPPORTED_METRICS, AVAILABLE_DEVICES
		AVAILABLE_DEVICES: GNA_SW

	Default values for device configuration keys:
		EXCLUSIVE_ASYNC_REQUESTS: NO
		GNA_COMPACT_MODE: NO
		GNA_DEVICE_MODE: GNA_SW_EXACT
		GNA_FIRMWARE_MODEL_IMAGE: 
		GNA_FIRMWARE_MODEL_IMAGE_GENERATION: 
		GNA_LIB_N_THREADS: 1
		GNA_PRECISION: I16
		GNA_PWL_UNIFORM_DESIGN: NO
		GNA_SCALE_FACTOR: 1.000000
		GNA_SCALE_FACTOR_0: 1.000000
		PERF_COUNT: NO
		SINGLE_THREAD: YES
	Device: GPU
	Metrics:
		AVAILABLE_DEVICES: 0
		SUPPORTED_METRICS: AVAILABLE_DEVICES, SUPPORTED_METRICS, FULL_DEVICE_NAME, OPTIMIZATION_CAPABILITIES, SUPPORTED_CONFIG_KEYS, RANGE_FOR_ASYNC_INFER_REQUESTS, RANGE_FOR_STREAMS
		FULL_DEVICE_NAME: Intel(R) Gen12LP HD Graphics (iGPU)
		OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
		SUPPORTED_CONFIG_KEYS: CACHE_DIR, CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS, CLDNN_GRAPH_DUMPS_DIR, CLDNN_MEM_POOL, CLDNN_NV12_TWO_INPUTS, CLDNN_PLUGIN_PRIORITY, CLDNN_PLUGIN_THROTTLE, CLDNN_SOURCES_DUMPS_DIR, CONFIG_FILE, DEVICE_ID, DUMP_KERNELS, DYN_BATCH_ENABLED, EXCLUSIVE_ASYNC_REQUESTS, GPU_THROUGHPUT_STREAMS, PERF_COUNT, TUNING_FILE, TUNING_MODE
		RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
		RANGE_FOR_STREAMS: 1, 2

	Default values for device configuration keys:
		CACHE_DIR: 
		CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS: YES
		CLDNN_GRAPH_DUMPS_DIR: 
		CLDNN_MEM_POOL: YES
		CLDNN_NV12_TWO_INPUTS: NO
		CLDNN_PLUGIN_PRIORITY: 0
		CLDNN_PLUGIN_THROTTLE: 0
		CLDNN_SOURCES_DUMPS_DIR: 
		CONFIG_FILE: 
		DEVICE_ID: 
		DUMP_KERNELS: NO
		DYN_BATCH_ENABLED: NO
		EXCLUSIVE_ASYNC_REQUESTS: NO
		GPU_THROUGHPUT_STREAMS: 1
		PERF_COUNT: NO
		TUNING_FILE: 
		TUNING_MODE: TUNING_DISABLED
	Device: MYRIAD.3.4.1-ma2480
	Metrics:
		DEVICE_THERMAL: UNSUPPORTED TYPE
		OPTIMIZATION_CAPABILITIES: FP16
		RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1
		SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES
		SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION
		FULL_DEVICE_NAME: Intel Movidius Myriad X VPU
		AVAILABLE_DEVICES: 3.4.1-ma2480, 3.4.3-ma2480, 3.4.4-ma2480

	Default values for device configuration keys:
		DEVICE_ID: 
		EXCLUSIVE_ASYNC_REQUESTS: NO
		LOG_LEVEL: LOG_NONE
		VPU_MYRIAD_FORCE_RESET: NO
		VPU_MYRIAD_PLATFORM: 
		VPU_CUSTOM_LAYERS: 
		PERF_COUNT: NO
		VPU_PRINT_RECEIVE_TENSOR_TIME: NO
		CONFIG_FILE: 
		VPU_HW_STAGES_OPTIMIZATION: YES
		MYRIAD_THROUGHPUT_STREAMS: -1
		MYRIAD_ENABLE_FORCE_RESET: NO
		MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO
		MYRIAD_CUSTOM_LAYERS: 
		MYRIAD_ENABLE_HW_ACCELERATION: YES
	Device: MYRIAD.3.4.3-ma2480
	Metrics:
		DEVICE_THERMAL: UNSUPPORTED TYPE
		OPTIMIZATION_CAPABILITIES: FP16
		RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1
		SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES
		SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION
		FULL_DEVICE_NAME: Intel Movidius Myriad X VPU
		AVAILABLE_DEVICES: 3.4.1-ma2480, 3.4.3-ma2480, 3.4.4-ma2480

	Default values for device configuration keys:
		DEVICE_ID: 
		EXCLUSIVE_ASYNC_REQUESTS: NO
		LOG_LEVEL: LOG_NONE
		VPU_MYRIAD_FORCE_RESET: NO
		VPU_MYRIAD_PLATFORM: 
		VPU_CUSTOM_LAYERS: 
		PERF_COUNT: NO
		VPU_PRINT_RECEIVE_TENSOR_TIME: NO
		CONFIG_FILE: 
		VPU_HW_STAGES_OPTIMIZATION: YES
		MYRIAD_THROUGHPUT_STREAMS: -1
		MYRIAD_ENABLE_FORCE_RESET: NO
		MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO
		MYRIAD_CUSTOM_LAYERS: 
		MYRIAD_ENABLE_HW_ACCELERATION: YES
	Device: MYRIAD.3.4.4-ma2480
	Metrics:
		DEVICE_THERMAL: UNSUPPORTED TYPE
		OPTIMIZATION_CAPABILITIES: FP16
		RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1
		SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES
		SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION
		FULL_DEVICE_NAME: Intel Movidius Myriad X VPU
		AVAILABLE_DEVICES: 3.4.1-ma2480, 3.4.3-ma2480, 3.4.4-ma2480

	Default values for device configuration keys:
		DEVICE_ID: 
		EXCLUSIVE_ASYNC_REQUESTS: NO
		LOG_LEVEL: LOG_NONE
		VPU_MYRIAD_FORCE_RESET: NO
		VPU_MYRIAD_PLATFORM: 
		VPU_CUSTOM_LAYERS: 
		PERF_COUNT: NO
		VPU_PRINT_RECEIVE_TENSOR_TIME: NO
		CONFIG_FILE: 
		VPU_HW_STAGES_OPTIMIZATION: YES
		MYRIAD_THROUGHPUT_STREAMS: -1
		MYRIAD_ENABLE_FORCE_RESET: NO
		MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO
		MYRIAD_CUSTOM_LAYERS: 
		MYRIAD_ENABLE_HW_ACCELERATION: YES

● X 2 本

$ ./_hello_query_device.sh 
[hello_query_device.sh] 'hello_query_device' Run !!
Available devices:
[E:] [BSL] found 0 ioexpander device
	Device: CPU
	Metrics:
		AVAILABLE_DEVICES: 
		SUPPORTED_METRICS: AVAILABLE_DEVICES, SUPPORTED_METRICS, FULL_DEVICE_NAME, OPTIMIZATION_CAPABILITIES, SUPPORTED_CONFIG_KEYS, RANGE_FOR_ASYNC_INFER_REQUESTS, RANGE_FOR_STREAMS
		FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
		OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN
		SUPPORTED_CONFIG_KEYS: CPU_BIND_THREAD, CPU_THREADS_NUM, CPU_THROUGHPUT_STREAMS, DUMP_EXEC_GRAPH_AS_DOT, DYN_BATCH_ENABLED, DYN_BATCH_LIMIT, ENFORCE_BF16, EXCLUSIVE_ASYNC_REQUESTS, PERF_COUNT
		RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
		RANGE_FOR_STREAMS: 1, 8

	Default values for device configuration keys:
		CPU_BIND_THREAD: YES
		CPU_THREADS_NUM: 0
		CPU_THROUGHPUT_STREAMS: 1
		DUMP_EXEC_GRAPH_AS_DOT: 
		DYN_BATCH_ENABLED: NO
		DYN_BATCH_LIMIT: 0
		ENFORCE_BF16: NO
		EXCLUSIVE_ASYNC_REQUESTS: NO
		PERF_COUNT: NO
	Device: GNA
	Metrics:
		GNA_LIBRARY_FULL_VERSION: 2.0.0.1047
		FULL_DEVICE_NAME: GNA_SW
		OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
		SUPPORTED_CONFIG_KEYS: EXCLUSIVE_ASYNC_REQUESTS, GNA_COMPACT_MODE, GNA_DEVICE_MODE, GNA_FIRMWARE_MODEL_IMAGE, GNA_FIRMWARE_MODEL_IMAGE_GENERATION, GNA_LIB_N_THREADS, GNA_PRECISION, GNA_PWL_UNIFORM_DESIGN, GNA_SCALE_FACTOR, GNA_SCALE_FACTOR_0, PERF_COUNT, SINGLE_THREAD
		SUPPORTED_METRICS: GNA_LIBRARY_FULL_VERSION, FULL_DEVICE_NAME, OPTIMAL_NUMBER_OF_INFER_REQUESTS, SUPPORTED_CONFIG_KEYS, SUPPORTED_METRICS, AVAILABLE_DEVICES
		AVAILABLE_DEVICES: GNA_SW

	Default values for device configuration keys:
		EXCLUSIVE_ASYNC_REQUESTS: NO
		GNA_COMPACT_MODE: NO
		GNA_DEVICE_MODE: GNA_SW_EXACT
		GNA_FIRMWARE_MODEL_IMAGE: 
		GNA_FIRMWARE_MODEL_IMAGE_GENERATION: 
		GNA_LIB_N_THREADS: 1
		GNA_PRECISION: I16
		GNA_PWL_UNIFORM_DESIGN: NO
		GNA_SCALE_FACTOR: 1.000000
		GNA_SCALE_FACTOR_0: 1.000000
		PERF_COUNT: NO
		SINGLE_THREAD: YES
	Device: GPU
	Metrics:
		AVAILABLE_DEVICES: 0
		SUPPORTED_METRICS: AVAILABLE_DEVICES, SUPPORTED_METRICS, FULL_DEVICE_NAME, OPTIMIZATION_CAPABILITIES, SUPPORTED_CONFIG_KEYS, RANGE_FOR_ASYNC_INFER_REQUESTS, RANGE_FOR_STREAMS
		FULL_DEVICE_NAME: Intel(R) Gen12LP HD Graphics (iGPU)
		OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
		SUPPORTED_CONFIG_KEYS: CACHE_DIR, CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS, CLDNN_GRAPH_DUMPS_DIR, CLDNN_MEM_POOL, CLDNN_NV12_TWO_INPUTS, CLDNN_PLUGIN_PRIORITY, CLDNN_PLUGIN_THROTTLE, CLDNN_SOURCES_DUMPS_DIR, CONFIG_FILE, DEVICE_ID, DUMP_KERNELS, DYN_BATCH_ENABLED, EXCLUSIVE_ASYNC_REQUESTS, GPU_THROUGHPUT_STREAMS, PERF_COUNT, TUNING_FILE, TUNING_MODE
		RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
		RANGE_FOR_STREAMS: 1, 2

	Default values for device configuration keys:
		CACHE_DIR: 
		CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS: YES
		CLDNN_GRAPH_DUMPS_DIR: 
		CLDNN_MEM_POOL: YES
		CLDNN_NV12_TWO_INPUTS: NO
		CLDNN_PLUGIN_PRIORITY: 0
		CLDNN_PLUGIN_THROTTLE: 0
		CLDNN_SOURCES_DUMPS_DIR: 
		CONFIG_FILE: 
		DEVICE_ID: 
		DUMP_KERNELS: NO
		DYN_BATCH_ENABLED: NO
		EXCLUSIVE_ASYNC_REQUESTS: NO
		GPU_THROUGHPUT_STREAMS: 1
		PERF_COUNT: NO
		TUNING_FILE: 
		TUNING_MODE: TUNING_DISABLED
	Device: MYRIAD.3.4.1-ma2480
	Metrics:
		DEVICE_THERMAL: UNSUPPORTED TYPE
		OPTIMIZATION_CAPABILITIES: FP16
		RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1
		SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES
		SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION
		FULL_DEVICE_NAME: Intel Movidius Myriad X VPU
		AVAILABLE_DEVICES: 3.4.1-ma2480, 3.4.3-ma2480

	Default values for device configuration keys:
		DEVICE_ID: 
		EXCLUSIVE_ASYNC_REQUESTS: NO
		LOG_LEVEL: LOG_NONE
		VPU_MYRIAD_FORCE_RESET: NO
		VPU_MYRIAD_PLATFORM: 
		VPU_CUSTOM_LAYERS: 
		PERF_COUNT: NO
		VPU_PRINT_RECEIVE_TENSOR_TIME: NO
		CONFIG_FILE: 
		VPU_HW_STAGES_OPTIMIZATION: YES
		MYRIAD_THROUGHPUT_STREAMS: -1
		MYRIAD_ENABLE_FORCE_RESET: NO
		MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO
		MYRIAD_CUSTOM_LAYERS: 
		MYRIAD_ENABLE_HW_ACCELERATION: YES
	Device: MYRIAD.3.4.3-ma2480
	Metrics:
		DEVICE_THERMAL: UNSUPPORTED TYPE
		OPTIMIZATION_CAPABILITIES: FP16
		RANGE_FOR_ASYNC_INFER_REQUESTS: 3, 6, 1
		SUPPORTED_METRICS: DEVICE_THERMAL, OPTIMIZATION_CAPABILITIES, RANGE_FOR_ASYNC_INFER_REQUESTS, SUPPORTED_METRICS, SUPPORTED_CONFIG_KEYS, FULL_DEVICE_NAME, AVAILABLE_DEVICES
		SUPPORTED_CONFIG_KEYS: DEVICE_ID, EXCLUSIVE_ASYNC_REQUESTS, LOG_LEVEL, VPU_MYRIAD_FORCE_RESET, VPU_MYRIAD_PLATFORM, VPU_CUSTOM_LAYERS, PERF_COUNT, VPU_PRINT_RECEIVE_TENSOR_TIME, CONFIG_FILE, VPU_HW_STAGES_OPTIMIZATION, MYRIAD_THROUGHPUT_STREAMS, MYRIAD_ENABLE_FORCE_RESET, MYRIAD_ENABLE_RECEIVING_TENSOR_TIME, MYRIAD_CUSTOM_LAYERS, MYRIAD_ENABLE_HW_ACCELERATION
		FULL_DEVICE_NAME: Intel Movidius Myriad X VPU
		AVAILABLE_DEVICES: 3.4.1-ma2480, 3.4.3-ma2480

	Default values for device configuration keys:
		DEVICE_ID: 
		EXCLUSIVE_ASYNC_REQUESTS: NO
		LOG_LEVEL: LOG_NONE
		VPU_MYRIAD_FORCE_RESET: NO
		VPU_MYRIAD_PLATFORM: 
		VPU_CUSTOM_LAYERS: 
		PERF_COUNT: NO
		VPU_PRINT_RECEIVE_TENSOR_TIME: NO
		CONFIG_FILE: 
		VPU_HW_STAGES_OPTIMIZATION: YES
		MYRIAD_THROUGHPUT_STREAMS: -1
		MYRIAD_ENABLE_FORCE_RESET: NO
		MYRIAD_ENABLE_RECEIVING_TENSOR_TIME: NO
		MYRIAD_CUSTOM_LAYERS: 
		MYRIAD_ENABLE_HW_ACCELERATION: YES

▼　benchmark_app 実行結果

● X 3 本

$ ./_benchmark_app.sh MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480,MYRIAD.3.4.4-ma2480 ~/model/public/FP16/squeezenet1.1.xml
[benchmark_app.sh] 'benchmark_app' Run !!
'command: ./benchmark_app -d MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480,MYRIAD.3.4.4-ma2480 -m /home/mizutu/model/public/FP16/squeezenet1.1.xml -pc -niter 1000'
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine: 
	API version ............ 2.1
	Build .................. 2021.3.0-2787-60059f2c755-releases/2021/3
	Description ....... API
[ INFO ] Device info: 
	MULTI
	MultiDevicePlugin version ......... 2.1
	Build ........... 2021.3.0-2787-60059f2c755-releases/2021/3
	MYRIAD
	myriadPlugin version ......... 2.1
	Build ........... 2021.3.0-2787-60059f2c755-releases/2021/3

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for MYRIAD device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[ WARNING ] -nstreams default value is determined automatically for MYRIAD device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[ WARNING ] -nstreams default value is determined automatically for MYRIAD device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 11.91 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 7218.07 ms
[Step 8/11] Setting optimal runtime parameters
[ WARNING ] Number of iterations was aligned by request number from 1000 to 1008 using number of requests 12
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'data' precision U8, dimensions (NCHW): 1 3 227 227 
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 4 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 5 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 6 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 7 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 8 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 9 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 10 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 11 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 12 inference requests, limits: 1008 iterations)
[ INFO ] First inference took 9.10 ms

[Step 11/11] Dumping statistics report
[ INFO ] Performance counts for 0-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 1-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 2-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 3-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 4-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 5-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 6-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 7-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 8-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 9-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 10-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 11-th infer request:
Total time: 0        microseconds

Full device name: 

Count:      1008 iterations
Duration:   1180.91 ms
Throughput: 853.58 FPS

● X 2 本

$ ./_benchmark_app.sh MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480 ~/model/public/FP16/squeezenet1.1.xml
[benchmark_app.sh] 'benchmark_app' Run !!
'command: ./benchmark_app -d MULTI:MYRIAD.3.4.1-ma2480,MYRIAD.3.4.3-ma2480 -m /home/mizutu/model/public/FP16/squeezenet1.1.xml -pc -niter 1000'
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine: 
	API version ............ 2.1
	Build .................. 2021.3.0-2787-60059f2c755-releases/2021/3
	Description ....... API
[ INFO ] Device info: 
	MULTI
	MultiDevicePlugin version ......... 2.1
	Build ........... 2021.3.0-2787-60059f2c755-releases/2021/3
	MYRIAD
	myriadPlugin version ......... 2.1
	Build ........... 2021.3.0-2787-60059f2c755-releases/2021/3

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for MYRIAD device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[ WARNING ] -nstreams default value is determined automatically for MYRIAD device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 16.48 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 4903.99 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'data' precision U8, dimensions (NCHW): 1 3 227 227 
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 4 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 5 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 6 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[ INFO ] Infer Request 7 filling
[ INFO ] Fill input 'data' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 8 inference requests, limits: 1000 iterations)
[ INFO ] First inference took 8.91 ms

[Step 11/11] Dumping statistics report
[ INFO ] Performance counts for 0-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 1-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 2-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 3-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 4-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 5-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 6-th infer request:
Total time: 0        microseconds

Full device name: 

[ INFO ] Performance counts for 7-th infer request:
Total time: 0        microseconds

Full device name: 

Count:      1000 iterations
Duration:   1755.24 ms
Throughput: 569.72 FPS

↑

VirtualBox を使う †

↑

VirtualBox 動作環境のチューニング †

● Virtual マシンの「設定」→「システム」を選択
●「マザーボード」タグの「メインメモリー」を推奨値の最大にする。
●「プロセッサー」タグの「プロセッサー数」を推奨値の最大にする。
●「ディスプレイ」タグの「ビデオメモリー」を推奨値の最大にする。

項目	Core™ i7-6700		Core™ i7-2620M		NCS2 (Core™ i7-6700)		NCS2 (Core™ i7-2620M)
項目	チューニング前	チューニング後	チューニング前	チューニング後	チューニング前	チューニング後	チューニング前	チューニング後
Duration (ms)	9760	2933	34791	24671	3657	3716	11061	10659
Latency (ms)	8.54	9.90	33.33	22.21	14.44	14.69	42.54	41.04
Throughput (fps)	102	341	28.7	40.5	273	269	90.41	93.8

▼　VirtualBox チューニング benchmark_app 実行結果

　実行ディレクトリへ

$ cd ~/inference_engine_samples_build/intel64/Release

● Core™ i7-6700

チューニング前
・main memory: 2048MB
・proccessor: 1

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz

Count:      1000 iterations
Duration:   9760.39 ms
Latency:    8.54 ms
Throughput: 102.45 FPS

チューニング後
・main memory: 11288MB
・proccessor: 4

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz

Count:      1000 iterations
Duration:   2932.73 ms
Latency:    9.90 ms
Throughput: 340.98 FPS

● Core™ i7-2620M

チューニング前
・main memory: 2048MB
・proccessor: 1

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name:        Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Count:      1000 iterations
Duration:   34790.98 ms
Latency:    33.33 ms
Throughput: 28.74 FPS

チューニング後
・main memory: 11288MB
・proccessor: 4

./benchmark_app -d CPU -m ~/model/public/FP32/squeezenet1.1.xml -pc -niter 1000
    :
Full device name:        Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Count:      1000 iterations
Duration:   24671.35 ms
Latency:    22.21 ms
Throughput: 40.53 FPS

↑

Open Model Zoo Demos を動かす (まとめ) †

スクリプト実行ディレクトリへ
```
$ cd ~/run_app/
```

実行環境

機種	世代	CPU/GPU	OS
①	第11世代	GPU (FP16) Core™ i7-1185G7	Ubuntu20.04
②	第11世代	GPU (FP32) Core™ i7-1185G7	Ubuntu20.04
③	第11世代	CPU Core™ i7-1185G7	Ubuntu20.04
④	第10世代	CPU Core™ i5-10210	Ubuntu20.04
⑤	第11世代	CPU Core™ i3-1115G4	Ubuntu20.04 (VirtualBox)
⑥	第6世代	CPU Core™ i7-6700	Ubuntu20.04 (VirtualBox)
⑦	-	VPU NCS2 (on Core™ i7-1185G7)	Ubuntu20.04
⑧	第2世代	CPU Core™ i7-2620M	Ubuntu20.04 (VirtualBox)
⑨	第1世代	CPU Core™ i5-M520	Ubuntu20.04 (VirtualBox)

↑

3D Human Pose Estimation Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_human_pose_estimation_3d_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_human_pose_estimation_3d_demo_g32.sh

・ CPU

$ ./_human_pose_estimation_3d_demo.sh

・ NCS2

$ ./_human_pose_estimation_3d_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

fps 41 30.2 13.8 13.5 4.7 8.1 4.1 2.0 0.7

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
fps	41	30.2	13.8	13.5	4.7	8.1	4.1	2.0	0.7

↑

Action Recognition Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_action_recognition_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_action_recognition_g32.sh

・ CPU

$ ./_action_recognition.sh

・ NCS2

$ ./_action_recognition_16.sh

速度比較

項目		①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Data totai	fps	642	528	552	1152	365	532	435	573	586
Data totai	ms	1.56	1.89	1.81	0.87	2.74	1.88	2.30	1.75	1.71
Encoder total	fps	197	186	184	235	42	118	125	13.9	14.7
Encoder total	ms	5.1	5.38	5.43	4.3	23.7	8.44	7.99	71.9	68.2
Decoder total	fps	1239	76.4	65.8	44	10.7	26.3	580	15.2	3.92
Decoder total	ms	0.81	13.1	15.2	23	93.1	38.0	1.72	65.8	255.3
Render total	fps	30.4	21.7	23.6	27	6.39	11.2	31.8	3.51	1.48
Render total	ms	33	46.1	42.4	37	156	90.0	31.4	285	676

↑

Object Detection Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_object_detection_python_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_object_detection_python_demo_g32.sh

・ CPU

$ ./_object_detection_python_demo.sh

・ NCS2

$ ./_object_detection_python_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Latency(ms) 33.0 124 235 219.2 670 439 488 2236 5149

fps 10.5 7.9 4.2 4.5 1.2 2.2 2.1 0.4 0.2

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Latency(ms)	33.0	124	235	219.2	670	439	488	2236	5149
fps	10.5	7.9	4.2	4.5	1.2	2.2	2.1	0.4	0.2

↑

Human Pose Estimation Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_human_pose_estimation_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_human_pose_estimation_g32.sh

・ CPU

$ ./_human_pose_estimation.sh

・ NCS2

$ ./_human_pose_estimation_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Latency (ms) 23.2 28.5 51.4 214.6 162.3 88.4 214.6 406 704.1

FPS 38.8 32.2 18.5 4.5 5.3 10.7 4.5 0.4 1.4

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Latency (ms)	23.2	28.5	51.4	214.6	162.3	88.4	214.6	406	704.1
FPS	38.8	32.2	18.5	4.5	5.3	10.7	4.5	0.4	1.4

↑

Gesture Recognition Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_gesture_recognition_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_gesture_recognition_demo_g32.sh

・ CPU

$ ./_gesture_recognition_demo.sh

・ NCS2

 -- No Support!! --

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

fps 13.5 13.5 14.3 14.3 3.34 8.2 X 1.5 0.79

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
fps	13.5	13.5	14.3	14.3	3.34	8.2	X	1.5	0.79

↑

Handwritten Text Recognition Demo †

デモの実行
・ GPU (FP16)

$ ./_handwritten_text_recognition_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_handwritten_text_recognition_demo_g32.sh

・ CPU

$ ./_handwritten_text_recognition_demo.sh

・ NCS2

$ ./_handwritten_text_recognition_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Average throughput(ms) 52.7 91.2 373 277 1129 495 314 1928 4512

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Average throughput(ms)	52.7	91.2	373	277	1129	495	314	1928	4512

↑

Text Detection C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_text_detection_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_text_detection_demo_g32.sh

・ CPU

$ ./_text_detection_demo.sh

・ NCS2

$ ./_text_detection_demo_16.sh

速度比較

項目		①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
detection model inference	ms	40.9	62	103	102	386	194	648	1000	1266
detection model inference	fps	24.5	16.1	9.68	9.8	2.59	9.13	1.54	1	0.79
detection model postprocessing	ms	73.7	90.7	56.9	40.4	70.2	109.5	86.6	198.9	199.4
detection model postprocessing	fps	13.6	11.0	17.6	24.7	14.3	9.1	11.5	5.0	5.0
recognition model inference	ms	54.3	59.0	8.14	7.1	25.2	14.14	76.6	59.6	120.5
recognition model inference	fps	18.4	16.9	122.9	140	40.0	70.7	13.1	16.8	8.30
recognition model postprocessing	ms	0.0091	0.0091	0.0094	0.007	0.0175	0.0205	0.018	0.0416	0.0193
recognition model postprocessing	fps	109375	108834	106020	136364	57143	488829	58852	24055	51871

↑

Crossroad Camera C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_crossroad_camera_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_crossroad_camera_demo_g32.sh

・ CPU

$ ./_crossroad_camera_demo.sh

・ NCS2

$ ./_crossroad_camera_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

detection time(ms) 20 17.6 24.3 30 81.0 58.6 269.2 319.5 310.5

fps 74 56 40.1 32 12.34 17.6 3.71 3.13 3.22

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
detection time(ms)	20	17.6	24.3	30	81.0	58.6	269.2	319.5	310.5
fps	74	56	40.1	32	12.34	17.6	3.71	3.13	3.22

↑

Human Pose Estimation C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_human_pose_estimation_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_human_pose_estimation_g32.sh

・ CPU

$ ./_human_pose_estimation.sh

・ NCS2

$ ./_human_pose_estimation_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Latency (ms) 23.5 28.7 51.8 43.4 161.0 93.5 43.4 427.2 699.9

FPS 37.9 31.3 18.5 21.5 6.0 9.8 21.5 2.2 1.4

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Latency (ms)	23.5	28.7	51.8	43.4	161.0	93.5	43.4	427.2	699.9
FPS	37.9	31.3	18.5	21.5	6.0	9.8	21.5	2.2	1.4

↑

Object Detection C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_object_detection_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_object_detection_demo_g32.sh

・ CPU

$ ./_object_detection_demo.sh

・ NCS2

$ ./_object_detection_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Latency(ms) 20.3 30.9 44.6 22 89.5 98.9 22 214.9 285.4

fps 214.7 151.7 104.5 73.0 21.0 52.6 73.0 8.6 6.9

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Latency(ms)	20.3	30.9	44.6	22	89.5	98.9	22	214.9	285.4
fps	214.7	151.7	104.5	73.0	21.0	52.6	73.0	8.6	6.9

↑

Smart Classroom C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_smart_classroom_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_smart_classroom_demo_g32.sh

・ CPU

$ ./_smart_classroom_demo.sh

・ NCS2

$ ./_smart_classroom_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Mean Fps 39.2 22.5 20.3 20 6.3 12.4 3.7 2.44 1.44

Frames processed 363 87 117 168 61 125 25 23 17

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Mean Fps	39.2	22.5	20.3	20	6.3	12.4	3.7	2.44	1.44
Frames processed	363	87	117	168	61	125	25	23	17

↑

Pedestrian Tracker C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_pedestrian_tracker_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_pedestrian_tracker_demo_g32.sh

・ CPU

$ ./_pedestrian_tracker_demo.sh

・ NCS2

 -- No Support!! --

↑

Super Resolution C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_super_resolution_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_super_resolution_demo_g32.sh

・ CPU

$ ./_super_resolution_demo.sh

・ NCS2

$ ./_super_resolution_demo_16.sh

↑

Single Human Pose Estimation Demo (top-down pipeline) †

デモの実行
・ GPU (FP16)

$ ./_single_human_pose_estimation_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_single_human_pose_estimation_demo_g32.sh

・ CPU

$ ./_single_human_pose_estimation_demo.sh

・ NCS2

$ ./_single_human_pose_estimation_demo_16.sh

速度比較

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
summary (fps)	4.5	3.2	1.5	0.9	0.5	0.5	0.3	0.2	0.1
estimation (fps)	18.1	10.0	4.7	5.6	1.5	2.7	1.5	0.6	0.4
detection (fps)	140.2	118	78.3	109.9	20.1	67.6	26.3	7.0	8.3

↑

Interactive Face Detection C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_interactive_face_detection_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_interactive_face_detection_demo_g32.sh

・ CPU

$ ./_interactive_face_detection_demo.sh

・ NCS2

$ ./_interactive_face_detection_demo_16.sh

速度比較

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Number of processed frames	218	467	379	86	41	40	86	19	16
Total image throughput (fps)	19.2	16.5	17.6	17.4	4.48	9.64	17.4	2.16	1.64

↑

Gaze Estimation Demo †

デモの実行
・ GPU (FP16)

$ ./_gaze_estimation_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_gaze_estimation_demo_g32.sh

・ CPU

$ ./_gaze_estimation_demo.sh

・ NCS2

$ ./_gaze_estimation_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Overeli (fps) 40 32 62 110 23 43 110 12 10

Interface (fps) 46 41 75 202 35 55 202 14 11

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Overeli (fps)	40	32	62	110	23	43	110	12	10
Interface (fps)	46	41	75	202	35	55	202	14	11

↑

Security Barrier Camera C++ Demo †

デモの実行
・ GPU (FP16)

$ ./_security_barrier_camera_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_security_barrier_camera_demo_g32.sh

・ CPU

$ ./_security_barrier_camera_demo.sh

・ NCS2

$ ./_security_barrier_camera_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

fps 157.7 161.4 128.9 164.5 46.7 91.2 164.5 23.6 19.5

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
fps	157.7	161.4	128.9	164.5	46.7	91.2	164.5	23.6	19.5

↑

Image Inpainting Python Demo †

デモの実行
・ GPU (FP16)

$ ./_image_inpainting_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_image_inpainting_demo_g32.sh

・ CPU

$ ./_image_inpainting_demo.sh

・ NCS2

$ ./_image_inpainting_demo_16.sh

↑

Colorization Python Demo †

デモの実行
・ GPU (FP16)

$ ./_colorization_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_colorization_demo_g32.sh

・ CPU

$ ./_colorization_demo.sh

・ NCS2

$ ./_colorization_demo_16.sh

↑

Image Deblurring Python* Demo †

デモの実行
・ GPU (FP16)

$ ./_deblurring_demo_g16.sh

・ GPU (FP32)

$ cd ~/run_app/
$ ./_deblurring_demo_g32.sh

・ CPU

$ ./_deblurring_demo.sh

・ NCS2

$ ./_deblurring_demo_16.sh

速度比較

項目 ①GPU(16) ②GPU(32) ③CPU i7-11 ④CPU i5-10 ⑤CPU i3-11 ⑥CPU i7-6 ⑦NCS2 ⑧CPU i7-2 ⑨CPU i5-1

Latency(ms) 160.0 182.9 164.3 213.4 629.6 298.4 213.4 1175 1796

fps 5.9 5.1 5.8 4.5 1.6 3.2 4.5 0.8 0.5

項目	①GPU(16)	②GPU(32)	③CPU i7-11	④CPU i5-10	⑤CPU i3-11	⑥CPU i7-6	⑦NCS2	⑧CPU i7-2	⑨CPU i5-1
Latency(ms)	160.0	182.9	164.3	213.4	629.6	298.4	213.4	1175	1796
fps	5.9	5.1	5.8	4.5	1.6	3.2	4.5	0.8	0.5

↑

Hello Query Device Python* Sample †

　使用可能なすべての推論エンジンデバイスを照会し、サポートされているメトリックと既定の構成値を出力する。

デモの実行

$ cd ~/run_app/
$ ./_hello_query_device.sh

↑

テスト環境 †

DELL Latitude 7520 (第11世代 Core™ i7 CPU/GPU搭載ノート)
- CPU Intel® Core™ i7-1185G7 CPU
- Processor Graphics Intel® Iris® Xe Graphics
- 32GB Memory
- 1TB M.2 SSD
- OS Ubuntu20.04LTS
- OpenVINO™ Toolkit for Linux 2021.3

DELL Vostro 3500 (第11世代 Core™ i3 CPU搭載ノート)
- CPU 11th Gen Intel® Core™ i3-1115G4 @ 3.00GHz
- 8GB Memory
- 1TB HDD
- OS Windows10 Pro
- Ubuntu20.04LTS on Oracle VM VirtualBox
- OpenVINO™ Toolkit for Linux 2021.3

Intel® NUC キット BXNUC10I5FNH (第10世代 Core™ i7 CPU搭載ミニPC)
- CPU Intel® Core™ i5-10210U CPU
- 62GB Memory
- 1TB M.2 SSD
- OS Ubuntu20.04LTS
- OpenVINO™ Toolkit for Linux 2021.3

HP EliteDesk 800 G2 SFF (第6世代 Core™ i7 CPU搭載デスクトップ)
- CPU Intel® Core™ i7-6700 CPU @ 3.40GHz
- OS Windows10 Pro
- Ubuntu20.04LTS on Oracle VM VirtualBox
- OpenVINO™ Toolkit for Linux 2021.3

Panasonic CF-B10BWJYS (第2世代 Core™ i7 CPU搭載ノート)
- CPU Intel® Core™ i7-2620M vPro CPU @ 2.60GHz
- OS Windows10 Pro
- Ubuntu20.04LTS on Oracle VM VirtualBox
- OpenVINO™ Toolkit for Linux 2021.3

DELL Studio 1558 (第1世代 Core™ i5 CPU搭載ノート)
- CPU Intel® Core™ i5-M520 @ 2.40GHz
- 4GB Memory
- OS Windows10 Pro
- Ubuntu20.04LTS on Oracle VM VirtualBox
- OpenVINO™ Toolkit for Linux 2021.3

Intel® Neural Compute Stick 2 (Intel® NCS2)
- Processor: Intel Movidius Myriad X Vision Processing Unit (VPU)

↑

更新履歴 †

2021/06/14 初版
2021/06/27 NCS2並列駆動
2021/07/08 デモ検証追加 ⑨

↑

参考資料 †

オフィシャルサイト

最新の20件

OpenVINO™ 動作テスト †

OpenVINO™ Benchmark Test †

OpenVINO™ ベンチマークテスト概要 †

ベンチマークテスト実行スクリプト †

OpenVINO™ ベンチマークテスト実行結果 †

Neural Compute Stick 2 (NCS2) の実行速度 †

Neural Compute Stick 2 (NCS2) 並列動作時の実行速度 †

VirtualBox を使う †

VirtualBox 動作環境のチューニング †

Open Model Zoo Demos を動かす (まとめ) †

テスト環境 †

更新履歴 †

参考資料 †