#author("2024-04-16T05:00:26+00:00","default:mizutu","mizutu")
#author("2024-05-01T06:06:49+00:00","default:mizutu","mizutu")
[[私的AI研究会]] > RevYOLOv5
*【復習】物体検出アルゴリズム「YOLO V5」 [#r795e317]
#ref(img01x.jpg,right,around,36%,img01x.jpg)
#ref(img06x.jpg,right,around,36%,img06x.jpg)
「PyTorch ではじめる AI開発」Chapter04 で使用する「YOLO V5」について復習する。~
以前の作成ページ [[物体検出アルゴリズム「YOLO V5」>YOLOv5]] を全面改定する~
#divregion( 目 次,open)
#contents
#enddivregion
#clear
RIGHT:&size(12){※ 最終更新:2024/04/16 };
RIGHT:&size(12){※ 最終更新:2024/05/01 };

** [[Official YOLOv5>+https://github.com/ultralytics/yolov5]] 考察1 推論/モデル変換編 [#rc6dc52e]
- 下記のプロジェクト・パッケージをダウンロード~
[[update_20240405.zip>https://izutsu.aa0.netvolante.jp/download/linux/update_20240405.zip]] (60.7MB) <アップデートファイル>~
解凍してできた「workspace_pylearn/」内のディレクトリは&color(red){下記の「git clone」コマンド実行後に};作成されたディレクトリ「yolov5」「yolov5_demo」にそれぞ入れコピーする~

*** 物体検出とは [#d397fbfd]
- 物体検出とは画像の中から「犬」や「自転車」といった特定のオブジェクトを検出する技術。~
- 物体検知モデルは画像を入力として Bounding Boxという物体を囲む矩形とそれに対応するクラスラベルを出力する。~
- 物体検知は画像処理技術の中では基本的なタスクの一つで、物体追跡や姿勢検知など様々な応用タスクの土台となる技術である。~
- 近年では Yolo、Faster-RCNN、SSD、RetinaNet、CenterNet等様々な手法が提案されており、多くの研究者が高精度で高速な物体検知モデルを発表している。~
- 物体検知の精度としては既に実用に足る水準に達しつつあり、実際、画像処理技術を応用したソリューションが次々と発表されている。~
引用 → https://www.ariseanalytics.com/activities/report/20210521/ より~
参考 → [[画像認識 (Image Recognition) とは>+https://izutsu.aa0.netvolante.jp/pukiwiki/?PyTorch5#vf0920b8]]~

*** YOLO について [#wbad4914]
 ''『YOLO』とは "You only live once”「人生一度きり」を引用した "You Only Look Once"「見るのは一度きり」が名の由来。''~
- リアルタイム画像認識を行うアルゴリズムで Darknet というフレームワークを使用して実装している。~
- FCN というネットワークを使用しているが、これは darknet 以外の機械学習フレームワークでも実現可能であり、すでに Yolo の Tensorflow版 や PyTorch版 などがを実装されている。~
-「人間のように一目見ただけで物体検出ができることが強み」だそう。~

- モデルが学習しているラベルファイル(プロジェクト・パッケージ「update_20240405.zip」に同梱)~
-- 80 クラスの [[COCO* データセット>https://cocodataset.org/#home]] で学習されている~

-- サイトから coco.names をダウンロードする~
→ https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names~
-- coco.names をテキストエディタで開き、最終行のスペースだけになっている 81行目を削除して上書き保存~
-- coco.names を翻訳して coco.names_jp を作成~

-- ラベル・インデックス一覧~
|RIGHT:|LEFT:|LEFT:|RIGHT:|LEFT:|LEFT:|c
|CENTER:ID|CENTER:coco.names|CENTER:coco.names_jp|CENTER:ID|CENTER:coco.names|CENTER:coco.names_jp|h
|0|person|人|40|wine glass|ワイングラス|
|1|bicycle|自転車|41|cup|カップ|
|2|car|車|42|fork|フォーク|
|3|motorbike|バイク|43|knife|ナイフ|
|4|aeroplane|飛行機|44|spoon|スプーン|
|5|bus|バス|45|bowl|丼鉢|
|6|train|列車|46|banana|バナナ|
|7|truck|トラック|47|apple|リンゴ|
|8|boat|ボート|48|sandwich|サンドイッチ|
|9|traffic light|信号機|49|orange|オレンジ|
|10|fire hydrant|消火栓|50|broccoli|ブロッコリー|
|11|stop sign|一時停止標識|51|carrot|人参|
|12|parking meter|パーキングメーター|52|hot dog|ホットドッグ|
|13|bench|ベンチ|53|pizza|ピザ|
|14|bird|鳥|54|donut|ドーナッツ|
|15|cat|猫|55|cake|ケーキ|
|16|dog|犬|56|chair|椅子|
|17|horse|馬|57|sofa|ソファー|
|18|sheep|羊|58|pottedplant|鉢植え|
|19|cow|牛|59|bed|ベッド|
|20|elephant|象|60|diningtable|ダイニングテーブル|
|21|bear|熊|61|toilet|トイレ|
|22|zebra|シマウマ|62|tvmonitor|テレビ|
|23|giraffe|キリン|63|laptop|ラップトップコンピューター|
|24|backpack|バックパック|64|mouse|マウス|
|25|umbrella|傘|65|remote|リモコン|
|26|handbag|ハンドバック|66|keyboard|キーボード|
|27|tie|ネクタイ|67|cell phone|携帯電話|
|28|suitcase|スーツケース|68|microwave|電子レンジ|
|29|frisbee|フリスビー|69|oven|オーブン|
|30|skis|スキー板|70|toaster|トースター|
|31|snowboard|スノーボード|71|sink|キッチン・シンク|
|32|sports ball|スポーツボール|72|refrigerator|冷蔵庫|
|33|kite|凧|73|book|本|
|34|baseball bat|野球のバット|74|clock|時計|
|35|baseball glove|野球のグローブ|75|vase|花瓶|
|36|skateboard|スケートボード|76|scissors|ハサミ|
|37|surfboard|サーフボード|77|teddy bear|テディベア|
|38|tennis racket|テニスラケット|78|hair drier|ヘアドライヤー|
|39|bottle|瓶|79|toothbrush|歯ブラシ|

*** YOLOv5 をローカルマシンにインストール [#n23094f3]
+ 仮想環境「py_learn」をアクティブにする~
#codeprettify(){{
(base) conda activate py_learn
}}
+ プロジェクトの実行ディレクトリに切り替える~
~
&color(white,black){'' Windows の場合 ''};
#codeprettify(){{
(py_learn) PS > cd /anaconda_win/workspace_pylearn
}}
&color(white,black){'' Linux の場合 ''};
#codeprettify(){{
(py_learn) $ cd ~/workspace_pylearn
}}
+ 次のコマンドで サイト https://github.com/ultralytics/yolov5 から「YOLOv5」をインストール~
#codeprettify(){{
(py_learn) git clone https://github.com/ultralytics/yolov5
}}
・パッケージ構成ファイル「requirements.txt」は使わず現在の環境で不足パッケージのみインストールする~
・プロジェクトのディレクトリ~
#codeprettify(){{
c:\anaconda_win\workspace_pylearn\     ← Windows の場合
~/workspace_pylearn/                   ← Linux   の場合
  ├ chapter01
  ├ chapter02 
  ├ forest-path-movie-dataset
  ├ sample
  │    :
  └ yolov5
}}
+ 冒頭の [[update_20240405.zip>https://izutsu.aa0.netvolante.jp/download/linux/update_20240405.zip]] を解凍してできた「workspace_pylearn/yolov5」を「git clone」でできた「yolov5」にコピーする~

*** YOLOv5 推論プログラムの実行 [#a1321a14]
- プロジェクトの実行ディレクトリ「workspace_pylearn/yolov5/」~
+ カメラ画像を推論する~
#codeprettify(){{
(py_learn) cd yolov5
(py_learn) python detect.py --source 0
}}
・実行結果~
#codeprettify(){{
(py_learn) python detect.py --source 0
Traceback (most recent call last):
  File "C:\anaconda_win\workspace_pylearn\yolov5\detect.py", line 46, in <module>
    from ultralytics.utils.plotting import Annotator, colors, save_one_box
ModuleNotFoundError: No module named 'ultralytics'
}}
+ パッケージ「ultralytics」が無いようなのでインストール~
#codeprettify(){{
(py_learn) pip install ultralytics
}}
+ もう一度 カメラ画像で実行~
・終了はターミナル画面で 'Ctrl' + 'c' を押す~
#codeprettify(){{
(py_learn) python detect.py --source 0
}}
・実行結果~
#ref(camtest01.gif,right,around,60%,camtest01.gif)
#codeprettify(){{
(py_learn) python detect.py --source 0
detect: weights=yolov5s.pt, source=0, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.7 torch-2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
1/1: 0...  Success (inf frames 640x480 at 30.00 FPS)

0: 480x640 1 person, 1 chair, 198.1ms
0: 480x640 1 person, 8.0ms
0: 480x640 1 person, 1 chair, 5.0ms
0: 480x640 1 person, 1 chair, 4.0ms
    :
    :
0: 480x640 1 person, 2 chairs, 6.0ms
0: 480x640 1 person, 1 chair, 16.0ms
Traceback (most recent call last):
    :
    :
KeyboardInterrupt
}}

*** 実行プログラムの修正「detect.py」→「detect2.py」 [#gb64b0df]
+ 入力ソースをカメラ('0') に指定したとき、終了する手段がないので正常に実行結果を保存できない。~
'Esc'キー入力で終了できるように変更する。~
(修正済みプログラムを プロジェクト・パッケージ「update_20240405.zip」に同梱)~
#codeprettify(){{
## Official YOLOv5 https://github.com/ultralytics/yolov5
##
## detect2.py        (original: detect.py)
##  ver 0.01    2024.03.12      'Esc' key Break
        :
}}
#codeprettify(){{
        :
    # Run inference
    model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz))                    # warmup
    seen, windows, dt = 0, [], (Profile(device=device), Profile(device=device), Profile(device=device))

    break_flag = False                                      # 'Esc' key Break     2024/03/12

    for path, im, im0s, vid_cap, s in dataset:

        if break_flag:                                      # 'Esc' key Break     2024/03/12
            break

        with dt[0]:
        :
}}
#codeprettify(){{
        :
            # Stream results
            im0 = annotator.result()
            if view_img:
#                if platform.system() == "Linux" and p not in windows:
#                    windows.append(p)
#                    cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # allow window resize (Linux)
#                    cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
                cv2.namedWindow(str(p), flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL) # 2024/03/12
                cv2.imshow(str(p), im0)
#                cv2.waitKey(1)                             # 1 millisecond

                ## 'Esc' key Break    2023/06/18
                c = cv2.waitKey(1)                          # 1 millisecond
                if c == 27: 
                    break_flag = True
                    break 

            # Save results (image with detections)
        :
}}
+ 推論実行結果の実行中ログをターミナルに出力しないようにする~
#codeprettify(){{
        # Print time (inference-only)
        # 途中表示なし 2024/03/12
#        LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")
}}
+ 修正済みソースコード~
#divregion(「detect2.py」)
#codeprettify(){{
# -*- coding: utf-8 -*-
##------------------------------------------
##  Object detection YOLO V5      Ver 0.01
##    Inference program
##
##               2024.03.12 Masahiro Izutsu
##------------------------------------------
## Official YOLOv5 https://github.com/ultralytics/yolov5
##
## detect2.py        (original: detect.py)
##  ver 0.01    2024.03.12      'Esc' key Break

# YOLOv5 陜}~ by Ultralytics, AGPL-3.0 license
"""
Run YOLOv5 detection inference on images, videos, directories, globs, YouTube, webcam, streams, etc.

Usage - sources:
    $ python detect.py --weights yolov5s.pt --source 0                               # webcam
                                                     img.jpg                         # image
                                                     vid.mp4                         # video
                                                     screen                          # screenshot
                                                     path/                           # directory
                                                     list.txt                        # list of images
                                                     list.streams                    # list of streams
                                                     'path/*.jpg'                    # glob
                                                     'https://youtu.be/LNwODJXcvt4'  # YouTube
                                                     'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream

Usage - formats:
    $ python detect.py --weights yolov5s.pt                 # PyTorch
                                 yolov5s.torchscript        # TorchScript
                                 yolov5s.onnx               # ONNX Runtime or OpenCV DNN with --dnn
                                 yolov5s_openvino_model     # OpenVINO
                                 yolov5s.engine             # TensorRT
                                 yolov5s.mlmodel            # CoreML (macOS-only)
                                 yolov5s_saved_model        # TensorFlow SavedModel
                                 yolov5s.pb                 # TensorFlow GraphDef
                                 yolov5s.tflite             # TensorFlow Lite
                                 yolov5s_edgetpu.tflite     # TensorFlow Edge TPU
                                 yolov5s_paddle_model       # PaddlePaddle
"""

import argparse
import csv
import os
import platform
import sys
from pathlib import Path

import torch

FILE = Path(__file__).resolve()
ROOT = FILE.parents[0]                                      # YOLOv5 root directory
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))                              # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd()))              # relative

from ultralytics.utils.plotting import Annotator, colors, save_one_box

from models.common import DetectMultiBackend
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (
    LOGGER,
    Profile,
    check_file,
    check_img_size,
    check_imshow,
    check_requirements,
    colorstr,
    cv2,
    increment_path,
    non_max_suppression,
    print_args,
    scale_boxes,
    strip_optimizer,
    xyxy2xywh,
)
from utils.torch_utils import select_device, smart_inference_mode


@smart_inference_mode()
def run(
    weights=ROOT / "yolov5s.pt",                            # model path or triton URL
    source=ROOT / "data/images",                            # file/dir/URL/glob/screen/0(webcam)
    data=ROOT / "data/coco128.yaml",                        # dataset.yaml path
    imgsz=(640, 640),                                       # inference size (height, width)
    conf_thres=0.25,                                        # confidence threshold
    iou_thres=0.45,                                         # NMS IOU threshold
    max_det=1000,                                           # maximum detections per image
    device="",                                              # cuda device, i.e. 0 or 0,1,2,3 or cpu
    view_img=False,                                         # show results
    save_txt=False,                                         # save results to *.txt
    save_csv=False,                                         # save results in CSV format
    save_conf=False,                                        # save confidences in --save-txt labels
    save_crop=False,                                        # save cropped prediction boxes
    nosave=False,                                           # do not save images/videos
    classes=None,                                           # filter by class: --class 0, or --class 0 2 3
    agnostic_nms=False,                                     # class-agnostic NMS
    augment=False,                                          # augmented inference
    visualize=False,                                        # visualize features
    update=False,                                           # update all models
    project=ROOT / "runs/detect",                           # save results to project/name
    name="exp",                                             # save results to project/name
    exist_ok=False,                                         # existing project/name ok, do not increment
    line_thickness=3,                                       # bounding box thickness (pixels)
    hide_labels=False,                                      # hide labels
    hide_conf=False,                                        # hide confidences
    half=False,                                             # use FP16 half-precision inference
    dnn=False,                                              # use OpenCV DNN for ONNX inference
    vid_stride=1,                                           # video frame-rate stride
):
    source = str(source)
    save_img = not nosave and not source.endswith(".txt")   # save inference images
    is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
    is_url = source.lower().startswith(("rtsp://", "rtmp://", "http://", "https://"))
    webcam = source.isnumeric() or source.endswith(".streams") or (is_url and not is_file)
    screenshot = source.lower().startswith("screen")
    if is_url and is_file:
        source = check_file(source)                         # download

    # Directories
    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)                  # increment run
    (save_dir / "labels" if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

    # Load model
    device = select_device(device)
    model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
    stride, names, pt = model.stride, model.names, model.pt
    imgsz = check_img_size(imgsz, s=stride)                 # check image size

    # Dataloader
    bs = 1  # batch_size
    if webcam:
        view_img = check_imshow(warn=True)
        dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
        bs = len(dataset)
    elif screenshot:
        dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
    else:
        dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
    vid_path, vid_writer = [None] * bs, [None] * bs

    # Run inference
    model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz))                    # warmup
    seen, windows, dt = 0, [], (Profile(device=device), Profile(device=device), Profile(device=device))

    break_flag = False                                      # 'Esc' key Break     2024/03/12

    for path, im, im0s, vid_cap, s in dataset:

        if break_flag:                                      # 'Esc' key Break     2024/03/12
            break

        with dt[0]:
            im = torch.from_numpy(im).to(model.device)
            im = im.half() if model.fp16 else im.float()    # uint8 to fp16/32
            im /= 255  # 0 - 255 to 0.0 - 1.0
            if len(im.shape) == 3:
                im = im[None]  # expand for batch dim
            if model.xml and im.shape[0] > 1:
                ims = torch.chunk(im, im.shape[0], 0)

        # Inference
        with dt[1]:
            visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
            if model.xml and im.shape[0] > 1:
                pred = None
                for image in ims:
                    if pred is None:
                        pred = model(image, augment=augment, visualize=visualize).unsqueeze(0)
                    else:
                        pred = torch.cat((pred, model(image, augment=augment, visualize=visualize).unsqueeze(0)), dim=0)
                pred = [pred, None]
            else:
                pred = model(im, augment=augment, visualize=visualize)
        # NMS
        with dt[2]:
            pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)

        # Second-stage classifier (optional)
        # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)

        # Define the path for the CSV file
        csv_path = save_dir / "predictions.csv"

        # Create or append to the CSV file
        def write_to_csv(image_name, prediction, confidence):
            """Writes prediction data for an image to a CSV file, appending if the file exists."""
            data = {"Image Name": image_name, "Prediction": prediction, "Confidence": confidence}
            with open(csv_path, mode="a", newline="") as f:
                writer = csv.DictWriter(f, fieldnames=data.keys())
                if not csv_path.is_file():
                    writer.writeheader()
                writer.writerow(data)

        # Process predictions
        for i, det in enumerate(pred):                      # per image
            seen += 1
            if webcam:                                      # batch_size >= 1
                p, im0, frame = path[i], im0s[i].copy(), dataset.count
                s += f"{i}: "
            else:
                p, im0, frame = path, im0s.copy(), getattr(dataset, "frame", 0)

            p = Path(p)                                     # to Path
            save_path = str(save_dir / p.name)              # im.jpg
            txt_path = str(save_dir / "labels" / p.stem) + ("" if dataset.mode == "image" else f"_{frame}")  # im.txt
            s += "%gx%g " % im.shape[2:]                    # print string
            gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]      # normalization gain whwh
            imc = im0.copy() if save_crop else im0          # for save_crop
            annotator = Annotator(im0, line_width=line_thickness, example=str(names))
            if len(det):
                # Rescale boxes from img_size to im0 size
                det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()

                # Print results
                for c in det[:, 5].unique():
                    n = (det[:, 5] == c).sum()              # detections per class
                    s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string

                # Write results
                for *xyxy, conf, cls in reversed(det):
                    c = int(cls)                            # integer class
                    label = names[c] if hide_conf else f"{names[c]}"
                    confidence = float(conf)
                    confidence_str = f"{confidence:.2f}"

                    if save_csv:
                        write_to_csv(p.name, label, confidence_str)

                    if save_txt:                            # Write to file
                        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                        line = (cls, *xywh, conf) if save_conf else (cls, *xywh)  # label format
                        with open(f"{txt_path}.txt", "a") as f:
                            f.write(("%g " * len(line)).rstrip() % line + "\n")

                    if save_img or save_crop or view_img:   # Add bbox to image
                        c = int(cls)                        # integer class
                        label = None if hide_labels else (names[c] if hide_conf else f"{names[c]} {conf:.2f}")
                        annotator.box_label(xyxy, label, color=colors(c, True))
                    if save_crop:
                        save_one_box(xyxy, imc, file=save_dir / "crops" / names[c] / f"{p.stem}.jpg", BGR=True)

            # Stream results
            im0 = annotator.result()
            if view_img:
#                if platform.system() == "Linux" and p not in windows:
#                    windows.append(p)
#                    cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # allow window resize (Linux)
#                    cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
                cv2.namedWindow(str(p), flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL) # 2024/03/12
                cv2.imshow(str(p), im0)
#                cv2.waitKey(1)                             # 1 millisecond

                ## 'Esc' key Break    2023/06/18
                c = cv2.waitKey(1)                          # 1 millisecond
                if c == 27: 
                    break_flag = True
                    break 


            # Save results (image with detections)
            if save_img:
                if dataset.mode == "image":
                    cv2.imwrite(save_path, im0)
                else:                                       # 'video' or 'stream'
                    if vid_path[i] != save_path:            # new video
                        vid_path[i] = save_path
                        if isinstance(vid_writer[i], cv2.VideoWriter):
                            vid_writer[i].release()         # release previous video writer
                        if vid_cap:                         # video
                            fps = vid_cap.get(cv2.CAP_PROP_FPS)
                            w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                            h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                        else:                               # stream
                            fps, w, h = 30, im0.shape[1], im0.shape[0]
                        save_path = str(Path(save_path).with_suffix(".mp4"))  # force *.mp4 suffix on results videos
                        vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
                    vid_writer[i].write(im0)

        # Print time (inference-only)
        # 途中表示なし 2024/03/12
#        LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")

    # Print results
    t = tuple(x.t / seen * 1e3 for x in dt)                 # speeds per image
    LOGGER.info(f"Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}" % t)
    if save_txt or save_img:
        s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ""
        LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
    if update:
        strip_optimizer(weights[0])                         # update model (to fix SourceChangeWarning)


def parse_opt():
    """Parses command-line arguments for YOLOv5 detection, setting inference options and model configurations."""
    parser = argparse.ArgumentParser()
    parser.add_argument("--weights", nargs="+", type=str, default=ROOT / "yolov5s.pt", help="model path or triton URL")
    parser.add_argument("--source", type=str, default=ROOT / "data/images", help="file/dir/URL/glob/screen/0(webcam)")
    parser.add_argument("--data", type=str, default=ROOT / "data/coco128.yaml", help="(optional) dataset.yaml path")
    parser.add_argument("--imgsz", "--img", "--img-size", nargs="+", type=int, default=[640], help="inference size h,w")
    parser.add_argument("--conf-thres", type=float, default=0.25, help="confidence threshold")
    parser.add_argument("--iou-thres", type=float, default=0.45, help="NMS IoU threshold")
    parser.add_argument("--max-det", type=int, default=1000, help="maximum detections per image")
    parser.add_argument("--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu")
    parser.add_argument("--view-img", action="store_true", help="show results")
    parser.add_argument("--save-txt", action="store_true", help="save results to *.txt")
    parser.add_argument("--save-csv", action="store_true", help="save results in CSV format")
    parser.add_argument("--save-conf", action="store_true", help="save confidences in --save-txt labels")
    parser.add_argument("--save-crop", action="store_true", help="save cropped prediction boxes")
    parser.add_argument("--nosave", action="store_true", help="do not save images/videos")
    parser.add_argument("--classes", nargs="+", type=int, help="filter by class: --classes 0, or --classes 0 2 3")
    parser.add_argument("--agnostic-nms", action="store_true", help="class-agnostic NMS")
    parser.add_argument("--augment", action="store_true", help="augmented inference")
    parser.add_argument("--visualize", action="store_true", help="visualize features")
    parser.add_argument("--update", action="store_true", help="update all models")
    parser.add_argument("--project", default=ROOT / "runs/detect", help="save results to project/name")
    parser.add_argument("--name", default="exp", help="save results to project/name")
    parser.add_argument("--exist-ok", action="store_true", help="existing project/name ok, do not increment")
    parser.add_argument("--line-thickness", default=3, type=int, help="bounding box thickness (pixels)")
    parser.add_argument("--hide-labels", default=False, action="store_true", help="hide labels")
    parser.add_argument("--hide-conf", default=False, action="store_true", help="hide confidences")
    parser.add_argument("--half", action="store_true", help="use FP16 half-precision inference")
    parser.add_argument("--dnn", action="store_true", help="use OpenCV DNN for ONNX inference")
    parser.add_argument("--vid-stride", type=int, default=1, help="video frame-rate stride")
    opt = parser.parse_args()
    opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1  # expand
    print_args(vars(opt))
    return opt


def main(opt):
    """Executes YOLOv5 model inference with given options, checking requirements before running the model."""
    check_requirements(ROOT / "requirements.txt", exclude=("tensorboard", "thop"))
    run(**vars(opt))


if __name__ == "__main__":
    opt = parse_opt()
    main(opt)
}}
#enddivregion

#ref(YOLOv5/yolov5_bus_result.jpg,right,around,12%,yolov5_bus_result.jpg)
#ref(YOLOv5/yolov5_openvino3_m.jpg,right,around,12%,yolov5_openvino_m3.jpg)
*** 推論プログラム「detect2.py」の実行 [#dac40883]
- 実行ディレクトリは「workspace_pylearn/yolov5/」~
- 実行結果は「yolov5/runs/detect/exp(2・3・4 …)」ディレクトリに保存~
「exp*」ディレクトリは実行のたびに更新される~

- カメラ画像入力('Esc'キー入力で終了)~
#codeprettify(){{
(py_learn) python detect2.py --source 0
}}
・実行結果~
#codeprettify(){{
(py_learn) python detect2.py --source 0
detect2: weights=yolov5s.pt, source=0, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.7 torch-2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
1/1: 0...  Success (inf frames 640x480 at 30.00 FPS)

Speed: 0.4ms pre-process, 8.9ms inference, 3.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\detect\exp10
}}
- 静止画サンプル画像入力~
#codeprettify(){{
(py_learn) python detect2.py
}}
・実行結果~
#codeprettify(){{
(py_learn) python detect2.py
detect: weights=yolov5s.pt, source=data\images, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.7 torch-2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/2 C:\anaconda_win\workspace_pylearn\yolov5\data\images\bus.jpg: 640x480 4 persons, 1 bus, 48.9ms
image 2/2 C:\anaconda_win\workspace_pylearn\yolov5\data\images\zidane.jpg: 384x640 2 persons, 2 ties, 52.8ms
Speed: 0.0ms pre-process, 50.8ms inference, 74.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\detect\exp3
}}

-「detect2.py」実行時のコマンドパラメータ1~
・-- source <入力ソース名>~
|LEFT:220|LEFT:220|c
|CENTER:入力ソース名|CENTER:種類|h
|BGCOLOR(lightyellow):0|BGCOLOR(lightyellow):webcam(0,1,...)|
|BGCOLOR(lightyellow):img.jpg|BGCOLOR(lightyellow):image|
|BGCOLOR(lightyellow):vid.mp4|BGCOLOR(lightyellow):video|
|screen|screenshot|
|path/|directory|
|list.txt|list of images|
|list.streams|list of streams|
|'path/*.jpg'|glob|
|'https://youtu.be/LNwODJXcvt4'|YouTube|
|'rtsp://example.com/media.mp4'|RTSP, RTMP, HTTP stream|
・-- weights <学習モデル名>~
|LEFT:220|LEFT:220|c
|CENTER:学習モデル名|CENTER:種類|h
|BGCOLOR(lightyellow):yolov5s.pt|BGCOLOR(lightyellow):PyTorch|
|yolov5s.torchscript|TorchScript|
|BGCOLOR(lightyellow):yolov5s.onnx|BGCOLOR(lightyellow):ONNX Runtime or OpenCV DNN with --dnn|
|BGCOLOR(lightyellow):yolov5s_openvino_model|BGCOLOR(lightyellow):OpenVINO|
|yolov5s.engine|TensorRT|
|yolov5s.mlmodel|CoreML (macOS-only)|
|yolov5s_saved_model|TensorFlow SavedModel|
|yolov5s.pb|TensorFlow GraphDef|
|yolov5s.tflite|TensorFlow Lite|
|yolov5s_edgetpu.tflite|TensorFlow Edge TPU|
|yolov5s_paddle_model|PaddlePaddle|

-「detect2.py」実行時のコマンドパラメータ2(詳細)~
|LEFT:128|CENTER:38|CENTER:90|LEFT:|c
|CENTER:コマンドオプション|引数|初期値|CENTER:意味|h
|BGCOLOR(lightyellow):--weights|BGCOLOR(lightyellow):str|BGCOLOR(lightyellow):yolov7s.pt|BGCOLOR(lightyellow):学習済み重みモデルファイル|
|BGCOLOR(lightyellow):--source|BGCOLOR(lightyellow):str|BGCOLOR(lightyellow):data/images|BGCOLOR(lightyellow):推論対象の画像ソース(file/folder) のパス(0,1,... = Webカメラ)|
|BGCOLOR(lightyellow):--imgsz|BGCOLOR(lightyellow):int|BGCOLOR(lightyellow):(640, 480)|BGCOLOR(lightyellow):推論対象の画像のサイズ(pixel)|
|--conf-thres|0.25|float|クラス判定の閾値 (数値が小さい程オブジェクトは増えるが、ノイズも増える|
|--iou-thres|0.45|float|iou は Intersection Over Union (検出領域が重なっている割合、数値が大きいほど重なり度合いが高い)|
|--max_det|int|1000|maximum detections per image|
|BGCOLOR(lightyellow):--device|BGCOLOR(lightyellow):str|BGCOLOR(lightyellow):|BGCOLOR(lightyellow):使用プロセッサの指定(0 or 0,1,2,3 or cpu) (指定なしの場合 cuda)|
|BGCOLOR(lightyellow):--view-img|BGCOLOR(lightyellow):なし|BGCOLOR(lightyellow):False|BGCOLOR(lightyellow):推論結果の表示 (指定すれば表示する)|
|--save-txt|なし|False|推論結果(検出座標と予測クラス)をテキストファイルで残す (*.txt)|
|--save-conf|なし|False|推論結果(クラスの確率)をテキストファイルで残す (*.txt)|
|--save_crop|なし|False|save cropped prediction boxes|
|--nosave|なし|False|推論結果の記録 (指定すれば残さない)|
|--classes|str|None|クラスフィルタ(--class 0, or --class 0 2 3)|
|--agnostic-nms|なし|False|class-agnostic NMS|
|--augment|なし|False|拡張推論|
|--visualize|なし|False|visualize features|
|--update|なし|False|モデルをアップデートする|
|BGCOLOR(lightyellow):--project|BGCOLOR(lightyellow):str|BGCOLOR(lightyellow):runs/detect|BGCOLOR(lightyellow):推論結果の記録フォルダパス|
|BGCOLOR(lightyellow):--name|BGCOLOR(lightyellow):str|BGCOLOR(lightyellow):exp|BGCOLOR(lightyellow):推論結果の記録フォルダの下のフォルダ名(推論ごとにインクリメント)|
|BGCOLOR(lightyellow):--exist-ok|BGCOLOR(lightyellow):なし|BGCOLOR(lightyellow):False-|BGCOLOR(lightyellow):推論結果を上書き保存(指定すれば上書き)|
|--line_thickness|int|3|bounding box thickness (pixels)|
|--hide_labels|なし|False|hide labels|
|--hide_conf|なし|False|hide confidences|
|--half|なし|False|use FP16 half-precision inference|
|--dnn|なし|False|use OpenCV DNN for ONNX inference|
|--vid_stride|int|1|video frame-rate stride|

*** 学習済みモデルのフォーマット変換「export.py」 [#v414f3bf]
-「export.py」対応モデル~
|CENTER:フォーマット|CENTER:パラメータ `export.py --include`|CENTER:変換モデルファイル名称|h
|PyTorch|CENTER:-|yolov5s.pt|
|TorchScript|`torchscript`|yolov5s.torchscript|
|ONNX|`onnx`|yolov5s.onnx|
|OpenVINO|`openvino`|yolov5s_openvino_model/ ※|
|TensorRT|`engine`|yolov5s.engine|
|CoreML|`coreml`|yolov5s.mlmodel|
|TensorFlow SavedModel|`saved_model`|yolov5s_saved_model/ ※|
|TensorFlow GraphDef|`pb`|yolov5s.pb|
|TensorFlow Lite|`tflite`|yolov5s.tflite|
|TensorFlow Edge TPU|`edgetpu`|yolov5s_edgetpu.tflite|
|TensorFlow.js|`tfjs`|yolov5s_web_model/ ※|
|PaddlePaddle|`paddle`|yolov5s_paddle_model/ ※|
 ※ フォルダ名(フォルダ内に変換したモデルファイル)

- onnx, OpenVINO™ に変換する~
#codeprettify(){{
(py_learn2) python export.py --weights yolov5s.pt --include onnx openvino
}}
・実行結果~
#codeprettify(){{
(py_learn2) python export.py --weights yolov5s.pt --include onnx openvino
export: data=C:\anaconda_win\workspace_pylearn\yolov5\data\coco128.yaml, weights=['yolov5s.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, keras=False, optimize=False, int8=False, per_tensor=False, dynamic=False, simplify=False, opset=17, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx', 'openvino']
YOLOv5  v7.0-294-gdb125a20 Python-3.11.8 torch-2.2.1+cu121 CPU

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs

PyTorch: starting from yolov5s.pt with output shape (1, 25200, 85) (14.1 MB)

ONNX: starting export with onnx 1.15.0...
ONNX: export success  0.8s, saved as yolov5s.onnx (28.0 MB)

OpenVINO: starting export with openvino 2024.0.0-14509-34caeefd078-releases/2024/0...
OpenVINO: export success  1.4s, saved as yolov5s_openvino_model\ (28.2 MB)

Export complete (2.8s)
Results saved to C:\anaconda_win\workspace_pylearn\yolov5
Detect:          python detect.py --weights yolov5s_openvino_model\
Validate:        python val.py --weights yolov5s_openvino_model\
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s_openvino_model\')
Visualize:       https://netron.app
}}

- パッケージ環境(OpenVINO™ のバージョンによっては動作しないことがある)~
#codeprettify(){{
(py_learn2) python -V
Python 3.11.8

(py_learn2) conda list
    :
onnx                      1.15.0                   pypi_0    pypi
onnxruntime               1.17.1                   pypi_0    pypi
opencv                    4.6.0           py311h5d08a89_5
opencv-python             4.9.0.80                 pypi_0    pypi
openjpeg                  2.4.0                h4fc8c34_0
openssl                   3.0.13               h2bbff1b_0
openvino                  2024.0.0                 pypi_0    pypi
openvino-dev              2024.0.0                 pypi_0    pypi
openvino-telemetry        2023.2.1                 pypi_0    pypi
    :
}}

- (参考) OpenVINO™ のコンバートコマンドで変換~
#codeprettify(){{
(py_learn) mo  --input_model yolov5s.onnx
}}
・実行結果~
#codeprettify(){{
(py_learn) mo  --input_model yolov5s.onnx
[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression explicitly by adding argument --compress_to_fp16=False.
Find more information about compression to FP16 at https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_FP16_Compression.html
[ INFO ] The model was converted to IR v11, the latest model format that corresponds to the source DL framework input/output format. While IR v11 is backwards compatible with OpenVINO Inference Engine API v1.0, please use API v2.0 (as of 2022.1) to take advantage of the latest improvements in IR v11.
Find more information about API v2.0 and IR v11 at https://docs.openvino.ai/2023.0/openvino_2_0_transition_guide.html
[ INFO ] MO command line tool is considered as the legacy conversion API as of OpenVINO 2023.2 release. Please use OpenVINO Model Converter (OVC). OVC represents a lightweight alternative of MO and provides simplified model conversion API.
Find more information about transition from MO to OVC at https://docs.openvino.ai/2023.2/openvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition.html
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: C:\anaconda_win\workspace_pylearn\yolov5\yolov5s.xml
[ SUCCESS ] BIN file: C:\anaconda_win\workspace_pylearn\yolov5\yolov5s.bin
}}

- OpenVINO™ 対応プログラムについて~
・この方法でコンバートされたモデルは、従来の方法によるアクセスプログラムでは動作しない → [[サンプルデモを動かす>#n0915dc0]]~
・「openvino-dev」パッケージ付属のモデルオプティマイザーによる変換が必要~
 → [[「export.py」で得られた ONNXファイルを OpenVINO™ IR に変換>#o96f0a22]]~

*** 変換した学習済みモデルで 推論プログラム「detect2.py」の実行 [#oa24b4bc]
- PyTorch(オリジナル)モデル「yolov5s.pt」~
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img
}}
・実行結果~
#ref(rev_yolov5_02_m.jpg,right,around,25%,rev_yolov5_02_m.jpg)
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img

detect2: weights=yolov5s.pt, source=../../Videos/car_m.mp4, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=True, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Speed: 0.4ms pre-process, 5.7ms inference, 2.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\detect\exp17
}}

- ONNX モデル「yolov5s.onnx」~
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img --weights yolov5s.onnx
}}
・実行結果~
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img --weights yolov5s.onnx

detect2: weights=['yolov5s.onnx'], source=../../Videos/car_m.mp4, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=True, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Loading yolov5s.onnx for ONNX Runtime inference...
requirements: Ultralytics requirement ['onnxruntime-gpu'] not found, attempting AutoUpdate...
ERROR: Could not install packages due to an OSError: [WinError 5] アクセスが拒否されました。: 'C:\\Users\\izuts\\anaconda3\\envs\\py_learn2\\Lib\\site-packages\\onnxruntime\\capi\\onnxruntime_providers_shared.dll'
Consider using the `--user` option or check the permissions.

requirements: ❌ Command 'pip install --no-cache "onnxruntime-gpu" ' returned non-zero exit status 1.
Speed: 1.3ms pre-process, 29.4ms inference, 4.1ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\detect\exp18
}}

- OpenVINO™ モデル「yolov5s_openvino_model」~
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img --weights yolov5s_openvino_model
}}
・実行結果~
#codeprettify(){{
(py_learn2) python detect2.py --source ../../Videos/car_m.mp4 --view-img --weights yolov5s_openvino_model

detect2: weights=['yolov5s_openvino_model'], source=../../Videos/car_m.mp4, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=True, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  v7.0-294-gdb125a20 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Loading yolov5s_openvino_model for OpenVINO inference...
Speed: 1.3ms pre-process, 29.3ms inference, 3.8ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\detect\exp19
}}

** YOLO V5 を「PyTorch」で使う [#sc033781]
*** YOLO V5 テストプログラム [#qf4e4736]
- プロジェクト・パッケージ「update_20240405.zip」に同梱~
- プロジェクトの実行ディレクトリ「workspace_pylearn/yolov5/」~
#codeprettify(){{
(py_learn) python yolov5-test2.py
}}
・実行結果~
#ref(rev_yolov5_01_m.jpg,right,around,25%,rev_yolov5_01_m.jpg)
#codeprettify(){{
(py_learn) python yolov5-test2.py
Using cache found in C:\Users\<User>/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5  2024-3-13 Python-3.11.7 torch-2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape...
Saved 2 images to runs\detect\exp13
image 1/2: 720x1280 2 persons, 2 ties
image 2/2: 1080x810 4 persons, 1 bus
Speed: 8.7ms pre-process, 30.0ms inference, 79.0ms NMS per image at shape (2, 3, 640, 640)
最初の画像からの検出
tensor([[7.42863e+02, 4.79508e+01, 1.14113e+03, 7.16857e+02, 8.80750e-01, 0.00000e+00],
        [4.42037e+02, 4.37341e+02, 4.96715e+02, 7.09926e+02, 6.87170e-01, 2.70000e+01],
        [1.25252e+02, 1.93575e+02, 7.10963e+02, 7.13103e+02, 6.41552e-01, 0.00000e+00],
        [9.82882e+02, 3.08400e+02, 1.02733e+03, 4.20228e+02, 2.62887e-01, 2.70000e+01]], device='cuda:0')
2番目の画像からの検出
tensor([[2.20872e+02, 4.07374e+02, 3.45721e+02, 8.74728e+02, 8.35223e-01, 0.00000e+00],
        [6.62591e+02, 3.86202e+02, 8.10000e+02, 8.80324e+02, 8.28926e-01, 0.00000e+00],
        [5.75802e+01, 3.97293e+02, 2.14777e+02, 9.18263e+02, 7.85060e-01, 0.00000e+00],
        [1.47090e+01, 2.22154e+02, 7.98415e+02, 7.84966e+02, 7.81528e-01, 5.00000e+00],
        [0.00000e+00, 5.53392e+02, 7.24685e+01, 8.74691e+02, 4.64727e-01, 0.00000e+00]], device='cuda:0')
全てのクラス
{0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
}}
・結果画像は「runs/detect/exp(2・3・4 …)」ディレクトリに保存されている。~
#clear
~
- ソースファイル~
#divregion(「yolov5-test2.py」)
#codeprettify(){{
# -*- coding: utf-8 -*-
##------------------------------------------
## 【復習】「PyTorch で始める AI開発」
##   exercise / YOLOv5で物体検出    Ver. 0.02
##       (PyTorch Hubからダウンロード)
##
##               2024.09.13 Masahiro Izutsu
##------------------------------------------
## https://kikaben.com/yolov5-starter/
## yolov5-test2.py

import torch

# Torch HubからYolao V5をダウンロード
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
# 画像のURL
base_url = 'data/images/'

# 画像二つのバッチ
imgs = [base_url + f for f in ('zidane.jpg', 'bus.jpg')]

# 推論の実行
results = model(imgs)

# 結果を表示
results.show()

# 画像をセーブ
results.save()

# 検出されたクラスと数を表示
results.print()

# Bounding Boxなどの表示
print('最初の画像からの検出')
print(results.xyxy[0])

print('2番目の画像からの検出')
print(results.xyxy[1])

# サポートされているクラス
print('全てのクラス')
print(model.names)
}}
#enddivregion

*** YOLO V5 物体検出プログラム「detect2_yolov5.py」の作成 [#n3cc0566]
#ref(rev_yolov5_08_m.jpg,right,around,30%,rev_yolov5_08_m.jpg)
#ref(rev_yolov5_07_m.jpg,right,around,30%,rev_yolov5_07_m.jpg)
- 作成プログラムの仕様~
・オンライン/オフライン(ローカル)どちらでも動作する~
・検出された 80種類のオブジェクトを領域と文字で表示する。~
・文字の表示は「日本語/英語」の表記が可能。~
・オブジェクトの種類によって色分け表示する。~
・入力ソースは「WEBカメラ/動画ファイル/静止画ファイル」に対応する。~
・結果を画像出力できる。~
・プロジェクト・パッケージ「update_20240405.zip」に同梱~
~
-「yolov5」ディレクトリ直下にラベルファイルを用意しておく~
#codeprettify(){{
coco.names   ← 英語版
coco.names_jp  ← 日本語版
}}
~
- コマンドパラメータ~
|LEFT:|CENTER:|LEFT:|c
|CENTER:コマンドオプション|初期値|CENTER:意味|h
| -i , --image|'../../Videos/car_m.mp4'|入力ソースのパス またはカメラ(cam/cam0~cam9)|
|BGCOLOR(lightyellow): -y , --yolov5|BGCOLOR(lightyellow):'ultralytics/yolov5'|BGCOLOR(lightyellow):yolov5ディレクトリのパス(ローカルの場合は yolov5 のパス)|
| -m , --models|'yolov5s'|モデル名(ローカルの場合は モデルファイルのパス)※1|
| -l , --labels|'coco.names_jp'|ラベルファイルのパス(coco.name, coco_name_jp)|
| -c , --conf|0.25|オブジェクト検出レベルの閾値|
| -t , --title|'y'|タイトルの表示(y/n)|
| -s , --speed|'y'|速度の表示(y/n)|
| -o , --out|'non'|出力結果の保存パス <path/filename> ※2|
| -cpu|-|CPUフラグ(指定すれば 常に CPU動作)|
※1 オンライン動作の場合の指定できるモデルの種類「yolov5n」「yolov5s」「yolov5m」「yolov5l」「yolov5x」~
※2 出力ファイル名までのディレクトリ・パスは必ず存在すること(存在しない場合は保存しない)~
~
・「-y , --yolov5」パラメータ指定の例~
#codeprettify(){{
-y ultralytics/yolov5                                       ← オンライン(TorchHub)<default>
-y ./                                                       ← オフライン(ローカル)
}}
 ※ 初回起動時にキャッシュにダウンロードされ以後はキャッシュで動作する~
~
・「-m , --models」パラメータ指定の例~
#codeprettify(){{
-m yolov5s                                                  ← オンライン(TorchHub)<default>
-m ./test/yolov5s.pt                                        ← オフライン(ローカル)
}}
 ※ モデルが指定場所にない場合は、初回実行時に自動的にダウンロードされる~
~
#divregion( コマンドパラメータ詳細)
#codeprettify(){{
(py_learn) python detect2_yolov5.py -h
usage: detect2_yolov5.py [-h] [-i IMAGE_FILE] [-y YOLOV5] [-m MODELS] [-c CONFIDENCE] [-l LABELS] [-t TITLE]
                         [-s SPEED] [-o IMAGE_OUT] [-cpu]

options:
  -h, --help            show this help message and exit
  -i IMAGE_FILE, --image IMAGE_FILE
                        Absolute path to image file or cam/cam0/cam1 for camera stream.
  -y YOLOV5, --yolov5 YOLOV5
                        YOLO V5 directry absolute path.
  -m MODELS, --models MODELS
                        yolov5n/yolov5m/yolov5l/yolov5x or model file absolute path.
  -c CONFIDENCE, --conf CONFIDENCE
                        confidences labels Default value is 0.25
  -l LABELS, --labels LABELS
                        Language.(jp/en) Default value is 'jp'
  -t TITLE, --title TITLE
                        Program title flag.(y/n) Default value is 'y'
  -s SPEED, --speed SPEED
                        Speed display flag.(y/n) Default calue is 'y'
  -o IMAGE_OUT, --out IMAGE_OUT
                        Processed image file path. Default value is 'non'
  -cpu                  Optional. CPU only!
}}
#enddivregion
~
- オンライン実行例~
#codeprettify(){{
(py_learn) python detect2_yolov5.py
}}
・実行結果~
#ref(rev_yolov5_03_m.jpg,right,around,25%,rev_yolov5_03_m.jpg)
#codeprettify(){{
(py_learn) python detect2_yolov5.py

Object detection YoloV5 in PyTorch Ver. 0.05: Starting application...
   OpenCV virsion : 4.9.0

   - Image File   :  ../../Videos/car_m.mp4
   - YOLO v5      :  ultralytics/yolov5
   - Pretrained   :  yolov5s
   - Confidence lv:  0.25
   - Label file   :  coco.names_jp
   - Program Title:  y
   - Speed flag   :  y
   - Processed out:  non
   - Use device   :  cuda:0

Using cache found in C:\Users\izuts/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5  2024-3-13 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape...

FPS average:      30.90

 Finished.
}}
#clear
- オフライン実行例~
#codeprettify(){{
(py_learn) python detect2_yolov5.py -y ./
}}
・実行結果~
#codeprettify(){{
(py_learn) python detect2_yolov5.py -y ./

Object detection YoloV5 in PyTorch Ver. 0.05: Starting application...
   OpenCV virsion : 4.9.0

   - Image File   :  ../../Videos/car_m.mp4
   - YOLO v5      :  ./
   - Pretrained   :  yolov5s
   - Confidence lv:  0.25
   - Label file   :  coco.names_jp
   - Program Title:  y
   - Speed flag   :  y
   - Processed out:  non
   - Use device   :  cuda:0

YOLOv5  v7.0-294-gdb125a20 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape...

FPS average:      20.80

 Finished.
}}

- ソースコード~
#divregion(「detect2_yolov5.py」)
#codeprettify(){{
# -*- coding: utf-8 -*-
##------------------------------------------
## 【復習】「PyTorch で始める AI開発」
##   Chapter 04 / Extra edition     Ver. 0.05
##   Chapter 04 / Extra edition     Ver. 0.06
##       YoloV5 in PyTorch による物体検出
##
##               2024.09.13 Masahiro Izutsu
##------------------------------------------
## detect2_yolov5.py
##  Ver. 0.03   2024/04/09  classID=119 まで対応
##  Ver. 0.04   2024/04/13  クラウド/ローカル切り替え
##  Ver. 0.05   2024/04/15  confidence 閾値設定/カメラ入力(cam0-cam9)
##  Ver. 0.06   2024/04/30  ラベルファイルなしの対応

# -y <YOLOv5>                                   -m <Pretrained model>
#    'ultralytics/yolov5'                          'yolov5s' [yolov5n][yolov5m][yolov5l][yolov5x]      Torch Hub on line
#    '/anaconda_win/workspace_pylearn/yolov5'      '/anaconda_win/workspace_pylearn/yolov5/yolov5s'              off line
#
# 例:Windows
#       python detect2_yolov5.py                (Torch Hub on line )
#       python detect2_yolov5.py -y '/anaconda_win/workspace_pylearn/yolov5' -m '/anaconda_win/workspace_pylearn/yolov5/yolov5s'
#
# 例:Linux
#       python detect2_yolov5.py                (Torch Hub on line)
#       python detect2_yolov5.py -y '~/workspace_pylearn/yolov5' -m '~/workspace_pylearn/yolov5/yolov5s'

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'

# 定数定義
WINDOW_WIDTH = 640

from os.path import expanduser
INPUT_DEF = expanduser('../../Videos/car_m.mp4')
LANG_DEF = 'coco.names_jp'                                    # 2024/04/09

# import処理
import sys
import cv2
import numpy as np
import argparse
import torch
from torch import nn
from torchvision import transforms, models
from PIL import Image
import platform

import my_puttext                                           # my library 2024.03.13
import my_fps                                               # my library 2024.03.13
import my_color80                                           # my library 2024.03.13
import my_puttext                                           # 2024/03/13 my library
import my_fps                                               # 2024/03/13 my library
import my_color80                                           # 2024/03/13 my library
from os.path import isfile                                  # 2024/04/30 Ver. 0.06

TEXT_COLOR = my_color80.CR_white

# タイトル
title = 'Object detection YoloV5 in PyTorch Ver. 0.05'
title = 'Object detection YoloV5 in PyTorch Ver. 0.06'

# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--image', metavar = 'IMAGE_FILE', type=str,
            default = INPUT_DEF,
            help = 'Absolute path to image file or cam/cam0/cam1 for camera stream.')
    parser.add_argument('-y', '--yolov5', metavar = 'YOLOV5', type=str,
            default = 'ultralytics/yolov5',
            help = 'YOLO V5 directry absolute path.')
    parser.add_argument('-m', '--models', metavar = 'MODELS', type=str,
            default = 'yolov5s',
            help = 'yolov5n/yolov5m/yolov5l/yolov5x or model file absolute path.')
    parser.add_argument('-c', '--conf', metavar = 'CONFIDENCE',
            default = 0.25,                                 # 2024/04/14
            help = 'confidences labels Default value is 0.25')
    parser.add_argument('-l', '--labels', metavar = 'LABELS',
            default = LANG_DEF,                             # 2024/04/09
            help = 'Language.(jp/en) Default value is \'jp\'')
    parser.add_argument('-t', '--title', metavar = 'TITLE',
            default = 'y',
            help = 'Program title flag.(y/n) Default value is \'y\'')
    parser.add_argument('-s', '--speed', metavar = 'SPEED',
            default = 'y',
            help = 'Speed display flag.(y/n) Default calue is \'y\'')
    parser.add_argument('-o', '--out', metavar = 'IMAGE_OUT',
            default = 'non',
            help = 'Processed image file path. Default value is \'non\'')
    parser.add_argument("-cpu", default = False, action = 'store_true',
            help="Optional. CPU only!")
    return parser

# モデル基本情報の表示
def display_info(image, yolov5, models, conf, labels, titleflg, speedflg, outpath, use_device):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('   OpenCV virsion :',cv2.__version__)
    print('\n   - ' + YELLOW + 'Image File   : ' + NOCOLOR, image)
    print('   - ' + YELLOW + 'YOLO v5      : ' + NOCOLOR, yolov5)
    print('   - ' + YELLOW + 'Pretrained   : ' + NOCOLOR, models)
    print('   - ' + YELLOW + 'Confidence lv: ' + NOCOLOR, conf)
    print('   - ' + YELLOW + 'Label file   : ' + NOCOLOR, labels)
    print('   - ' + YELLOW + 'Program Title: ' + NOCOLOR, titleflg)
    print('   - ' + YELLOW + 'Speed flag   : ' + NOCOLOR, speedflg)
    print('   - ' + YELLOW + 'Processed out: ' + NOCOLOR, outpath)
    print('   - ' + YELLOW + 'Use device   : ' + NOCOLOR, use_device, '\n')

# 画像の種類を判別する
#   戻り値: 'jeg''png'... 画像ファイル
#           'None'        画像ファイル以外 (動画ファイル)
#           'NotFound'    ファイルが存在しない
import os
def is_pict(filename):
    '''
    try:
        imgtype = imghdr.what(filename)
    except FileNotFoundError as e:
        imgtype = 'NotFound'
    return str(imgtype)
    '''
    if not os.path.isfile(filename):
        return 'NotFound'

    types = ['.bmp','.png','.jpg','.jpeg','.JPG','.tif']
    for ss in types:
        if filename.endswith(ss):
            return ss
    return 'None'

# ** main関数 **
def main():
    # 日本語フォント指定
    fontPIL = my_puttext.get_font()                         # 2024.03.13
    fontPIL = my_puttext.get_font()                         # 2024/03/13

    # Argument parsing and parameter setting
    ARGS = parse_args().parse_args()
    input_stream = ARGS.image
    labels = ARGS.labels                                    # 2024/04/09
    titleflg = ARGS.title
    speedflg = ARGS.speed

    # 入力 cam/cam0-cam9 対応                               # 2024/04/15
    if input_stream.find('cam') == 0 and len(input_stream) < 5:
        input_stream = 0 if input_stream == 'cam' else int(input_stream[3])
        isstream = True
    else:
        filetype = is_pict(input_stream)
        isstream = filetype == 'None'
        if (filetype == 'NotFound'):
            print(RED + "\ninput file Not found." + NOCOLOR)
            quit()
    outpath = ARGS.out
    conf = ARGS.conf
    yolov5 = ARGS.yolov5 if platform.system()=='Windows' else expanduser(ARGS.yolov5)
    models = ARGS.models if platform.system()=='Windows' else expanduser(ARGS.models)
    
    # 判定ラベル
    with open(labels, 'r', encoding="utf-8") as labels_file:
        label_list = labels_file.read().splitlines()

    # GPUが使用できるか調べる
    use_device = 'cuda:0' if not ARGS.cpu and torch.cuda.is_available() else 'cpu'

    # 情報表示
    display_info(input_stream, yolov5, models, conf, labels, titleflg, speedflg, outpath, use_device)

    # TorchHubからモデルを読み込む (クラウド/ローカル切り替え)    2024/04/13
    cust = 'custom' if 0 < models.find('yolo') else ''
    if yolov5 == 'ultralytics/yolov5':
        if cust == '':
            if -1 == models.find('.'):
                model = torch.hub.load(yolov5, models)
            else:
                model = torch.hub.load(yolov5, 'custom', models)
        else:
            model = torch.hub.load(yolov5, cust, models)
    else:
        if cust == '':
            if -1 == models.find('.'):
                model = torch.hub.load(yolov5, models, source='local')
            else:
                model = torch.hub.load(yolov5, 'custom', models, source='local')
        else:
            model = torch.hub.load(yolov5, cust, models, source='local')

    # モデルを推論用に設定する
    model.eval()
    model.to(use_device)

    # 判定ラベル                                            # 2024/04/30 Ver. 0.06
    if isfile(labels):
        with open(labels, 'r', encoding="utf-8") as labels_file:
            label_list = labels_file.read().splitlines()
    else:
        label_list = model.names

    # 入力準備
    if (isstream):
        # カメラ 
        cap = cv2.VideoCapture(input_stream)
        ret, frame = cap.read()
        loopflg = cap.isOpened()
    else:
        # 画像ファイル読み込み
        frame = cv2.imread(input_stream)
        if frame is None:
            print(RED + "\nUnable to read the input." + NOCOLOR)
            quit()

        # アスペクト比を固定してリサイズ
        img_h, img_w = frame.shape[:2]
        if (img_w > WINDOW_WIDTH):
            height = round(img_h * (WINDOW_WIDTH / img_w))
            frame = cv2.resize(frame, dsize = (WINDOW_WIDTH, height))
        loopflg = True                                      # 1回ループ

    # 処理結果の記録 step1
    if (outpath != 'non'):
        if (isstream):
            fps = int(cap.get(cv2.CAP_PROP_FPS))
            out_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
            out_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
            fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
            outvideo = cv2.VideoWriter(outpath, fourcc, fps, (out_w, out_h))

    # 計測値初期化
    fpsWithTick = my_fps.fpsWithTick()
    fps_total = 0
    fpsWithTick.get()                                       # fps計測開始

    # メインループ 
    while (loopflg):
        if frame is None:
            print(RED + "\nUnable to read the input." + NOCOLOR)
            quit()

        # ニューラルネットワークを実行する
        results = model(frame, size=640)
        message = []                                        # 表示メッセージ
        bbox = results.xyxy[0].detach().cpu().numpy()
        for preds in bbox:
            xmin = int(preds[0])
            ymin = int(preds[1])
            xmax = int(preds[2])
            ymax = int(preds[3])
            confidence  = preds[4]
            class_id  = int(preds[5])
            color_id = class_id if class_id < 80 else class_id - 40 # 2024/04/09
            
            if (confidence > conf):                         # 低い確率を除外
                # オブジェクト別の色指定
                BOX_COLOR = my_color80.get_boder_bgr80(color_id)
                LABEL_BG_COLOR = my_color80.get_back_bgr80(color_id)

                # ラベル描画領域を得る
                x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
                                       text = label_list[class_id] + ': %.2f' % confidence,
                                       org = (xmin+5, ymin+18), fontFace = fontPIL,
                                       fontScale = 14,
                                       color = TEXT_COLOR,
                                       mode = 0,
                                       areaf=True)
                xx = xmax if xmax > x1 else x1              # 横が領域を超える場合は超えた値にする
                cv2.rectangle(frame,(xmin, ymin), (xx, ymin+20), LABEL_BG_COLOR, -1)
                my_puttext.cv2_putText(img = frame,
                                       text = label_list[class_id] + ': %.2f' % confidence,
                                       org = (xmin+5, ymin+18), fontFace = fontPIL,
                                       fontScale = 14,
                                       color = TEXT_COLOR,
                                       mode = 0)
                # 画像に枠を描く
                cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), BOX_COLOR, 1)

        # FPSを計算する
        fps = fpsWithTick.get()
        st_fps = 'fps: {:>6.2f}'.format(fps)
        if (speedflg == 'y'):
            cv2.rectangle(frame, (10, 38), (95, 55), (90, 90, 90), -1)
            cv2.putText(frame, st_fps, (15, 50), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.4, color=(255, 255, 255), lineType=cv2.LINE_AA)

        # タイトル描画
        if (titleflg == 'y'):
            cv2.putText(frame, title, (12, 32), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(0, 0, 0), lineType=cv2.LINE_AA)
            cv2.putText(frame, title, (10, 30), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(200, 200, 0), lineType=cv2.LINE_AA)

        # 画像表示 
        window_name = title + "  (hit 'q' or 'esc' key to exit)"
        cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL) 
        cv2.imshow(window_name, frame)

        # 処理結果の記録 step2
        if (outpath != 'non'):
            if (isstream):
                outvideo.write(frame)
            else:
                cv2.imwrite(outpath, frame)

        # 何らかのキーが押されたら終了 
        breakflg = False
        while(True):
            key = cv2.waitKey(1)
            prop_val = cv2.getWindowProperty(window_name, cv2.WND_PROP_ASPECT_RATIO)
            if cv2.getWindowProperty(window_name, cv2.WND_PROP_VISIBLE) < 1:        
                print('\n Window close !!')
                sys.exit(0)
            if key == 27 or key == 113 or (prop_val < 0.0):     # 'esc' or 'q'
                breakflg = True
                break
            if (isstream):
                break

        if ((breakflg == False) and isstream):
            # 次のフレームを読み出す
            ret, frame = cap.read()
            if ret == False:
                break
            loopflg = cap.isOpened()
        else:
            loopflg = False

    # 終了処理 
    if (isstream):
        cap.release()

        # 処理結果の記録 step3
        if (outpath != 'non'):
            if (isstream):
                outvideo.release()

    cv2.destroyAllWindows()

    print('\nFPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
    print('\n Finished.')

# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    sys.exit(main())
}}
#enddivregion

*** PyTorch モデル実行速度 [#n33ed9a0]
- 実行プログラム「python detect2_yolov5.py」 (単位:fps)~
|CENTER:100|CENTER:100|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|c
|マシン・OS|モデル|>|car_m.mp4|>|car1_mp4|>|car2.mp4|h
|~|~|>|#ref(rev_yolov5_03_m.jpg,right,around,12%,rev_yolov5_03_m.jpg)|>|#ref(rev_yolov5_07_m.jpg,right,around,24%,rev_yolov5_07_m.jpg)|>|#ref(rev_yolov5_08_m.jpg,right,around,24%,rev_yolov5_08_m.jpg)|h
|~|~|GPU|CPU|GPU|CPU|GPU|CPU|h
|'''HP ENVY'''&br;&br;Windows&br;11|yolov5n|BGCOLOR(lightyellow):32.2|15.9|BGCOLOR(lightyellow):49.7|17.9|BGCOLOR(lightyellow):56.0|20.7|
|~|BGCOLOR(ivory):yolov5s|BGCOLOR(lightyellow):31.3|12.7|BGCOLOR(lightyellow):38.7|14.8|BGCOLOR(lightyellow):48.9|15.3|
|~|yolov5m|BGCOLOR(lightyellow):28.8|8.7|BGCOLOR(lightyellow):31.8|9.4|BGCOLOR(lightyellow):42.4|9.5|
|~|yolov5l|BGCOLOR(lightyellow):25.1|5.7|BGCOLOR(lightyellow):31.5|5.9|BGCOLOR(lightyellow):32.0|6.0|
|~|yolov5x|BGCOLOR(lightyellow):23.8|3.9|BGCOLOR(lightyellow):30.8|4.0|BGCOLOR(lightyellow):31.8|4.1|
|BGCOLOR(gold):'''HP ENVY'''&br;&br;Ubuntu&br;22.04LTS|yolov5n|BGCOLOR(lightyellow):64.0|30.0|BGCOLOR(lightyellow):86.0|34.0|BGCOLOR(lightyellow):91.0|37.0|
|~|BGCOLOR(ivory):yolov5s|BGCOLOR(lightyellow):53.9|21.1|BGCOLOR(lightyellow):72.3|25.5|BGCOLOR(lightyellow):87.5|27.1|
|~|yolov5m|BGCOLOR(lightyellow):49.8|13.0|BGCOLOR(lightyellow):63.2|14.7|BGCOLOR(lightyellow):78.0|16.0|
|~|yolov5l|BGCOLOR(lightyellow):44.3|8.3|BGCOLOR(lightyellow):54.7|8.1|BGCOLOR(lightyellow):70.7|8.9|
|~|yolov5x|BGCOLOR(lightyellow):37.7|5.4|BGCOLOR(lightyellow):46.4|4.9|BGCOLOR(lightyellow):57.4|5.1|
|'''HP ELITE'''&br;&br;Windows&br;10|yolov5n|BGCOLOR(lightyellow):27.3|9.4|BGCOLOR(lightyellow):39.4|10.6|BGCOLOR(lightyellow):48.3|11.1|
|~|BGCOLOR(ivory):yolov5s|BGCOLOR(lightyellow):19.6|5.4|BGCOLOR(lightyellow):27.9|5.9|BGCOLOR(lightyellow):30.2|6.3|
|~|yolov5m|BGCOLOR(lightyellow):15.4|2.9|BGCOLOR(lightyellow):18.5|3.1|BGCOLOR(lightyellow):22.3|3.2|
|~|yolov5l|BGCOLOR(lightyellow):11.1|1.7|BGCOLOR(lightyellow):12.7|1.7|BGCOLOR(lightyellow):14.8|1.8|
|~|yolov5x|BGCOLOR(lightyellow):7.6|1.0|BGCOLOR(lightyellow):8.3|1.0|BGCOLOR(lightyellow):9.2|1.4|
|'''DELL Latitude'''&br;&br;Ubuntu&br;20.04LTS|yolov5n|BGCOLOR(lightyellow):-|10.8|BGCOLOR(lightyellow):-|13.7|BGCOLOR(lightyellow):-|14.7|
|~|BGCOLOR(ivory):yolov5s|BGCOLOR(lightyellow):-|8.9|BGCOLOR(lightyellow):-|8.6|BGCOLOR(lightyellow):-|9.4|
|~|yolov5m|BGCOLOR(lightyellow):-|3.9|BGCOLOR(lightyellow):-|4.2|BGCOLOR(lightyellow):-|4.5|
|~|yolov5l|BGCOLOR(lightyellow):-|2.3|BGCOLOR(lightyellow):-|2.4|BGCOLOR(lightyellow):-|2.7|
|~|yolov5x|BGCOLOR(lightyellow):-|1.5|BGCOLOR(lightyellow):-|1.5|BGCOLOR(lightyellow):-|1.6|
・テストコマンド:yolov5n モデルの例~
#codeprettify(){{
(py_learn) python detect2_yolov5.py -i ../../Videos/car_m.mp4 -m yolov5n
(py_learn) python detect2_yolov5.py -i ../../Videos/car_m.mp4 -m yolov5n -cpu
(py_learn) python detect2_yolov5.py -i ../../Videos/car1_m.mp4 -m yolov5n
(py_learn) python detect2_yolov5.py -i ../../Videos/car1_m.mp4 -m yolov5n -cpu
(py_learn) python detect2_yolov5.py -i ../../Videos/car2_m.mp4 -m yolov5n
(py_learn) python detect2_yolov5.py -i ../../Videos/car2_m.mp4 -m yolov5n -cpu
}}
- テスト環境(Intel® CPU / NVIDIA GPU)~
|CENTER:機種|CENTER:OS|CENTER:CPU|CENTER:GPU|h
|HP ENVY Desktop TE02-1097jp|Windows11/Ubuntu22.04LTS|13th Gen Core™ i9-13900|GeForce RTX 4070 Ti 12GB|
|HP EliteDesk 800 G2 SFF|Windows10|6 th Gen Core™ i7-6700|GeForce GTX 1050 Ti 4GB|
|DELL Latitude 7520 NoteBook|Ubuntu20.04LTS|11th Gen Core™ i7-1185G7|CENTER:-|

*** モデルによる推論結果の違い [#zff23b95]
- 学習済みモデル 「yolov5n(軽)」→「yolov5x(重)」(「detect2.py」による実行例)~
|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|c
|元画像|yolov5n|yolov5s|yolov5m|yolov5l|yolov5x|h
|#ref(img01.jpg,right,around,18%,img01.jpg)|#ref(img01n.jpg,right,around,18%,img01n.jpg)|#ref(img01s.jpg,right,around,18%,img01s.jpg)|#ref(img01m.jpg,right,around,18%,img01m.jpg)|#ref(img01l.jpg,right,around,18%,img01l.jpg)|#ref(img01x.jpg,right,around,18%,img01x.jpg)|
|#ref(img06.jpg,right,around,18%,img06.jpg)|#ref(img06n.jpg,right,around,18%,img06n.jpg)|#ref(img06s.jpg,right,around,18%,img06s.jpg)|#ref(img06m.jpg,right,around,18%,img06m.jpg)|#ref(img06l.jpg,right,around,18%,img06l.jpg)|#ref(img06x.jpg,right,around,18%,img06x.jpg)|
|#ref(img07.jpg,right,around,18%,img07.jpg)|#ref(img07n.jpg,right,around,18%,img07n.jpg)|#ref(img07s.jpg,right,around,18%,img07s.jpg)|#ref(img07m.jpg,right,around,18%,img07m.jpg)|#ref(img07l.jpg,right,around,18%,img07l.jpg)|#ref(img07x.jpg,right,around,18%,img07x.jpg)|
|#ref(img13.jpg,right,around,18%,img13.jpg)|#ref(img13n.jpg,right,around,18%,img13n.jpg)|#ref(img13s.jpg,right,around,18%,img13s.jpg)|#ref(img13m.jpg,right,around,18%,img13m.jpg)|#ref(img13l.jpg,right,around,18%,img13l.jpg)|#ref(img13x.jpg,right,around,18%,img13x.jpg)|
|#ref(img20.jpg,right,around,18%,img20.jpg)|#ref(img20n.jpg,right,around,18%,img20n.jpg)|#ref(img20s.jpg,right,around,18%,img20s.jpg)|#ref(img20m.jpg,right,around,18%,img20m.jpg)|#ref(img20l.jpg,right,around,18%,img20l.jpg)|#ref(img20x.jpg,right,around,18%,img20x.jpg)|

*** YOLO V5 / YOLO V3 比較 [#b38887ee]
- 今回(V5)の結果と以前(V3)の結果を比較する~
~
&tinyvideo(https://izutsu.aa0.netvolante.jp/video/ai_result/detect_yolov5.mp4,320 180,controls,loop,muted,autoplay);
&tinyvideo(https://izutsu.aa0.netvolante.jp/video/ai_result/object_detect_yolo3.mp4,320 180,controls,loop,muted,autoplay);~
信号機など V3で検出されなかったオブジェクトが検出できている。~
#clear
#br

** YOLO V5 を「OpenVINO™」で使う [#wf6af861]
*** OpenVINO™ API 2.0 対応方法を調べる [#n0915dc0]
+ サンプルデモのインストール~
・&color(red){実行ディレクトリ「workspace_pylearn/」};~
#codeprettify(){{
(py_learn) git clone https://github.com/violet17/yolov5_demo.git
}}
・実行ログ~
#codeprettify(){{
(py_learn) git clone https://github.com/violet17/yolov5_demo.git
Cloning into 'yolov5_demo'...
remote: Enumerating objects: 31, done.
remote: Counting objects: 100% (31/31), done.
remote: Compressing objects: 100% (30/30), done.
remote: Total 31 (delta 13), reused 0 (delta 0), pack-reused 0Receiving objects:  58% (18/31)
Receiving objects: 100% (31/31), 59.87 KiB | 5.99 MiB/s, done.
Resolving deltas: 100% (13/13), done.
}}
+ 冒頭の [[update_20240405.zip>https://izutsu.aa0.netvolante.jp/download/linux/update_20240405.zip]] を解凍してできた「workspace_pylearn/yolov5_demo」を「git clone」でできた「yolov5_demo」にコピーする~
~
+ オリジナルのデモプログラムを動かす~
・&color(red){実行ディレクトリ「workspace_pylearn/yolov5_demo/」};~
#codeprettify(){{
(py_learn2) python yolov5_demo_sync_ov2023.py -i ../yolov5/data/images/zidane.jpg -m ../yolov5/yolov5s_openvino_model/yolov5s.xml
}}
・オリジナルの API2.0 対応デモプログラム「yolov5_demo_sync_ov2023.py」~
#codeprettify(){{
(py_learn2) python yolov5_demo_sync_ov2023.py -i ../yolov5/data/images/zidane.jpg -m ../yolov5/yolov5s_openvino_model/yolov5s.xml
[ INFO ] Creating OpenVINO Runtime Core...
[ INFO ] Reading the model:
        ../yolov5/yolov5s_openvino_model/yolov5s.xml
[ INFO ] Preparing inputs
*********** [1,3,640,640]
--------- ../yolov5/data/images/zidane.jpg
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference...
[ INFO ]          classes : 80
[ INFO ]          num     : 3
[ INFO ]          coords  : 4
[ INFO ]          anchors : [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0, 198.0, 373.0, 326.0]
Traceback (most recent call last):
  File "C:\anaconda_win\workspace_pylearn\yolov5_demo\yolov5_demo_sync_ov2023.py", line 349, in <module>
    sys.exit(main() or 0)
             ^^^^^^
  File "C:\anaconda_win\workspace_pylearn\yolov5_demo\yolov5_demo_sync_ov2023.py", line 281, in main
    objects += parse_yolo_region(out_blob, in_frame.shape[2:],
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\anaconda_win\workspace_pylearn\yolov5_demo\yolov5_demo_sync_ov2023.py", line 153, in parse_yolo_region
    out_blob_n, out_blob_c, out_blob_h, out_blob_w = blob.shape
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: not enough values to unpack (expected 4, got 3)
}}
 ※ &color(red){変換した学習済みモデルには対応できないよう};~
~
+ 以前 [[物体検出アルゴリズム「YOLO V5」>YOLOv5]] で使用したモデル(V3) で動かしてみる~
・学習済みモデルを「workspace_pylearn/yolov5_demo/」内に「yolov5s_v3.xml」「yolov5s_v3.bin」の名前で用意しておく~
 → [['''GitHUNB: ultralytics/yolov5 V3'''>+https://github.com/ultralytics/yolov5/releases/tag/v3.0]]~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023.py -i ../yolov5/data/images/zidane.jpg -m yolov5s_v3.xml -show
}}
・実行結果~
#ref(YOLOv5/yolov5_openvino3_m.jpg,right,around,25%,yolov5_openvino_m3.jpg)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023.py -i ../yolov5/data/images/zidane.jpg -m yolov5s_v3.xml -show
[ INFO ] Creating OpenVINO Runtime Core...
[ INFO ] Reading the model:
        yolov5s_v3.xml
[ INFO ] Preparing inputs
*********** [1,3,640,640]
--------- ../yolov5/data/images/zidane.jpg
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference...
[ INFO ]          classes : 80
[ INFO ]          num     : 3
[ INFO ]          coords  : 4
[ INFO ]          anchors : [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0, 198.0, 373.0, 326.0]
[ INFO ]          classes : 80
[ INFO ]          num     : 3
[ INFO ]          coords  : 4
[ INFO ]          anchors : [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0, 198.0, 373.0, 326.0]
[ INFO ]          classes : 80
[ INFO ]          num     : 3
[ INFO ]          coords  : 4
[ INFO ]          anchors : [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0, 198.0, 373.0, 326.0]
(720, 1280)
}}
 ※ &color(green){以前の学習済みモデル(V3) では問題なく動作する};
#clear
~
+ プログラムを改良する ''「yolov5_demo_sync_ov2023x.py」''~
・「yolov5_demo_sync_ov2023.py」を「yolov5_demo_sync_ov2023x.py」としてコピーし修正する~
 (プロジェクト・パッケージ「update_20240405.zip」に同梱)~
・表示出力を日本語対応にする~
・不具合修正(キー入力による中断など)~
・コマンドパラメータを修正して使いやすくする~
~
・ラベルファイルをコピーしておく~
#codeprettify(){{
(py_learn) cp ../yolov5/coco.names ./
(py_learn) cp ../yolov5/coco.names_jp ./
}}
・修正した「yolov5_demo_sync_ov2023x.py」の実行~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../yolov5/data/images/zidane.jpg -r
}}
・実行結果~
#ref(YOLOv5/yolov5_openvino_m.jpg,right,around,25%,yolov5_openvino_m.jpg)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../yolov5/data/images/zidane.jpg -r

--- YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01 ---
OpenCV: 4.9.0
OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0

 Creating OpenVINO Runtime Core...
 Reading the model: yolov5s_v3.xml
 Label file  : coco.names_jp
 Input source: ../yolov5/data/images/zidane.jpg
 Starting inference...
[ INFO ]
Detected boxes for batch 1:
[ INFO ]  Class ID      | Confidence | XMIN | YMIN | XMAX | YMAX | COLOR
[ INFO ]     人         |   0.873057 |  747 |   39 | 1148 |  711 | (0, 80, 0)
[ INFO ]     人         |   0.816089 |  116 |  197 | 1003 |  711 | (0, 80, 0)
[ INFO ]   ネクタイ     |   0.778782 |  422 |  430 |  517 |  719 | (128, 0, 128)

 FPS average:      11.80

 Finished.
}}
#clear
・動画入力の実行~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Videos/car1_m.mp4
}}
・実行結果~
#ref(YOLOv5/yolov5_openvino2_m.jpg,right,around,15%,yolov5_openvino2_m.jpg)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Videos/car1_m.mp4

--- YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01 ---
OpenCV: 4.9.0
OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0

 Creating OpenVINO Runtime Core...
 Reading the model: yolov5s_v3.xml
 Label file  : coco.names_jp
 Input source: ../../Videos/car1_m.mp4
 Starting inference...

 FPS average:       9.20

 Finished.
}}
#clear
・カメラ入力の実行~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py
}}
・実行結果~
#ref(rev_yolov5_04_m.jpg,right,around,25%,rev_yolov5_04_m.jpg)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py

--- YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01 ---
OpenCV: 4.9.0
OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0

 Creating OpenVINO Runtime Core...
 Reading the model: yolov5s_v3.xml
 Label file  : coco.names_jp
 Input source: 0
 Starting inference...

 FPS average:      10.40

 Finished.
}}
#clear

- 主なコマンドパラメータ~
|LEFT:|CENTER:|LEFT:|c
|CENTER:コマンド・オプション|初期値|CENTER:意味|h
| -i , --input|'cam'|入力ソースのパス or cam/cam0/cam1|
| -m , --mode|'yolov5s_v3.xml'|学習済みモデルのパス|
| -d , --device|'CPU'|推論デバイス(CPU, GPU, FPGA, HDDL or MYRIAD)|
| --labels|'coco.names_jp'|ラベルファイルのパス(coco.name, coco_name_jp)|
| -show|-|表示禁止フラグ(指定すると画面表示をしない)|
| -r, --raw_output_message|-|メッセージ出力フラグ|
| -x, --debug_message|-|デバッグ・メッセージ出力フラグ|
#divregion( コマンドパラメータ詳細)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -h

--- YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01 ---
OpenCV: 4.9.0
OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0

usage: yolov5_demo_sync_ov2023x.py [-h] [-m MODEL] [-i INPUT] [-l CPU_EXTENSION] [-d DEVICE] [--labels LABELS]
                                   [-t PROB_THRESHOLD] [-iout IOU_THRESHOLD] [-ni NUMBER_ITER] [-pc] [-r] [-x] [-show]

Options:
  -h, --help            Show this help message and exit.
  -m MODEL, --model MODEL
                        Required. Path to an .xml file with a trained model.
  -i INPUT, --input INPUT
                        Required. Path to an image/video file. (Specify 'cam' to work with camera)
  -l CPU_EXTENSION, --cpu_extension CPU_EXTENSION
                        Optional. Required for CPU custom layers. Absolute path to a shared library with the kernels
                        implementations.
  -d DEVICE, --device DEVICE
                        Optional. Specify the target device to infer on; CPU, GPU, FPGA, HDDL or MYRIAD is acceptable.
                        The sample will look for a suitable plugin for device specified. Default value is CPU
  --labels LABELS       Optional. Labels mapping file
  -t PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
                        Optional. Probability threshold for detections filtering
  -iout IOU_THRESHOLD, --iou_threshold IOU_THRESHOLD
                        Optional. Intersection over union threshold for overlapping detections filtering
  -ni NUMBER_ITER, --number_iter NUMBER_ITER
                        Optional. Number of inference iterations
  -pc, --perf_counts    Optional. Report performance counters
  -r, --raw_output_message
                        Optional. Output inference results raw values showing
  -x, --debug_message   Optional. Output debug values showing
  -show                 Optional. Hide output view
}}
#enddivregion

- ソースコード~
#divregion( 「yolov5_demo_sync_ov2023x.py」)
#codeprettify(){{
#!/usr/bin/env python
# -*- coding: utf-8 -*-
##------------------------------------------
## YOLO V5 OpenVINO demoprogram  Ver 0.01 
##   GitHub https://github.com/violet17/yolov5_demo
##
##               2024.03.18 Masahiro Izutsu
##------------------------------------------
## yolov5_demo_sync_ov2023x.py  (original: yolov5_demo_sync_ov2023.py)
##
## 修正箇所:
## ・検出したオブジェクトの表示の日本語対応と表示色
## ・キー入力による中断の不具合修正
## ・コンソール出力、ログ出力の変更
## ・入力パラメータの改良

## from (original: yolov5_demo_sync_ov2023.py)
"""
 Copyright (C) 2018-2019 Intel Corporation

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
"""

## --- yolov5_demo_sync_ov2023x.py ---

# インポート処理
from __future__ import print_function, division

import logging
import os
import sys
from argparse import ArgumentParser, SUPPRESS
from math import exp as exp
from time import time
import numpy as np

import cv2
from openvino.preprocess import PrePostProcessor, ResizeAlgorithm
from openvino.runtime import Core, Layout, Type
import openvino.runtime as ov

import my_puttext                                               # 2024/03/18
import my_color80                                               # 2024/03/18
import my_fps                                                   # 2024/03/18

#import object_check                                             # 2024/03/20

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
CYAN = '\033[1;36m'

# 定数定義
TEXT_COLOR = my_color80.CR_white                                # 2024/03/18
DEF_MODEL_PATH = os.path.expanduser('yolov5s_v3.xml')
DEF_LABEL_PATH = os.path.expanduser('coco.names_jp')
DEF_INPUT_PATH = os.path.expanduser('cam')

# タイトル・バージョン情報
title = 'YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01'
print(CYAN + '\n--- {} ---'.format(title))
print(GREEN + 'OpenCV:',cv2.__version__)
print("OpenVINO inference_engine:", ov.get_version())
print(NOCOLOR)

logging.basicConfig(format="[ %(levelname)s ] %(message)s", level=logging.INFO, stream=sys.stdout)
log = logging.getLogger()

def build_argparser():
    parser = ArgumentParser(add_help=False)
    args = parser.add_argument_group('Options')
    args.add_argument('-h', '--help', action='help', default=SUPPRESS, help='Show this help message and exit.')
    args.add_argument("-m", "--model", help="Required. Path to an .xml file with a trained model.",
                      default=DEF_MODEL_PATH, type=str)
    args.add_argument("-i", "--input", help="Required. Path to an image/video file. (Specify 'cam' to work with "
                                            "camera)", default=DEF_INPUT_PATH, type=str)
    args.add_argument("-l", "--cpu_extension",
                      help="Optional. Required for CPU custom layers. Absolute path to a shared library with "
                           "the kernels implementations.", type=str, default=None)
    args.add_argument("-d", "--device",
                      help="Optional. Specify the target device to infer on; CPU, GPU, FPGA, HDDL or MYRIAD is"
                           " acceptable. The sample will look for a suitable plugin for device specified. "
                           "Default value is CPU", default="CPU", type=str)
    args.add_argument("--labels", help="Optional. Labels mapping file", default=DEF_LABEL_PATH, type=str)
    args.add_argument("-t", "--prob_threshold", help="Optional. Probability threshold for detections filtering",
                      default=0.5, type=float)
    args.add_argument("-iout", "--iou_threshold", help="Optional. Intersection over union threshold for overlapping "
                                                       "detections filtering", default=0.4, type=float)
    args.add_argument("-ni", "--number_iter", help="Optional. Number of inference iterations", default=1, type=int)
    args.add_argument("-pc", "--perf_counts", help="Optional. Report performance counters", default=False,
                      action="store_true")
    args.add_argument("-r", "--raw_output_message", help="Optional. Output inference results raw values showing",
                      default=False, action="store_true")                   # 2024/03/18
    args.add_argument("-x", "--debug_message", help="Optional. Output debug values showing",
                      default=False, action="store_true")
    args.add_argument("-show", help="Optional. Hide output view", default=True, action='store_false')
    return parser


class YoloParams:
    # ------------------------------------------- Extracting layer parameters ------------------------------------------
    # Magic numbers are copied from yolo samples
    def __init__(self,  side):
        self.num = 3 #if 'num' not in param else int(param['num'])
        self.coords = 4 #if 'coords' not in param else int(param['coords'])
        self.classes = 80 #if 'classes' not in param else int(param['classes'])
        self.side = side
        self.anchors = [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0,
                        198.0,
                        373.0, 326.0] #if 'anchors' not in param else [float(a) for a in param['anchors'].split(',')]

    def log_params(self):
        params_to_print = {'classes': self.classes, 'num': self.num, 'coords': self.coords, 'anchors': self.anchors}
        [log.info("         {:8}: {}".format(param_name, param)) for param_name, param in params_to_print.items()]


def letterbox(img, size=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
    shape = img.shape[:2]  # current shape [height, width]
    w, h = size

    # Scale ratio (new / old)
    r = min(h / shape[0], w / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better test mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = w - new_unpad[0], h - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (w, h)
        ratio = w / shape[1], h / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border

    top2, bottom2, left2, right2 = 0, 0, 0, 0
    if img.shape[0] != h:
        top2 = (h - img.shape[0])//2
        bottom2 = top2
        img = cv2.copyMakeBorder(img, top2, bottom2, left2, right2, cv2.BORDER_CONSTANT, value=color)  # add border
    elif img.shape[1] != w:
        left2 = (w - img.shape[1])//2
        right2 = left2
        img = cv2.copyMakeBorder(img, top2, bottom2, left2, right2, cv2.BORDER_CONSTANT, value=color)  # add border
    return img


def scale_bbox(x, y, height, width, class_id, confidence, im_h, im_w, resized_im_h=640, resized_im_w=640):
    gain = min(resized_im_w / im_w, resized_im_h / im_h)  # gain  = old / new
    pad = (resized_im_w - im_w * gain) / 2, (resized_im_h - im_h * gain) / 2  # wh padding
    x = int((x - pad[0])/gain)
    y = int((y - pad[1])/gain)

    w = int(width/gain)
    h = int(height/gain)
 
    xmin = max(0, int(x - w / 2))
    ymin = max(0, int(y - h / 2))
    xmax = min(im_w, int(xmin + w))
    ymax = min(im_h, int(ymin + h))
    # Method item() used here to convert NumPy types to native types for compatibility with functions, which don't
    # support Numpy types (e.g., cv2.rectangle doesn't support int64 in color parameter)
    return dict(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, class_id=class_id.item(), confidence=confidence.item())


def entry_index(side, coord, classes, location, entry):
    side_power_2 = side ** 2
    n = location // side_power_2
    loc = location % side_power_2
    return int(side_power_2 * (n * (coord + classes + 1) + entry) + loc)


def parse_yolo_region(blob, resized_image_shape, original_im_shape, params, threshold):
    # --- Validating output parameters ---
    out_blob_n, out_blob_c, out_blob_h, out_blob_w = blob.shape
    predictions = 1.0/(1.0+np.exp(np.zeros(blob.shape)-blob)) 
                   
    assert out_blob_w == out_blob_h, "Invalid size of output blob. It sould be in NCHW layout and height should " \
                                     "be equal to width. Current height = {}, current width = {}" \
                                     "".format(out_blob_h, out_blob_w)

    # --- Extracting layer parameters ---
    orig_im_h, orig_im_w = original_im_shape
    resized_image_h, resized_image_w = resized_image_shape
    objects = list()

    side_square = params.side * params.side

    # --- Parsing YOLO Region output ---
    bbox_size = int(out_blob_c/params.num) #4+1+num_classes

    for row, col, n in np.ndindex(params.side, params.side, params.num):
        bbox = predictions[0, n*bbox_size:(n+1)*bbox_size, row, col]
        
        x, y, width, height, object_probability = bbox[:5]
        class_probabilities = bbox[5:]
        if object_probability < threshold:
            continue
        x = (2*x - 0.5 + col)*(resized_image_w/out_blob_w)
        y = (2*y - 0.5 + row)*(resized_image_h/out_blob_h)
        if int(resized_image_w/out_blob_w) == 8 & int(resized_image_h/out_blob_h) == 8: #80x80, 
            idx = 0
        elif int(resized_image_w/out_blob_w) == 16 & int(resized_image_h/out_blob_h) == 16: #40x40
            idx = 1
        elif int(resized_image_w/out_blob_w) == 32 & int(resized_image_h/out_blob_h) == 32: # 20x20
            idx = 2

        width = (2*width)**2* params.anchors[idx * 6 + 2 * n]
        height = (2*height)**2 * params.anchors[idx * 6 + 2 * n + 1]
        class_id = np.argmax(class_probabilities)
        confidence = object_probability
        objects.append(scale_bbox(x=x, y=y, height=height, width=width, class_id=class_id, confidence=confidence,
                                  im_h=orig_im_h, im_w=orig_im_w, resized_im_h=resized_image_h, resized_im_w=resized_image_w))
    return objects


def intersection_over_union(box_1, box_2):
    width_of_overlap_area = min(box_1['xmax'], box_2['xmax']) - max(box_1['xmin'], box_2['xmin'])
    height_of_overlap_area = min(box_1['ymax'], box_2['ymax']) - max(box_1['ymin'], box_2['ymin'])
    if width_of_overlap_area < 0 or height_of_overlap_area < 0:
        area_of_overlap = 0
    else:
        area_of_overlap = width_of_overlap_area * height_of_overlap_area
    box_1_area = (box_1['ymax'] - box_1['ymin']) * (box_1['xmax'] - box_1['xmin'])
    box_2_area = (box_2['ymax'] - box_2['ymin']) * (box_2['xmax'] - box_2['xmin'])
    area_of_union = box_1_area + box_2_area - area_of_overlap
    if area_of_union == 0:
        return 0
    return area_of_overlap / area_of_union


def main():
    # 日本語フォント指定
    fontPIL = my_puttext.get_font()                             # 2024/03/18

    args = build_argparser().parse_args()


    # --- 1. Plugin initialization for specified device and load extensions library if specified ---
    print(' Creating OpenVINO Runtime Core...')
    core = Core()

    # --- 2. Reading the IR generated by the Model Optimizer (.xml and .bin files) ---
    model = args.model
    print(f" Reading the model: {model}")
    model = core.read_model(model)

    assert len(model.inputs) == 1, "Sample supports only single input topologies"

    # --- 4. Preparing inputs ---
    if args.debug_message:                                      # 2024/03/18
        log.info("Preparing inputs")

    # Read and pre-process input images
    n, c, h, w = model.inputs[0].shape

    if args.labels and os.path.isfile(args.labels):
        with open(args.labels, 'r', encoding="utf-8") as f:     # 2024/03/18
            labels_map = [x.strip() for x in f]
    else:
        labels_map = None
    print(f" Label file  : {args.labels}")                     # 2024/03/18

#    input_stream = 0 if args.input == "cam" else args.input
    if args.input.lower() == "cam" or args.input.lower() == "cam0":
        input_stream = 0
    elif args.input.lower() == "cam1":
        input_stream = 1
    else:
        input_stream = args.input

    print(f" Input source: {input_stream}")                     # 2024/03/18
    cap = cv2.VideoCapture(input_stream)
    number_input_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    number_input_frames = 1 if number_input_frames != -1 and number_input_frames < 0 else number_input_frames

    wait_key_code = 1                                           # 2024/03/18

    # Number of frames in picture is 1 and this will be read in cycle. Sync mode is default value for this case
    if number_input_frames != 1:
        ret, frame = cap.read()
    else:
        wait_key_code = 0                                       # 2024/03/18

    # --- 5. Loading model to the plugin ---
    if args.debug_message:                                      # 2024/03/18
        log.info("Loading model to the plugin")
    compiled_model = core.compile_model(model, device_name=args.device)

    render_time = 0
    parsing_time = 0

    # --- 6. Doing inference ---
    print(" Starting inference...")

    # 計測値初期化
    fpsWithTick = my_fps.fpsWithTick()
    fpsWithTick.get()                                            # fps計測開始

    # メインループ 
    while cap.isOpened():

        ret, frame = cap.read()
        if not ret:
            break
        in_frame = letterbox(frame, (w, h))

        in_frame0 = in_frame
        # resize input_frame to network size
        in_frame = in_frame.transpose((2, 0, 1))  # Change data layout from HWC to CHW
        in_frame = in_frame.reshape((n, c, h, w))

        # Start inference
        start_time = time()
        results = compiled_model.infer_new_request({0: in_frame})
        det_time = time() - start_time

        objects = list()
        for idx in range(len(results)):
            out_blob = results[idx]
            layer_params = YoloParams(side=out_blob.shape[2])

            # オブジェクト・チェック(DEBUG)                   # 2024/03/20
#            if args.debug_message:
#                object_check.chk_object(results, 'results')
#                object_check.chk_object(out_blob, 'out_blob')


            # ログを表示    2024/03/18
            if args.debug_message:                              # 2024/03/18
                layer_params.log_params()

            objects += parse_yolo_region(out_blob, in_frame.shape[2:],
                                            frame.shape[:-1], layer_params,
                                            args.prob_threshold)
            parsing_time = time() - start_time

        # Filtering overlapping boxes with respect to the --iou_threshold CLI parameter
        objects = sorted(objects, key=lambda obj : obj['confidence'], reverse=True)
        for i in range(len(objects)):
            if objects[i]['confidence'] == 0:
                continue
            for j in range(i + 1, len(objects)):
                if intersection_over_union(objects[i], objects[j]) > args.iou_threshold:
                    objects[j]['confidence'] = 0

        # Drawing objects with respect to the --prob_threshold CLI parameter
        objects = [obj for obj in objects if obj['confidence'] >= args.prob_threshold]

        if len(objects) and args.raw_output_message:
            log.info("\nDetected boxes for batch {}:".format(1))
            log.info(" Class ID \t| Confidence | XMIN | YMIN | XMAX | YMAX | COLOR ")

        origin_im_size = frame.shape[:-1]
        for obj in objects:
            # Validation bbox of detected object
            if obj['xmax'] > origin_im_size[1] or obj['ymax'] > origin_im_size[0] or obj['xmin'] < 0 or obj['ymin'] < 0:
                continue
#            color = (int(min(obj['class_id'] * 12.5, 255)),
#                     min(obj['class_id'] * 7, 255), min(obj['class_id'] * 5, 255))

            # オブジェクト別の色指定
            BOX_COLOR = my_color80.get_boder_bgr80(obj['class_id'])
            LABEL_BG_COLOR = my_color80.get_back_bgr80(obj['class_id'])

            det_label = labels_map[obj['class_id']] if labels_map and len(labels_map) >= obj['class_id'] else \
                str(obj['class_id'])

            if args.raw_output_message:
                log.info(
                    "{:^9} \t| {:10f} | {:4} | {:4} | {:4} | {:4} | {} ".format(det_label, obj['confidence'], obj['xmin'],
                                                                              obj['ymin'], obj['xmax'], obj['ymax'],
                                                                              BOX_COLOR))
            # ラベル描画領域を得る
            x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
                   text = det_label + ' ' + str(round(obj['confidence'] * 100, 1)) + ' %',
                   org = (obj['xmin'], obj['ymin'] - 7), fontFace = fontPIL,
                   fontScale = 14,
                   color = TEXT_COLOR,
                   mode = 0,
                   areaf=True)
            xx = obj['xmax'] if obj['xmax'] > x1 else x1              # 横が領域を超える場合は超えた値にする
            cv2.rectangle(frame, (obj['xmin'], obj['ymin']-26), (xx, obj['ymin']), LABEL_BG_COLOR, -1)

            my_puttext.cv2_putText(img = frame,
                   text = det_label + ' ' + str(round(obj['confidence'] * 100, 1)) + ' %',
                   org = (obj['xmin'], obj['ymin'] - 7), fontFace = fontPIL,
                   fontScale = 14,
                   color = TEXT_COLOR,
                   mode = 0)

            # 画像に枠を描く
            cv2.rectangle(frame, (obj['xmin'], obj['ymin']), (obj['xmax'], obj['ymax']), BOX_COLOR, 2)

        # FPSを計算する
        fps = fpsWithTick.get()

        # Draw performance stats over frame
        inf_time_message = "Inference time: {:.3f} ms".format(det_time * 1e3)
        render_time_message = "OpenCV rendering time: {:.3f} ms".format(render_time * 1e3)
        parsing_message = "YOLO parsing time is {:.3f} ms".format(parsing_time * 1e3)

        # 文字の影
        cv2.putText(frame, inf_time_message, (15+1, 15+1), cv2.FONT_HERSHEY_COMPLEX, 0.5, (255, 255, 255), 1)
        cv2.putText(frame, render_time_message, (15+1, 45+1), cv2.FONT_HERSHEY_COMPLEX, 0.5, (255, 255, 255), 1)
        cv2.putText(frame, parsing_message, (15+1, 30+1), cv2.FONT_HERSHEY_COMPLEX, 0.5, (255, 255, 255), 1)
        # 文字描画
        cv2.putText(frame, inf_time_message, (15, 15), cv2.FONT_HERSHEY_COMPLEX, 0.5, (200, 10, 10), 1)
        cv2.putText(frame, render_time_message, (15, 45), cv2.FONT_HERSHEY_COMPLEX, 0.5, (10, 10, 200), 1)
        cv2.putText(frame, parsing_message, (15, 30), cv2.FONT_HERSHEY_COMPLEX, 0.5, (10, 10, 200), 1)

        start_time = time()
        if args.show:
            window_name = "DetectionResults (hit 'q' or 'esc' key to exit)"
            cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL) 
            cv2.imshow(window_name, frame)
        render_time = time() - start_time
        cv2.imwrite("results.jpg", frame)

        if args.show:
            key = cv2.waitKey(wait_key_code)        # 2024/03/18

            # ESC key
            if key == 27 or key == 113:             # 'esc' or 'q'
                break

    cv2.destroyAllWindows()

    print('\n FPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
    print('\n Finished.')


if __name__ == '__main__':
    sys.exit(main() or 0)
}}
#enddivregion

*** YOLO V5 学習済みモデル バージョンによる違い [#l89a7f4d]
 「yolov5_demo_sync_ov2023x.py」がエラーとなる原因を調べる~
- オブジェクトの属性エラーのようなのでチェックプログラムを作成~
 (プロジェクト・パッケージ「update_20240405.zip」に同梱)~
#divregion( 「object_check.py」)
#codeprettify(){{
# -*- coding: utf-8 -*-
##------------------------------------------
##   My Library Object Check  Ver 0.01
##
##               2024.03.15 Masahiro Izutsu
##------------------------------------------
## object_check.py.py

# オブジェクトの表示
def chk_obj(obj):
    print(f'\n■■ obj  ■■\n{obj}')

# オブジェクトの型
def chk_type(obj):
    print(f'\n■■ type ■■\n{type(obj)}')

# 次元の確認
def chk_shape(obj):
    print(f'\n■■ shape ■■')
    try:
        print(f'\n{obj.shape}')
    except AttributeError as e:
        print(e)

# サイズの確認
def chk_size(obj):
    print(f'\n■■ size ■■')
    try:
        print(f'\n{obj.size}')
    except AttributeError as e:
        print(e)

# 辞書のキーを出力
def chk_keys(obj):
    print(f'\n■■ keys ■■')
    try:
        print(f'\n{obj.keys}')
    except AttributeError as e:
        print(e)

# オブジェクトの全ての属性(メソッドやインスタンス変数)
def chk_dir(obj):
    print(f'\n■■ dir ■■\n{dir(obj)}')

# 属性と中に入ってる変数を出力
def chk_vars(obj):
    print(f'\n■■ vars ■■')
    try:
        print(f'\n{obj.vars}')
    except AttributeError as e:
        print(e)

# オブジェクトのチェック
def chk_object(obj, obj_str):
    print(f'↓↓↓↓↓↓↓↓↓↓ 「{obj_str} 」CHECK START... ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓')
    chk_obj(obj)
    chk_type(obj)
    chk_shape(obj)
    chk_size(obj)
    chk_keys(obj)
    chk_dir(obj)
    chk_vars(obj)
    print(f'↑↑↑↑↑↑↑↑↑↑ 「{obj_str} 」CHECK START... ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑')

if __name__ == '__main__':
    class Sample:
        def __init__(self, value):
            self.value = value

        def show_value(self):
            print(f'Value: {self.value}')

    sample_object = Sample(3)

    chk_object(sample_object, 'sample_object')
}}
#enddivregion
~
-「yolov5_demo_sync_ov2023x.py」のエラー箇所の前に挿入~
#codeprettify(){{
    :
# インポート処理

import object_check                                             # 2024/03/20
    :
}}
#codeprettify(){{
    :
        objects = list()
        for idx in range(len(results)):
            out_blob = results[idx]
            layer_params = YoloParams(side=out_blob.shape[2])

            # オブジェクト・チェック(DEBUG)                   # 2024/03/20
            if args.debug_message:
                object_check.chk_object(results, 'results')
                object_check.chk_object(out_blob, 'out_blob')
    :
}}
・学習済みモデル(V3)「yolov5s_v3.xml」の推論結果で得られるオブジェクト~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Images/cat.jpg  -x
}}
 ''' 'result''' object'
 {<ConstOutput: names[668, Conv_487] shape[1,255,20,20] type: f32>: array([[[[ ..., ]]]], dtype=float32),
  <ConstOutput: names[648, Conv_471] shape[1,255,40,40] type: f32>: array([[[[ ..., ]]]], dtype=float32),
  <ConstOutput: names[628, Conv_455] shape[1,255,80,80] type: f32>: array([[[[ ..., ]]]], dtype=float32)}
・学習済みモデル(V7)「yolov5s.xml」の推論結果で得られるオブジェクト~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Images/cat.jpg -m ../yolov5/yolov5s.xml -x
}}
 ''' 'result' object'''
 {<ConstOutput: names[output0] shape[1,25200,85] type: f32>: array([[[ ..., ]]], dtype=float32)}

-[[「Netron (Web版)」>+https://netron.app/]] で視覚化して確認「yolov5s_v3.onnx」/「yolov5s.onnx」~
#ref(netron_yolov5s_3_onnx.png,left,around,25%,netron_yolov5s_3_onnx.png)
#ref(netron_yolov5s_onnx.png,left,around,25%,netron_yolov5s_onnx.png)
#clear

***「export.py」で得られた ONNXファイルを OpenVINO™ IR に変換 [#o96f0a22]
  参考サイト:→ [['''Object Detection & YOLOs'''>+https://github.com/bethusaisampath/YOLOv5_Openvino]]~
+ ONNXファイルからモデルオプティマイザーを使用してIRファイルに変換できる~
・モデルオプティマイザーを使用して YOLOv5 モデルを変換するときに、IR の出力ノードを指定する必要がある~
・YOLOv5 には 3 つの出力ノードがある~
~
+[[「Netron (Web版)」>+https://netron.app/]] で YOLOv5 ONNX の重みを視覚化する~
・Netronでキーワード「Transpose」を検索して出力ノードを見つける~
・前図赤印 ① の畳み込みノードダブルクリックし、右のプロパティパネルで、名前「/model.24/m.0/Conv」を読み取る~ことができる~
・同様に ②「/model.24/m.1/Conv」, ③「/model.24/m.2/Conv」を得る~
・モデルオプティマイザーの出力ノード パラメーターとして「/model.24/m.0/Conv」「/model.24/m.1/Conv」「/model.24/m.2/Conv」を使用する~
~
+ モデルオプティマイザーを使用してコンバートする~
・「workspace_pylearn/yolov5/」ディレクトリで実行する~
・「yolov5s.onnx」→「yolov5s_v7.xml」「yolov5s_v7..bin」
#codeprettify(){{
(py_learn) mo --input_model yolov5s.onnx --model_name yolov5s_v7 -s 255 --reverse_input_channels --output '/model.24/m.0/Conv','/model.24/m.1/Conv','/model.24/m.2/Conv'
}}
・実行結果~
#codeprettify(){{
(py_learn) mo --input_model yolov5s.onnx --model_name yolov5s_v7 -s 255 --reverse_input_channels --output '/model.24/m.0/Conv','/model.24/m.1/Conv','/model.24/m.2/Conv'
[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression explicitly by adding argument --compress_to_fp16=False.
Find more information about compression to FP16 at https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_FP16_Compression.html
[ INFO ] MO command line tool is considered as the legacy conversion API as of OpenVINO 2023.2 release. Please use OpenVINO Model Converter (OVC). OVC represents a lightweight alternative of MO and provides simplified model conversion API.
Find more information about transition from MO to OVC at https://docs.openvino.ai/2023.2/openvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition.html
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: C:\anaconda_win\workspace_pylearn\yolov5\yolov5s_v7.xml
[ SUCCESS ] BIN file: C:\anaconda_win\workspace_pylearn\yolov5\yolov5s_v7.bin
}}

- 前項で作成した「yolov5_demo_sync_ov2023x.py」を実行~
・「workspace_pylearn/yolov5_demo/」ディレクトリで実行する~
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Images/cat.jpg -m ../yolov5/yolov5s_v7.xml
}}
・実行結果~
#ref(rev_yolov5_05_m.jpg,right,around,30%,rev_yolov5_05_m.jpg)
#codeprettify(){{
(py_learn) python yolov5_demo_sync_ov2023x.py -i ../../Images/cat.jpg -m ../yolov5/yolov5s_v7.xml

--- YOLO V5 OpenVINO(API 2.0) demoprogram Ver 0.01 ---
OpenCV: 4.9.0
OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0

 Creating OpenVINO Runtime Core...
 Reading the model: ../yolov5/yolov5s_v7.xml
 Label file  : coco.names_jp
 Input source: ../../Images/cat.jpg
 Starting inference...

 FPS average:      11.90

 Finished.
}}

*** OpenVINO™ API 2.0 対応プログラムを作成 [#zcabfe44]
- プログラム概要~
・サイト [['''YOLOv5_OpenVINO_demo'''>+https://github.com/violet17/yolov5_demo?tab=readme-ov-file]] のサンプルプログラムを参考に、物体認識プログラムを作成する~
 (修正済み プロジェクト・パッケージ「update_20240405.zip」に同梱)~
・入力ソースとして、単一の静止画/動画ファイル指定・カメラ(0/1)、が選べるようにする~
・OpenVINO™ API 2.0 に準拠する~
~
#ref(car-person_o_s.jpg,left,around,30%,car-person_o_s.jpg)
#ref(desk-image_o_m.jpg,left,around,19%,desk-image_o_m.jpg)
#ref(photo1_o_s.jpg,left,around,30%,photo1_o_s.jpg)
#clear

- プロジェクトの実行ディレクトリ~
~
&color(white,black){'' Windows の場合 ''};
#codeprettify(){{
(py_learn) PS > cd /anaconda_win/workspace_pylearn/yolov5
}}
&color(white,black){'' Linux の場合 ''};
#codeprettify(){{
(py_learn) $ cd ~/workspace_pylearn/yolov5
}}

- 実行手順~
・コマンドラインから起動する~
#codeprettify(){{
(py_learn) python yolov5_OV2.py
}}
・コマンドライン引数
|コマンドオプション|デフォールト設定|意味|h
|-h, --help|CENTER:-|ヘルプ表示|
|BGCOLOR(lightyellow):-i, --input|BGCOLOR(lightyellow):CENTER:cam|BGCOLOR(lightyellow):カメラ(cam/cam0~cam9)または動画・静止画像ファイル ※|
|BGCOLOR(lightyellow):-m, --model|BGCOLOR(lightyellow):CENTER:yolov5s_v7.xml|BGCOLOR(lightyellow):学習済みモデル(IR)|
|BGCOLOR(lightyellow):-d, --device|BGCOLOR(lightyellow):CENTER:CPU|BGCOLOR(lightyellow):デバイス指定 (CPU/GPU/MYRIAD)|
|-l, --label|CENTER:coco.names_jp|ラベル・ファイル|
|-t, --prob_threshold|CENTER:0.5|クラス判定の閾値 (数値が小さい程オブジェクトは増えるが、ノイズも増える|
|-iout, --iou_threshold|CENTER:0.4|Intersection Over Union(検出領域が重なっている割合、数値が大きいほど重なり度合いが高い)|
|-t, --title|CENTER:y|タイトル表示 (y/n)|
|-s, --speed|CENTER:y|スピード計測表示 (y/n)|
|-o, --out|CENTER:non|処理結果を出力する場合のファイル名|
 ※ 入力ソースの指定~
   ・cam    :カメラ入力~
   ・ファイルパス:動画ファイル(.mp4) / 静止画ファイル(.jpg, .png, .bmp, ....)~
#codeprettify(){{
(py_learn) python yolov5_OV2.py -h
usage: yolov5_OV2.py [-h] [-i INPUT] [-m MODEL] [-d DEVICE] [--labels LABELS] [-t PROB_THRESHOLD]
                     [-iout IOU_THRESHOLD] [--titlef TITLE] [--speedf SPEED] [-o IMAGE_OUT]

Options:
  -h, --help            Show this help message and exit.
  -i INPUT, --input INPUT
                        Required. Path to an image/video file. (Specify 'cam','cam0','cam1')
  -m MODEL, --model MODEL
                        Required. Path to an .xml file with a trained model.
  -d DEVICE, --device DEVICE
                        Optional. Specify the target device to infer on; CPU, GPU, FPGA, HDDL or MYRIAD is acceptable.
                        The sample will look for a suitable plugin for device specified. Default value is CPU
  --labels LABELS       Optional. Labels mapping file
  -t PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
                        Optional. Probability threshold for detections filtering
  -iout IOU_THRESHOLD, --iou_threshold IOU_THRESHOLD
                        Optional. Intersection over union threshold for overlapping detections filtering
  --titlef TITLE        Program title flag.(y/n) Default value is 'y'
  --speedf SPEED        Speed display flag.(y/n) Default calue is 'y'
  -o IMAGE_OUT, --out IMAGE_OUT
                        Output image file path. Default value is 'non'
}}

- 実行例~
・入力パラメータなしの場合~
#codeprettify(){{
(py_learn) python yolov5_OV2.py
}}
・実行結果~
#ref(rev_yolov5_06_m.jpg,right,around,30%,rev_yolov5_06_m.jpg)
#codeprettify(){{
(py_learn) python yolov5_OV2.py

YOLO V5 in OpenVINO(API 2.0)  Ver 0.01: Starting application...
   OpenVINO inference_engine: 2024.0.0-14509-34caeefd078-releases/2024/0
   OpenCV virsion : 4.9.0

   - Input source   :  cam
   - Pretrained     :  yolov5s_v7.xml
   - Label file     :  coco.names_jp
   - Use device     :  CPU
   - prob threshold :  0.5
   - iou  threshold :  0.4

   - Output path    :  non
   - Program Title  :  y
   - Speed flag     :  y

 FPS average:       5.80

 Finished.
}}
・カメラ・デバイス1を使う場合~
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i cam1
}}
・動画ファイル指定の場合~
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Videos/car1_m.mp4
}}
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Videos/car2_m.mp4
}}
・静止画ファイル指定の場合~
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Images/desk-image.jpg
}}
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Images/car-person.jpg
}}
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Images/bus.jpg
}}
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Images/zidane.jpg
}}
- ソースコード~
・ファイルの場所 /workspace_pylearn/yolov5
#divregion(「yolov5_OV2.py」)
#codeprettify(){{
# -*- coding: utf-8 -*-
##------------------------------------------
## YOLO V5 in OpenVINO(API 2.0)  Ver 0.03
##
##               2024.03.31 Masahiro Izutsu
##------------------------------------------
## yolov5_OV2.py  (original: yolov5_demo_sync_ov2023.py)
##  Ver. 0.02   2024/04/09  classID=119 まで対応
##  Ver. 0.03   2024/04/15  カメラ入力(cam0-cam9)

# インポート処理
from __future__ import print_function, division

import logging
import os
import sys
from argparse import ArgumentParser, SUPPRESS
from math import exp as exp
from time import time
import numpy as np

import cv2
from openvino.preprocess import PrePostProcessor, ResizeAlgorithm
from openvino.runtime import Core, Layout, Type
import openvino.runtime as ov

import my_puttext
import my_color80
import my_fps

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
CYAN = '\033[1;36m'

# 定数定義
TEXT_COLOR = my_color80.CR_white
DEF_MODEL_PATH = os.path.expanduser('yolov5s_v7.xml')
DEF_LABEL_PATH = os.path.expanduser('coco.names_jp')
DEF_INPUT_PATH = os.path.expanduser('cam')

# タイトル・バージョン情報
title = 'YOLO V5 in OpenVINO(API 2.0)  Ver 0.03'

#logging.basicConfig(format="[ %(levelname)s ] %(message)s", level=logging.INFO, stream=sys.stdout)
#log = logging.getLogger()

def build_argparser():
    parser = ArgumentParser(add_help=False)
    args = parser.add_argument_group('Options')
    args.add_argument('-h', '--help', action = 'help', default = SUPPRESS, help = 'Show this help message and exit.')
    args.add_argument("-i", "--input", default = DEF_INPUT_PATH, type=str,
                        help="Required. Path to an image/video file. (Specify 'cam','cam0','cam1')")
    args.add_argument("-m", "--model", help = "Required. Path to an .xml file with a trained model.",
                        default=DEF_MODEL_PATH, type=str)
    args.add_argument("-d", "--device", default = "CPU", type = str,
                        help="Optional. Specify the target device to infer on; CPU, GPU, FPGA, HDDL or MYRIAD is"
                           " acceptable. The sample will look for a suitable plugin for device specified. "
                           "Default value is CPU")
    args.add_argument("--labels", help = "Optional. Labels mapping file", default = DEF_LABEL_PATH, type = str)
    args.add_argument("-t", "--prob_threshold", default = 0.5, type = float,
                        help = "Optional. Probability threshold for detections filtering")
    args.add_argument("-iout", "--iou_threshold", default=0.4, type=float,
                        help="Optional. Intersection over union threshold for overlapping detections filtering")
    args.add_argument('--titlef', metavar = 'TITLE', default = 'y',
                        help = 'Program title flag.(y/n) Default value is \'y\'')
    args.add_argument('--speedf', metavar = 'SPEED', default = 'y',
                        help = 'Speed display flag.(y/n) Default calue is \'y\'')
    args.add_argument('-o', '--out', metavar = 'IMAGE_OUT', default = 'non',
                        help = 'Output image file path. Default value is \'non\'')
    return parser

# 基本情報の表示
def display_info(args):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print("   OpenVINO inference_engine:", ov.get_version())
    print('   OpenCV virsion :',cv2.__version__)
    print('\n   - ' + YELLOW + 'Input source   : ' + NOCOLOR, args.input)
    print('   - ' + YELLOW + 'Pretrained     : ' + NOCOLOR, args.model)
    print('   - ' + YELLOW + 'Label file     : ' + NOCOLOR, args.labels)
    print('   - ' + YELLOW + 'Use device     : ' + NOCOLOR, args.device)
    print('   - ' + YELLOW + 'prob threshold : ' + NOCOLOR, args.prob_threshold)
    print('   - ' + YELLOW + 'iou  threshold : ' + NOCOLOR, args.iou_threshold, '\n')

    print('   - ' + YELLOW + 'Output path    : ' + NOCOLOR, args.out)
    print('   - ' + YELLOW + 'Program Title  : ' + NOCOLOR, args.titlef)
    print('   - ' + YELLOW + 'Speed flag     : ' + NOCOLOR, args.speedf)


class YoloParams:
    # ------------------------------------------- Extracting layer parameters ------------------------------------------
    # Magic numbers are copied from yolo samples
    def __init__(self,  side):
        self.num = 3 #if 'num' not in param else int(param['num'])
        self.coords = 4 #if 'coords' not in param else int(param['coords'])
        self.classes = 80 #if 'classes' not in param else int(param['classes'])
        self.side = side
        self.anchors = [10.0, 13.0, 16.0, 30.0, 33.0, 23.0, 30.0, 61.0, 62.0, 45.0, 59.0, 119.0, 116.0, 90.0, 156.0,
                        198.0,
                        373.0, 326.0] #if 'anchors' not in param else [float(a) for a in param['anchors'].split(',')]

    def log_params(self):
        params_to_print = {'classes': self.classes, 'num': self.num, 'coords': self.coords, 'anchors': self.anchors}
        [log.info("         {:8}: {}".format(param_name, param)) for param_name, param in params_to_print.items()]


def letterbox(img, size=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
    shape = img.shape[:2]  # current shape [height, width]
    w, h = size

    # Scale ratio (new / old)
    r = min(h / shape[0], w / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better test mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = w - new_unpad[0], h - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (w, h)
        ratio = w / shape[1], h / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border

    top2, bottom2, left2, right2 = 0, 0, 0, 0
    if img.shape[0] != h:
        top2 = (h - img.shape[0])//2
        bottom2 = top2
        img = cv2.copyMakeBorder(img, top2, bottom2, left2, right2, cv2.BORDER_CONSTANT, value=color)  # add border
    elif img.shape[1] != w:
        left2 = (w - img.shape[1])//2
        right2 = left2
        img = cv2.copyMakeBorder(img, top2, bottom2, left2, right2, cv2.BORDER_CONSTANT, value=color)  # add border
    return img


def scale_bbox(x, y, height, width, class_id, confidence, im_h, im_w, resized_im_h=640, resized_im_w=640):
    gain = min(resized_im_w / im_w, resized_im_h / im_h)  # gain  = old / new
    pad = (resized_im_w - im_w * gain) / 2, (resized_im_h - im_h * gain) / 2  # wh padding
    x = int((x - pad[0])/gain)
    y = int((y - pad[1])/gain)

    w = int(width/gain)
    h = int(height/gain)

    xmin = max(0, int(x - w / 2))
    ymin = max(0, int(y - h / 2))
    xmax = min(im_w, int(xmin + w))
    ymax = min(im_h, int(ymin + h))
    # Method item() used here to convert NumPy types to native types for compatibility with functions, which don't
    # support Numpy types (e.g., cv2.rectangle doesn't support int64 in color parameter)
    return dict(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, class_id=class_id.item(), confidence=confidence.item())


def entry_index(side, coord, classes, location, entry):
    side_power_2 = side ** 2
    n = location // side_power_2
    loc = location % side_power_2
    return int(side_power_2 * (n * (coord + classes + 1) + entry) + loc)


def parse_yolo_region(blob, resized_image_shape, original_im_shape, params, threshold):
    # --- Validating output parameters ---
    out_blob_n, out_blob_c, out_blob_h, out_blob_w = blob.shape
    predictions = 1.0/(1.0+np.exp(np.zeros(blob.shape)-blob)) 

    assert out_blob_w == out_blob_h, "Invalid size of output blob. It sould be in NCHW layout and height should " \
                                     "be equal to width. Current height = {}, current width = {}" \
                                     "".format(out_blob_h, out_blob_w)

    # --- Extracting layer parameters ---
    orig_im_h, orig_im_w = original_im_shape
    resized_image_h, resized_image_w = resized_image_shape
    objects = list()

    side_square = params.side * params.side

    # --- Parsing YOLO Region output ---
    bbox_size = int(out_blob_c/params.num) #4+1+num_classes

    for row, col, n in np.ndindex(params.side, params.side, params.num):
        bbox = predictions[0, n*bbox_size:(n+1)*bbox_size, row, col]
        
        x, y, width, height, object_probability = bbox[:5]
        class_probabilities = bbox[5:]
        if object_probability < threshold:
            continue
        x = (2*x - 0.5 + col)*(resized_image_w/out_blob_w)
        y = (2*y - 0.5 + row)*(resized_image_h/out_blob_h)
        if int(resized_image_w/out_blob_w) == 8 & int(resized_image_h/out_blob_h) == 8: #80x80, 
            idx = 0
        elif int(resized_image_w/out_blob_w) == 16 & int(resized_image_h/out_blob_h) == 16: #40x40
            idx = 1
        elif int(resized_image_w/out_blob_w) == 32 & int(resized_image_h/out_blob_h) == 32: # 20x20
            idx = 2

        width = (2*width)**2* params.anchors[idx * 6 + 2 * n]
        height = (2*height)**2 * params.anchors[idx * 6 + 2 * n + 1]
        class_id = np.argmax(class_probabilities)
        confidence = object_probability
        objects.append(scale_bbox(x=x, y=y, height=height, width=width, class_id=class_id, confidence=confidence,
                                  im_h=orig_im_h, im_w=orig_im_w, resized_im_h=resized_image_h, resized_im_w=resized_image_w))
    return objects


def intersection_over_union(box_1, box_2):
    width_of_overlap_area = min(box_1['xmax'], box_2['xmax']) - max(box_1['xmin'], box_2['xmin'])
    height_of_overlap_area = min(box_1['ymax'], box_2['ymax']) - max(box_1['ymin'], box_2['ymin'])
    if width_of_overlap_area < 0 or height_of_overlap_area < 0:
        area_of_overlap = 0
    else:
        area_of_overlap = width_of_overlap_area * height_of_overlap_area
    box_1_area = (box_1['ymax'] - box_1['ymin']) * (box_1['xmax'] - box_1['xmin'])
    box_2_area = (box_2['ymax'] - box_2['ymin']) * (box_2['xmax'] - box_2['xmin'])
    area_of_union = box_1_area + box_2_area - area_of_overlap
    if area_of_union == 0:
        return 0
    return area_of_overlap / area_of_union

# 画像の種類を判別する
#   戻り値: 'jeg''png'... 画像ファイル
#           'None'        画像ファイル以外 (動画ファイル)
#           'NotFound'    ファイルが存在しない
def is_pict(filename):
    if not os.path.isfile(filename):
        return 'NotFound'

    types = ['.bmp','.png','.jpg','.jpeg','.JPG','.tif']
    for ss in types:
        if filename.endswith(ss):
            return ss
    return 'None'

def main():
    # 日本語フォント指定
    fontPIL = my_puttext.get_font()

    # 入力パラメータ
    args = build_argparser().parse_args()
    display_info(args)
    outpath = args.out

    # --- 1. Plugin initialization for specified device and load extensions library if specified ---
    core = Core()

    # --- 2. Reading the IR generated by the Model Optimizer (.xml and .bin files) ---
    model = args.model
    model = core.read_model(model)

    assert len(model.inputs) == 1, "Sample supports only single input topologies"

    # --- 4. Preparing inputs ---

    # Read and pre-process input images
    n, c, h, w = model.inputs[0].shape

    # 判定ラベル
    if args.labels and os.path.isfile(args.labels):
        with open(args.labels, 'r', encoding="utf-8") as f:     # 2024/03/18
            labels_map = [x.strip() for x in f]
    else:
        labels_map = None

    # 入力 cam/cam0-cam9 対応                               # 2024/04/15
    input_stream = args.input
    if input_stream.find('cam') == 0 and len(input_stream) < 5:
        input_stream = 0 if input_stream == 'cam' else int(input_stream[3])
        isstream = True
    else:
        filetype = is_pict(input_stream)
        isstream = filetype == 'None'
        if (filetype == 'NotFound'):
            print(RED + "\ninput file Not found." + NOCOLOR)
            quit()

    # 入力準備
    cap = cv2.VideoCapture(input_stream)
    number_input_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    number_input_frames = 1 if number_input_frames != -1 and number_input_frames < 0 else number_input_frames

    wait_key_code = 1

    # Number of frames in picture is 1 and this will be read in cycle. Sync mode is default value for this case
    if number_input_frames != 1:
        ret, frame = cap.read()
    else:
        wait_key_code = 0

    # --- 5. Loading model to the plugin ---
    compiled_model = core.compile_model(model, device_name=args.device)

    parsing_time = 0

    # --- 6. Doing inference ---

    # 処理結果の記録 step1
    if (outpath != 'non'):
        if (isstream):
            fps = int(cap.get(cv2.CAP_PROP_FPS))
            out_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
            out_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
            fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
            outvideo = cv2.VideoWriter(outpath, fourcc, fps, (out_w, out_h))

    # 計測値初期化
    fpsWithTick = my_fps.fpsWithTick()
    fpsWithTick.get()                                            # fps計測開始

    # メインループ 
    while cap.isOpened():

        ret, frame = cap.read()
        if not ret:
            break
        in_frame = letterbox(frame, (w, h))

        in_frame0 = in_frame
        # resize input_frame to network size
        in_frame = in_frame.transpose((2, 0, 1))  # Change data layout from HWC to CHW
        in_frame = in_frame.reshape((n, c, h, w))

        # Start inference
        start_time = time()
        results = compiled_model.infer_new_request({0: in_frame})
        det_time = time() - start_time

        objects = list()
        for idx in range(len(results)):
            out_blob = results[idx]
            layer_params = YoloParams(side=out_blob.shape[2])

            objects += parse_yolo_region(out_blob, in_frame.shape[2:],
                                            frame.shape[:-1], layer_params,
                                            args.prob_threshold)
            parsing_time = time() - start_time

        # Filtering overlapping boxes with respect to the --iou_threshold CLI parameter
        objects = sorted(objects, key=lambda obj : obj['confidence'], reverse=True)
        for i in range(len(objects)):
            if objects[i]['confidence'] == 0:
                continue
            for j in range(i + 1, len(objects)):
                if intersection_over_union(objects[i], objects[j]) > args.iou_threshold:
                    objects[j]['confidence'] = 0

        # Drawing objects with respect to the --prob_threshold CLI parameter
        objects = [obj for obj in objects if obj['confidence'] >= args.prob_threshold]

        origin_im_size = frame.shape[:-1]
        for obj in objects:
            # Validation bbox of detected object
            if obj['xmax'] > origin_im_size[1] or obj['ymax'] > origin_im_size[0] or obj['xmin'] < 0 or obj['ymin'] < 0:
                continue

            # オブジェクト別の色指定
            color_id = obj['class_id'] if obj['class_id'] < 80 else obj['class_id'] - 40    # 2024/04/09
            BOX_COLOR = my_color80.get_boder_bgr80(color_id)
            LABEL_BG_COLOR = my_color80.get_back_bgr80(color_id)

            det_label = labels_map[obj['class_id']] if labels_map and len(labels_map) >= obj['class_id'] else \
                str(obj['class_id'])

            # ラベル描画領域を得る
            x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
                   text = det_label + ' ' + str(round(obj['confidence'] * 100, 1)) + ' %',
                   org = (obj['xmin'], obj['ymin'] - 7), fontFace = fontPIL,
                   fontScale = 14,
                   color = TEXT_COLOR,
                   mode = 0,
                   areaf=True)
            xx = obj['xmax'] if obj['xmax'] > x1 else x1              # 横が領域を超える場合は超えた値にする
            cv2.rectangle(frame, (obj['xmin'], obj['ymin']-26), (xx, obj['ymin']), LABEL_BG_COLOR, -1)

            my_puttext.cv2_putText(img = frame,
                   text = det_label + ' ' + str(round(obj['confidence'] * 100, 1)) + ' %',
                   org = (obj['xmin'], obj['ymin'] - 7), fontFace = fontPIL,
                   fontScale = 14,
                   color = TEXT_COLOR,
                   mode = 0)

            # 画像に枠を描く
            cv2.rectangle(frame, (obj['xmin'], obj['ymin']), (obj['xmax'], obj['ymax']), BOX_COLOR, 2)

        # FPSを計算する
        fps = fpsWithTick.get()
        st_fps = 'fps: {:>6.2f}'.format(fps)
        if (args.speedf == 'y'):
            cv2.rectangle(frame, (10, 38), (95, 55), (90, 90, 90), -1)
            cv2.putText(frame, st_fps, (15, 50), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.4, color=(255, 255, 255), lineType=cv2.LINE_AA)

        # タイトル描画
        if (args.titlef == 'y'):
            cv2.putText(frame, title, (12, 32), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(0, 0, 0), lineType=cv2.LINE_AA)
            cv2.putText(frame, title, (10, 30), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(200, 200, 0), lineType=cv2.LINE_AA)

        # 画像表示 
        window_name = title + "  (hit 'q' or 'esc' key to exit)"
        cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL) 
        cv2.imshow(window_name, frame)

        # 処理結果の記録 step2
        if (outpath != 'non'):
            if (isstream):
                outvideo.write(frame)
            else:
                cv2.imwrite(outpath, frame)

        # 何らかのキーが押されたら終了 
        key = cv2.waitKey(wait_key_code)
        if key == 27 or key == 113:             # 'esc' or 'q'
            break

        # ウインドウのクローズボタン
        if cv2.getWindowProperty(window_name, cv2.WND_PROP_VISIBLE) < 1:        
            print('\n Window close !!')
            break

    cv2.destroyAllWindows()

    print('\n FPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
    print('\n Finished.\n')

if __name__ == '__main__':
    sys.exit(main() or 0)
}}
#enddivregion
・ファイルの場所 /workspace_py37/mylib~
□ [[Python 私的汎用ライブラリ>MyLibrary]]~


*** OpenVINO™ API 2.0 対応プログラム実行速度 [#ha3f5fea]
- 実行プログラム「python yolov5_OV2.py」 (単位:fps) 学習済みモデル「yolov5s_v7.xml」~
|CENTER:240|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|CENTER:|c
|マシン・OS|>|car_m.mp4|>|car1_mp4|>|car2.mp4|h
|~|>|#ref(rev_yolov5_03_m.jpg,right,around,12%,rev_yolov5_03_m.jpg)|>|#ref(rev_yolov5_07_m.jpg,right,around,24%,rev_yolov5_07_m.jpg)|>|#ref(rev_yolov5_08_m.jpg,right,around,24%,rev_yolov5_08_m.jpg)|h
|~|GPU|CPU|GPU|CPU|GPU|CPU|h
|'''HP ENVY  windows11'''|BGCOLOR(lightyellow):11.5|11.9|BGCOLOR(lightyellow):12.7|12.4|BGCOLOR(lightyellow):14.2|12.9|
|'''HP ENVY  Ubuntu22.04LTS'''|BGCOLOR(lightyellow):10.6|10.3|BGCOLOR(lightyellow):11.3|10.5|BGCOLOR(lightyellow):12.4|11.0|
|'''DELL XPS  windows11'''|BGCOLOR(lightyellow):10.7|7.4|BGCOLOR(lightyellow):12.0|7.9|BGCOLOR(lightyellow):12.7|8.0|
|'''DELL Latitude  Ubuntu20.04LTS'''|BGCOLOR(lightyellow):9.7|6.5|BGCOLOR(lightyellow):10.7|6.8|BGCOLOR(lightyellow):11.2|7.1|
|'''HP ELITE  windows10'''|BGCOLOR(lightyellow):6.6|4.5|BGCOLOR(lightyellow):7.2|4.7|BGCOLOR(lightyellow):7.7|5.0|
・テストコマンド~
#codeprettify(){{
(py_learn) python yolov5_OV2.py -i ../../Videos/car_m.mp4
(py_learn) python yolov5_OV2.py -i ../../Videos/car_m.mp4 -d GPU
(py_learn) python yolov5_OV2.py -i ../../Videos/car1_m.mp4
(py_learn) python yolov5_OV2.py -i ../../Videos/car1_m.mp4 -d GPU
(py_learn) python yolov5_OV2.py -i ../../Videos/car2_m.mp4
(py_learn) python yolov5_OV2.py -i ../../Videos/car2_m.mp4 -d GPU
}}

- テスト環境(Intel® CPU / GPU)~
|CENTER:機種|CENTER:OS|CENTER:CPU|CENTER:GPU|h
|HP ENVY Desktop TE02-1097jp|Windows11/Ubuntu22.04LTS|13th Gen Core™ i9-13900|UHD Graphics 770|
|DELL XPS Plus 9320 NoteBook|Windows11|12th Gen Core™ i7-1260P|Iris® Xe Graphics|
|DELL Latitude 7520 NoteBook|Ubuntu20.04LTS|11th Gen Core™ i7-1185G7|Iris® Xe Graphics|
|HP EliteDesk 800 G2 SFF|Windows10|6 th Gen Core™ i7-6700|HD Graphics 530|

#br

** 対処したエラー詳細 [#y7abfe2b]
*** '''UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument.''' [#k89c4aad]
*** '''UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument.''' [#o5d51c92]
- エラー内容~
#codeprettify(){{
(py_test) python detect2_yolov5.py -i ../../Videos/car_m.mp4 -m yolov5x
    :
Using cache found in /home/USER/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 噫 2021-9-16 torch 2.2.1+cpu CPU
YOLOv5  2021-9-16 torch 2.2.1+cpu CPU

Fusing layers... 
/home/USER/anaconda3/envs/py_learn/lib/python3.11/site-packages/torch/functional.py:507: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3549.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 444 layers, 86705005 parameters, 0 gradients
Adding AutoShape... 
Traceback (most recent call last):
    :
}}
- 対処方法~
1. エラーメッセージからキャッシュデータのディレクトリを調べ削除~
 /home/USER/.cache/torch/hub/ultralytics_yolov5_master
2. 再度実行する~

- コメント~
以前に実行したキャッシュが残っていて、つじつまが合わなくなることがあるらしい~

*** '''ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found''' [#m268d157]
 【Ubuntu20.04LTSで発生】
- エラー内容~
#codeprettify(){{
(py_test) python detect2.py --source 0
Traceback (most recent call last):
  File "/home/mizutu/workspace_pylearn/yolov5/detect2.py", line 58, in <module>
    :
ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/mizutu/anaconda3/envs/py_test/lib/python3.11/site-packages/cv2/python-3.11/cv2.cpython-311-x86_64-linux-gnu.so)
}}
- 対処方法~
1.「libstdc++.so.6」を調べる~
#codeprettify(){{
$ls -l ~/anaconda3/envs/py_test/lib/
    :
lrwxrwxrwx   1 mizutu mizutu        19  3月 16 17:01 libstdc++.so.6 -> libstdc++.so.6.0.29
-rwxrwxr-x   4 mizutu mizutu  17981480  6月  1  2022 libstdc++.so.6.0.29
    :
}}
#codeprettify(){{
$ ls -l /lib/x86_64-linux-gnu/
    :
lrwxrwxrwx  1 root root       19  7月  9  2023 libstdc++.so.6 -> libstdc++.so.6.0.28
-rw-r--r--  1 root root  1956992  7月  9  2023 libstdc++.so.6.0.28
    :
}}
2.「libstdc++.so.6.0.29」をシステム側にコピーしリンクを再作成~
#codeprettify(){{
$sudo cp /home/mizutu/anaconda3/envs/py_test/lib/libstdc++.so.6.0.29 /lib/x86_64-linux-gnu
$cd /lib/x86_64-linux-gnu
$sudo ln -sb libstdc++.so.6.0.29 libstdc++.so.6
$sudo chmod 644 libstdc++.so.6.0.29
}}
3. ファイルの確認~
#codeprettify(){{
$ ls -l /lib/x86_64-linux-gnu/
    :
lrwxrwxrwx  1 root root       19  3月 17 05:37 libstdc++.so.6 -> libstdc++.so.6.0.29
-rw-r--r--  1 root root  1956992  7月  9  2023 libstdc++.so.6.0.28
-rw-r--r--  1 root root 17981480  3月 17 05:34 libstdc++.so.6.0.29
lrwxrwxrwx  1 root root       19  7月  9  2023 libstdc++.so.6~ -> libstdc++.so.6.0.28
    :
}}
- コメント~
システムのアップデートでシンボリックリンクが書き変わった場合再度リンクを作成する~

- 参考サイト:~
・https://github.com/pybind/pybind11/discussions/3453~
#codeprettify(){{
"This undoes exactly what a virtual environment like anaconda is meant to achieve: not having to replace system libraries in order to satisfy dependencies."
}}

#br

** 更新履歴 [#l98a3f0e]
- 2024/03/12 初版
- 2024/04/08 内容を全面的に更新

#br

* 参考資料 [#nbfb082e]
- [[物体検出アルゴリズム「YOLO V5」>+https://izutsu.aa0.netvolante.jp/pukiwiki/?YOLOv5]](旧版)~

- YOLO V5~
-- [[YOLOV5 By Ultralytics>+https://pytorch.org/hub/ultralytics_yolov5/]]~
-- [[''Load YOLOv5 with PyTorch Hub''>+https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading/]]~
-- [[''PyTorch Hub -Ultralytics YOLOv8 ドキュメント''>+https://docs.ultralytics.com/ja/yolov5/tutorials/pytorch_hub_model_loading/]]~
-- [[''NVIDIA の GPU に最適化された YOLOv5 の実装で物体検出アプリケーションを高速化する''>+https://developer.nvidia.com/ja-jp/blog/nvidia-yolo-v5-gpu-optimization/]]~

-- [[yolov5のモデルをオフラインで使用する>+https://qiita.com/Decwest/items/6ef2383787baa7b83143]]~
-- [[【PyTorch】TorchHub+YOLOv5でリアルタイム認識して遊ぶ>+https://zenn.dev/opamp/articles/b4005309740fa6]]~
-- [[PytorchでMobileNetV2を使うときに詰まったこと>+https://zenn.dev/kmiura55/articles/pytorch-use-mobilenetv2]] ''(パラメータ pretrained=True の意味)''~
-- [[YOLOv5 で物体検出をしてみよう>+https://rinsaka.com/python/yolov5/index.html]]~

-- [[物体検出,物体検出のための追加学習の実行(YOLOv5,PyTorch,Python を使用)(Windows 上)>+https://www.kkaneko.jp/ai/win/yolov5.html]]~
-- [[YOLOv5で実装する物体検出入門|第5回:PyTorch hubまとめとYouTube動画の物体検出>+https://tt-tsukumochi.com/archives/1933]]~

-- [[【やってみた】ONNX・OpenVINOでYOLOv5の高速化!>+https://kdl-di.hatenablog.com/entry/2023/01/30/090000]]~
-- [[YOLOv5を利用した学習と物体検出>+https://qiita.com/shinya_sun_sun/items/61205f83ea4873c0993d]]~
-- [[YOLO V5 の使い方>+https://note.com/npaka/n/n371912b48ee2]]~
-- [[YOLOv5を使った物体検出>+https://www.alpha.co.jp/blog/202108_02]]~
-- [[【YOLO V5】AIでじゃんけん検出>+https://qiita.com/PoodleMaster/items/5f2cc3248c03b03821b8]]~

- YOLOv5のONNXへのエクスポート~
-- [[YOLOv5 : 物体検出の最新モデル>+https://medium.com/axinc/yolov5-%E7%89%A9%E4%BD%93%E6%A4%9C%E5%87%BA%E3%81%AE%E6%9C%80%E6%96%B0%E3%83%A2%E3%83%87%E3%83%AB-5b7316d1e54d]]~
-- [[convert yolov5 to openvino #891>+https://github.com/ultralytics/yolov5/issues/891]]~
-- [[YOLOv5_OpenVINO_demo>+https://github.com/violet17/yolov5_demo]]~
-- [[物体検知モデルを実用向けに速度チューニングする>+https://www.ariseanalytics.com/activities/report/20210521/]]~

- Python の基本~
-- [[Pythonのpprintの使い方(リストや辞書を整形して出力)>+https://note.nkmk.me/python-pprint-pretty-print/]]~
-- [[Python オブジェクトの属性と中身の確認方法>+https://trends.codecamp.jp/blogs/media/column337]]~
-- [[オブジェクトの中身を確認したい場合に試すこと(Python)>+https://zenn.dev/ynakashi/articles/15b2b7c0a3cd89]]~
-- [[pandas.DataFrameの構造とその作成方法>+https://note.nkmk.me/python-pandas-dataframe-values-columns-index/]]~
-- [[Python 'タプル(tuple)'>+https://www.python.jp/train/tuple/index.html]]~

- OpenVINO™~
-- [[OpenVINO™ Python API Exclusives>+https://docs.openvino.ai/2024/openvino-workflow/running-inference/integrate-openvino-with-your-application/python-api-exclusives.html]]~
-- [[openvino.runtime.CompiledModel>+https://docs.openvino.ai/2024/api/ie_python_api/_autosummary/openvino.runtime.CompiledModel.html]]~
-- [[AE2100 OpenVINO API2.0移行ガイド>+https://qiita.com/TWAT/items/38f1bb8fb4fc42d4fd3b]]~
-- [['''YOLOv5_OpenVINO_demo'''>+https://github.com/violet17/yolov5_demo?tab=readme-ov-file]]~
-- [['''GitHUNB: ultralytics/yolov5 V3'''>+https://github.com/ultralytics/yolov5/releases/tag/v3.0]]~
-- [['''Object Detection & YOLOs'''>+https://github.com/bethusaisampath/YOLOv5_Openvino]]~

- Netron~
-- [[Netron>+https://netron.app/]] (Web版)~
-- [[GitHUB: Netron>+https://github.com/lutzroeder/netron#models]]~
-- [[機械学習モデル可視化ツール「Netron」を使ってみる>+https://developer.mamezou-tech.com/blogs/2023/02/06/ml-model-visualizer-netron/]]~
-- [[【ONNX / PyTorch】機械学習モデルをグラフィカルに可視化するnetronを使ってみた>+https://yiskw713.hatenablog.com/entry/2022/01/17/223548]]~


#br