「detect3_number.py」で推論実行
・学習結果モデルを指定(ログレベルを'0'として識別経過を表示)
(py_learn) python detect3_number.py --log 0
・実行ログ
(py_learn) python detect3_number.py --log 0
Starting..
Number plate detection YOLOv5 Ver. 0.08: Starting application...
OpenCV virsion : 4.9.0
- Image File : ../number/test_data/japan78.jpg
- YOLO v5 : ultralytics/yolov5
- Pretrained : ./runs/train/vd_yolov5s_ep100/weights/best.pt
- Pretrained 2 : ./runs/train/nm_yolov5s_ep50/weights/best.pt
- Confidence lv: 0.25
- Label file : ./data/nm_dataset/vd_names_jp
- Program Title: y
- Speed flag : y
- Processed out: non
- Use device : cuda:0
- Log Level : 0
Using cache found in C:\Users\izuts/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5 2024-4-9 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)
Fusing layers...
Model summary: 157 layers, 7015519 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape...
Using cache found in C:\Users\izuts/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5 2024-4-9 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)
Fusing layers...
Model summary: 157 layers, 7182733 parameters, 0 gradients, 16.3 GFLOPs
Adding AutoShape...
** プレートラベル:
['ナンバー', 'ナンバープレート', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'あ', 'い', 'う', 'え', 'か', 'き', 'く', 'け', 'こ', 'さ', 'す', 'せ', 'そ', 'た', 'ち', 'つ', 'て', 'と', 'な', 'に', 'ぬ', 'ね', 'の', 'は', 'ひ', 'ふ', 'ほ', 'ま', 'み', 'む', 'め', 'も', 'や', 'ゆ', 'よ', 'ら', 'り', 'る', 'れ', 'ろ', 'わ', 'を', '京都', 'なにわ', '大阪', '和泉', '堺', '奈良', '和歌山', '神戸', '徳島', '香川', '愛媛', '千葉']
** Bounding Box:
[[ 61.591 38.322 510.12 262.66 0.4888 0]
[ 352.12 59.349 392.61 129.97 0.98521 2]
[ 315.72 59.296 354.99 131.48 0.98443 3]
[ 391.33 60.289 431.3 129.43 0.98215 2]
[ 344.3 148.63 433.98 289.92 0.9785 7]
[ 426.44 146.62 513.85 303.11 0.96134 10]
[ 84.134 152.66 159.09 231.37 0.58646 22]
[ 1.3103 181.38 54.892 325.32 0.57503 60]
[ 156.16 49.026 321.75 166.41 0.35428 55]]
** Y最小値 を昇順にソートX:
[[ 61.591 38.322 510.12 262.66 0.4888 0]
[ 156.16 49.026 321.75 166.41 0.35428 55]
[ 315.72 59.296 354.99 131.48 0.98443 3]
[ 352.12 59.349 392.61 129.97 0.98521 2]
[ 391.33 60.289 431.3 129.43 0.98215 2]
[ 426.44 146.62 513.85 303.11 0.96134 10]
[ 344.3 148.63 433.98 289.92 0.9785 7]
[ 84.134 152.66 159.09 231.37 0.58646 22]
[ 1.3103 181.38 54.892 325.32 0.57503 60]]
** 上下を分割する場所:
bbox = (61.59,38.32)-(510.12,262.66) ylimit = 106.30 index = 5
** ナンバープレートの上部:
[[ 61.591 38.322 510.12 262.66 0.4888 0]
[ 156.16 49.026 321.75 166.41 0.35428 55]
[ 315.72 59.296 354.99 131.48 0.98443 3]
[ 352.12 59.349 392.61 129.97 0.98521 2]
[ 391.33 60.289 431.3 129.43 0.98215 2]]
** ナンバープレートの下部:
[[ 426.44 146.62 513.85 303.11 0.96134 10]
[ 344.3 148.63 433.98 289.92 0.9785 7]
[ 84.134 152.66 159.09 231.37 0.58646 22]
[ 1.3103 181.38 54.892 325.32 0.57503 60]]
** 上部 X最小値) を昇順にソート:
[[ 61.591 38.322 510.12 262.66 0.4888 0]
[ 156.16 49.026 321.75 166.41 0.35428 55]
[ 315.72 59.296 354.99 131.48 0.98443 3]
[ 352.12 59.349 392.61 129.97 0.98521 2]
[ 391.33 60.289 431.3 129.43 0.98215 2]]
** 下部 X最小値) を昇順にソート:
[[ 1.3103 181.38 54.892 325.32 0.57503 60]
[ 84.134 152.66 159.09 231.37 0.58646 22]
[ 344.3 148.63 433.98 289.92 0.9785 7]
[ 426.44 146.62 513.85 303.11 0.96134 10]]
** 上部下部を一つにする:
[[ 61.591 38.322 510.12 262.66 0.4888 0]
[ 156.16 49.026 321.75 166.41 0.35428 55]
[ 315.72 59.296 354.99 131.48 0.98443 3]
[ 352.12 59.349 392.61 129.97 0.98521 2]
[ 391.33 60.289 431.3 129.43 0.98215 2]
[ 1.3103 181.38 54.892 325.32 0.57503 60]
[ 84.134 152.66 159.09 231.37 0.58646 22]
[ 344.3 148.63 433.98 289.92 0.9785 7]
[ 426.44 146.62 513.85 303.11 0.96134 10]]
** ナンバープレート解析結果: なにわ100 す58
FPS average: 7.80
Finished.
「detect_number.py」で推論実行
・学習結果モデルを指定(ログレベルを'0'として識別経過を表示)
(py_learn) python detect_number.py --log 0
・実行ログ
(py_learn) python detect_number.py --log 0
Starting..
Number plate detection Ver. 0.09: Starting application...
OpenCV virsion : 4.9.0
- Image File : ../number/test_data/japan74.jpg
- YOLO v5 : ultralytics/yolov5
- Pretrained : ./runs/train/vd_yolov5s_ep100/weights/best.pt
- Pretrained 2 : ./runs/train/nm_yolov5s_ep50/weights/best.pt
- Confidence lv: 0.4
- Label file : ./data/nm_dataset/names_jp
- Program Title: y
- Speed flag : y
- Processed out: non
- Use device : cuda:0
- Log Level : 0
Using cache found in C:\Users\izuts/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5 2024-4-9 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)
Fusing layers...
Model summary: 157 layers, 7015519 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape...
** プレートラベル:
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'あ', 'い', 'う', 'え', 'か', 'き', 'く', 'け', 'こ', 'さ', 'す', 'せ', 'そ', 'た', 'ち', 'つ', 'て', 'と', 'な', 'に', 'ぬ', 'ね', 'の', 'は', 'ひ', 'ふ', 'ほ', 'ま', 'み', 'む', 'め', 'も', 'や', 'ゆ', 'よ', 'ら', 'り', 'る', 'れ', 'ろ', 'わ', 'を', '京都', 'なにわ', '大阪', '和泉', '堺', '奈良', '和歌山', '神戸', '徳島', '香川', '愛媛', '千葉']
Using cache found in C:\Users\izuts/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5 2024-4-9 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Ti, 12282MiB)
Fusing layers...
Model summary: 157 layers, 7182733 parameters, 0 gradients, 16.3 GFLOPs
Adding AutoShape...
** プレートの Bounding Box:
[[ 168.33 60.04 414.33 173.39 0.85431 0]]
** プレート画像 Bounding Box:
[[ 174.72 13.63 194.84 47.704 0.98788 0]
[ 155.33 12.984 175.65 47.85 0.98763 0]
[ 136.1 12.444 156.45 47.14 0.98613 3]
[ 92.386 54.675 140.81 113 0.98496 0]
[ 56.594 53.658 97.797 113 0.98449 7]
[ 153.8 56.805 193.96 113 0.98116 4]
[ 193.16 57.049 234.76 113 0.98103 5]
[ 21.854 59.884 55.78 93.946 0.58157 30]
[ 57.435 5.9446 138.67 45.611 0.57542 53]
[ 193.21 0 243.2 11.105 0.41337 2]
[ 61.161 10.435 133.51 45.012 0.38094 54]]
** Y最小値 を昇順にソートX:
[[ 193.21 0 243.2 11.105 0.41337 2]
[ 57.435 5.9446 138.67 45.611 0.57542 53]
[ 61.161 10.435 133.51 45.012 0.38094 54]
[ 136.1 12.444 156.45 47.14 0.98613 3]
[ 155.33 12.984 175.65 47.85 0.98763 0]
[ 174.72 13.63 194.84 47.704 0.98788 0]
[ 56.594 53.658 97.797 113 0.98449 7]
[ 92.386 54.675 140.81 113 0.98496 0]
[ 153.8 56.805 193.96 113 0.98116 4]
[ 193.16 57.049 234.76 113 0.98103 5]
[ 21.854 59.884 55.78 93.946 0.58157 30]]
** 上下を分割する場所:
bbox = (0.00,0.00)-(246.00,113.00) ylimit = 34.24 index = 6
** ナンバープレートの上部:
[[ 193.21 0 243.2 11.105 0.41337 2]
[ 57.435 5.9446 138.67 45.611 0.57542 53]
[ 61.161 10.435 133.51 45.012 0.38094 54]
[ 136.1 12.444 156.45 47.14 0.98613 3]
[ 155.33 12.984 175.65 47.85 0.98763 0]
[ 174.72 13.63 194.84 47.704 0.98788 0]]
** ナンバープレートの下部:
[[ 56.594 53.658 97.797 113 0.98449 7]
[ 92.386 54.675 140.81 113 0.98496 0]
[ 153.8 56.805 193.96 113 0.98116 4]
[ 193.16 57.049 234.76 113 0.98103 5]
[ 21.854 59.884 55.78 93.946 0.58157 30]]
** 上部 X最小値) を昇順にソート:
[[ 57.435 5.9446 138.67 45.611 0.57542 53]
[ 61.161 10.435 133.51 45.012 0.38094 54]
[ 136.1 12.444 156.45 47.14 0.98613 3]
[ 155.33 12.984 175.65 47.85 0.98763 0]
[ 174.72 13.63 194.84 47.704 0.98788 0]
[ 193.21 0 243.2 11.105 0.41337 2]]
** 下部 X最小値) を昇順にソート:
[[ 21.854 59.884 55.78 93.946 0.58157 30]
[ 56.594 53.658 97.797 113 0.98449 7]
[ 92.386 54.675 140.81 113 0.98496 0]
[ 153.8 56.805 193.96 113 0.98116 4]
[ 193.16 57.049 234.76 113 0.98103 5]]
** 上部下部を一つにする:
[[ 57.435 5.9446 138.67 45.611 0.57542 53]
[ 61.161 10.435 133.51 45.012 0.38094 54]
[ 136.1 12.444 156.45 47.14 0.98613 3]
[ 155.33 12.984 175.65 47.85 0.98763 0]
[ 174.72 13.63 194.84 47.704 0.98788 0]
[ 193.21 0 243.2 11.105 0.41337 2]
[ 21.854 59.884 55.78 93.946 0.58157 30]
[ 56.594 53.658 97.797 113 0.98449 7]
[ 92.386 54.675 140.81 113 0.98496 0]
[ 153.8 56.805 193.96 113 0.98116 4]
[ 193.16 57.049 234.76 113 0.98103 5]]
** ナンバープレート解析結果: なにわ300 ぬ7045
FPS average: 5.90
Finished.
ソースコード
▼「detect3_number.py」
▲「detect3_number.py」
# -*- coding: utf-8 -*-
##------------------------------------------
## ナンバープレート識別 Ver. 0.8
## YoloV5 in PyTorch による物体検出
## ** 同じ画像からプレート領域/情報を識別 **
##
## 2024.05.20 Masahiro Izutsu
##------------------------------------------
## detect3_number.py (detect3_yolov5.py Ver.0.08を改良)
# -y <YOLOv5> -m <Pretrained model>
# 'ultralytics/yolov5' 'yolov5s' [yolov5n][yolov5m][yolov5l][yolov5x] Torch Hub on line
# '/anaconda_win/workspace_pylearn/yolov5' '/anaconda_win/workspace_pylearn/yolov5/yolov5s' off line
#
# 例:Windows
# python detect2_yolov5.py (Torch Hub on line )
# python detect2_yolov5.py -y '/anaconda_win/workspace_pylearn/yolov5' -m '/anaconda_win/workspace_pylearn/yolov5/yolov5s'
#
# 例:Linux
# python detect2_yolov5.py (Torch Hub on line)
# python detect2_yolov5.py -y '~/workspace_pylearn/yolov5' -m '~/workspace_pylearn/yolov5/yolov5s'
# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
# 定数定義
WINDOW_WIDTH = 640
from os.path import expanduser
INPUT_DEF = expanduser('../number/test_data/japan78.jpg')
LANG_DEF = expanduser('./data/nm_dataset/vd_names_jp')
MODEL1_DEF = expanduser('./runs/train/vd_yolov5s_ep100/weights/best.pt')
MODEL2_DEF = expanduser('./runs/train/nm_yolov5s_ep50/weights/best.pt')
NUMBER_YL = float(50/165) # Y境界を少し上げる 70→50
LOCATE_IDMIN = 52
# import処理
import sys
import cv2
import numpy as np
import argparse
import torch
from torch import nn
from torchvision import transforms, models
from PIL import Image
import platform
import my_puttext
import my_fps
import my_color80
import my_logging
from os.path import isfile
from ultralytics.utils.plotting import colors
TEXT_COLOR = my_color80.CR_white
# タイトル
title = 'Number plate detection YOLOv5 Ver. 0.08'
# Parses arguments for the application
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--image', metavar = 'IMAGE_FILE', type=str,
default = INPUT_DEF,
help = 'Absolute path to image file or cam/cam0/cam1 for camera stream.')
parser.add_argument('-y', '--yolov5', metavar = 'YOLOV5', type=str,
default = 'ultralytics/yolov5',
help = 'YOLO V5 directry absolute path.')
parser.add_argument('-m', '--models', metavar = 'MODELS', type=str,
default = MODEL1_DEF,
help = 'yolov5n/yolov5m/yolov5l/yolov5x or model file absolute path.')
parser.add_argument('-ms', '--models2', metavar = 'MODELS2', type=str,
default = MODEL2_DEF,
help = 'second model file absolute path.')
parser.add_argument('-c', '--conf', metavar = 'CONFIDENCE',
default = 0.25, # 2024/05/21
help = 'confidences labels Default value is 0.25')
parser.add_argument('-l', '--labels', metavar = 'LABELS',
default = LANG_DEF, # 2024/04/09
help = 'Language.(jp/en) Default value is \'jp\'')
parser.add_argument('-t', '--title', metavar = 'TITLE',
default = 'y',
help = 'Program title flag.(y/n) Default value is \'y\'')
parser.add_argument('-s', '--speed', metavar = 'SPEED',
default = 'y',
help = 'Speed display flag.(y/n) Default calue is \'y\'')
parser.add_argument('-o', '--out', metavar = 'IMAGE_OUT',
default = 'non',
help = 'Processed image file path. Default value is \'non\'')
parser.add_argument("-cpu", default = False, action = 'store_true',
help="Optional. CPU only!")
parser.add_argument('--log', metavar = 'LOG', default = '3',
help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')
parser.add_argument("--ucr", default=False, action="store_true",
help="use Ultralytics color")
return parser
# モデル基本情報の表示
def display_info(args, image, yolov5, models, models2, conf, labels, titleflg, speedflg, outpath, use_device, log):
print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
print(' OpenCV virsion :',cv2.__version__)
print('\n - ' + YELLOW + 'Image File : ' + NOCOLOR, image)
print(' - ' + YELLOW + 'YOLO v5 : ' + NOCOLOR, yolov5)
print(' - ' + YELLOW + 'Pretrained : ' + NOCOLOR, models)
if models2 != '':
print(' - ' + YELLOW + 'Pretrained 2 : ' + NOCOLOR, models2)
print(' - ' + YELLOW + 'Confidence lv: ' + NOCOLOR, conf)
print(' - ' + YELLOW + 'Label file : ' + NOCOLOR, labels)
print(' - ' + YELLOW + 'Program Title: ' + NOCOLOR, titleflg)
print(' - ' + YELLOW + 'Speed flag : ' + NOCOLOR, speedflg)
print(' - ' + YELLOW + 'Processed out: ' + NOCOLOR, outpath)
print(' - ' + YELLOW + 'Use device : ' + NOCOLOR, use_device)
if args.ucr:
print(' - ' + YELLOW + 'Class color : ' + NOCOLOR, 'Ultralytics')
print(' - ' + YELLOW + 'Log Level : ' + NOCOLOR, log, '\n')
# 画像の種類を判別する
# 戻り値: 'jeg''png'... 画像ファイル
# 'None' 画像ファイル以外 (動画ファイル)
# 'NotFound' ファイルが存在しない
import os
def is_pict(filename):
'''
try:
imgtype = imghdr.what(filename)
except FileNotFoundError as e:
imgtype = 'NotFound'
return str(imgtype)
'''
if not os.path.isfile(filename):
return 'NotFound'
types = ['.bmp','.png','.jpg','.jpeg','.JPG','.tif']
for ss in types:
if filename.endswith(ss):
return ss
return 'None'
# TorchHubからモデルを読み込む (クラウド/ローカル切り替え) 2024/04/15
def load_model(yolov5, models):
cust = 'custom' if 0 < models.find('yolo') else ''
if yolov5 == 'ultralytics/yolov5':
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models)
else:
model = torch.hub.load(yolov5, 'custom', models)
else:
model = torch.hub.load(yolov5, cust, models)
else:
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models, source='local')
else:
model = torch.hub.load(yolov5, 'custom', models, source='local')
else:
model = torch.hub.load(yolov5, cust, models, source='local')
return model
# ** main関数 **
def main():
# 日本語フォント指定
fontPIL = my_puttext.get_font() # 2024.03.13
# Argument parsing and parameter setting
args = parse_args().parse_args()
input_stream = args.image
labels = args.labels # 2024/04/09
titleflg = args.title
speedflg = args.speed
# アプリケーション・ログ設定
module = os.path.basename(__file__)
module_name = os.path.splitext(module)[0]
logger = my_logging.get_module_logger_sel(module_name, int(args.log))
logger.info(' Starting..')
# 入力 cam/cam0-cam9 対応 # 2024/04/15
if input_stream.find('cam') == 0 and len(input_stream) < 5:
input_stream = 0 if input_stream == 'cam' else int(input_stream[3])
isstream = True
else:
filetype = is_pict(input_stream)
isstream = filetype == 'None'
if (filetype == 'NotFound'):
print(RED + "\ninput file Not found." + NOCOLOR)
quit()
outpath = args.out
conf = args.conf
yolov5 = args.yolov5 if platform.system()=='Windows' else expanduser(args.yolov5)
models = args.models if platform.system()=='Windows' else expanduser(args.models)
models2 = args.models2 if platform.system()=='Windows' else expanduser(args.models2)
# GPUが使用できるか調べる
use_device = 'cuda:0' if not args.cpu and torch.cuda.is_available() else 'cpu'
# 情報表示
display_info(args, input_stream, yolov5, models, models2, conf, labels, titleflg, speedflg, outpath, use_device, args.log)
# TorchHubからモデルを読み込む
model = load_model(yolov5, models)
# モデルを推論用に設定する
model.eval()
model.to(use_device)
class_num = len(model.names)
class_ofs = 0
# 判定ラベル # 2024/04/30 Ver. 0.07
if isfile(labels):
with open(labels, 'r', encoding="utf-8-sig") as labels_file:
label_list = labels_file.read().splitlines()
else:
label_list = model.names
for i in range(len(label_list)):
s = label_list[i]
if s[0] == "'" and s[-1] == "'":
label_list[i] = s[1:-1] # 2024/05/21 Ver. 0.08
# 2つ目のモデル指定がある場合
model2 = None
if models2 != '':
model2 = load_model(yolov5, models2)
model2.eval()
model2.to(use_device)
class_ofs = class_num
# 2番目のモデルのクラスID が1番目のクラスにすべて含まれているかチェック
key = list(model.names)
val = list(model.names.values())
eq = True
for i in range(len(key)):
if (key[i], val[i]) not in model2.names.items():
eq = False
if eq:
class_ofs = 0 # ID はすべて含んでいる
if not isfile(labels): # 2024/04/30 Ver. 0.07 2番目のモデルのラベルを追加
for i in range(len(model2.names)):
label_list[i + class_ofs] = model2.names[i]
logger.debug(f'\n** プレートラベル:\n{label_list}')
# 入力準備
if (isstream):
# カメラ
cap = cv2.VideoCapture(input_stream)
ret, frame = cap.read()
loopflg = cap.isOpened()
else:
# 画像ファイル読み込み
frame = cv2.imread(input_stream)
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# アスペクト比を固定してリサイズ
img_h, img_w = frame.shape[:2]
if (img_w > WINDOW_WIDTH):
height = round(img_h * (WINDOW_WIDTH / img_w))
frame = cv2.resize(frame, dsize = (WINDOW_WIDTH, height))
loopflg = True # 1回ループ
# 処理結果の記録 step1
if (outpath != 'non'):
if (isstream):
fps = int(cap.get(cv2.CAP_PROP_FPS))
out_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
out_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
outvideo = cv2.VideoWriter(outpath, fourcc, fps, (out_w, out_h))
# 計測値初期化
fpsWithTick = my_fps.fpsWithTick()
fps_total = 0
fpsWithTick.get() # fps計測開始
# メインループ
while (loopflg):
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# ニューラルネットワークを実行する
results = model(frame, size=640)
message = [] # 表示メッセージ
bbox = results.xyxy[0].detach().cpu().numpy()
if models2 != '': # 2つ目のモデル指定がある場合
results2 = model2(frame, size=640)
bbox2 = results2.xyxy[0].detach().cpu().numpy()
for i in range(len(bbox2)):
bbox2[i][5] += class_ofs # クラスID の調整
bbox = np.append(bbox, bbox2, axis=0) # 結果を追加する
# バウンダリーBOX の Ndarray 配列
logger.debug(f'\n** Bounding Box:\n{bbox}')
# バウンダリーBOX が 3つ以上ある場合だけ処理する
if len(bbox) < 3:
num_string = '読み取れません !!'
color_id = 0
xys_bbox = bbox
else:
# 2番目の列(Y最小値) を昇順にソート
col_num = 1
ys_bbox = bbox[np.argsort(bbox[:, col_num])]
logger.debug(f'\n** Y最小値 を昇順にソートX:\n{ys_bbox}')
# 最初の枠情報から 上下を分割するインデックス値を調べる
x0 = ys_bbox[0,0]
y0 = ys_bbox[0,1]
x1 = ys_bbox[0,2]
y1 = ys_bbox[0,3]
ys = (y1 - y0) * NUMBER_YL + y0
up_n = 0
for i in ys_bbox[:, 1]:
up_n = up_n + 1 if i < ys else up_n
logger.debug(f'\n** 上下を分割する場所:\nbbox = ({x0:.2f},{y0:.2f})-({x1:.2f},{y1:.2f}) ylimit = {ys:.2f} index = {up_n}')
# 上下を分割する
up_bbox, dn_bbox = np.split(ys_bbox, [up_n])
logger.debug(f'\n** ナンバープレートの上部:\n{up_bbox}')
logger.debug(f'\n** ナンバープレートの下部:\n{dn_bbox}')
# 上下別々に最初の列(X最小値) を昇順にソート
col_num = 0
ups_bbox = up_bbox[np.argsort(up_bbox[:, col_num])]
logger.debug(f'\n** 上部 X最小値) を昇順にソート:\n{up_bbox}')
dns_bbox = dn_bbox[np.argsort(dn_bbox[:, col_num])]
logger.debug(f'\n** 下部 X最小値) を昇順にソート:\n{dns_bbox}')
# 上部下部を結合
xys_bbox = np.concatenate([ups_bbox, dns_bbox])
logger.debug(f'\n** 上部下部を一つにする:\n{xys_bbox}')
# 地名
nm_id = int(xys_bbox[1, 5])
up_string = '□□' if xys_bbox[1, 4] > conf and nm_id < LOCATE_IDMIN + class_num else label_list[nm_id]
# ひらがな
dn_string = '□'
for bb in xys_bbox:
nm_id = int(bb[5])
if bb[4] > conf and nm_id < LOCATE_IDMIN and nm_id > 9 + class_num:
dn_string = label_list[nm_id]
i = 0
for bb in xys_bbox:
nm_id = int(bb[5])
if i < up_n: # プレート上部
if bb[4] > conf and nm_id >= class_num and nm_id < 10 + class_num:
up_string = up_string + label_list[nm_id]
else: # プレート下部
if bb[4] > conf and nm_id >= class_num and nm_id < 10 + class_num:
dn_string = dn_string + label_list[nm_id]
i += 1
num_string = up_string + ' ' + dn_string
logger.debug(f'\n** ナンバープレート解析結果: {num_string}')
for preds in xys_bbox:
xmin = int(preds[0])
ymin = int(preds[1])
xmax = int(preds[2])
ymax = int(preds[3])
confidence = preds[4]
class_id = int(preds[5])
color_id = class_id
if (confidence > conf): # 低い確率を除外
# オブジェクト別の色指定 # 2024/04/16
if args.ucr:
BOX_COLOR = colors(color_id, True)
LABEL_BG_COLOR = BOX_COLOR
else:
BOX_COLOR = my_color80.get_boder_bgr80(color_id)
LABEL_BG_COLOR = my_color80.get_back_bgr80(color_id)
# ラベル描画領域を得る
if class_id < class_num:
x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0,
areaf=True)
xx = xmax if xmax > x1 else x1 # 横が領域を超える場合は超えた値にする
cv2.rectangle(frame,(xmin, ymin), (xx, ymin-18), LABEL_BG_COLOR, -1)
my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin-3), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0)
# 画像に枠を描く
ss = 2 if class_id < class_num else 1
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), BOX_COLOR, ss)
# FPSを計算する
fps = fpsWithTick.get()
st_fps = 'fps: {:>6.2f}'.format(fps)
if (speedflg == 'y'):
cv2.rectangle(frame, (10, 38), (95, 55), (90, 90, 90), -1)
cv2.putText(frame, st_fps, (15, 50), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.4, color=(255, 255, 255), lineType=cv2.LINE_AA)
# タイトル描画
if (titleflg == 'y'):
cv2.putText(frame, title, (12, 32), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(0, 0, 0), lineType=cv2.LINE_AA)
cv2.putText(frame, title, (10, 30), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(200, 200, 0), lineType=cv2.LINE_AA)
# 画像表示
window_name = title + " (hit 'q' or 'esc' key to exit)"
cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL)
cv2.imshow(window_name, frame)
# 処理結果の記録 step2
if (outpath != 'non'):
if (isstream):
outvideo.write(frame)
else:
cv2.imwrite(outpath, frame)
# 何らかのキーが押されたら終了
breakflg = False
while(True):
key = cv2.waitKey(1)
prop_val = cv2.getWindowProperty(window_name, cv2.WND_PROP_ASPECT_RATIO)
if cv2.getWindowProperty(window_name, cv2.WND_PROP_VISIBLE) < 1:
print('\n Window close !!')
sys.exit(0)
if key == 27 or key == 113 or (prop_val < 0.0): # 'esc' or 'q'
breakflg = True
break
if (isstream):
break
if ((breakflg == False) and isstream):
# 次のフレームを読み出す
ret, frame = cap.read()
if ret == False:
break
loopflg = cap.isOpened()
else:
loopflg = False
# 終了処理
if (isstream):
cap.release()
# 処理結果の記録 step3
if (outpath != 'non'):
if (isstream):
outvideo.release()
cv2.destroyAllWindows()
print('\nFPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
print('\n Finished.')
# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
sys.exit(main())
▼「detect_number.py」
▲「detect_number.py」
# -*- coding: utf-8 -*-
##------------------------------------------
## ナンバープレート識別 Ver. 0.09
## YoloV5 in PyTorch による物体検出
## ** プレート領域画像で再識別する **
##
## 2024.05.20 Masahiro Izutsu
##------------------------------------------
## detect_number.py (detect3_yolov5.py Ver.0.08を改良)
# -y <YOLOv5> -m <Pretrained model>
# 'ultralytics/yolov5' 'yolov5s' [yolov5n][yolov5m][yolov5l][yolov5x] Torch Hub on line
# '/anaconda_win/workspace_pylearn/yolov5' '/anaconda_win/workspace_pylearn/yolov5/yolov5s' off line
#
# 例:Windows
# python detect2_yolov5.py (Torch Hub on line )
# python detect2_yolov5.py -y '/anaconda_win/workspace_pylearn/yolov5' -m '/anaconda_win/workspace_pylearn/yolov5/yolov5s'
#
# 例:Linux
# python detect2_yolov5.py (Torch Hub on line)
# python detect2_yolov5.py -y '~/workspace_pylearn/yolov5' -m '~/workspace_pylearn/yolov5/yolov5s'
# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
# 定数定義
WINDOW_WIDTH = 640
from os.path import expanduser
INPUT_DEF = expanduser('../number/test_data/japan74.jpg')
LANG_DEF = expanduser('./data/nm_dataset/names_jp')
MODEL1_DEF = expanduser('./runs/train/vd_yolov5s_ep100/weights/best.pt')
MODEL2_DEF = expanduser('./runs/train/nm_yolov5s_ep50/weights/best.pt')
NUMBER_YL = float(50/165) # Y境界を少し上げる 70→50
LOCATE_IDMIN = 52
# import処理
import sys
import cv2
import numpy as np
import argparse
import torch
from torch import nn
from torchvision import transforms, models
from PIL import Image
import platform
import my_puttext
import my_fps
import my_color80
import my_logging
from os.path import isfile
from ultralytics.utils.plotting import colors
TEXT_COLOR = my_color80.CR_white
# タイトル
title = 'Number plate detection Ver. 0.09'
# Parses arguments for the application
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--image', metavar = 'IMAGE_FILE', type=str,
default = INPUT_DEF,
help = 'Absolute path to image file or cam/cam0/cam1 for camera stream.')
parser.add_argument('-y', '--yolov5', metavar = 'YOLOV5', type=str,
default = 'ultralytics/yolov5',
help = 'YOLO V5 directry absolute path.')
parser.add_argument('-m', '--models', metavar = 'MODELS', type=str,
default = MODEL1_DEF,
help = 'yolov5n/yolov5m/yolov5l/yolov5x or model file absolute path.')
parser.add_argument('-ms', '--models2', metavar = 'MODELS2', type=str,
default = MODEL2_DEF,
help = 'second model file absolute path.')
parser.add_argument('-c', '--conf', metavar = 'CONFIDENCE',
default = 0.40, # 2024/04/14
help = 'confidences labels Default value is 0.25')
parser.add_argument('-l', '--labels', metavar = 'LABELS',
default = LANG_DEF, # 2024/04/09
help = 'Language.(jp/en) Default value is \'jp\'')
parser.add_argument('-t', '--title', metavar = 'TITLE',
default = 'y',
help = 'Program title flag.(y/n) Default value is \'y\'')
parser.add_argument('-s', '--speed', metavar = 'SPEED',
default = 'y',
help = 'Speed display flag.(y/n) Default calue is \'y\'')
parser.add_argument('-o', '--out', metavar = 'IMAGE_OUT',
default = 'non',
help = 'Processed image file path. Default value is \'non\'')
parser.add_argument("-cpu", default = False, action = 'store_true',
help="Optional. CPU only!")
parser.add_argument('--log', metavar = 'LOG', default = '3',
help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')
parser.add_argument("--ucr", default=False, action="store_true",
help="use Ultralytics color")
return parser
# モデル基本情報の表示
def display_info(args, image, yolov5, models, models2, conf, labels, titleflg, speedflg, outpath, use_device, log):
print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
print(' OpenCV virsion :',cv2.__version__)
print('\n - ' + YELLOW + 'Image File : ' + NOCOLOR, image)
print(' - ' + YELLOW + 'YOLO v5 : ' + NOCOLOR, yolov5)
print(' - ' + YELLOW + 'Pretrained : ' + NOCOLOR, models)
if models2 != '':
print(' - ' + YELLOW + 'Pretrained 2 : ' + NOCOLOR, models2)
print(' - ' + YELLOW + 'Confidence lv: ' + NOCOLOR, conf)
print(' - ' + YELLOW + 'Label file : ' + NOCOLOR, labels)
print(' - ' + YELLOW + 'Program Title: ' + NOCOLOR, titleflg)
print(' - ' + YELLOW + 'Speed flag : ' + NOCOLOR, speedflg)
print(' - ' + YELLOW + 'Processed out: ' + NOCOLOR, outpath)
print(' - ' + YELLOW + 'Use device : ' + NOCOLOR, use_device)
if args.ucr:
print(' - ' + YELLOW + 'Class color : ' + NOCOLOR, 'Ultralytics')
print(' - ' + YELLOW + 'Log Level : ' + NOCOLOR, log, '\n')
# 画像の種類を判別する
# 戻り値: 'jeg''png'... 画像ファイル
# 'None' 画像ファイル以外 (動画ファイル)
# 'NotFound' ファイルが存在しない
import os
def is_pict(filename):
'''
try:
imgtype = imghdr.what(filename)
except FileNotFoundError as e:
imgtype = 'NotFound'
return str(imgtype)
'''
if not os.path.isfile(filename):
return 'NotFound'
types = ['.bmp','.png','.jpg','.jpeg','.JPG','.tif']
for ss in types:
if filename.endswith(ss):
return ss
return 'None'
# TorchHubからモデルを読み込む (クラウド/ローカル切り替え) 2024/04/15
def load_model(yolov5, models):
cust = 'custom' if 0 < models.find('yolo') else ''
if yolov5 == 'ultralytics/yolov5':
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models)
else:
model = torch.hub.load(yolov5, 'custom', models)
else:
model = torch.hub.load(yolov5, cust, models)
else:
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models, source='local')
else:
model = torch.hub.load(yolov5, 'custom', models, source='local')
else:
model = torch.hub.load(yolov5, cust, models, source='local')
return model
# ** main関数 **
def main():
# 日本語フォント指定
fontPIL = my_puttext.get_font() # 2024.03.13
# Argument parsing and parameter setting
args = parse_args().parse_args()
input_stream = args.image
labels = args.labels # 2024/04/09
titleflg = args.title
speedflg = args.speed
# アプリケーション・ログ設定
module = os.path.basename(__file__)
module_name = os.path.splitext(module)[0]
logger = my_logging.get_module_logger_sel(module_name, int(args.log))
logger.info(' Starting..')
# 入力 cam/cam0-cam9 対応 # 2024/04/15
if input_stream.find('cam') == 0 and len(input_stream) < 5:
input_stream = 0 if input_stream == 'cam' else int(input_stream[3])
isstream = True
else:
filetype = is_pict(input_stream)
isstream = filetype == 'None'
if (filetype == 'NotFound'):
print(RED + "\ninput file Not found." + NOCOLOR)
quit()
outpath = args.out
conf = args.conf
yolov5 = args.yolov5 if platform.system()=='Windows' else expanduser(args.yolov5)
models = args.models if platform.system()=='Windows' else expanduser(args.models)
models2 = args.models2 if platform.system()=='Windows' else expanduser(args.models2)
# GPUが使用できるか調べる
use_device = 'cuda:0' if not args.cpu and torch.cuda.is_available() else 'cpu'
# 情報表示
display_info(args, input_stream, yolov5, models, models2, conf, labels, titleflg, speedflg, outpath, use_device, args.log)
# TorchHubからモデルを読み込む
model = load_model(yolov5, models)
# モデルを推論用に設定する
model.eval()
model.to(use_device)
class_num = len(model.names)
class_ofs = 0
# 判定ラベル
if isfile(labels):
with open(labels, 'r', encoding="utf-8-sig") as labels_file:
label_list = labels_file.read().splitlines()
else:
label_list = model.names
for i in range(len(label_list)):
s = label_list[i]
if s[0] == "'" and s[-1] == "'":
label_list[i] = s[1:-1] # 2024/05/21 Ver. 0.08
logger.debug(f'\n** プレートラベル:\n{label_list}')
# 2つ目のモデル
model2 = None
if models2 != '':
model2 = load_model(yolov5, models2)
model2.eval()
model2.to(use_device)
class_ofs = class_num
# 2番目のモデルのクラスID が1番目のクラスにすべて含まれているかチェック
key = list(model.names)
val = list(model.names.values())
if not isfile(labels): # 2番目のモデルのラベルを追加
for i in range(len(model2.names)):
label_list[i + class_ofs] = model2.names[i]
# 入力準備
if (isstream):
# カメラ
cap = cv2.VideoCapture(input_stream)
ret, frame = cap.read()
loopflg = cap.isOpened()
else:
# 画像ファイル読み込み
frame = cv2.imread(input_stream)
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# アスペクト比を固定してリサイズ
img_h, img_w = frame.shape[:2]
if (img_w > WINDOW_WIDTH):
height = round(img_h * (WINDOW_WIDTH / img_w))
frame = cv2.resize(frame, dsize = (WINDOW_WIDTH, height))
loopflg = True # 1回ループ
# 処理結果の記録 step1
if (outpath != 'non'):
if (isstream):
fps = int(cap.get(cv2.CAP_PROP_FPS))
out_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
out_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
outvideo = cv2.VideoWriter(outpath, fourcc, fps, (out_w, out_h))
# 計測値初期化
fpsWithTick = my_fps.fpsWithTick()
fps_total = 0
fpsWithTick.get() # fps計測開始
# メインループ
while (loopflg):
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# ニューラルネットワークを実行する
results = model(frame, size=640)
message = [] # 表示メッセージ
bbox = results.xyxy[0].detach().cpu().numpy()
logger.debug(f'\n** プレートの Bounding Box:\n{bbox}')
# 最も確率の高いプレート枠の ID を調べる
confidence = 0.0
class_id_select = -1
select_n = 0
n = 0
for preds in bbox:
if confidence < preds[4]:
confidence = preds[4]
class_id_select = int(preds[5])
select_n = n
n += 1
number_n = 0 # 後の画像で検出できたナンバープレートの数
for preds in bbox:
xmin = int(preds[0])
ymin = int(preds[1])
xmax = int(preds[2])
ymax = int(preds[3])
confidence = preds[4]
class_id = int(preds[5])
color_id = 0
num_string = '----'
# 最も確率の高いプレート枠のみ処理する 低い確率と プレートID以外を除外(枠の確率は定数)
if confidence > 0.25 and class_id < class_num and class_id == class_id_select and number_n == select_n:
# プレート画像の切り出し
img_nm = frame[ymin:ymax,xmin:xmax]
cv2.imwrite(f'plate{number_n}.jpg',img_nm) # 仮保存
# プレート画像を推論して、バウンダリーBOX の Ndarray 配列を得る
results2 = model2(img_nm, size=640)
bbox2 = results2.xyxy[0].detach().cpu().numpy()
if len(bbox2) > 0:
logger.debug(f'\n** プレート画像 Bounding Box:\n{bbox2}')
# 2番目の列(Y最小値) を昇順にソート
col_num = 1
ys_bbox = bbox2[np.argsort(bbox2[:, col_num])]
logger.debug(f'\n** Y最小値 を昇順にソートX:\n{ys_bbox}')
# プレート画像から 上下を分割するインデックス値を調べる
nm_h, nm_w = img_nm.shape[:2]
ys = nm_h * NUMBER_YL
up_n = 0
for i in ys_bbox[:, 1]:
up_n = up_n + 1 if i < ys else up_n
logger.debug(f'\n** 上下を分割する場所:\nbbox = ({0:.2f},{0:.2f})-({nm_w:.2f},{nm_h:.2f}) ylimit = {ys:.2f} index = {up_n}')
# 上下を分割する
up_bbox, dn_bbox = np.split(ys_bbox, [up_n])
logger.debug(f'\n** ナンバープレートの上部:\n{up_bbox}')
logger.debug(f'\n** ナンバープレートの下部:\n{dn_bbox}')
# 上下別々に最初の列(X最小値) を昇順にソート
col_num = 0
ups_bbox = up_bbox[np.argsort(up_bbox[:, col_num])]
logger.debug(f'\n** 上部 X最小値) を昇順にソート:\n{ups_bbox}')
dns_bbox = dn_bbox[np.argsort(dn_bbox[:, col_num])]
logger.debug(f'\n** 下部 X最小値) を昇順にソート:\n{dns_bbox}')
# 上部下部を結合
xys_bbox = np.concatenate([ups_bbox, dns_bbox])
logger.debug(f'\n** 上部下部を一つにする:\n{xys_bbox}')
# 地名
nm_id = int(xys_bbox[0, 5])
up_string = '□□' if xys_bbox[0, 4] <= conf or nm_id < LOCATE_IDMIN else label_list[nm_id]
color_id = nm_id
# ひらがな
dn_string = '□'
for bb in xys_bbox:
nm_id = int(bb[5])
if bb[4] > conf and nm_id < LOCATE_IDMIN and nm_id >9:
dn_string = label_list[nm_id]
i = 0
for bb in xys_bbox:
nm_id = int(bb[5])
if i < up_n: # プレート上部
if bb[4] > conf and nm_id < 10 and bb[1] > 0:
up_string = up_string + label_list[nm_id]
else: # プレート下部
if bb[4] > conf and nm_id < 10 and bb[1] > 0:
dn_string = dn_string + label_list[nm_id]
i += 1
num_string = up_string + ' ' + dn_string
logger.debug(f'\n** ナンバープレート解析結果: {num_string}')
# ナンバー内検出領域
for bx in bbox2:
xmin2 = int(bx[0]) + xmin
ymin2 = int(bx[1]) + ymin
xmax2 = int(bx[2]) + xmin
ymax2 = int(bx[3]) + ymin
confidence2 = bx[4]
class_id2 = int(bx[5])
box_cr = my_color80.get_boder_bgr80(class_id2)
if confidence2 > conf:
cv2.rectangle(frame, (xmin2, ymin2), (xmax2, ymax2), box_cr, 1)
# オブジェクト別の色指定
BOX_COLOR = my_color80.get_boder_bgr80(color_id)
LABEL_BG_COLOR = my_color80.get_back_bgr80(color_id)
# ラベル描画領域を得る
x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0,
areaf=True)
xx = xmax if xmax > x1 else x1 # 横が領域を超える場合は超えた値にする
cv2.rectangle(frame,(xmin, ymin-18), (xx, ymin), LABEL_BG_COLOR, -1)
my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin-3), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0)
# 画像に枠を描く
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), BOX_COLOR, 2)
number_n += 1
# FPSを計算する
fps = fpsWithTick.get()
st_fps = 'fps: {:>6.2f}'.format(fps)
if (speedflg == 'y'):
cv2.rectangle(frame, (10, 38), (95, 55), (90, 90, 90), -1)
cv2.putText(frame, st_fps, (15, 50), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.4, color=(255, 255, 255), lineType=cv2.LINE_AA)
# タイトル描画
if (titleflg == 'y'):
cv2.putText(frame, title, (12, 32), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(0, 0, 0), lineType=cv2.LINE_AA)
cv2.putText(frame, title, (10, 30), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(200, 200, 0), lineType=cv2.LINE_AA)
# 画像表示
window_name = title + " (hit 'q' or 'esc' key to exit)"
cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL)
cv2.imshow(window_name, frame)
# 処理結果の記録 step2
if (outpath != 'non'):
if (isstream):
outvideo.write(frame)
else:
cv2.imwrite(outpath, frame)
# 何らかのキーが押されたら終了
breakflg = False
while(True):
key = cv2.waitKey(1)
prop_val = cv2.getWindowProperty(window_name, cv2.WND_PROP_ASPECT_RATIO)
if cv2.getWindowProperty(window_name, cv2.WND_PROP_VISIBLE) < 1:
print('\n Window close !!')
sys.exit(0)
if key == 27 or key == 113 or (prop_val < 0.0): # 'esc' or 'q'
breakflg = True
break
if (isstream):
break
if ((breakflg == False) and isstream):
# 次のフレームを読み出す
ret, frame = cap.read()
if ret == False:
break
loopflg = cap.isOpened()
else:
loopflg = False
# 終了処理
if (isstream):
cap.release()
# 処理結果の記録 step3
if (outpath != 'non'):
if (isstream):
outvideo.release()
cv2.destroyAllWindows()
print('\nFPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
print('\n Finished.')
# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
sys.exit(main())
▼「detect_number2.py」
▲「detect_number2.py」
# -*- coding: utf-8 -*-
##------------------------------------------
## ナンバープレート識別 Ver. 0.10
## YoloV5 in PyTorch による物体検出
## ** プレート領域画像でOCR識別する **
##
## 2024.05.28 Masahiro Izutsu
##------------------------------------------
## detect_number2.py (detect_number.py Ver.0.09を改良)
# -y <YOLOv5> -m <Pretrained model>
# 'ultralytics/yolov5' 'yolov5s' [yolov5n][yolov5m][yolov5l][yolov5x] Torch Hub on line
# '/anaconda_win/workspace_pylearn/yolov5' '/anaconda_win/workspace_pylearn/yolov5/yolov5s' off line
#
# 例:Windows
# python detect2_yolov5.py (Torch Hub on line )
# python detect2_yolov5.py -y '/anaconda_win/workspace_pylearn/yolov5' -m '/anaconda_win/workspace_pylearn/yolov5/yolov5s'
#
# 例:Linux
# python detect2_yolov5.py (Torch Hub on line)
# python detect2_yolov5.py -y '~/workspace_pylearn/yolov5' -m '~/workspace_pylearn/yolov5/yolov5s'
# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
# 定数定義
WINDOW_WIDTH = 640
from os.path import expanduser
INPUT_DEF = expanduser('../number/test_data/japan78.jpg')
MODEL1_DEF = expanduser('./runs/train/vd_yolov5s_ep100/weights/best.pt')
NUMBER_Y2 = float(70/165) # 4桁の縦位置
NUMBER_X2 = float(70/330) # 4桁の横位置
NUMBER_UX0 = float(80/330) # 地名の最初
NUMBER_UX1 = float(270/330) # 3桁の最後
NUMBER_UX2 = float(15/330) # ひらがなの最初
# import処理
import sys
import cv2
import numpy as np
import argparse
import torch
from torch import nn
from torchvision import transforms, models
from PIL import Image
import platform
import my_puttext
import my_fps
import my_color80
import my_logging
from os.path import isfile
from ultralytics.utils.plotting import colors
import pyocr
import my_imgprocess
TEXT_COLOR = my_color80.CR_white
# タイトル
title = 'Number plate detection with OCR Ver. 0.10'
# Parses arguments for the application
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--image', metavar = 'IMAGE_FILE', type=str,
default = INPUT_DEF,
help = 'Absolute path to image file or cam/cam0/cam1 for camera stream.')
parser.add_argument('-y', '--yolov5', metavar = 'YOLOV5', type=str,
default = 'ultralytics/yolov5',
help = 'YOLO V5 directry absolute path.')
parser.add_argument('-m', '--models', metavar = 'MODELS', type=str,
default = MODEL1_DEF,
help = 'yolov5n/yolov5m/yolov5l/yolov5x or model file absolute path.')
parser.add_argument('-c', '--conf', metavar = 'CONFIDENCE',
default = 0.40, # 2024/04/14
help = 'confidences labels Default value is 0.25')
parser.add_argument('-t', '--title', metavar = 'TITLE',
default = 'y',
help = 'Program title flag.(y/n) Default value is \'y\'')
parser.add_argument('-s', '--speed', metavar = 'SPEED',
default = 'y',
help = 'Speed display flag.(y/n) Default calue is \'y\'')
parser.add_argument('-o', '--out', metavar = 'IMAGE_OUT',
default = 'non',
help = 'Processed image file path. Default value is \'non\'')
parser.add_argument("-cpu", default = False, action = 'store_true',
help="Optional. CPU only!")
parser.add_argument('--log', metavar = 'LOG', default = '3',
help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')
parser.add_argument("--ucr", default=False, action="store_true",
help="use Ultralytics color")
return parser
# モデル基本情報の表示
def display_info(args, image, yolov5, models, conf, titleflg, speedflg, outpath, use_device, log):
print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
print(' OpenCV virsion :',cv2.__version__)
print('\n - ' + YELLOW + 'Image File : ' + NOCOLOR, image)
print(' - ' + YELLOW + 'YOLO v5 : ' + NOCOLOR, yolov5)
print(' - ' + YELLOW + 'Pretrained : ' + NOCOLOR, models)
print(' - ' + YELLOW + 'Confidence lv: ' + NOCOLOR, conf)
print(' - ' + YELLOW + 'Program Title: ' + NOCOLOR, titleflg)
print(' - ' + YELLOW + 'Speed flag : ' + NOCOLOR, speedflg)
print(' - ' + YELLOW + 'Processed out: ' + NOCOLOR, outpath)
print(' - ' + YELLOW + 'Use device : ' + NOCOLOR, use_device)
if args.ucr:
print(' - ' + YELLOW + 'Class color : ' + NOCOLOR, 'Ultralytics')
print(' - ' + YELLOW + 'Log Level : ' + NOCOLOR, log, '\n')
# 画像の種類を判別する
# 戻り値: 'jeg''png'... 画像ファイル
# 'None' 画像ファイル以外 (動画ファイル)
# 'NotFound' ファイルが存在しない
import os
def is_pict(filename):
'''
try:
imgtype = imghdr.what(filename)
except FileNotFoundError as e:
imgtype = 'NotFound'
return str(imgtype)
'''
if not os.path.isfile(filename):
return 'NotFound'
types = ['.bmp','.png','.jpg','.jpeg','.JPG','.tif']
for ss in types:
if filename.endswith(ss):
return ss
return 'None'
# TorchHubからモデルを読み込む (クラウド/ローカル切り替え) 2024/04/15
def load_model(yolov5, models):
cust = 'custom' if 0 < models.find('yolo') else ''
if yolov5 == 'ultralytics/yolov5':
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models)
else:
model = torch.hub.load(yolov5, 'custom', models)
else:
model = torch.hub.load(yolov5, cust, models)
else:
if cust == '':
if -1 == models.find('.'):
model = torch.hub.load(yolov5, models, source='local')
else:
model = torch.hub.load(yolov5, 'custom', models, source='local')
else:
model = torch.hub.load(yolov5, cust, models, source='local')
return model
# ** main関数 **
def main():
# 日本語フォント指定
fontPIL = my_puttext.get_font() # 2024.03.13
# Argument parsing and parameter setting
args = parse_args().parse_args()
input_stream = args.image
titleflg = args.title
speedflg = args.speed
# アプリケーション・ログ設定
module = os.path.basename(__file__)
module_name = os.path.splitext(module)[0]
logger = my_logging.get_module_logger_sel(module_name, int(args.log))
logger.info(' Starting..')
# OCR
tools = pyocr.get_available_tools()
if len(tools) == 0:
print(RED + "\nOCR tool Not found." + NOCOLOR)
quit()
tool = tools[0]
# 画像処理クラス
my_process = my_imgprocess.ImagePreprocess()
# 文字認識処理
def ocr_process(img_pil, lang, layout):
content = ''
xmin = 0
ymin = 0
xmax = 0
ymax = 0
line_and_word_boxes = tool.image_to_string(img_pil, lang=lang,builder=pyocr.builders.LineBoxBuilder(tesseract_layout=layout))
for lw_box in line_and_word_boxes:
content = lw_box.content
position = lw_box.position
box = []
txt = []
for lw_box in lw_box.word_boxes:
txt.append(lw_box.content)
box.append(lw_box.position)
confidence = lw_box.confidence
xmin = position[0][0]
ymin = position[0][1]
xmax = position[1][0]
ymax = position[1][1]
logger.debug(f'\n contents: \t{content}')
logger.debug(f' position: \t{position}')
logger.debug(f' confidence: \t{confidence}')
return content, xmin, ymin, xmax, ymax
# 入力 cam/cam0-cam9 対応 # 2024/04/15
if input_stream.find('cam') == 0 and len(input_stream) < 5:
input_stream = 0 if input_stream == 'cam' else int(input_stream[3])
isstream = True
else:
filetype = is_pict(input_stream)
isstream = filetype == 'None'
if (filetype == 'NotFound'):
print(RED + "\ninput file Not found." + NOCOLOR)
quit()
outpath = args.out
conf = args.conf
yolov5 = args.yolov5 if platform.system()=='Windows' else expanduser(args.yolov5)
models = args.models if platform.system()=='Windows' else expanduser(args.models)
# GPUが使用できるか調べる
use_device = 'cuda:0' if not args.cpu and torch.cuda.is_available() else 'cpu'
# 情報表示
display_info(args, input_stream, yolov5, models, conf, titleflg, speedflg, outpath, use_device, args.log)
# TorchHubからモデルを読み込む
model = load_model(yolov5, models)
# モデルを推論用に設定する
model.eval()
model.to(use_device)
class_num = len(model.names)
class_ofs = 0
# 入力準備
if (isstream):
# カメラ
cap = cv2.VideoCapture(input_stream)
ret, frame = cap.read()
loopflg = cap.isOpened()
else:
# 画像ファイル読み込み
frame = cv2.imread(input_stream)
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# アスペクト比を固定してリサイズ
img_h, img_w = frame.shape[:2]
if (img_w > WINDOW_WIDTH):
height = round(img_h * (WINDOW_WIDTH / img_w))
frame = cv2.resize(frame, dsize = (WINDOW_WIDTH, height))
loopflg = True # 1回ループ
# 処理結果の記録 step1
if (outpath != 'non'):
if (isstream):
fps = int(cap.get(cv2.CAP_PROP_FPS))
out_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
out_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
outvideo = cv2.VideoWriter(outpath, fourcc, fps, (out_w, out_h))
# 計測値初期化
fpsWithTick = my_fps.fpsWithTick()
fps_total = 0
fpsWithTick.get() # fps計測開始
# メインループ
while (loopflg):
if frame is None:
print(RED + "\nUnable to read the input." + NOCOLOR)
quit()
# ニューラルネットワークを実行する
results = model(frame, size=640)
message = [] # 表示メッセージ
bbox = results.xyxy[0].detach().cpu().numpy()
logger.debug(f'\n** プレートの Bounding Box:\n{bbox}')
# 最も確率の高いプレート枠の ID を調べる
confidence = 0.0
class_id_select = -1
select_n = 0
n = 0
for preds in bbox:
if confidence < preds[4]:
confidence = preds[4]
class_id_select = int(preds[5])
select_n = n
n += 1
number_n = 0 # 後の画像で検出できたナンバープレートの数
for preds in bbox:
xmin = int(preds[0])
ymin = int(preds[1])
xmax = int(preds[2])
ymax = int(preds[3])
confidence = preds[4]
class_id = int(preds[5])
color_id = 0
num_string = '----'
layout = 6
lang = 'jpn'
y2 = int ((preds[3] - preds[1]) * NUMBER_Y2 + preds[1])
x2 = int ((preds[2] - preds[0]) * NUMBER_X2 + preds[0])
ux0 = int ((preds[2] - preds[0]) * NUMBER_UX0 + preds[0]) # 地名の最初
ux1 = int ((preds[2] - preds[0]) * NUMBER_UX1 + preds[0]) # 3桁の最後
ux2 = int ((preds[2] - preds[0]) * NUMBER_UX2 + preds[0]) # ひらがなの最初
# 最も確率の高いプレート枠のみ処理する 低い確率と プレートID以外を除外(枠の確率は定数)
if confidence > 0.25 and class_id < class_num and class_id == class_id_select and number_n == select_n:
# プレート画像の切り出し
img_nm = frame[ymin:ymax,xmin:xmax]
cv2.imwrite(f'plate{number_n}.jpg',img_nm) # 仮保存
logger.debug(f'\n** ナンバープレート画像を切り出して OCR')
# ナンバープレート上部
img_n0 = frame[ymin:y2,ux0:ux1]
img_n0 = my_process.img_binarization(img_n0)
cv2.imwrite(f'plate{number_n}_0.jpg',img_n0) # 仮保存
img0_pil = Image.fromarray(img_n0)
str0, xx0, yy0, xx1, yy1 = ocr_process(img0_pil, 'jpn', 7)
cv2.rectangle(frame, (xx0 + ux0, yy0 + ymin), (xx1 + ux0, yy1 + ymin), (0, 0, 240), 1)
# ナンバープレート下部ひらがな
img_n1 = frame[y2:ymax,ux2:x2 + 10]
img_n1 = my_process.img_binarization(img_n1)
cv2.imwrite(f'plate{number_n}_1.jpg',img_n1) # 仮保存
img1_pil = Image.fromarray(img_n1)
str1, xx0, yy0, xx1, yy1 = ocr_process(img1_pil, 'jpn', 7)
cv2.rectangle(frame, (xx0 + ux2, yy0 + y2), (xx1 + ux2, yy1 + y2), (0, 0, 240), 1)
# ナンバープレート下部4桁数字
img_n2 = frame[y2:ymax,x2:xmax]
img_n2 = my_process.img_binarization(img_n2)
cv2.imwrite(f'plate{number_n}_2.jpg',img_n2) # 仮保存
img2_pil = Image.fromarray(img_n2)
str2, xx0, yy0, xx1, yy1 = ocr_process(img2_pil, 'jpn', 7)
cv2.rectangle(frame, (xx0 + x2, yy0 + y2), (xx1 + x2, yy1 + y2), (0, 0, 240), 1)
num_string = str0 + ' ' + str1 + ' ' + str2
logger.info(f'\n** ナンバープレート解析結果: {num_string}')
# オブジェクト別の色指定
BOX_COLOR = my_color80.get_boder_bgr80(color_id)
LABEL_BG_COLOR = my_color80.get_back_bgr80(color_id)
# ラベル描画領域を得る
x0,y0,x1,y1 = my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0,
areaf=True)
xx = xmax if xmax > x1 else x1 # 横が領域を超える場合は超えた値にする
cv2.rectangle(frame,(xmin, ymin-18), (xx, ymin), LABEL_BG_COLOR, -1)
my_puttext.cv2_putText(img = frame,
text = num_string,
org = (xmin+5, ymin-3), fontFace = fontPIL,
fontScale = 14,
color = TEXT_COLOR,
mode = 0)
# 画像に枠を描く
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), BOX_COLOR, 2)
number_n += 1
# FPSを計算する
fps = fpsWithTick.get()
st_fps = 'fps: {:>6.2f}'.format(fps)
if (speedflg == 'y'):
cv2.rectangle(frame, (10, 38), (95, 55), (90, 90, 90), -1)
cv2.putText(frame, st_fps, (15, 50), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.4, color=(255, 255, 255), lineType=cv2.LINE_AA)
# タイトル描画
if (titleflg == 'y'):
cv2.putText(frame, title, (12, 32), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(0, 0, 0), lineType=cv2.LINE_AA)
cv2.putText(frame, title, (10, 30), cv2.FONT_HERSHEY_DUPLEX, fontScale=0.8, color=(200, 200, 0), lineType=cv2.LINE_AA)
# 画像表示
window_name = title + " (hit 'q' or 'esc' key to exit)"
cv2.namedWindow(window_name, flags=cv2.WINDOW_AUTOSIZE | cv2.WINDOW_GUI_NORMAL)
cv2.imshow(window_name, frame)
# 処理結果の記録 step2
if (outpath != 'non'):
if (isstream):
outvideo.write(frame)
else:
cv2.imwrite(outpath, frame)
# 何らかのキーが押されたら終了
breakflg = False
while(True):
key = cv2.waitKey(1)
prop_val = cv2.getWindowProperty(window_name, cv2.WND_PROP_ASPECT_RATIO)
if cv2.getWindowProperty(window_name, cv2.WND_PROP_VISIBLE) < 1:
print('\n Window close !!')
sys.exit(0)
if key == 27 or key == 113 or (prop_val < 0.0): # 'esc' or 'q'
breakflg = True
break
if (isstream):
break
if ((breakflg == False) and isstream):
# 次のフレームを読み出す
ret, frame = cap.read()
if ret == False:
break
loopflg = cap.isOpened()
else:
loopflg = False
# 終了処理
if (isstream):
cap.release()
# 処理結果の記録 step3
if (outpath != 'non'):
if (isstream):
outvideo.release()
cv2.destroyAllWindows()
print('\nFPS average: {:>10.2f}'.format(fpsWithTick.get_average()))
print('\n Finished.')
# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
sys.exit(main())