MotionSeg のバックアップ(No.18)

私的AI研究会 > MotionSeg

動画のパーツを入れ替える：Motion Supervised co-part Segmentation †

　画像のセグメンテーションを使い静止画から動画パーツを入れ替える技術「Motion Supervised co-part Segmentation」をローカルマシンで検証する

動画のパーツを入れ替える：Motion Supervised co-part Segmentation
参考資料

※ 最終更新:2024/07/27　

↑

Motion Supervised co-part Segmentation †

↑

概要 †

2004年に発表された『Motion-supervised Co-Part Segmentation』という論文に基づいて作成されたモデル
静止画 source と動画 Terget Frame から静止画のセグメント分割を行い入力動画の部分入れ替えを行った動画を生成する
論文「Motion-supervised Co-Part Segmentation」
<paper>
・https://arxiv.org/pdf/2004.03234
<framework>
・https://github.com/AliaksandrSiarohin/motion-cosegmentation
・https://github.com/AliaksandrSiarohin/face-makeup.PyTorch

↑

実行環境の構築 †

GitHub サイトからプロジェクトをダウンロード

cd /anaconda_win/workspace_2　　　　　　　　　　　　　　　　　　　　　← Windows の場合
cd ~/workspace_2　　　　　　　　　　　　　　　　　　　　　　　　　　　← Linux の場合

git clone https://github.com/AliaksandrSiarohin/motion-cosegmentation motion-co-seg

・「motion-co-seg」フォルダ内にもう一つプロジェクトをダウンロードする

cd motion-co-seg
git clone https://github.com/AliaksandrSiarohin/face-makeup.PyTorch face_parsing

プロジェクト・パッケージ update_20240717.zip (2.91GB) <アップデートファイル> をダウンロード
・解凍してできるフォルダ

update
└─workspace_2
    └─motion-co-seg　　　　　　　　　　　　　　　　　　　　　　　　← GitHub からクローンしたプロジェクトに上書きする
        ├─face_parsing 　　　　　　　　　　　　　　　　　　　　　　← GitHub からクローンしたプロジェクトに上書きする
        │  ├─results
        │  └─result_save
        ├─results
        ├─results_save
        └─sample
            ├─images
            └─videos

・解凍してできる「update/」フォルダ以下を次のフォルダの下に上書きコピーする
　Windows の場合 →「anaconda_win/」　Linux の場合 → 「~/」

以下の項目は仮想環境「py38_learn」で実行する
未作成の場合は → 『仮想環境 (py38_learn)』の手順で仮想環境を作成

↑

前準備 †

「Google Colab」にデモが公開されているので動作確認ののち、ローカルマシンで実行できるようにする
→ Part-swap demo for paper "Motion Supervised co-part Segmentation"
セルの最初から順次実行していく
エラーが出た場合は適宜対応する
・Google Drive をマウントした後「logger.py」「part_swap.py」に下記の修正をし上書きする
　→ 対処した問題点とエラー詳細
・サンプル画像と学習済みモデルは下記サイトから予めダウンロードして Google Drive に「motion-supervised-co-segmentation/」のフォルダを作成してアップロードしておく
　→ https://drive.google.com/drive/folders/1SsBifjoM_qO0iFzb8wLlsz_4qW2j8dZe
Loading checkpoints with 10 parts の項目最初のセルの次に追加する
```
%matplotlib inline
```
パッケージのバージョンアップなどの多少の不具合を修正すれば問題なく動作する
最後のセルまで実行できることを確認して次のステップに移行する

↑

提供されているデモ「part_swap.py」を試す †

「GoogleColabo」で検証したカテゴリー別オプション指定のまとめ

カテゴリー	--config	--checkpoint	--source_image	--terget_video	--swap_index	出力例
0:10seg	config/ vox-256-sem-10segments.yaml	./sample/ vox-10segments.pth.tar	./sample/images/ 16.png	./sample/videos/ 04.mp4	[2]	赤い唇のトランプ
0:10seg	config/ vox-256-sem-10segments.yaml	./sample/ vox-10segments.pth.tar	./sample/images/ 26.png	./sample/ videos/11.mp4	{7,3]	青い目なった女優
1:5seg	config/ 256-sem-5segments.yaml	./sample/ vox-5segments.pth.tar	./sample/images/ 27.png	./sample/videos/ 02.mp4	[3,4,5]	金髪になった女優
			./sample/images/ 27.png	./sample/videos/ 04.mp4	[3,4,5]	トランプ顔の女優
			./sample/images/ 23.png	./sample/videos/ 07.mp4	[1]	髭が生えた男優
2:super	config/ vox-256-sem-10segments.yaml	./sample/ vox-first-order.pth.tar	./sample/images/ anim16.png	./sample/videos/ 04.mp4	[1,2,3.. ...14,15]	女優の顔のトランプ

学習済みモデル（プロジェクト・パッケージに組み込み済み）を使ってデモプログラムを動かす
・提供されている「part_swap.py」は若干の不具合があるので対処した版を「part_swap2.py」とする
・GPU未搭載やメモリー容量などで CUDAエラーが起きる場合は「--cpu」オプションを付加する
・処理結果は「--result_video <filepath>」オプションで指定するファイルに出力される

赤い唇のトランプ (make trump with red lips) を作成

（原文）Identify index of the part that you want to swap. For example to make trump with red lips part 2 should be used

・スワップするパーツのインデックスを特定する。この場合は 2 を使用する

(py38_learn) python part_swap2.py  --config ./config/vox-256-sem-10segments.yaml --target_video ./sample/videos/04.mp4 --source_image ./sample/images/16.png --checkpoint sample/vox-10segments.pth.tar --swap_index 2 --result_video ./result_lips.mp4
100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [00:03<00:00, 66.93it/s]

・GPU動作の場合（「--cpu」オプションなし）

100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [01:25<00:00,  2.46it/s]

目の色を変える (Changing eye color)
・右上のパーツ（セグメント）表示からインデックス 7,9 を指定する

(py38_learn) python part_swap2.py  --config ./config/vox-256-sem-10segments.yaml --target_video ./sample/videos/11.mp4 --source_image ./sample/images/26.png --checkpoint sample/vox-10segments.pth.tar --swap_index 7,9 --result_video ./result_eye_cor.mp4
100%|████████████████████████████████████████████████████████████████████████████████| 109/109 [00:01<00:00, 67.05it/s]

・GPU動作の場合（「--cpu」オプションなし）

100%|████████████████████████████████████████████████████████████████████████████████| 109/109 [00:48<00:00,  2.26it/s]

↑

GUI で操作できるプログラム「motion_seg.py」を作成する †

主な機能
・カテゴリーを指定すると、学習済みモデルの設定を自動でできるようにする
・入力として必要な静止画像と動画は ダイアログにより選択指定する
・オリジナルのオプションパラメータ指定はそのまま利用できる
・GPU未搭載やメモリー容量などで CUDAエラーが起きる場合は「--cpu」オプションを付加する
・処理後に生成される動画と元の静止画/元の動画/処理結果の動画を生成し表示する

出力ファイルの保存場所とファイル名（--result_video './result/result_10seg.mp4' 指定の時）
・「./result」フォルダに保存される（「./result」フォルダは存在しなければならない）
・静止画から生成された動画 → 'result_10seg_ + <静止画> + <元動画> + .mp4'
・静止画/元動画/動画一覧　 → 'result_10seg_ + <静止画> + <元動画> + _a + .mp4'

コマンドオプション一覧

コマンドオプション	引数	初期値	意味
-c, --category	str	'0'	カテゴリー指定（必須）
--config	str	指定しなければ内部設定※	学習済みモデルの設定ファイル（.yaml）
--checkpoint	str		学習済みモデル・ファイル
--source_image	str		静止画ファイルパス
--driving_video	str		動画ファイルパス
--result_video	str		出力保存ファイルパス
--swap_index"	list	[-1]	index of swaped parts
--hard	bool	False	use hard segmentation labels for blending
--use_source_segmentation	bool	False	use source segmentation for swaping
--first_order_motion_model	bool	False	use first order model for alignment
--supervised	bool	False	use supervised segmentation labels for blending. Only for faces.
--cpu	bool	False	cpu mode

※ 指定がないときの内部指定

コマンドオプション	-c 0 (10seg)	-c 1 (5seg)	-c 3 (super)
--config	./config/vox-256-sem-10segments.yaml	./config/vox-256-sem-5segments.yaml	./config/vox-256-sem-10segments.yaml
--checkpoint	./sample/vox-10segments.pth.tar	./sample/vox-5segments.pth.tar	./sample/vox-first-order.pth.tar
--source_image	ダイアログによる静止画選択
--driving_video	ダイアログによる動画選択（swap_index=[-1] の時はなし）
--result_video	./result/result_10seg.mp4	./result/result_5seg.mp4	./result/result_super.mp4
--result_video	./result/result_10seg.png 注1	./result/result_5seg.png 注1	./result/result_super.png 注1

　注1: swap_index=[-1] の時はセグメント表示画像となる

← プログラム実行中にダイアログからファイルを選択

カテゴリー別のセグメント分割

-c 0: 10-segment -c 1: 5-segment -c 2: supervised

コマンド実行例（赤い唇のトランプ）

　・画像:'./sample/image/16.png' 動画:'./sample/videos/04.mp4'

(py38_learn) python motion_seg.py -c 0 --swap_index 2

Motion Supervised co-part Segmentation Ver. 0.01: Starting application...

   - Category                :  0: ** 10-segments model **
   - config                  :  ./config/vox-256-sem-10segments.yaml
   - checkpoint              :  ./sample/vox-10segments.pth.tar
   - source_image            :  C:/anaconda_win/workspace_2/motion-co-seg/sample/images/16.png
   - target_video            :  C:/anaconda_win/workspace_2/motion-co-seg/sample/videos/04.mp4
   - result_video            :  ./results/result_10seg.mp4
   - swap_index              :  [2]
   - hard                    :  False
   - use_source_segmentation :  False
   - first_order_motion_model:  False
   - supervised              :  False
   - cpu                     :  False

100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [00:02<00:00, 71.21it/s]
 Saving... → './results/result_10seg_16_04.mp4'
 Saving... → './results/result_10seg_16_04_a.mp4'

 Finished.

・CPU動作の場合（「--cpu」オプション指定）

100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [01:26<00:00,  2.44it/s]

コマンド実行例（目の色を変える）

　・画像:'./sample/image/26.png' 動画:'./sample/videos/11.mp4'

(py38_learn) python motion_seg.py -c 0 --swap_index 7,9

Motion Supervised co-part Segmentation Ver. 0.01: Starting application...

   - Category                :  0: ** 10-segments model **
   - config                  :  ./config/vox-256-sem-10segments.yaml
   - checkpoint              :  ./sample/vox-10segments.pth.tar
   - source_image            :  C:/anaconda_win/workspace_2/motion-co-seg/sample/images/26.png
   - target_video            :  C:/anaconda_win/workspace_2/motion-co-seg/sample/videos/11.mp4
   - result_video            :  ./results/result_10seg.mp4
   - swap_index              :  [7, 9]
   - hard                    :  False
   - use_source_segmentation :  False
   - first_order_motion_model:  False
   - supervised              :  False
   - cpu                     :  False

100%|████████████████████████████████████████████████████████████████████████████████| 109/109 [00:01<00:00, 69.68it/s]
 Saving... → './results/result_10seg_26_11.mp4'
 Saving... → './results/result_10seg_26_11_a.mp4'

 Finished.

・CPU動作の場合（「--cpu」オプション指定）

100%|████████████████████████████████████████████████████████████████████████████████| 109/109 [00:44<00:00,  2.46it/s]

そのほかの実行例

① Changing hair　画像:'./sample/image/27.png' 動画:'./sample/videos/02.mp4'
```
(py38_learn) python motion_seg.py -c 1 --swap_index 3,4,5
```
② 顔以外の交換　　画像:'./sample/image/27.png' 動画:'./sample/videos/04.mp4'
```
(py38_learn) python motion_seg.py -c 1 --swap_index 3,4,5
```
③ Adding Beard 　画像:'./sample/image/23.png' 動画:'./sample/videos/07.mp4'
```
(py38_learn) python motion_seg.py -c 1 --swap_index 1
```
④ 顔の交換　　　　画像:'./sample/image/16.png' 動画:'./sample/videos/04.mp4'
```
(py38_learn) python motion_seg.py -c 2 --swap_index 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
```

音声付きの生成例（元の動画の音声トラックを引き継ぐ）
　・画像:'./sample/image/26.png' 動画:'./sample/videos/asahi_cm_out01.mp4'
```
(py38_learn) python motion_seg.py -c 0 --swap_index 7,9
```

モジュール・ソースコード

▼「motion_seg.py」

# -*- coding: utf-8 -*-
##------------------------------------------
##   Motion Supervised co-part Segmentation
##                           Demo Ver 0.02
##
##               2024.07.13 Masahiro Izutsu
##------------------------------------------
## motion_seg.py
##
##  Ver 0.02    2024.07.23      GPU 判定追加

# Examples with 10-segments model
#   10seg:  source_image = './sample/image/16.png'
#           target_video = './sample/videos/04.mp4'
#           index        = [2]          (唇)
#
#   10seg:  source_image = './sample/image/25.png'
#           target_video = './sample/videos/11.mp4'
#           index        = [7,9]        (目)
#
#   5seg:   source_image = './sample/image/27.png'
#           target_video = './sample/videos/02.mp4'
#           index        = [3,4,5]      (髪)
#
#   5seg:   source_image = './sample/image/27.png'
#           target_video = './sample/videos/04.mp4'
#           use_source_segmentation = True
#           index        = [3,4,5]      (顔)
#
#   5seg:   source_image = './sample/image/23.png'
#           target_video = './sample/videos/07.mp4'
#           index        = [1]          (髭)
#
#   super:  source_image = './sample/image/16.png'
#           target_video = './sample/videos/04.mp4'
#           index        = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]    (髪以外)

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'

# 定数定義
DEF_CONFIG_10SEG = './config/vox-256-sem-10segments.yaml'
DEF_CHECKPOINT_10SEG = './sample/vox-10segments.pth.tar'
DEF_CONFIG_5SEG = './config/vox-256-sem-5segments.yaml'
DEF_CHECKPOINT_5SEG = './sample/vox-5segments.pth.tar'
DEF_CONFIG_SUPER = './config/vox-256-sem-10segments.yaml'
DEF_CHECKPOINT_SUPER = './sample/vox-first-order.pth.tar'

DEF_RESULT_10SEG_VIDEO = './results/result_10seg.mp4'
DEF_RESULT_5SEG_VIDEO = './results/result_5seg.mp4'
DEF_RESULT_SUPER_VIDEO = './results/result_super.mp4'
DEF_RESULT_10SEG_IMAGE = './results/result_10seg.png'
DEF_RESULT_5SEG_IMAGE = './results/result_5seg.png'
DEF_RESULT_SUPER_IMAGE = './results/result_super.png'

DEF_RESULT_10SEG_INDEX_LIP = [2]

# import
import os
import argparse
import imageio.v2 as imageio
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import matplotlib.patches as mpatches
import torch
import torch.nn.functional as F
from skimage.transform import resize
from skimage import img_as_ubyte

from part_swap2 import load_checkpoints
from part_swap2 import load_face_parser
from part_swap2 import make_video
import my_dialog
import my_imagetool
import my_videotool
import my_movieplay

import warnings
warnings.simplefilter('ignore', UserWarning)            # warning error 対応

# タイトル
title = 'Motion Supervised co-part Segmentation Ver. 0.02'
sub_title = '10-segments model', '5-segments model', 'supervised  part-swaps'

# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('-c', '--category', default = '0', type = str, help = 'Category (0:10-segment, 1:5-segment, 2:supervised) Default value is 0')
    parser.add_argument("--config", default='', help="path to config")
    parser.add_argument("--checkpoint", default='', help="path to checkpoint to restore")

    parser.add_argument("--source_image", default='', help="path to source image")
    parser.add_argument("--target_video", default='', help="path to target video")
    parser.add_argument("--result_video", default='', help="path to output")

    parser.add_argument("--swap_index", default="-1", type=lambda x: list(map(int, x.split(','))), help='index of swaped parts')
    parser.add_argument("--hard", action="store_true", help="use hard segmentation labels for blending")
    parser.add_argument("--use_source_segmentation", action="store_true", help="use source segmentation for swaping")
    parser.add_argument("--first_order_motion_model", action="store_true", help="use first order model for alignment")
    parser.add_argument("--supervised", action="store_true", help="use supervised segmentation labels for blending. Only for faces.")
    parser.add_argument("--cpu", action="store_true", help="cpu mode")

    return parser

# 基本情報の表示
def display_info(opt, title):
    if opt.category[0] == '0':
        cat = f'{opt.category}: ** {sub_title[0]} **'
    elif opt.category[0] == '1':
        cat = f'{opt.category}: ** {sub_title[1]} **'
    elif opt.category[0] == '2':
        cat = f'{opt.category}: ** {sub_title[2]} **'
    else:
        cat = f'{opt.category}: ** setup **'

    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('\n   - ' + YELLOW + 'category                : ' + NOCOLOR, cat)
    print('   - ' + YELLOW + 'config                  : ' + NOCOLOR, opt.config)
    print('   - ' + YELLOW + 'checkpoint              : ' + NOCOLOR, opt.checkpoint)
    print('   - ' + YELLOW + 'source_image            : ' + NOCOLOR, opt.source_image)
    print('   - ' + YELLOW + 'target_video            : ' + NOCOLOR, opt.target_video)
    print('   - ' + YELLOW + 'result_video            : ' + NOCOLOR, opt.result_video)

    print('   - ' + YELLOW + 'swap_index              : ' + NOCOLOR, opt.swap_index)
    print('   - ' + YELLOW + 'hard                    : ' + NOCOLOR, opt.hard)
    print('   - ' + YELLOW + 'use_source_segmentation : ' + NOCOLOR, opt.use_source_segmentation)
    print('   - ' + YELLOW + 'first_order_motion_model: ' + NOCOLOR, opt.first_order_motion_model)
    print('   - ' + YELLOW + 'supervised              : ' + NOCOLOR, opt.supervised)
    print('   - ' + YELLOW + 'cpu                     : ' + NOCOLOR, opt.cpu)
    print(' ')

# セグメンテーションの視覚化
def visualize_segmentation(image, network, supervised=False, hard=True, colormap='gist_rainbow', cpu = False):
    with torch.no_grad():
        if not cpu:
            inp = torch.tensor(image[np.newaxis].astype(np.float32)).permute(0, 3, 1, 2).cuda()
        else:
            inp = torch.tensor(image[np.newaxis].astype(np.float32)).permute(0, 3, 1, 2).cpu()

        if supervised:
            inp = F.interpolate(inp, size=(512, 512))
            inp = (inp - network.mean) / network.std
            mask = torch.softmax(network(inp)[0], dim=1)
            mask = F.interpolate(mask, size=image.shape[:2])
        else:
            mask = network(inp)['segmentation']
            mask = F.interpolate(mask, size=image.shape[:2], mode='bilinear')
    
    if hard:
        mask = (torch.max(mask, dim=1, keepdim=True)[0] == mask).float()
    
    colormap = plt.get_cmap(colormap)
    num_segments = mask.shape[1]
    mask = mask.squeeze(0).permute(1, 2, 0).cpu().numpy()
    color_mask = 0
    patches = []
    for i in range(num_segments):
        if i != 0:
            color = np.array(colormap((i - 1) / (num_segments - 1)))[:3]
        else:
            color = np.array((0, 0, 0))
        patches.append(mpatches.Patch(color=color, label=str(i)))
        color_mask += mask[..., i:(i+1)] * color.reshape(1, 1, 3)
    
    fig, ax = plt.subplots(1, 2, figsize=(12,6), dpi=64)
    fig.subplots_adjust(left=0, right=1, bottom=0, top=1) # 2024.07.13

    ax[0].imshow(color_mask)
    ax[1].imshow(0.3 * image + 0.7 * color_mask)
    ax[1].legend(handles=patches)
    ax[0].axis('off')
    ax[1].axis('off')

# 終了処理（結果の表示と保存）
def end_Process(save_path, dispf = False):
    if len(save_path) > 0:
        plt.savefig(save_path)
    if dispf:
        plt.show()
    plt.close()
    return


# セグメント表示
def segment_disp(opt, msg = 'Visualize segmentation', maxsize = 0, loop_f = True):
    source_image = imageio.imread(opt.source_image)
    source_image = resize(source_image, (256, 256))[..., :3]

    if opt.supervised:
        face_parser = load_face_parser(opt.cpu)
        visualize_segmentation(source_image, face_parser, supervised = True, hard = opt.hard, colormap = 'tab20', cpu = opt.cpu)
    else :                                                          # cpu 対応
        reconstruction_module, segmentation_module = load_checkpoints(opt.config, checkpoint = opt.checkpoint, blend_scale = 1, cpu = opt.cpu)
        visualize_segmentation(source_image, segmentation_module, hard = opt.hard, cpu = opt.cpu)

    # 出力ファイル名
    out_path = ''                                                   # 処理結果画像
    base_dir_pair = os.path.split(opt.source_image)
    s_name, ext = os.path.splitext(base_dir_pair[1])
    name, ext = os.path.splitext(opt.result_video)
    out_path = name + '_' + s_name + ext

    end_Process(out_path)
    my_imagetool.image2disp(out_path, winname = msg, maxsize = my_imagetool.WINDOW_WIDTH, loop_f = loop_f)


# ビデオ生成
def part_swap(opt, maxsize = 0, loop_f = True):

    # ファイルの存在確認
    if not os.path.isfile(opt.source_image):
        print(RED + f"File not found !! '{opt.source_image}' " + NOCOLOR)
        return
    if not os.path.isfile(opt.target_video):
        print(RED + f"File not found !! '{opt.target_video}' " + NOCOLOR)
        return

    # 静止画/動画 読み出し
    source_image = imageio.imread(opt.source_image)
    target_video, fps = my_videotool.read_video(opt.target_video)

    # 256x256 にリサイズ6
    source_image = resize(source_image, (256, 256))[..., :3]
    target_video = [resize(frame, (256, 256))[..., :3] for frame in target_video]

    # 静止画/動画から処理
    blend_scale = (256 / 4) / 512 if opt.supervised else 1
    reconstruction_module, segmentation_module = load_checkpoints(opt.config, opt.checkpoint, blend_scale=blend_scale,
                                                                first_order_motion_model=opt.first_order_motion_model, cpu=opt.cpu)
    if opt.supervised:
        face_parser = load_face_parser(opt.cpu)
    else:
        face_parser = None

    predictions = make_video(opt.swap_index, source_image, target_video, reconstruction_module, segmentation_module,
                                        face_parser, hard=opt.hard, use_source_segmentation=opt.use_source_segmentation, cpu=opt.cpu)

    # 出力ファイル名
    out_path1 = ''                                                  # 処理結果画像
    out_path2 = ''                                                  # 静止画/元動画/処理結果画像
    ext = ''
    if len(opt.result_video) > 0:
        base_dir_pair = os.path.split(opt.source_image)
        s_name, ext = os.path.splitext(base_dir_pair[1])
        base_dir_pair = os.path.split(opt.target_video)
        d_name, ext = os.path.splitext(base_dir_pair[1])
        
        name, ext = os.path.splitext(opt.result_video)
        out_path1 = name + '_' + s_name + '_' + d_name + ext
        out_path2 = name + '_' + s_name + '_' + d_name + '_a' + ext

    # 処理結果の保存 1
    if out_path1 != '':
        if ext == '.gif':
            imageio.mimsave(out_path1, [img_as_ubyte(frame) for frame in predictions], fps = fps, loop = 0)
        else:
            imageio.mimsave(out_path1, [img_as_ubyte(frame) for frame in predictions], fps = fps)

        print(f" Saving... → '{out_path1}'")

        # 音声トラックの付加
        my_videotool.add_audio(opt.target_video, out_path1, log_f = False)

        # 生成動画の表示 1
        my_movieplay.movie_play(out_path1, title = 'Processed result image 1')

    # 静止画/元動画/処理結果画像の作成
    ani = my_videotool.img_movie3x1(source_image, target_video, predictions, interval = fps)

    # 処理結果の保存 2
    if out_path2 != '':
        my_videotool.save_video(ani, out_path2)
        print(f" Saving... → '{out_path2}'")

        # 音声トラックの付加
        my_videotool.add_audio(opt.target_video, out_path2, log_f = False)

        # 入力画像/元動画/生成動画の表示 2
        my_movieplay.movie_play(out_path2, title = 'Processed result image 2')

    print('\n Finished.')
    return out_path1, out_path2


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = parse_args()
    opt = parser.parse_args()

    # GPU 確認（GPUが有効な時だけ CPU選択可）
    gpu_d = torch.cuda.is_available()
    cpu_d = not gpu_d
    opt.cpu = True if cpu_d or opt.cpu else opt.cpu                 # 2024.07.23 GPU がないとき

    segment_f = opt.swap_index == [-1]                              # visualizing the segmentation flag

    if len(opt.source_image) == 0:
        opt.source_image = my_dialog.select_image_file('静止画像　', './sample/images')
        if len(opt.source_image) == 0:
            exit(0)

    if len(opt.target_video) == 0 and not segment_f:
        opt.target_video = my_dialog.select_movie_file('参照動画　', './sample/videos')
        if len(opt.target_video) == 0:
            exit(0)

    # カテゴリー別の前処理
    if opt.category[0] == '0':                                      # 0: 10-segment
        opt.supervised = False
        opt.config = DEF_CONFIG_10SEG if len(opt.config) == 0 else opt.config
        opt.checkpoint = DEF_CHECKPOINT_10SEG if len(opt.checkpoint) == 0 else opt.checkpoint

        if segment_f:
            opt.result_video = DEF_RESULT_10SEG_IMAGE
            opt.hard = True

        else:
            opt.result_video = DEF_RESULT_10SEG_VIDEO if len(opt.result_video) == 0 else opt.result_video

    if opt.category[0] == '1':                                      # 1: 5-segment
        opt.supervised = False
        opt.config = DEF_CONFIG_5SEG if len(opt.config) == 0 else opt.config
        opt.checkpoint = DEF_CHECKPOINT_5SEG if len(opt.checkpoint) == 0 else opt.checkpoint

        if segment_f:
            opt.result_video = DEF_RESULT_5SEG_IMAGE
            opt.hard = True

        else:
            opt.result_video = DEF_RESULT_5SEG_VIDEO if len(opt.result_video) == 0 else opt.result_video

    elif opt.category[0] == '2':                                    # 2: supervised
        opt.supervised = True
        opt.config = DEF_CONFIG_SUPER if len(opt.config) == 0 else opt.config
        opt.checkpoint = DEF_CHECKPOINT_SUPER if len(opt.checkpoint) == 0 else opt.checkpoint
        opt.first_order_motion_model = True

        if segment_f:
            opt.result_video = DEF_RESULT_SUPER_IMAGE
            opt.hard = True

        else:
            opt.result_video = DEF_RESULT_SUPER_VIDEO if len(opt.result_video) == 0 else opt.result_video

    display_info(opt, title)

    if segment_f:
        segment_disp(opt, msg = sub_title[int(opt.category[0])], maxsize = my_imagetool.WINDOW_WIDTH, loop_f = False)
    else:
        part_swap(opt, maxsize = my_imagetool.WINDOW_WIDTH, loop_f = False)

↑

face-parsing: 顔のパーツに関するピクセル単位のラベルマップを求める †

↑

概要 †

「face-parsing」という顔の部品ごとのマスクを学習したセマンティックセグメンテーションモデル
上記「Motion Supervised co-part Segmentation」プロジェクトのカテゴリー'2:supervised' で使用

<framework>
・ https://github.com/AliaksandrSiarohin/face-makeup.PyTorch
・ https://github.com/zllrunning/face-parsing.PyTorch
・ https://github.com/VisionSystemsInc/face-parsing.PyTorch

↑

提供されているデモ「makeup.py」を試す †

カレントディレクトリを移動する
```
(py38_learn) cd face_parsing
```

「ヘアー」「唇」メークアップのデモ
```
(py38_learn) python makeup.py
```
・「model.py」に不具合があるので変更を加えた
・「test.py」に不具合があるので変更を加えた

↑

GUI で操作できるプログラム「makeup2.py」を作成 †

主な機能
・入力として必要な静止画像は ダイアログにより選択指定する
・入力画像は正方形かそれに近いこと
・オリジナルのオプションパラメータ指定はそのまま利用できる
・GPU未搭載やメモリー容量などで CUDAエラーが起きる場合は「--cpu」オプションを付加する
・処理後に元画像と処理画像3枚をまとめて生成し表示する

出力ファイルの保存場所とファイル名（--result_image'./results/result.jpg' 指定の時）
・「./result」フォルダに保存される（「./result」フォルダは存在しなければならない）
・処理結果画像　 → 'result_ + <静止画> + .jpg'

コマンドオプション一覧

コマンドオプション	引数	初期値	意味
--source_image	str	'' (ダイアログによる指定)	静止画ファイルパス
--result_image	str	'./result/result.jpg'	出力保存ファイルパス
--checkpoint	str	'./cp/79999_iter.pth'	学習済みモデル・ファイル
--cpu	bool	False	cpu mode

コマンド実行例

(py38_learn) python makeup2.py

Motion Supervised co-part Segmentation Ver. 0.01: Starting application...

   - source_image            :  C:/anaconda_win/workspace_2/motion-co-seg/face_parsing/imgs/116.jpg
   - result_image            :  ./result_116.jpg
   - checkpoint              :  ./cp/79999_iter.pth
   - cpu                     :  False

そのほかのコマンド実行例

モジュール・ソースコード

▼「makeup2.py」

# -*- coding: utf-8 -*-
##------------------------------------------
##   Face-makeup demo           Ver 0.01
##
##               2024.07.15 Masahiro Izutsu
##------------------------------------------
## makeup2.py           (original:  makeup.py)

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'

# import
import cv2
import os
import numpy as np
from skimage.filters import gaussian
import argparse
import torch

from parsing import evaluate
import my_imagetool
import my_dialog

# タイトル
title = 'Face-makeup demo Ver. 0.01'


# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--source_image", default='', help="path to source image")
    parser.add_argument("--result_image", default='./results/result.jpg', help="path to output")
    parser.add_argument("--checkpoint", default='./cp/79999_iter.pth', help="path to checkpoint to restore")
    parser.add_argument("--cpu", action="store_true", help="cpu mode")  # 2014.07.16

    return parser

# 基本情報の表示
def display_info(opt, title):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('\n   - ' + YELLOW + 'source_image            : ' + NOCOLOR, opt.source_image)
    print('   - ' + YELLOW + 'result_image            : ' + NOCOLOR, opt.result_image)
    print('   - ' + YELLOW + 'checkpoint              : ' + NOCOLOR, opt.checkpoint)
    print('   - ' + YELLOW + 'cpu                     : ' + NOCOLOR, opt.cpu)
    print(' ')


def sharpen(img):
    img = img * 1.0
    gauss_out = gaussian(img, sigma=5, multichannel=True)

    alpha = 1.5
    img_out = (img - gauss_out) * alpha + img

    img_out = img_out / 255.0

    mask_1 = img_out < 0
    mask_2 = img_out > 1

    img_out = img_out * (1 - mask_1)
    img_out = img_out * (1 - mask_2) + mask_2
    img_out = np.clip(img_out, 0, 1)
    img_out = img_out * 255
    return np.array(img_out, dtype=np.uint8)

def get_key_from_value(d, val):
    keys = [k for k, v in d.items() if v == val]
    if keys:
        return keys[0]
    return None

def hair(image, parsing, part=17, color=[230, 50, 20]):
    b, g, r = color      #[10, 50, 250]       # [10, 250, 10]
    tar_color = np.zeros_like(image)
    tar_color[:, :, 0] = b
    tar_color[:, :, 1] = g
    tar_color[:, :, 2] = r

    image_hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
    tar_hsv = cv2.cvtColor(tar_color, cv2.COLOR_BGR2HSV)

    if part == 12 or part == 13:
        image_hsv[:, :, 0:2] = tar_hsv[:, :, 0:2]
    else:
        image_hsv[:, :, 0:1] = tar_hsv[:, :, 0:1]

    changed = cv2.cvtColor(image_hsv, cv2.COLOR_HSV2BGR)

    if part == 17:
        changed = sharpen(changed)

    changed[parsing != part] = image[parsing != part]
    return changed


# main関数エントリーポイント(実行開始)
if __name__ == '__main__':
    # 1  face
    # 11 teeth
    # 12 upper lip
    # 13 lower lip
    # 17 hair

    parser = parse_args()
    opt = parser.parse_args()

    if len(opt.source_image) == 0:
        opt.source_image = my_dialog.select_image_file('静止画像　', './imgs')
        if len(opt.source_image) == 0:
            exit(0)

        # 出力ファイル名
        base_dir_pair = os.path.split(opt.source_image)
        s_name, ext = os.path.splitext(base_dir_pair[1])
        name, ext = os.path.splitext(opt.result_image)
        opt.result_image = name + '_' + s_name + ext

    # GPU 確認（GPUが有効な時だけ CPU選択可）
    gpu_d = torch.cuda.is_available()
    cpu_d = not gpu_d
    opt.cpu = True if cpu_d or opt.cpu else opt.cpu                             # GPU がないとき

    display_info(opt, title)

    table = {
        'hair': 17,
        'upper_lip': 12,
        'lower_lip': 13
    }

    image_path = opt.source_image
    cp = opt.checkpoint

    image = cv2.imread(image_path)

    # 画像が小さい場合は拡大
    img_h, img_w = image.shape[:2]
    if img_h < 512 or img_w < 512:
        image = cv2.resize(image, (512, 512),interpolation = cv2.INTER_CUBIC)
        image_path = './temp.jpg'
        cv2.imwrite(image_path, image)

    ori = image.copy()
    parsing = evaluate(image_path, cp, cpu = opt.cpu)            # 2014.07.16
    parsing = cv2.resize(parsing, image.shape[0:2], interpolation=cv2.INTER_NEAREST)

    parts = [table['hair'], table['upper_lip'], table['lower_lip']]
    colors = [[230, 50, 20], [20, 70, 180], [20, 70, 180]]

    ds_image = np.zeros((1024,1024,3), dtype = np.uint8)
    n = 1
    win_name = 'original + '
    for part, color in zip(parts, colors):
        image = hair(image, parsing, part, color)
        img = image.copy()

        x = (n % 2) * 512
        y = (n // 2) * 512
        img = cv2.resize(img, (512, 512))
        ds_image[y:y + 512, x:x + 512] = img
        n = n + 1

    ds_image[0:512, 0:512] = cv2.resize(ori, (512, 512))

    win_name = 'original + hair + upper_lip + lower_lip'

    my_imagetool.image_disp(ds_image, win_name, save_path = opt.result_image)

↑

顔のパーツを分離するプログラム「parsing.py」を作成 †

主な機能
・「makeup2.py」から呼ばれる顔のパーツのインデックス分布を得るプログラム
・元はプロジェクトのテスト版プログラム「test.py」のようなので解析して新たに修正版を作成した
・入力画像によって視覚化された画像を確認することができる
・GPU未搭載やメモリー容量などで CUDAエラーが起きる場合は「--cpu」オプションを付加する

出力ファイルの保存場所とファイル名（--result_image'./results/parse.jpg' 指定の時）
・「./result」フォルダに保存される（「./result」フォルダは存在しなければならない）
・インデックス画像　 → 'pars_ + <静止画> + '_a' + .jpg'
・マスク合成画像　　 → 'pars_ + <静止画> + .jpg'
・マスク画像　　　　 → 'pars_ + <静止画> + .png'

コマンドオプション一覧

コマンドオプション	引数	初期値	意味
--source_image	str	'' (ダイアログによる指定)	静止画ファイルパス
--result_image	str	'./result/pars.jpg'	出力保存ファイルパス
--checkpoint	str	'./cp/79999_iter.pth'	学習済みモデル・ファイル
--cpu	bool	False	cpu mode

コマンド実行例

(py38_learn) python parsing.py

Face-makeup demo Ver. 0.01: Starting application...

   - source_image            :  C:/anaconda_win/workspace_2/motion-co-seg/face_parsing/imgs/116.jpg
   - result_image            :  ./results/result_116.jpg
   - checkpoint              :  ./cp/79999_iter.pth
   - cpu                     :  False

モジュール・ソースコード

▼「pasing.py」

# -*- coding: utf-8 -*-
##------------------------------------------
##   Face parsing check program   Ver 0.01
##      face-pasing の手順を調べる
##
##               2024.07.17 Masahiro Izutsu
##------------------------------------------
## parsing.py           (original:  test.py)

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'

# import
import torch
import os
from model import BiSeNet
import os.path as osp
import numpy as np
from PIL import Image
import torchvision.transforms as transforms
import cv2
import torch

import argparse
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import my_imagetool
import my_dialog

# タイトル
title = 'Face parsing check program Ver. 0.01'

# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--source_image", default='', help="path to source image")
    parser.add_argument("--result_image", default='./results/pars.jpg', help="path to output")
    parser.add_argument("--checkpoint", default='./cp/79999_iter.pth', help="path to checkpoint to restore")
    parser.add_argument("--cpu", action="store_true", help="cpu mode")

    return parser

# 基本情報の表示
def display_info(opt, title):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('\n   - ' + YELLOW + 'source_image            : ' + NOCOLOR, opt.source_image)
    print('   - ' + YELLOW + 'result_image            : ' + NOCOLOR, opt.result_image)
    print('   - ' + YELLOW + 'checkpoint              : ' + NOCOLOR, opt.checkpoint)
    print('   - ' + YELLOW + 'cpu                     : ' + NOCOLOR, opt.cpu)
    print(' ')

# カラーマップ(matplotlib)から BGRカラー値を得る
def colormap2bgrcolors(num_segments = 19, colormap='tab20'):
    colormap = plt.get_cmap(colormap)
    bgr_colors = []
    for i in range(num_segments):
        if i != 0:
            color = np.array(colormap((i - 1) / (num_segments - 1)))[:3]
        else:
            color = np.array((0, 0, 0))

        c = [int(color[2] * 255), int(color[1] * 255), int(color[0]* 255)]
        bgr_colors.append(c)

    return bgr_colors

# 色のインデックスを付加する
def add_patch(cv_image, bgr_colors, save_path = '', dispf = False, id_num = 19):
    patches = []
    for i in range(min(len(bgr_colors), id_num)):
        bgr = bgr_colors[i]
        rgb = [float(bgr[2] / 255), float(bgr[1] / 255), float(bgr[0] / 255)]
        patches.append(mpatches.Patch(color = rgb, label = str(i)))

    pil_image = cv_image[:, :, ::-1]
    fig, ax = plt.subplots(1, 1, figsize=(8,8), dpi=64)
    fig.subplots_adjust(left=0, right=1, bottom=0, top=1)
    ax.imshow(pil_image)
    ax.legend(handles=patches)
    ax.axis('off')
    if len(save_path) > 0:
        plt.savefig(save_path)
    if dispf:
        plt.show()
    plt.close()

# セグメント表示
#   im:             <class 'PIL.Image.Image'>
#   parsing_anno:   <class 'numpy.ndarray'>     int64   (512, 512)
def vis_parsing_maps(im, parsing_anno, stride, save_path = ''):
    '''
    # カラーテーブルを matplotlib のカラーマップの定義に変更
    # Colors for all 20 parts
    part_colors = [[255, 0, 0], [255, 85, 0], [255, 170, 0],
                   [255, 0, 85], [255, 0, 170],
                   [0, 255, 0], [85, 255, 0], [170, 255, 0],
                   [0, 255, 85], [0, 255, 170],
                   [0, 0, 255], [85, 0, 255], [170, 0, 255],
                   [0, 85, 255], [0, 170, 255],
                   [255, 255, 0], [255, 255, 85], [255, 255, 170],
                   [255, 0, 255], [255, 85, 255], [255, 170, 255],
                   [0, 255, 255], [85, 255, 255], [170, 255, 255]]
    '''
    part_colors = colormap2bgrcolors()

    im = np.array(im)
    vis_im = im.copy().astype(np.uint8)
    vis_parsing_anno = parsing_anno.copy().astype(np.uint8)
    vis_parsing_anno = cv2.resize(vis_parsing_anno, None, fx=stride, fy=stride, interpolation=cv2.INTER_NEAREST)
#    vis_parsing_anno_color = np.zeros((vis_parsing_anno.shape[0], vis_parsing_anno.shape[1], 3)) + 255     # インデックス 0 を白にする
    vis_parsing_anno_color = np.zeros((vis_parsing_anno.shape[0], vis_parsing_anno.shape[1], 3))

    num_of_class = np.max(vis_parsing_anno)

    for pi in range(1, num_of_class + 1):
        index = np.where(vis_parsing_anno == pi)
        vis_parsing_anno_color[index[0], index[1], :] = part_colors[pi]

    # segment mask
    vis_parsing_anno_color = vis_parsing_anno_color.astype(np.uint8)

    # segment-maps + image 合成画像
    vis_im1 = cv2.addWeighted(cv2.cvtColor(vis_im, cv2.COLOR_RGB2BGR), 0.4, vis_parsing_anno_color, 0.6, 0)

    # segment-maps
    vis_im2 = cv2.addWeighted(cv2.cvtColor(vis_im, cv2.COLOR_RGB2BGR), 0, vis_parsing_anno_color, 1, 0)

    # Save result or not
    if len(save_path) > 0:
        save_paths = get_outpath2(save_path)
        cv2.imwrite(save_paths[0], vis_parsing_anno)
#        cv2.imwrite(save_paths[1], vis_im1, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
        add_patch(vis_im1, part_colors, save_path = save_paths[1])
        cv2.imwrite(save_paths[2], vis_im2, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
    return vis_parsing_anno

# セグメント演算
def evaluate(source_img, cp, save_path = '', cpu = False, maps_f = False):
    n_classes = 19
    net = BiSeNet(n_classes=n_classes)

    if cpu:
        net.cpu()
        net.load_state_dict(torch.load(cp, torch.device('cpu')))
    else:
        net.cuda()
        net.load_state_dict(torch.load(cp))

    net.eval()

    to_tensor = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    ])

    with torch.no_grad():

        img = Image.open(source_img) if type(source_img) is str else source_img
        image = img.resize((512, 512), Image.BILINEAR)
        img = to_tensor(image)
        img = torch.unsqueeze(img, 0)

        if cpu:
            img = img.cpu()
        else:
            img = img.cuda()

        out = net(img)[0]
        parsing = out.squeeze(0).cpu().numpy().argmax(0)
        # print(parsing)
        # print(np.unique(parsing))

        if maps_f:
            vis_parsing_maps(image, parsing, stride = 1, save_path = save_path)

        return parsing

# ソースファイル名を含んだ出力ファイル名を得る
def get_outpath(source_image, result_image):
    base_dir_pair = os.path.split(source_image)
    s_name, ext = os.path.splitext(base_dir_pair[1])
    name, ext = os.path.splitext(result_image)
    result_image = name + '_' + s_name + ext
    return result_image

# ソースファイル名を含んだ出力ファイル名を得る２
def get_outpath2(result_image):
    save_path = []
    name, ext = os.path.splitext(result_image)
    save_path.append(name + '.png')
    save_path.append(name + ext)
    save_path.append(name + '_a' + ext)
    return save_path


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = parse_args()
    opt = parser.parse_args()

    if len(opt.source_image) == 0:
        opt.source_image = my_dialog.select_image_file('静止画像　', './imgs')
        if len(opt.source_image) == 0:
            exit(0)

        opt.result_image = get_outpath(opt.source_image, opt.result_image)

    # GPU 確認（GPUが有効な時だけ CPU選択可）
    gpu_d = torch.cuda.is_available()
    cpu_d = not gpu_d
    opt.cpu = True if cpu_d or opt.cpu else opt.cpu                             # GPU がないとき

    display_info(opt, title)

    parsing = evaluate(opt.source_image, cp = opt.checkpoint, save_path = opt.result_image, cpu = opt.cpu, maps_f = True)

    # parsing Image
    images = []
    image = cv2.imread(opt.source_image)
    image = cv2.resize(image, dsize = (512, 512))
    images.append(image)
    save_paths = get_outpath2(opt.result_image)
    for path in save_paths:
        image = cv2.imread(path)
        images.append(image)

    ds_image = my_imagetool.make_tileimage(images, 1024, 1024)
    winname = 'Make-up process'
    my_imagetool.image_disp(ds_image, winname, maxsize = my_imagetool.WINDOW_WIDTH)

↑

セグメンテーションによる顔パーツの ID †

face-parsing.PyTorch/FaceParsing.py より

id	label	note
0	background
1	skin	face
2	l_brow	left eyebrow
3	r_brow	right eyebrow
4	l_eye	left eye
5	r_eye	right eye
6	eye_g	eye glasses
7	l_ear	left ear
8	r_ear	right ear
9	ear_r	ear ring
10	nose
11	mouth	area between lips
12	u_lip	upper lip
13	l_lip	lower lip
14	neck
15	neck_l	necklace
16	cloth	clothing (衣類)
17	hair
18	hat

↑

メークアップをシュミレーションできる「makeup_gui.py」を作る †

主な機能
・「顔」「髪」「唇」の色をリアルタイムで変更できるプログラム
・入力した任意のサイズの顔画像で実行できる
・色の指定は直感的にわかりやすい「H (色相)」「S (彩度)」「V (明度)」を採用する
・リアルタイムで表示する処理結果を任意のタイミングで元画像と並べて保存できる

出力ファイルの保存場所とファイル名（--result_image'./results/make.jpg' 指定の時）
・「./result」フォルダに保存される（「./result」フォルダは存在しなければならない）
・保存画像　 → 'make_ + <カテゴリー> + <R>-<G>-<B> + <静止画> + .jpg'

コマンドオプション一覧（起動時の設定）

コマンドオプション	引数	初期値	意味
--source_image	str	'' (ダイアログによる指定)	静止画ファイルパス
--result_image	str	'./result/make.jpg'	出力保存ファイルパス
--checkpoint	str	'./cp/79999_iter.pth'	学習済みモデル・ファイル
--cpu	bool	False	cpu mode
--category	str	'face'	category 'hair'/'lips'/'face'
--swap_color	list	[210,120,110]	swaped color (R,G,B)
--log	int	3	Log level(-1/0/1/2/3/4/5)

操作方法

① 顔の画像の選択
② 現在選ばれている顔のオリジナル画像とそのファイルパスを表示
③ 現在の設定で処理した画像と保存するときのファイルパス名
④ 設定されている色
⑤ H(色相)・S(彩度)・V(明度) による色変更のためのスライダー ※

⑥ カテゴリー（hair / lips / face）の選択
⑦ CPU / GPU 切り替え（GPU が有効な時）
⑧ 現在の処理画像を保存する
⑨ アプリケーションを終了する

　※参考→ HSV色空間 (ウイキペディア)

コマンド実行例

(py38_learn) python makeup_gui.py

Face-makeup GUI Ver. 0.01: Starting application...

   - source_image            :  ./imgs/116.jpg
   - result_image            :  ./results/make_face_210-120-110_116.jpg
   - checkpoint              :  ./cp/79999_iter.pth
   - cpu                     :  False

   - category                :  face
   - swap_color              :  [210, 120, 110]

   - log                     :  3

Finished.

出力結果の例（'hair'/lips'/face'）

モジュール・ソースコード

▼「makeup_gui.py」

# -*- coding: utf-8 -*-
##------------------------------------------
##   Face-makeup GUI program     Ver 0.01
##
##               2024.07.21 Masahiro Izutsu
##------------------------------------------
## makeup_gui.py

# Color Escape Code
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'

KEY_CPU = '-CPU-'
KEY_GPU = '-GPU-'
KEY_DEVICE = '-Device-'
KEY_HUE = '-Hue-'
KEY_SAT = '-Sat-'
KEY_VAL = '-Val-'
KEY_COLOR = '-Color-'
KEY_EXIT = '-Exit-'
KEY_ORGIMAGE = '-Org-'
KEY_PRSIMAGE = 'Prs-'
KEY_HAIR = '-Hair-'
KEY_LIPS = '-Lips-'
KEY_FACE = '-Face'
KEY_IMAGESEL = '-Image-'
KEY_IMAGESAVE = '-Save-'
KEY_TXTORG = '-Source-'
KEY_TXTPRS = '-Process-'

COL_CANVAS_SIZE = 60
IMG_CANCAS_SIZE = 512

# import
import cv2
import os
import numpy as np
import argparse

import torch
from PIL import Image, ImageTk
import PySimpleGUI as sg
import parsing
import makeup_test
import my_imagetool
import my_dialog
import my_logging

# タイトル
title = 'Face-makeup GUI Ver. 0.01'


# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--source_image", default='./imgs/116.jpg', help="path to source image")
    parser.add_argument("--result_image", default='./results/make.jpg', help="path to output")
    parser.add_argument("--checkpoint", default='./cp/79999_iter.pth', help="path to checkpoint to restore")
    parser.add_argument("--cpu", action="store_true", help="cpu mode")
    parser.add_argument("-c", "--category", default='face', type = str, choices=['hair', 'lips', 'face'], help="make-up category")
    parser.add_argument("--swap_color", default='210,120,110', type=lambda x: list(map(int, x.split(','))), help='swaped color (R,G,B)')
    parser.add_argument('--log', metavar = 'LOG', default = '3', help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')

    return parser

# 基本情報の表示
def display_info(opt, title):
    makeup_test.display_info(opt, title)
    print('   - ' + YELLOW + 'log                     : ' + NOCOLOR, opt.log)
    print(' ')

def hsv_to_rgb(h, s, v):
    bgr = cv2.cvtColor(np.array([[[h, s, v]]], dtype=np.uint8), cv2.COLOR_HSV2BGR)[0][0]
    return (bgr[2], bgr[1], bgr[0])


def rgb_to_hsv(r, g, b):
    hsv = cv2.cvtColor(np.array([[[b, g, r]]], dtype=np.uint8), cv2.COLOR_BGR2HSV)[0][0]
    return (hsv[0], hsv[1], hsv[2])

# 元画像を色変更処理して Canvas イメージを得る
def makup_color(cv2_image, category, parsing, color_bgr):
    prs_image = makeup_test.makeup_image(cv2_image, category, parsing, color_bgr)

    prs_image = cv2.cvtColor(prs_image, cv2.COLOR_BGR2RGB)
    p_image = Image.fromarray(prs_image)                                        # OpenCV型 -> PIL型
    p_image = p_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
    tkcv_image = ImageTk.PhotoImage(image = p_image)
    return tkcv_image

# 出力ファイル名を得る
def get_outpath(source_image, result_image, color):
    name, ext = os.path.splitext(result_image)
    outpath = name + '_' + opt.category + f'_{color[0]}-{color[1]}-{color[2]}' + ext
    outpath = parsing.get_outpath(source_image, outpath)
    return outpath

# ** main関数 **
def main(opt):
    color = opt.swap_color
    color_bgr = [color[2], color[1], color[0]]                                  # RGB > BGR
    image_path = opt.source_image
    save_path = opt.result_image
    category = opt.category
    checkpoint = opt.checkpoint
    cpu = opt.cpu

    radio0 = True if category == 'hair' else False
    radio1 = True if category == 'lips' else False
    radio2 = True if category == 'face' else False

    # ウィンドウのテーマ
    sg.theme('BlueMono')

    # カラー初期値
    hex_rgb = f'#{color[0]:02x}{color[1]:02x}{color[2]:02x}'
    h, s, v = rgb_to_hsv(color[0], color[1], color[2])

    # ウィンドウのレイアウト
    col_image0 = [
            [sg.Canvas(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_ORGIMAGE)], 
            [sg.Text(image_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTORG)]
    ]
    col_image1 = [
            [sg.Canvas(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_PRSIMAGE)], 
            [sg.Text(save_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTPRS)]
    ]
    col_color = [
            [sg.Text("H (Hue)", size=(22, 1)), sg.Slider((0, 179), h, 1, orientation='h', size=(35, 5), key=KEY_HUE, enable_events=True)],
            [sg.Text("S (Saturation Croma)", size=(22, 1)), sg.Slider((1, 255), s, 1, orientation='h', size=(35, 5), key=KEY_SAT, enable_events=True)],
            [sg.Text("V (Value Brightness))", size=(22, 1)), sg.Slider((1, 255), v,1, orientation='h', size=(35, 5), key=KEY_VAL, enable_events=True)],
    ]
    col_radio = [
            [sg.Text("Category", size=(10, 1)), sg.Radio('hair', group_id='category',enable_events = True, default=radio0, key=KEY_HAIR), sg.Radio('lips', group_id='category',enable_events = True, default=radio1, key=KEY_LIPS), sg.Radio('face', group_id='category',enable_events = True, default=radio2, key=KEY_FACE)],
            [sg.Text("Device", size=(10, 1)), sg.Radio("CPU", group_id='device', default=cpu, key=KEY_CPU), sg.Radio("GPU", group_id='device', default=not cpu, disabled = cpu_d, key=KEY_GPU)],
            [sg.Text("", size=(15, 1)), sg.Button('Image', size=(10, 1), key=KEY_IMAGESEL), sg.Button('Save', size=(10, 1), key=KEY_IMAGESAVE), sg.Text("", size=(1, 1)), sg.Button('Exit', size=(10, 1), key=KEY_EXIT)],
    ]

    layout = [[sg.Text(title, size=(30, 1), justification='center', font='Helvetica 20')],
            [sg.Column(col_image0), sg.Column(col_image1)],
            [sg.Canvas(size=(COL_CANVAS_SIZE, COL_CANVAS_SIZE), key=KEY_COLOR), sg.Column(col_color), sg.Column(col_radio)], 
            [sg.Text("", size=(15, 1))]
    ]

    # ウィンドウオブジェクトの作成
    window = sg.Window(title, layout, finalize=True, return_keyboard_events=True)

    col_canvas = window[KEY_COLOR]
    tcv_color = col_canvas.TKCanvas
    tcv_color.create_rectangle(0, 0, COL_CANVAS_SIZE, COL_CANVAS_SIZE, fill = hex_rgb)

    im0_canvas = window[KEY_ORGIMAGE]
    tcv_img0 = im0_canvas.TKCanvas
    tcv_img0.create_rectangle(0, 0, IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, fill = '#cccccc')

    im1_canvas = window[KEY_PRSIMAGE]
    tcv_img1 = im1_canvas.TKCanvas
    tcv_img1.create_rectangle(0, 0, IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, fill = '#cccccc')

    org_image = Image.open(image_path)                                                  # PIL型で読み込み（オリジナル画像）
    p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
    pht0_image = ImageTk.PhotoImage(image = p_image)
    tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

    cv2_image = np.array(org_image, dtype = np.uint8)                                   # PIL型 -> OpenCV型
    cv2_image = cv2.cvtColor(cv2_image, cv2.COLOR_RGB2BGR)

    pars = parsing.evaluate(org_image, checkpoint, cpu = cpu)
    pars = cv2.resize(pars, cv2_image.shape[0:2], interpolation = cv2.INTER_NEAREST)

    # 色変更処理して Canvas イメージを得る
    pht1_image = makup_color(cv2_image, category, pars, color_bgr)
    tcv_img1.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht1_image) # Canvasの中心に表示

    # イベントのループ
    while True:
        event, values = window.read()
        if event == KEY_EXIT or event == sg.WIN_CLOSED or event == 'Escape:27':
            logger.info('Finished.\n')
            break

        if event == KEY_HUE or event == KEY_SAT or event == KEY_VAL:
            hue = int(values[KEY_HUE])
            sat = int(values[KEY_SAT])
            val = int(values[KEY_VAL])
            logger.debug(f'Hue = {hue}, Sat = {sat}, Val = {val}')


            r, g, b = hsv_to_rgb(hue, sat, val)
            hex_rgb = f'#{r:02x}{g:02x}{b:02x}'
            color_bgr[0], color_bgr[1], color_bgr[2] = b, g, r
            color[0], color[1], color[2] = r, g, b

            logger.debug(f'R = {r}, G = {g}, B = {b}  {hex_rgb}')
            tcv_color.create_rectangle(0, 0, COL_CANVAS_SIZE, COL_CANVAS_SIZE, fill=hex_rgb)

            save_path = get_outpath(image_path, result_image, color)
            window[KEY_TXTPRS].update(save_path)

            # 色変更処理して Canvas イメージを得る
            pht1_image = makup_color(cv2_image, category, pars, color_bgr)
            tcv_img1.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht1_image) # Canvasの中心に表示

        if event == KEY_HAIR or event == KEY_LIPS or event == KEY_FACE:
            if values[KEY_HAIR]:
                category = 'hair'
            elif values[KEY_LIPS]:
                category = 'lips'
            elif values[KEY_FACE]:
                category = 'face'

            pars = parsing.evaluate(org_image, checkpoint, cpu = cpu)
            pars = cv2.resize(pars, cv2_image.shape[0:2], interpolation = cv2.INTER_NEAREST)

            # 色変更処理して Canvas イメージを得る
            pht1_image = makup_color(cv2_image, category, pars, color_bgr)
            tcv_img1.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht1_image) # Canvasの中心に表示

        if event == KEY_IMAGESEL:
            fpath = my_dialog.select_image_file('静止画像　', './imgs')
            if len(fpath) > 0:
                # オリジナル画像変更
                image_path = fpath
                save_path = get_outpath(image_path, result_image, color)
                window[KEY_TXTORG].update(image_path)
                window[KEY_TXTPRS].update(save_path)
                org_image = Image.open(image_path)                                      # PIL型で読み込み（オリジナル画像）
                p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
                pht0_image = ImageTk.PhotoImage(image = p_image)
                tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

                # パーツ領域演算
                cv2_image = np.array(org_image, dtype = np.uint8)                       # PIL型 -> OpenCV型
                cv2_image = cv2.cvtColor(cv2_image, cv2.COLOR_RGB2BGR)
                pars = parsing.evaluate(org_image, checkpoint, cpu = cpu)
                pars = cv2.resize(pars, cv2_image.shape[0:2], interpolation = cv2.INTER_NEAREST)

                # 色変更処理して Canvas イメージを得る
                pht1_image = makup_color(cv2_image, category, pars, color_bgr)
                tcv_img1.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht1_image) # Canvasの中心に表示

        if event == KEY_IMAGESAVE:
            save_path = get_outpath(image_path, result_image, color)
            ds_images = []
            ds_images.append(cv2_image)
            prs_image = makeup_test.makeup_image(cv2_image, category, pars, color_bgr)
            ds_images.append(prs_image)

            img_h, img_w = cv2_image.shape[:2]
            ds_image = my_imagetool.make_tileimage(ds_images, xmax = img_h * 2, ymax = img_h)
            my_imagetool.image_disp(ds_image, dispf = True, save_path = save_path, maxsize = 1280, wait_s = 2)  # 保存

    # ウィンドウ終了処理
    window.close()


# main関数エントリーポイント(実行開始)
if __name__ == '__main__':
    parser = parse_args()
    opt = parser.parse_args()

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))
    logger.debug('Starting..')

    if len(opt.source_image) == 0:
        opt.source_image = my_dialog.select_image_file('静止画像　', './imgs')
        if len(opt.source_image) == 0:
            exit(0)

    if len(opt.swap_color) == 3:
        color = opt.swap_color
    else:
        col = my_dialog.color_dialog('色指定')
        if col == (None, None):
            exit(0)
        else:
            color = list(col[0])

    # GPU 確認（GPUが有効な時だけ CPU選択可）
    gpu_d = torch.cuda.is_available()
    cpu_d = not gpu_d
    opt.cpu = True if cpu_d or opt.cpu else opt.cpu                             # GPU がないとき

    # 出力ファイル名
    result_image = opt.result_image
    opt.result_image = get_outpath(opt.source_image, result_image, opt.swap_color)

    # パラメータ表示
    display_info(opt, title)

    main(opt)

↑

対処した問題点とエラー詳細 †

↑

「logger.py」変更点 †

from skimage.draw import circle エラー対応 → 7行目
```
from skimage.draw import ellipse as cricle
```

↑

「part_swap.py」変更点（ローカル環境では「part_swap2.py」）とする †

import imageio エラー対応 → 19行目
```
import imageio.v2 as imageio
```

warning error を消す → 22行目から

import imageio.v2 as imageio
import warnings
warnings.simplefilter('ignore', UserWarning)

TypeError: load() missing 1 required positional argument: 'Loader' エラー対応 → 119行目
```
#        config = yaml.load(f)
        config = yaml.load(f,Loader=yaml.Loader)
```

↑

「face_parsing/model.py」変更点 †

10行目

try:
    from face_parsing.resnet import Resnet18
except ImportError:
    from resnet import Resnet18

warning error を消す → 15行目から

import warnings
warnings.simplefilter('ignore')

↑

「face_parsing/test.py」変更点 †

cpu 動作を追加 → 50行目から

def evaluate(image_path='./imgs/116.jpg', cp='cp/79999_iter.pth', cpu=False):   # 2014.07.16

    # if not os.path.exists(respth):
    #     os.makedirs(respth)

    n_classes = 19
    net = BiSeNet(n_classes=n_classes)

    if cpu:
        net.cpu()
        net.load_state_dict(torch.load(cp, torch.device('cpu')))
    else:
        net.cuda()
        net.load_state_dict(torch.load(cp))

    net.eval()

↑

更新履歴 †

2024/07/14 初版
2024/07/23 「makeup_gui.py」追加

↑

参考資料 †

Motion Supervised co-part Segmentation

face-parsing

First Order Motion Model

Error

Segmentation

Python Numpy

matplotlib

PySimpleGUI / tkinter

Others