StyleGAN2 のバックアップ(No.9)

バックアップ一覧
差分を表示
現在との差分を表示
ソースを表示
StyleGAN2 へ行く。
- 1 (2024-09-20 (金) 17:56:11)
- 2 (2024-09-20 (金) 19:53:29)
- 3 (2024-09-21 (土) 04:50:16)
- 4 (2024-09-21 (土) 11:52:32)
- 5 (2024-10-09 (水) 14:17:12)
- 6 (2024-10-09 (水) 15:43:06)
- 7 (2024-10-10 (木) 11:35:44)
- 8 (2024-10-10 (木) 14:31:58)
- 9 (2024-10-12 (土) 09:02:28)

私的AI研究会 > StylerGAN2

StyleGAN3による画像の編集：StyleGAN3（その２） †

　ローカルマシン上の「StyleGAN3」で画像を編集する

StyleGAN3による画像の編集：StyleGAN3（その２）
参考資料

※ 最終更新:2024/10/12　

↑

StyleGAN3による画像の編集 †

　Google Colab 上で動くの画像編集プログラムをローカルマシンに移植する
　「ビデオ編集プログラム」ハードウェアの制約から今回は保留とする

↑

概要 †

StyleGANとは、高解像度の画像を生成する敵対的生成ネットワーク（GAN）
StyleGANを活用すると、本物とは見分けが付かないような人間の顔など、リアルな画像を生成できる
リアリティが高すぎる SyleGAN で「写真は証拠として役に立たなくなる」とまで言われている

モデル概要図（下記論文所収）
論文「Alias-Free Generative Adversarial Networks (StyleGAN3)」
<paper>
・https://nvlabs.github.io/stylegan3/
・https://arxiv.org/pdf/1812.04948
<framework>
・https://github.com/yuval-alaluf/stylegan3-editing

前回の検証
・StyleGAN3による画像とビデオの編集：StyleGAN3

↑

実行環境の構築 †

仮想環境「py38_learn」で実行する
未作成の場合は → 『仮想環境 (py38_learn)』の手順で仮想環境を作成

GitHub サイトからプロジェクトをダウンロード

cd /anaconda_win/workspace_2　　　　　　　　　　　　　　　　　← Windows の場合
cd ~/workspace_2　　　　　　　　　　　　　　　　　　　　　　　← Linux の場合

git clone https://github.com/cedro3/stylegan3-editing.git

プロジェクト・パッケージ project_stylegan3.zip (1.47GB) <StyleGAN> をダウンロード
・解凍してできるフォルダ

project_StyleGAN
└─workspace_2
    └─stylegan3-editing　　　　　　　　　　　　　　　　　　← GitHub からクローンしたプロジェクトに上書きする
        ├─edit
        │  ├─align
        │  ├─crop
        │  ├─invert_psp
        │  └─latents_psp
        ├─editing
        │  ├─interfacegan
        │  │  └─boundaries
        │  │      └─ffhq
        │  │              age_boundary.npy                ①
        │  │              Male_boundary.npy               ②
        │  │              pose_boundary.npy               ③
        │  │              Smiling_boundary.npy            ④
        │  │
        │  └─styleclip_global_directions
        │      └─sg3-r-ffhq-1024
        │              delta_i_c.npy                       ⑤
        │              s_stats                             ⑥
        │
        └─pretrained_models
                restyle_e4e_ffhq.pt                         ⑦
                restyle_pSp_ffhq.pt                         ⑧
                shape_predictor_68_face_landmarks.dat       ⑨

　※ ①～⑨ 予めダウンロードした画像から潜在変数を求めるエンコーダ（e4e/pSp）と画像編集用のパラメータ

・解凍してできる「project_StyleGAN/」フォルダ内を次のフォルダの下に上書きコピーする
　Windows の場合 →「anaconda_win/」　Linux の場合 → 「~/」

不足しているパッケージの導入

pip install pyrallis
pip install git+https://github.com/openai/CLIP.git

↑

InterFaceGANによる編集プログラム「stylegan3.py」 †

主な仕様
・編集前処理
　(1) 画像フォルダ「./edit/pic」内の顔画像をサンプル画像を align 処理（顔固定）し「./edit/align」フォルダに同じファイル名で保存
　(2) 「./pic」内の顔画像をサンプル画像を clop 処理（背景固定固定）し「./edit/crop」フォルダに同じファイル名で保存
　(3) 顔固定の画像から潜在変数を求め「./edit/latents_psp」、潜在変数は無編集で、背景固定の画像を生成し「./edit/invert_psp」に保存する

・生成した潜在変数を用いて画像編集をおこなう
　<編集のための設定パラメータ> 　1. edit_direction 編集用パラメータ（age, smile, pose, male）（初期値:age）
　2. min_value　　　適用係数（初期値:-5）
　3. max_value　　　適用係数（初期値: 5）

・生成パラメータ（min/max）はスライダーにより値を変更できる
・GPU が使用できないマシンの場合は「ビューモード」となり、GPU による処理結果を再生することができる

コマンドオプション一覧

コマンドオプション	引数	初期値	意味
--model	str	'psp'	エンコーダ選択（psp/e4e）
--source_dir	str	'./edit/pic/001.jpg'	静止画ファイル
--result_path	str	'./result'	出力保存フォルダ
--align	boot	True	すでに切り出し画像がある場合も再作成しない
--cpu	bool	False	CPU指定 (指定のない場合は自動選択)
--log	int	3	Log level(-1/0/1/2/3/4/5)

コマンド実行例

(py38_learn_test) python stylegan3.py

StyleGAN3 edit    Ver 0.01: Starting application...

   - model                   :  psp
   - source_image            :  ./edit/pic/001.jpg
   - result_path             :  ./results
   - align                   :  True
   - log                     :  3
 
 check_target_image = True './edit/align'
 check_target_image = True './edit/invert_psp'

Finished.

▼「stylegan3.py」ソースコード

# -*- coding: utf-8 -*-
##------------------------------------------
##  StyleGAN3 edit    Ver 0.01
##
##               2024.09.17 Masahiro Izutsu
##------------------------------------------
## stylegan3.py

import warnings
warnings.simplefilter('ignore')

# Color Escape Code ---------------------------
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
CYAN = '\033[1;36m'
BLUE = '\033[1;34m'

# インポート＆初期設定
import os
import argparse

import numpy as np
import cv2
from PIL import Image, ImageTk
import PySimpleGUI as sg
import my_dialog

import _stylegan3

# 定数定義
DEF_IMAGE = './edit/pic/001.jpg'
IMAGE_DIR = './images'
RESULT_PATH = './results'

KEY_ORGIMAGE = '-Org-'
KEY_PRSIMAGE = '-Prs-'
KEY_MIN = '-Min-'
KEY_MAX = '-Max-'
KEY_AGE = '-Age-'
KEY_POSE = '-Pose-'
KEY_SMILE = '-Smile-'
KEY_MALE = '-Male-'
KEY_IMAGESEL = '-Image-'
KEY_IMAGEPTOS = '-New-'
KEY_EXIT = '-Exit-'
KEY_TXTORG = '-Source-'
KEY_TXTPRS = '-Process-'

IMG_CANCAS_SIZE = 512


# タイトル
title = 'StyleGAN3 edit    Ver 0.01'
sub_title = ''

# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', type = str, default='psp', choices=['psp', 'e4e'], help = 'encoder type \'psp / e4e\'')
    parser.add_argument("--source_image", default=DEF_IMAGE, help="path to source image")
    parser.add_argument("--result_path", default=RESULT_PATH, help="path to output")
    parser.add_argument("--align", action="store_false", help="make align image flag")
    parser.add_argument("--cpu", dest="cpu", action="store_true", help="cpu mode.")
    parser.add_argument('--log', type = int, metavar = 'LOG', default = '3', help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')

    return parser

# 基本情報の表示
def display_info(args, title):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('\n   - ' + YELLOW + 'model                   : ' + NOCOLOR, args.model)
    print('   - ' + YELLOW + 'source_image            : ' + NOCOLOR, args.source_image)
    print('   - ' + YELLOW + 'result_path             : ' + NOCOLOR, args.result_path)
    print('   - ' + YELLOW + 'align                   : ' + NOCOLOR, args.align)
    print('   - ' + YELLOW + 'cpu                     : ' + NOCOLOR, args.cpu)
    print('   - ' + YELLOW + 'log                     : ' + NOCOLOR, args.log)
    print(' ')

# カレントディレクトリ配下のディレクトリからファイルを選択
def get_image_file(tgt_dir, msg = ''):
    # カレントディレクトリ配下の指定のみ有効
    import re
    image_file = ''
    cpath = os.getcwd()
    cpath = re.sub(r'\\', '/', cpath)
    base_dir_pair = os.path.split(tgt_dir)
    if base_dir_pair[0] != '.':             # 配下のディレクトリ（1段）
        cpath = cpath + base_dir_pair[0][1:]

    while True:
        image_file = my_dialog.select_image_file(msg, tgt_dir)
        if len(image_file) == 0:
            break

        parir = os.path.split(image_file)
        parir = os.path.split(parir[0])
        if cpath == parir[0]:
            break

    return image_file


def main_process(opt, gan):
    gpu_d = gan.gpu_d

    # align & crop 画像の作成
    flg = gan.check_target_image(gan.pic_dir, gan.align_dir) and opt.align
    if not flg:
        gan.align_images(gan.pic_dir)
        gan.make_align(disp_f = True)

    # invert 画像作成
    flg = gan.check_target_image(gan.pic_dir, gan.invert_dir) and opt.align
    if not flg:
        gan.invert_images()
        gan.make_invert(disp_f = True)

    s_title = f'   model:{gan.model_sel}  {sub_title}'
    direction = 'age'
    min = -5
    max = 5
    image_path = gan.crop_dir + '/' + gan.source_image
    _, save_path, _ = gan.get_infer_path(direction, min, max)

    radio0 = True if direction == 'age' else False
    radio1 = True if direction == 'pose' else False
    radio2 = True if direction == 'smile' else False
    radio3 = True if direction == 'Male' else False

    # ウィンドウのテーマ
    sg.theme('BlueMono')

    # ウィンドウのレイアウト
    col_image0 = [
            [sg.Canvas(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_ORGIMAGE)], 
            [sg.Text(image_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTORG)],
            [sg.Text("min", size=(5, 1)), sg.Slider((-10, 10), min, resolution = 1, orientation='h', size=(50, 10), key=KEY_MIN, enable_events=True)],
            [sg.Text("max", size=(5, 1)), sg.Slider((-10, 10), max, resolution = 1, orientation='h', size=(50, 10), key=KEY_MAX, enable_events=True)]
    ]
    col_image1 = [
            [sg.Image(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_PRSIMAGE)], 
            [sg.Text(save_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTPRS)],
            [sg.Text("direction", size=(10, 1)), sg.Radio('age', group_id='direction',enable_events = True, default=radio0, key=KEY_AGE), sg.Radio('pose', group_id='direction',enable_events = True, default=radio1, key=KEY_POSE), sg.Radio('smile', group_id='direction',enable_events = True, default=radio2, key=KEY_SMILE), sg.Radio('Male', group_id='direction',enable_events = True, default=radio2, key=KEY_MALE)],
            [sg.Text("", size=(20, 1)), sg.Button('Image', size=(10, 1), key=KEY_IMAGESEL), sg.Button('Process', size=(10, 1), disabled = not gpu_d, key=KEY_IMAGEPTOS), sg.Text("", size=(1, 1)), sg.Button('Exit', size=(10, 1), key=KEY_EXIT)],
    ]

    layout = [[sg.Text("", size=(4, 1)), sg.Text(title + s_title, size=(60, 1), justification='left', font='Helvetica 20')],
            [sg.Column(col_image0, vertical_alignment='top'), sg.Column(col_image1, vertical_alignment='top')],
            [sg.Text("", size=(15, 1))]
    ]

    # ウィンドウオブジェクトの作成
    window = sg.Window(title, layout, finalize=True, return_keyboard_events=True)

    im0_canvas = window[KEY_ORGIMAGE]
    tcv_img0 = im0_canvas.TKCanvas
    tcv_img0.create_rectangle(0, 0, IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, fill = '#cccccc')

    org_image = Image.open(image_path)                                                  # PIL型で読み込み（オリジナル画像）
    p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
    pht0_image = ImageTk.PhotoImage(image = p_image)
    tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

    # 処理動画
    cap = cv2.VideoCapture(save_path)
    cap_open = cap.isOpened()

    # 処理画像クリア
    def clear_primage():
        frame = np.zeros((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, 3), np.uint8)
        frame[:,:,] = 0xcc
        img = cv2.imencode('.png', frame)[1].tobytes()
        window[KEY_PRSIMAGE].update(img)

    new_make_f = False

    # イベントのループ
    while True:
        event, values = window.read(timeout=200)

        if new_make_f:
            gan.set_param(image_path, gan.model_sel)
            gan.edit_interface_gan(direction, min, max, disp_f = False)
            _, save_path, _ = gan.get_infer_path(direction, min, max)
            window[KEY_TXTPRS].update(save_path)

            cap = cv2.VideoCapture(save_path)
            cap_open = cap.isOpened()
            new_make_f = False

        if event == KEY_EXIT or event == sg.WIN_CLOSED or event == 'Escape:27':
            break

        if event == KEY_MIN or event == KEY_MAX:
            min = int(values[KEY_MIN])
            max = int(values[KEY_MAX])

            if cap_open:
                cap.release()
            gan.set_param(image_path, gan.model_sel)
            _, save_path, _ = gan.get_infer_path(direction, min, max)
            clear_primage()
            cap = cv2.VideoCapture(save_path)
            cap_open = cap.isOpened()

            window[KEY_TXTPRS].update(save_path)
            logger.debug(f'New Process → {image_path}, direction = {direction}, min = {min}, max = {max}{NOCOLOR}')

        if event == KEY_AGE or event == KEY_POSE or event == KEY_SMILE or event == KEY_MALE:
            if values[KEY_AGE]:
                direction = 'age'
            elif values[KEY_POSE]:
                direction = 'pose'
            elif values[KEY_SMILE]:
                direction = 'smile'
            elif values[KEY_MALE]:
                direction = 'Male'

            if cap_open:
                cap.release()
            gan.set_param(image_path, gan.model_sel)
            _, save_path, _ = gan.get_infer_path(direction, min, max)
            clear_primage()
            cap = cv2.VideoCapture(save_path)
            cap_open = cap.isOpened()

            window[KEY_TXTPRS].update(save_path)
            logger.debug(f'New Process → {image_path}, direction = {direction}, min = {min}, max = {max}{NOCOLOR}')

        if event == KEY_IMAGESEL:
            fpath = get_image_file(gan.crop_dir, msg = gan.crop_dir + '　')
            if len(fpath) > 0:
                # オリジナル画像変更
                image_path = fpath
                window[KEY_TXTORG].update(image_path)
                org_image = Image.open(image_path)                                      # PIL型で読み込み（オリジナル画像）
                p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
                pht0_image = ImageTk.PhotoImage(image = p_image)
                tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

                base_dir_pair = os.path.split(image_path)
                img_name, ext = os.path.splitext(base_dir_pair[1])
                if cap_open:
                    cap.release()
                    frame = np.zeros((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, 3))
                    frame += 204
                    img = cv2.imencode('.png', frame)[1].tobytes()
                    window[KEY_PRSIMAGE].update(img)

                gan.set_param(image_path, gan.model_sel)
                _, save_path, _ = gan.get_infer_path(direction, min, max)
                cap = cv2.VideoCapture(save_path)
                cap_open = cap.isOpened()
                if gpu_d and not cap_open:
                    new_make_f = True

                window[KEY_TXTORG].update(image_path)
                window[KEY_TXTPRS].update(save_path)
                logger.debug(f'New image → {image_path}, direction = {direction}, min = {min}, max = {max}{NOCOLOR}')

        if event == KEY_IMAGEPTOS:
            if gpu_d:
                if cap_open:
                    cap.release()
                    cap_open = False

                new_make_f = True
            logger.debug(f'New Process → {image_path}, direction = {direction}, min = {min}, max = {max}{NOCOLOR}')

        if cap_open:
            ret, frame = cap.read()
            if frame is None:
                #最初のフレームに戻る
                cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
                ret, frame = cap.read()

            img = cv2.imencode('.png', frame)[1].tobytes()
            window[KEY_PRSIMAGE].update(img)

    # ウィンドウ終了処理
    if cap_open:
        cap.release()

    window.close()


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = parse_args()
    opt = parser.parse_args()

    if len(opt.source_image) == 0:
        opt.source_image = get_image_file(_stylegan3.StyleGAN3.CROP_DIR, msg = _stylegan3.StyleGAN3.CROP_DIR + '　')
        if len(opt.source_image) == 0:
            exit(0)

    gan = _stylegan3.StyleGAN3(opt.source_image, opt.result_path, opt.model, logsel = opt.log)
    logger = gan.logger

    if opt.cpu:
        gan.gpu_d = False

    sub_title = '' if gan.gpu_d else '  <view mode>'
    display_info(opt, title)

    main_process(opt, gan)

    logger.info('\nFinished.\n')

↑

StyleClipによる編集プログラム「stylegan3_clip.py」 †

主な仕様
・編集前処理
　(1) 画像フォルダ「./edit/pic」内の顔画像をサンプル画像を align 処理（顔固定）し「./edit/align」フォルダに同じファイル名で保存
　(2) 「./pic」内の顔画像をサンプル画像を clop 処理（背景固定固定）し「./edit/crop」フォルダに同じファイル名で保存
　(3) 顔固定の画像から潜在変数を求め「./edit/latents_psp」、潜在変数は無編集で、背景固定の画像を生成し「./edit/invert_psp」に保存する

・生成した潜在変数を用いて画像編集をおこなう
　<編集のための設定パラメータ> 　1. neutral_text : 基準テキスト（英文）（初期値:-5）
　2. target_text　: 対象テキスト（英文）（初期値:-5）
　3. alpha　　　　: 適用係数（初期値: 40）
　4, beta 　　　　: 適用係数（初期値: 13）

・生成パラメータ（min/max）はスライダーにより値を変更できる
・GPU が使用できないマシンの場合は「ビューモード」となり、GPU による処理結果を再生することができる

コマンドオプション一覧

コマンドオプション	引数	初期値	意味
--model	str	'psp'	エンコーダ選択（psp/e4e）
--source_dir	str	'./edit/pic/001.jpg'	静止画ファイル
--result_path	str	'./result'	出力保存フォルダ
--align	boot	True	すでに切り出し画像がある場合も再作成しない
--cpu	bool	False	CPU指定 (指定のない場合は自動選択)
--log	int	3	Log level(-1/0/1/2/3/4/5)

コマンド実行例

(py38_learn) python stylegan3_clip.py

StyleGAN3-Clip edit    Ver 0.01: Starting application...

   - model                   :  psp
   - source_image            :  ./edit/pic/001.jpg
   - result_path             :  ./results
   - align                   :  True
   - log                     :  3
 
 check_target_image = True './edit/align'
 check_target_image = True './edit/invert_psp'

Finished.

▼「stylegan3_clip.py」ソースコード

# -*- coding: utf-8 -*-
##------------------------------------------
##  StyleGAN3-Clip edit    Ver 0.01
##
##               2024.09.17 Masahiro Izutsu
##------------------------------------------
## stylegan3_clip.py

import warnings
warnings.simplefilter('ignore')

# Color Escape Code ---------------------------
GREEN = '\033[1;32m'
RED = '\033[1;31m'
NOCOLOR = '\033[0m'
YELLOW = '\033[1;33m'
CYAN = '\033[1;36m'
BLUE = '\033[1;34m'

# インポート＆初期設定
import os
import argparse

import numpy as np
import cv2
from PIL import Image, ImageTk
import PySimpleGUI as sg
import my_dialog

import _stylegan3

from torch.cuda import is_available
gpu_d = is_available()                                          # GPU 確認

# 定数定義
DEF_IMAGE = './edit/pic/001.jpg'
IMAGE_DIR = './images'
RESULT_PATH = './results'

KEY_ORGIMAGE = '-Org-'
KEY_PRSIMAGE = '-Prs-'
KEY_ALPHA = '-Alpha-'
KEY_BETA = '-Beta-'
KEY_AGE = '-Age-'
KEY_POSE = '-Pose-'
KEY_SMILE = '-Smile-'
KEY_MALE = '-Male-'
KEY_IMAGESEL = '-Image-'
KEY_IMAGEPTOS = '-New-'
KEY_EXIT = '-Exit-'
KEY_TXTORG = '-Source-'
KEY_TXTPRS = '-Process-'
KEY_NEUTBTN = '-NeutBtn-'
KEY_NEUTEXT = '-NeuText-'
KEY_TARGETBTN = '-TargetBtn-'
KEY_TARGETTEXT = '-TargetText-'

IMG_CANCAS_SIZE = 512


# タイトル
title = 'StyleGAN3-Clip edit    Ver 0.01'
sub_title = ''

# Parses arguments for the application
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', type = str, default='psp', choices=['psp', 'e4e'], help = 'encoder type \'psp / e4e\'')
    parser.add_argument("--source_image", default=DEF_IMAGE, help="path to source image")
    parser.add_argument("--result_path", default=RESULT_PATH, help="path to output")
    parser.add_argument("--align", action="store_false", help="make align image flag")
    parser.add_argument("--cpu", dest="cpu", action="store_true", help="cpu mode.")
    parser.add_argument('--log', type = int, metavar = 'LOG', default = '3', help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')

    return parser

# 基本情報の表示
def display_info(args, title):
    print('\n' + GREEN + title + ': Starting application...' + NOCOLOR)
    print('\n   - ' + YELLOW + 'model                   : ' + NOCOLOR, args.model)
    print('   - ' + YELLOW + 'source_image            : ' + NOCOLOR, args.source_image)
    print('   - ' + YELLOW + 'result_path             : ' + NOCOLOR, args.result_path)
    print('   - ' + YELLOW + 'align                   : ' + NOCOLOR, args.align)
    print('   - ' + YELLOW + 'cpu                     : ' + NOCOLOR, args.cpu)
    print('   - ' + YELLOW + 'log                     : ' + NOCOLOR, args.log)
    print(' ')

# カレントディレクトリ配下のディレクトリからファイルを選択
def get_image_file(tgt_dir, msg = ''):
    # カレントディレクトリ配下の指定のみ有効
    import re
    image_file = ''
    cpath = os.getcwd()
    cpath = re.sub(r'\\', '/', cpath)
    base_dir_pair = os.path.split(tgt_dir)
    if base_dir_pair[0] != '.':             # 配下のディレクトリ（1段）
        cpath = cpath + base_dir_pair[0][1:]

    while True:
        image_file = my_dialog.select_image_file(msg, tgt_dir)
        if len(image_file) == 0:
            break

        parir = os.path.split(image_file)
        parir = os.path.split(parir[0])
        if cpath == parir[0]:
            break

    return image_file


def main_process(opt, gan):
    gpu_d = gan.gpu_d

    # align & crop 画像の作成
    flg = gan.check_target_image(gan.pic_dir, gan.align_dir) and opt.align
    if not flg:
        gan.align_images(gan.pic_dir)
        gan.make_align(disp_f = True)

    # invert 画像作成
    flg = gan.check_target_image(gan.pic_dir, gan.invert_dir) and opt.align
    if not flg:
        gan.invert_images()
        gan.make_invert(disp_f = True)

    s_title = f'   model:{gan.model_sel}  {sub_title}'
    neutral_text = "a face"
    target_text = "a smiling face"
    alpha = 40
    beta = 13
    image_path = gan.crop_dir + '/' + gan.source_image
    save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)

    # ウィンドウのテーマ
    sg.theme('BlueMono')

    # ウィンドウのレイアウト
    col_image0 = [
            [sg.Canvas(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_ORGIMAGE)], 
            [sg.Text(image_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTORG)],
            [sg.Text("alpha", size=(5, 1)), sg.Slider((-50, 50), alpha, resolution = 5, orientation='h', size=(50, 10), key=KEY_ALPHA, enable_events=True)],
            [sg.Text("beta", size=(5, 1)), sg.Slider((-100, 100), beta, resolution = 1, orientation='h', size=(50, 10), key=KEY_BETA, enable_events=True)],
            [sg.Text("", size=(15, 1))]
    ]
    col_image1 = [
            [sg.Image(size=(IMG_CANCAS_SIZE, IMG_CANCAS_SIZE), key=KEY_PRSIMAGE)], 
            [sg.Text(save_path, background_color='LightSteelBlue1', size=(63, 1), key = KEY_TXTPRS)],
            [sg.Button('neutral', size=(8, 1), key=KEY_NEUTBTN), sg.Text(neutral_text, background_color='White', size=(58, 1), key = KEY_NEUTEXT)],
            [sg.Button('target', size=(8, 1), key=KEY_TARGETBTN), sg.Text(target_text, background_color='White', size=(58, 1), key = KEY_TARGETTEXT)],
            [sg.Text("", size=(20, 1)), sg.Button('Image', size=(10, 1), key=KEY_IMAGESEL), sg.Button('Process', size=(10, 1), disabled = not gpu_d, key=KEY_IMAGEPTOS), sg.Text("", size=(1, 1)), sg.Button('Exit', size=(10, 1), key=KEY_EXIT)],
    ]

    layout = [[sg.Text("", size=(4, 1)), sg.Text(title + s_title, size=(60, 1), justification='left', font='Helvetica 20')],
            [sg.Column(col_image0, vertical_alignment='top'), sg.Column(col_image1, vertical_alignment='top')],
            [sg.Text("", size=(15, 1))]
    ]

    # ウィンドウオブジェクトの作成
    window = sg.Window(title, layout, finalize=True, return_keyboard_events=True)

    im0_canvas = window[KEY_ORGIMAGE]
    tcv_img0 = im0_canvas.TKCanvas
    tcv_img0.create_rectangle(0, 0, IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, fill = '#cccccc')

    org_image = Image.open(image_path)                                                  # PIL型で読み込み（オリジナル画像）
    p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
    pht0_image = ImageTk.PhotoImage(image = p_image)
    tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

    # 処理画像クリア
    def clear_primage():
        frame = np.zeros((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE, 3), np.uint8)
        frame[:,:,] = 0xcc
        img = cv2.imencode('.png', frame)[1].tobytes()
        window[KEY_PRSIMAGE].update(img)

    # 処理画像の表示
    def disp_primage(filepath):
        fopen = os.path.isfile(filepath)
        if fopen:
            frame = cv2.imread(filepath)
            fopen = frame is not None

        if fopen:
            img = cv2.imencode('.png', frame)[1].tobytes()
            window[KEY_PRSIMAGE].update(img)
        return fopen

    fopen = False
    new_make_f = False

    # イベントのループ
    while True:
        event, values = window.read(timeout=200)

        if new_make_f:
            gan.set_param(image_path, gan.model_sel)
            gan.edit_styleclip(neutral_text, target_text, alpha, beta, disp_f = False)
            save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
            window[KEY_TXTPRS].update(save_path)

            fopen = disp_primage(save_path)
            new_make_f = False

        if event == KEY_EXIT or event == sg.WIN_CLOSED or event == 'Escape:27':
            break

        if event == KEY_NEUTBTN:
            result = sg.popup_get_text('Please input text !!', title = 'neutral_text', default_text = neutral_text)
            if result is not None:
                neutral_text = result

            save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
            fopen = disp_primage(save_path)
            if not fopen:
                clear_primage()
            window[KEY_NEUTEXT].update(neutral_text)
            logger.debug(f'event = {KEY_NEUTBTN} → neutral_text = {neutral_text}')

        if event == KEY_TARGETBTN:
            result = sg.popup_get_text('Please input text !!', title = 'target_text', default_text = target_text)
            if result is not None:
                target_text = result

            save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
            fopen = disp_primage(save_path)
            if not fopen:
                clear_primage()
            window[KEY_TARGETTEXT].update(target_text)
            logger.debug(f'event = {KEY_TARGETBTN} → target_text = {target_text}')


        if event == KEY_ALPHA or event == KEY_BETA:
            alpha = float(values[KEY_ALPHA])
            beta = float(values[KEY_BETA])

            save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
            fopen = disp_primage(save_path)
            if not fopen:
                clear_primage()

            logger.debug(f'event = {event}')

        if event == KEY_IMAGESEL:
            fpath = get_image_file(gan.crop_dir, msg = gan.crop_dir + '　')
            if len(fpath) > 0:
                # オリジナル画像変更
                image_path = fpath
                window[KEY_TXTORG].update(image_path)
                org_image = Image.open(image_path)                                      # PIL型で読み込み（オリジナル画像）
                p_image = org_image.resize((IMG_CANCAS_SIZE, IMG_CANCAS_SIZE))
                pht0_image = ImageTk.PhotoImage(image = p_image)
                tcv_img0.create_image(IMG_CANCAS_SIZE / 2, IMG_CANCAS_SIZE / 2, image = pht0_image) # Canvasの中心に表示

                base_dir_pair = os.path.split(image_path)
                img_name, ext = os.path.splitext(base_dir_pair[1])

                gan.set_param(image_path, gan.model_sel)
                save_path, _, _ = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
                fopen = disp_primage(save_path)

                if gpu_d and not fopen:
                    clear_primage()
                    new_make_f = True

                window[KEY_TXTORG].update(image_path)
                window[KEY_TXTPRS].update(save_path)
                logger.debug(f'New image → {image_path}{NOCOLOR}')

        if event == KEY_IMAGEPTOS:
            if gpu_d:
                new_make_f = True
            logger.debug(f'New Process → {save_path}{NOCOLOR}')

        if not fopen:
            fopen = disp_primage(save_path)
            if not fopen:
                clear_primage()
                fopen = True

    # ウィンドウ終了処理
    window.close()


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = parse_args()
    opt = parser.parse_args()

    if len(opt.source_image) == 0:
        opt.source_image = get_image_file(_stylegan3.StyleGAN3.CROP_DIR, msg = _stylegan3.StyleGAN3.CROP_DIR + '　')
        if len(opt.source_image) == 0:
            exit(0)

    gan = _stylegan3.StyleGAN3(opt.source_image, opt.result_path, opt.model, logsel = opt.log)
    logger = gan.logger

    if opt.cpu:
        gan.gpu_d = False

    sub_title = '' if gan.gpu_d else '  <view mode>'
    display_info(opt, title)

    main_process(opt, gan)

    logger.info('\nFinished.\n')

↑

StyleGAN3 実行モジュール「_stylegan.py」 †

　「StyleGAN3クラス」としてパッケージ化する　※ Linux環境でのみ動作（2024/10/10 現在）

関数メソッド (パラメータ詳細はソースコード参照)

機能	関数
クラスの初期化	FileTreatment(src, result, model, logsel = 3)
画像・エンコーダ再設定	set_param(src, model)
フォルダー内の一覧画像の作成	folder_image(folder, save_path='', pixel_size=(256,256), dpi=64, xn=10)
terget画像の存在確認	check_target_image(img_path, target_path)
画像を動画に変換	make_movie(movie_path, rate, disp_f = True)
align 画像一覧ファイル名の取得	get_align_filename()
invert 画像一覧ファイル名の取得	get_invert_filename()
align & crop 画像の作成	align_images(image_dir)
invert 画像の作成	invert_images()
エンコーダ・ロード	load_encoder()
infer_path 取得	get_infer_path(direction, min_value, max_value)
infer_clip_path 取得	get_infer_clip_path(neutral_text, target_text, alpha, beta)
align 画像一覧作成	make_align(disp_f=True)
invert 画像一覧作成	make_invert(disp_f=True)
interFaceGANによる編集	edit_interface_gan(edit_direction, min_value, max_value, disp_f=True)
StyleCLIPによる編集	edit_styleclip(neutral_text, target_text, alpha, beta, disp_f=True)

「_stylegan.py」クラスメンバを順にテスト

(py38_learn) python _stylegan3.py

1. 各種変数の取得（ログ）

 GPU mode: True
 インスタンス変数:   gan.pic_dir = ./edit/pic
 インスタンス変数:   gan.source_image = 001.jpg
 クラス変数:         gan.OUT_MOVIE = ./tmpimg/output.mp4
 インスタンス変数:   gan.invert_dir = ./edit/invert_psp

2. align & crop 画像の作成
・１列目が入力画像、２列目がalign処理（顔固定）、３列目がcrop処理（背景固定）結果画像

・実行ログ

100%|██████████████████████████████████████████████████| 5/5 [01:09<00:00, 13.91s/it]
Loading ReStyle pSp from checkpoint: ./pretrained_models/restyle_pSp_ffhq.pt
Loading StyleGAN3 generator from path: None
Model successfully loaded!

3. invert 画像作成
・１列目が実写から切り抜いた背景固定の画像、２列目が潜在変数から生成した背景固定の画像

・実行ログ

Setting up PyTorch plugin "filtered_lrelu_plugin"... Done.
100%|██████████████████████████████████████████████████| 5/5 [00:54<00:00, 10.85s/it]

4. interFaceGANによる編集

　・設定パラメータ
　invert: 　　　　001.jpg
　edit_direction: age
　min_value: 　　-5
　max_value: 　　5

・実行ログ

<interFaceGAN> Performing edit for age...
<interFaceGAN> result image → ./results/psp-gan_age_-5_5_001.jpg
 ffmpeg -r 5 -i ./tmpimg/img/%3d.jpg -vcodec libx264 -pix_fmt yuv420p ./tmpimg/output.mp4 -loglevel quiet -y
 making movie... → ./results/psp-gan_age_-5_5_001.mp4

5. StyleCLIPによる編集

　・設定パラメータ
　neutral_text:　　a face
　target_text: 　　a smiling face
　alpha: 　　　　　40
　beta: 　　　　　13

・実行ログ

<StyleCLIP> Performing edit for: "a smiling face"...
<StyleCLIP> result image → ./results/psp-clip_a~face_a~smiling~face_40_13_001.jpg
<StyleCLIP> result image → ./results/psp-clip_a~face_a~smiling~face_40_13_001_a.jpg

モジュール・ソースコード

▼「_stylegan.py」

# -*- coding: utf-8 -*-
##------------------------------------------
##  StyleGAN3 cl;ass    Ver 0.01
##
##               2024.09.14 Masahiro Izutsu
##------------------------------------------
## _stylegan3.py

import warnings
warnings.simplefilter('ignore')

import os
import shutil
from tqdm import tqdm
import dlib
import matplotlib.pyplot as plt
from skimage.transform import resize
import torch
from PIL import Image
import numpy as np
import cv2
import time
import torchvision.transforms as transforms
from torch.cuda import is_available

from utils.common import tensor2im

from editing.interfacegan.face_editor import FaceEditor
from editing.styleclip_global_directions import edit as styleclip_edit
from utils.alignment_utils import align_face, crop_face, get_stylegan_transform
import my_logging
import my_imagetool

class StyleGAN3:
    gpu_d = is_available()                          # GPU 確認
    SHAPE_PREDICTOR = './pretrained_models/shape_predictor_68_face_landmarks.dat'

    DEF_IMAGE = './edit/pic/001.jpg'
    RESULT_PATH = './results'
    PIC_DIR = './edit/pic'
    ALIGN_DIR = './edit/align'
    CROP_DIR = './edit/crop'
    INVERT_DIR = './edit/invert'
    LATENTS_DIR = './edit/latents'
    TMP_IMAGE_DIR = './tmpimg'
    OUT_MOVIE = './tmpimg/output.mp4'

    img_transforms = transforms.Compose([
                transforms.Resize((256, 256)),
                transforms.ToTensor(),
                transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])

    # 初期化
    #   out_path:   出力ファイル・パス
    #   logsel: ログ出力選択
    def __init__(self, src, result, model, logsel = 3):
        self.logger = my_logging.get_module_logger_sel(__name__, logsel)
        self.align_dir = self.ALIGN_DIR
        self.crop_dir = self.CROP_DIR
        self.result_path = result
        self.pic_dir = ''
        self.source_image = ''
        self.model_sel = ''
        self.set_param(src, model)

        self.reset_folder(self.TMP_IMAGE_DIR)
        self.reset_folder(self.TMP_IMAGE_DIR + '/img')
        os.makedirs(self.result_path, exist_ok = True)

    def set_param(self, src, model):
        if self.model_sel != model:
            self.model_sel = model
            self.net = None
            self.opts = None
            self.invert_dir = self.INVERT_DIR + '_' + self.model_sel
            self.latens_dir = self.LATENTS_DIR + '_' + self.model_sel
            self.experiment_type = 'restyle_pSp_ffhq' if self.model_sel == 'psp' else 'restyle_e4e_ffhq'

        base_dir_pair = os.path.split(src)
        if self.source_image != base_dir_pair[1] or self.pic_dir != base_dir_pair[0]:
            self.pic_dir = base_dir_pair[0]
            self.source_image = base_dir_pair[1]



    def run_alignment(self, image_path):
        predictor = dlib.shape_predictor(self.SHAPE_PREDICTOR)
        detector = dlib.get_frontal_face_detector()
        # self.logger.debug("Aligning image...")
        aligned_image = align_face(filepath=str(image_path), detector=detector, predictor=predictor)
        # self.logger.debug(f"Finished aligning image: {image_path}")
        return aligned_image

    def crop_image(self, image_path):
        predictor = dlib.shape_predictor(self.SHAPE_PREDICTOR)
        detector = dlib.get_frontal_face_detector()
        # self.logger.debug("Cropping image...")
        cropped_image = crop_face(filepath=str(image_path), detector=detector, predictor=predictor)
        # self.logger.debug(f"Finished cropping image: {image_path}")
        return cropped_image

    def compute_transforms(self, aligned_path, cropped_path):
        predictor = dlib.shape_predictor(self.SHAPE_PREDICTOR)
        detector = dlib.get_frontal_face_detector()
        # self.logger.debug("Computing landmarks-based transforms...")
        res = get_stylegan_transform(str(cropped_path), str(aligned_path), detector, predictor)
        # self.logger.debug("transforms Done!")
        if res is None:
            self.logger.error(f"Failed computing transforms on: {cropped_path}")
            return
        else:
            rotation_angle, translation, transform, inverse_transform = res
            return inverse_transform

    def reset_folder(self, path):
        if os.path.isdir(path):
          shutil.rmtree(path)
        os.makedirs(path,exist_ok=True)

    # フォルダー内の一覧画像の作成
    def folder_image(self, folder, save_path='', pixel_size=(256,256), dpi=64, xn=10):
        files = os.listdir(folder)
        files.sort()
        n = len(files)
        yn = n // xn + 1
        if n < xn:
            xn = n

        # ピクセル → インチ変換
        x_inch = pixel_size[0] / dpi
        y_inch = pixel_size[1] / dpi

        fig = plt.figure(figsize = (x_inch * xn, y_inch * yn + 0.2), dpi = dpi)
        fig.subplots_adjust(left=0, right=1, bottom=0, top=1)

        for i, file in enumerate(files):
            img = Image.open(folder+'/'+file)
            images = np.asarray(img)

            # 正方形にする
            img_h, img_w = images.shape[:2]
            if img_h != img_w:
                images = my_imagetool.frame_square(images)
            images = resize(images, pixel_size)[..., :3]

            ax = fig.add_subplot(yn, xn, i+1, xticks=[], yticks=[])
            image_plt = np.array(images)
            ax.imshow(image_plt)
            ax.set_xlabel(folder+'/'+file, fontsize=15)

        if len(save_path) > 0:
            plt.savefig(save_path)

        plt.close()

    # terget画像の存在確認
    def check_target_image(self, img_path, target_path):
        files_image = [
            f for f in os.listdir(img_path) if os.path.isfile(os.path.join(img_path, f))
        ]
        self.logger.debug(f'{img_path} = {files_image}')

        if not os.path.isdir(target_path):
            return False

        files_target = [
            f for f in os.listdir(target_path) if os.path.isfile(os.path.join(target_path, f))
        ]
        self.logger.debug(f'{target_path} = {files_target}')

        flag = True
        for img_path in files_image:
            name, ext= os.path.splitext(img_path)
            path = target_path + '/' + name + ext
            if not os.path.isfile(path):
                flag = False

        self.logger.info(f' check_target_image = {flag} \'{target_path}\'')
        return flag

    # 画像を動画に変換
    def make_movie(self, movie_path, rate, disp_f = True):
        s_path = self.TMP_IMAGE_DIR + '/img/%3d.jpg'

        command = f'ffmpeg -r {rate} -i {s_path} -vcodec libx264 -pix_fmt yuv420p {self.OUT_MOVIE} -loglevel quiet -y'
        self.logger.info(f' {command}')
        os.system(command)

        # out_dir フォルダへ名前を付けてコピー
        self.logger.info(f' making movie... → {movie_path}')
        shutil.copy(self.OUT_MOVIE, movie_path)

        if disp_f:
            my_imagetool.image2disp(movie_path)

    ## align 画像一覧ファイル名の取得
    def get_align_filename(self):
        base_dir_pair = os.path.split(self.pic_dir)
        path = self.result_path + '/' + base_dir_pair[1] + '_align_crop.jpg'
        msg = 'pic - align - crop'
        return path, msg

    ## invert 画像一覧ファイル名の取得
    def get_invert_filename(self):
        base_dir_pair = os.path.split(self.pic_dir)
        path = self.result_path + '/' + self.model_sel + '-' + base_dir_pair[1] + '_crop_invert.jpg'
        msg = 'crop - invert'
        return path, msg

    ## align & crop 画像の作成
    def align_images(self, image_dir):
        self.pic_dir = image_dir
        self.reset_folder(self.align_dir)
        self.reset_folder(self.crop_dir)

        files = sorted(os.listdir(self.pic_dir))
        for i, file in enumerate(tqdm(files)):
            input_image = self.run_alignment(self.pic_dir + '/' + file)
            cropped_image = self.crop_image(self.pic_dir + '/' + file)
            name = os.path.splitext(file)[0]
            input_image.save(self.align_dir + '/' + name + '.jpg')
            cropped_image.save(self.crop_dir + '/' + name + '.jpg')

    # invert 画像の作成
    def invert_images(self):
        from utils.inference_utils import run_on_batch
        from utils.inference_utils import get_average_image

        self.load_encoder()
        self.reset_folder(self.invert_dir)
        self.reset_folder(self.latens_dir)

        avg_image = get_average_image(self.net)
        files = sorted(os.listdir(self.align_dir))
        for file in tqdm(files):
            input_image = Image.open(self.align_dir+ '/' + file)
            aligned_path = self.align_dir + '/' + file
            cropped_path = self.crop_dir + '/' + file

            landmarks_transform = self.compute_transforms(aligned_path = aligned_path, cropped_path = cropped_path)

            self.opts.n_iters_per_batch = 3
            self.opts.resize_outputs = False                # generate outputs at full resolution
            transformed_image = self.img_transforms(input_image)
    
            with torch.no_grad():
                tic = time.time()
                result_batch, result_latents = run_on_batch(
                        inputs = transformed_image.unsqueeze(0).cuda().float(),
                        net = self.net,
                        opts = self.opts,
                        avg_image = avg_image,
                        landmarks_transform = torch.from_numpy(landmarks_transform).cuda().float())
                toc = time.time()
                #print('Inference took {:.4f} seconds.'.format(toc - tic))

            result_tensors = result_batch[0]
            final_rec = tensor2im(result_tensors[-1])       #.resize(resize_amount)
            final_rec.save(self.invert_dir + '/' + file)

            name = os.path.splitext(file)[0]
            np.save(self.latens_dir + '/' + name, result_latents[0][-1])

    ## align 画像一覧作成
    def make_align(self, disp_f=True):
        path, msg = self.get_align_filename()
        path0 = self.TMP_IMAGE_DIR + '/tmp_pic.jpg'
        path1 = self.TMP_IMAGE_DIR + '/tmp_align.jpg'
        path2 = self.TMP_IMAGE_DIR + '/tmp_crop.jpg'

        # フォルダ内の画像一覧作成
        self.folder_image(self.pic_dir, save_path = path0)
        self.folder_image(self.align_dir, save_path = path1)
        self.folder_image(self.crop_dir, save_path = path2)
        images = []
        images.append(cv2.imread(path0))
        images.append(cv2.imread(path1))
        images.append(cv2.imread(path2))
        h, w = images[0].shape[:2]
        ds_image = my_imagetool.make_tileimage(images, xmax = w, ymax = h * 3)
        my_imagetool.image_disp(ds_image, winname = msg, dispf = disp_f, save_path = path, maxsize = 1024, wait_s = 2)

    ## invert 画像一覧作成
    def make_invert(self, disp_f=True):
        path, msg = self.get_invert_filename()
        path0 = self.TMP_IMAGE_DIR + '/tmp_crop.jpg'
        path1 = self.TMP_IMAGE_DIR + '/tmp_invert.jpg'

        # フォルダ内の画像一覧作成
        self.folder_image(self.crop_dir, save_path = path0)
        self.folder_image(self.invert_dir, save_path = path1)
        images = []
        images.append(cv2.imread(path0))
        images.append(cv2.imread(path1))
        h, w = images[0].shape[:2]
        ds_image = my_imagetool.make_tileimage(images, xmax = w, ymax = h * 2)
        my_imagetool.image_disp(ds_image, winname = msg, dispf = disp_f, save_path = path, maxsize = 1024, wait_s = 2)

    ## エンコーダ・ロード
    def load_encoder(self):
        from utils.inference_utils import load_encoder

        if self.net == None:
            model_path = f'./pretrained_models/{self.experiment_type}.pt'
            self.net, self.opts = load_encoder(checkpoint_path=model_path)

    ## infer_path 取得
    def get_infer_path(self,direction, min_value, max_value):
        infer_path = f'{self.result_path}/{self.model_sel}-gan_{direction}_{min_value}_{max_value}_{self.source_image}'
        msg = f'interFaceGAN> {infer_path}'
        movie_path = infer_path[:-3] + 'mp4'
        return infer_path, movie_path, msg

    ## infer_clip_path 取得
    def get_infer_clip_path(self, neutral_text, target_text, alpha, beta):
        neutral = neutral_text.replace(' ', '~')
        target = target_text.replace(' ', '~')
        infer_path = f'{self.result_path}/{self.model_sel}-clip_{neutral}_{target}_{int(alpha)}_{int(beta)}_{self.source_image}'
        msg = f'StyleCLIP> {infer_path}'
        infer_path_a = infer_path[:-4] + '_a' + infer_path[-4:]
        return infer_path, infer_path_a, msg

    ## interFaceGANによる編集
    ##  in  edit_direction: 'age', 'smile', 'pose', 'Male'
    ##      min_value:      min:-10, max:10, step:1
    ##      max_value:      min:-10, max:10, step:1

    def edit_interface_gan(self, edit_direction, min_value, max_value, disp_f=True):
        self.load_encoder()
        self.reset_folder(self.TMP_IMAGE_DIR)
        self.reset_folder(self.TMP_IMAGE_DIR + '/img')
        name = os.path.splitext(self.source_image)[0] + '.npy'
        result_latents_ = np.load(self.latens_dir + '/' + name)
        aligned_path = self.align_dir + '/' + self.source_image
        cropped_path = self.crop_dir + '/' + self.source_image
        landmarks_transform = self.compute_transforms(aligned_path = aligned_path, cropped_path = cropped_path)
        editor = FaceEditor(stylegan_generator = self.net.decoder, generator_type = "aligned")

        self.logger.info(f"<interFaceGAN> Performing edit for {edit_direction}...")
        input_latent = torch.from_numpy(result_latents_).unsqueeze(0).cuda()
        edit_images, edit_latents = editor.edit(latents = input_latent,
                                                direction = edit_direction,
                                                factor_range = (min_value, max_value),
                                                user_transforms = landmarks_transform,
                                                apply_user_transformations = True)

        # 結果出力
        def prepare_edited_result(edit_images):
            if type(edit_images[0]) == list:
                edit_images = [image[0] for image in edit_images]

            i = 0
            for image in edit_images:
                o_path = self.TMP_IMAGE_DIR + '/img/' + str(i).zfill(3) + '.jpg'
                self.logger.debug(f' Image out ... {o_path}')
                image.resize((512, 512)).save(o_path)
                i = i + 1

            res = np.array(edit_images[0].resize((512, 512)))
            for image in edit_images[1:]:
                res = np.concatenate([res, image.resize((512, 512))], axis=1)
            res = Image.fromarray(res).convert("RGB")
            return res

        infer_path, movie_path, msg = self.get_infer_path(edit_direction, min_value, max_value)
        res = prepare_edited_result(edit_images)
        res.save(infer_path)
        self.logger.info(f'<interFaceGAN> result image → {infer_path}')
        if disp_f:
            my_imagetool.image2disp(infer_path, winname = msg, maxsize = 1000)

        self.make_movie(movie_path, rate = 5, disp_f = disp_f)

    ## StyleCLIPによる編集
    ##  in  neutral_text:   text string
    ##      target_text:    text string
    ##      alpha:          min:-5, max:5, step:0.5     (x10)
    ##      beta:           min:-1, max:1, step:0.1     (x100)

    def edit_styleclip(self, neutral_text, target_text, alpha, beta, disp_f=True):
        self.load_encoder()
        styleclip_args = styleclip_edit.EditConfig()
        global_direction_calculator = styleclip_edit.load_direction_calculator(stylegan_model = self.net.decoder, opts = styleclip_args)

        opts = styleclip_edit.EditConfig()
        opts.alpha_min = alpha / 10
        opts.alpha_max = alpha / 10
        opts.num_alphas = 1
        opts.beta_min = beta / 100
        opts.beta_max = beta / 100
        opts.num_betas = 1
        opts.neutral_text = neutral_text
        opts.target_text = target_text

        # 推論実行
        name = os.path.splitext(self.source_image)[0] + '.npy'
        result_latents_ = np.load(self.latens_dir + '/' + name)
        aligned_path = self.align_dir + '/' + self.source_image
        cropped_path = self.crop_dir + '/' + self.source_image
        landmarks_transform = self.compute_transforms(aligned_path = aligned_path, cropped_path = cropped_path)

        input_transforms = torch.from_numpy(landmarks_transform).cpu().numpy()
        self.logger.info(f'<StyleCLIP> Performing edit for: "{opts.target_text}"...')
        edit_res, edit_latent = styleclip_edit.edit_image(latent = result_latents_,
                                                          landmarks_transform = input_transforms,
                                                          stylegan_model = self.net.decoder,
                                                          global_direction_calculator = global_direction_calculator,
                                                          opts = opts,
                                                          image_name = None,
                                                          save = False)

        input_image = Image.open(self.invert_dir + '/' + self.source_image)
        transformed_image = self.img_transforms(input_image)

        # 結果出力
        infer_clip_path, infer_clip_path_a, msg = self.get_infer_clip_path(neutral_text, target_text, alpha, beta)
        input_im = tensor2im(transformed_image).resize((512, 512))
        edited_im = tensor2im(edit_res[0]).resize((512, 512))
        edit_coupled = np.concatenate([np.array(input_im), np.array(edited_im)], axis=1)
        edit_coupled = Image.fromarray(edit_coupled)
        edited_im.save(infer_clip_path)
        self.logger.info(f'<StyleCLIP> result image → {infer_clip_path}')
        edit_coupled.save(infer_clip_path_a)
        self.logger.info(f'<StyleCLIP> result image → {infer_clip_path_a}')
        if disp_f:
            my_imagetool.image2disp(infer_clip_path, winname = msg, maxsize = 1000)


#-----Test routine-----
# $ python _stylegan3.py
#
if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument('--model', type = str, default='psp', choices=['psp', 'e4e'], help = 'encoder type \'psp / e4e\'')
    parser.add_argument("--source_image", default=StyleGAN3.DEF_IMAGE, help="path to source image")
    parser.add_argument("--result_path", default=StyleGAN3.RESULT_PATH, help="path to output")
    parser.add_argument("--align", action="store_false", help="make align image flag")
    parser.add_argument("--cpu", dest="cpu", action="store_true", help="cpu mode.")
    parser.add_argument('--log', type = int, metavar = 'LOG', default = '3', help = 'Log level(-1/0/1/2/3/4/5) Default value is \'3\'')
    opt = parser.parse_args()

    gan = StyleGAN3(opt.source_image, opt.result_path, opt.model, logsel = opt.log)
    if opt.cpu:
        gan.gpu_d = False

    print(f' GPU mode: {gan.gpu_d}')
    print(f' インスタンス変数:\t gan.pic_dir = {gan.pic_dir}')
    print(f' インスタンス変数:\t gan.source_image = {gan.source_image}')
    print(f' クラス変数:\t\t gan.OUT_MOVIE = {gan.OUT_MOVIE}')
    print(f' インスタンス変数:\t gan.invert_dir = {gan.invert_dir}')

    # align & crop 画像の作成
    flg = gan.check_target_image(gan.pic_dir, gan.align_dir) and opt.align
    if not flg and gan.gpu_d:
        gan.align_images(gan.pic_dir)
        gan.make_align(disp_f = False)

    path, msg = gan.get_align_filename()
    my_imagetool.image2disp(path, winname = msg, maxsize = 1024)

    # invert 画像作成
    flg = gan.check_target_image(gan.pic_dir, gan.invert_dir) and opt.align
    if not flg and gan.gpu_d:
        gan.invert_images()
        gan.make_invert(disp_f = False)

    path, msg = gan.get_invert_filename()
    my_imagetool.image2disp(path, winname = msg, maxsize = 1000)

    ## interFaceGANによる編集
    edit_direction = 'age'
    min_value = -5
    max_value = 5
    if gan.gpu_d:
        gan.edit_interface_gan(edit_direction, min_value, max_value, disp_f = False)

    infer_path, movie_path, msg = gan.get_infer_path(edit_direction, min_value, max_value)
    my_imagetool.image2disp(infer_path, winname = msg, maxsize = 1000)
    my_imagetool.image2disp(movie_path)

    ## StyleCLIPによる編集
    neutral_text = "a face"
    target_text = "a smiling face"
    alpha = 40
    beta = 13
    if gan.gpu_d:
        gan.edit_styleclip(neutral_text, target_text, alpha, beta, disp_f = False)

    infer_clip_path, infer_clip_path_a, msg = gan.get_infer_clip_path(neutral_text, target_text, alpha, beta)
    my_imagetool.image2disp(infer_clip_path, winname = msg, maxsize = 1000)
    my_imagetool.image2disp(infer_clip_path_a, winname = msg, maxsize = 1000)

参考サイト
- cedro3/stylegan3-editing

↑

「モナリザ」を編集してみる †

interFaceGAN

　・設定パラメータ
　invert: 　　　　001.jpg
　edit_direction: age
　min_value: 　　-5
　max_value: 　　5

StyleCLIP

　・設定パラメータ
　neutral_text:　　a face
　target_text: 　　a smiling face
　alpha: 　　　　　40
　beta: 　　　　　13

参考サイト
- [令和版]モナリザに「表情」をつける（なんなら表情以外もつけちゃう）

↑

対処した問題点とエラー詳細 †

↑

AttributeError: module 'distutils' has no attribute '_msvccompiler' †

新たに作成した仮想環境下でエラーが発生するようになる（Windows のみ）

    :
  File "C:\Users\XXXX\anaconda3\envs\py38_learn_test2\lib\site-packages\torch\utils\cpp_extension.py", line 1834, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "C:\Users\iXXXX\anaconda3\envs\py38_learn_test2\lib\site-packages\torch\utils\cpp_extension.py", line 2082, in _run_ninja_build
    vc_env = distutils._msvccompiler._get_vc_env(plat_spec)
AttributeError: module 'distutils' has no attribute '_msvccompiler'

それぞれの環境下で比較してみる
・エラーの発生する環境

(py38_learn_test2) python
Python 3.8.19 (default, Mar 20 2024, 19:55:45) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from setuptools import distutils
>>> vc_env = distutils._msvccompiler._get_vc_env('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'distutils' has no attribute '_msvccompiler'
>>> quit()

・エラーの発生しない環境

(py38_learn_test) python
Python 3.8.19 (default, Mar 20 2024, 19:55:45) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from setuptools import distutils
>>> vc_env = distutils._msvccompiler._get_vc_env('')
>>> quit()

原因と対応策
・その後のアップデートによりインストールされる「setuptools」に問題があるようなので「72.1.0」にダウングレードする
```
pip uninstall setuptools
pip install setuptools==72.1.0
```
・環境構築手順『仮想環境 (py38_learn)』を更新した
・新たに環境を作成する場合は「cpp_extension.py」の修正（前項の手順）を忘れないこと

↑

Setting up PyTorch plugin "filtered_lrelu_plugin"... Failed! †

Windows 環境でパッケージのインポート途中でエラーとなる

Setting up PyTorch plugin "filtered_lrelu_plugin"... Failed!
Traceback (most recent call last):
  File "stylegan3.py", line 113, in <module>
    main(opt)
        :
    hash_value = update_hash(hash_value, file.read())
UnicodeDecodeError: 'cp932' codec can't decode byte 0xef in position 0: illegal multibyte sequence

原因と対応策
・原因不明（2024/10/9 現在）Cコンパイラの外部アクセスに問題があるよう！

↑

更新履歴 †

2024/09/20 初版
2024/10/10 全面改訂

↑

参考資料 †

StyleGAN3
- Third Time's the Charm? Image and Video Editing with StyleGAN3 (AIM Workshop ECCV 2022)
- StyleGAN3 NVIDIA オフィシャルサイト Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper

Others
- [令和版]モナリザに「表情」をつける（なんなら表情以外もつけちゃう）
- AIに自分好みのアイドルを生成させる