AI_Program3 のバックアップ(No.20)

私的AI研究会 > AI_Program3

生成 AI プログラミング３ †

　これまで検証してきた結果をもとに、Python で生成 AI プログラムを書く

▲　目　次

生成 AI プログラミング３
参考資料

※ 最終更新:2025/07/16　

↑

diffusersではじめめる Stable Diffusion （応用編２） †

　画像から画像を生成する　instruct-pix2pix と controlnet instruct-pix2pix

　参考サイト：instruct-pix2pixで画像を指示した通り変更したり

↑

概要 †

この章で作成するプログラム一覧と実行速度の目安

Step		プログラム	GPU					CPU
Step		プログラム	RTX 4070Ti	RTX 4060	RTX 4060L	RTX 3050	GTX 1050	i7-1260P
40	「instruct-pix2pix」で画像を変換	sd_040.py	00:03		00:08		00:50	05:32
40	「instruct-pix2pix」で画像を変換	sd_040a.py	00:08		00:31		18:19	24:11
41	image_guidance_scale パラメータによる変化	sd_041.py	00:12		00:24		04:52	14:23
41	image_guidance_scale パラメータによる変化	sd_041a.py	00:42		02:00		02:40:30	03:38:17
42	「controlnet instruct-pix2pix」で画像を変換	sd_042.py	00:02		00:14		00:54	06:30
43	controlnet_conditioning_scale パラメータによる変化	sd_043.py	00:06		00:24		04:56	17:01
44	「controlnet inpaint」で画像の一部を変換	sd_044.py	00:01		00:10		00:45	05:17
45	strength パラメータによる変化	sd_045.py	00:05		00:15		03:53	12:12
46	「outpaint」画像の外側を書き加える	sd_046.py	00:01		00:12		00:45	05:15
47	「controlnet scribble」手描きの線画から画像を生成	sd_047.py	00:01		00:12		00:53	05:36
48	「controlnet openpose」画像から同じ姿勢の画像を生成	sd_048.py	00:02		00:10		01:17	05:25

　・単位　（時：）分：秒

コマンドオプション

コマンドオプション	引数	初期値	意味
--result_image	str	'./sd_results/sd.png'	保存するファイルパスとヘッダ名の指定
--cpu	bool	False	cpu mode の時に設定する（パラメータ不要）
--log	int	3	Log level(-1/0/1/2/3/4/5)
--model_dir	str	'/StabilityMatrix/Data/Models/StableDiffusion'	モデルフォルダのパス
--model_path	str	'SD1.5/beautifulRealistic_brav5.safetensors'	モデルファイル
--ctrl_model_dir	str	'/StabilityMatrix/Data/Models/ControlNet'	コントロールネット・モデルフォルダのパス
--ctrl_model_path	str	'control_v11e_sd15_ip2p_fp16.safetensors'	コントロールネット・モデルファイル
--image_path	str	'images/StableDiffusion_247.png'	入力画像のファイル・パス名
--max_size	int	0	入力画像リサイズの最大値（0=入力画像サイズ）
--prompt	str	'黒髪で短い髪の女性'	画像生成のためのプロンプト（日本語/英語）
--seed	int	-1	シード値（-1の時はランダムに生成）
--width	int	512	画像サイズの横幅
--height	int	512	画像サイズの高さ
--step	int	30	生成ステップ数
--scale	float	7.0	ガイダンススケール値
--image_scale	float	1.5	イメージ・ガイダンススケール値
--cc_scale	float	1.0	controlnet conditioning scale
--strength	float	0.5	変化の強さを表すパラメータ

　・オプション定義・初期値はプログラムによって異なる
　・モデルは --model_dir パラメータで指定したフォルダに配置する
　・モデル名は --model_path パラメータで指定する
　・SD1.5 モデルのモデル名の先頭は「SD15/」でなければならない（「モデルフォルダ/SD15」に配置されていること）

instruct-pix2pix と controlnet instruct-pix2pix の違い

名称	機能	処理内容	プロンプトの書き方	モデルの場所
instruct-pix2pix	元画像をから新しい画像を作る	指示された内容との関係がある部分だけ変えられる	「これに変えたい」と書く	【SD1.5】instruct-pix2pix
instruct-pix2pix	元画像をから新しい画像を作る	指示された内容との関係がある部分だけ変えられる	「これに変えたい」と書く	【SDXL】sdxl-instructpix2pix-768
controlnet instruct-pix2pix	元画像を改造する	元画像全体を変えられる	欲しい結果画像の姿を描写する	【SD1.5】control_v11e_sd15_ip2p

・「instruct-pix2pix」は SD1.5/SDXL それぞれ専用のモデルで動作する
・「controlnet instruct-pix2pix」の場合はコントロールネットのモデルとベース・モデルが必要

↑

動作環境 †

このプロジェクトは以下の Anaconda 仮想環境とプロジェクト・フォルダで動作する
```
(base) PS > conda activate sd_test
(sd_test) PS > cd workspace_3/sd_test
```

↑

Step 40：「instruct-pix2pix」で画像を変換する †

このプログラムは「SD1.5」「SDXL」モデルで動作する

プログラムを実行する　　SD1.5モデル　　（実行時間：約 3秒 RTX 4070 Ti 12GB）

 python sd_040.py

　生成画像（左） image_040.png　元になる画像（右） sd_040_test.png →

(sd_test) > python sd_040.py

Stable Diffusion with diffusers(040)  Ver 0.01: Starting application...

 --result_image             :   results/image_040.png
 --cpu                      :   False
 --log                      :   3
 --model_path               :   timbrooks/instruct-pix2pix
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   雪の中の場面にする
 --seed                     :   0
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --image_scale              :   1.5

prompt: Make it a scene in the snow
width: 512, height: 512
seed: 0
Loading pipeline components...: 100%|████████████| 7/7 [00:02<00:00,  2.97it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.75it/s]
result_file: results/image_040.png

Finished.

画像ファイル「image_040.png」が生成される

プログラムを実行する　　SDXLモデル　　（実行時間：約 8秒 RTX 4070 Ti 12GB）

 python sd_040.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_040a.png'

　生成画像 image_040a.png　元になる画像は同じ sd_040_test.png →

(sd_test) PS > python sd_040.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_040a.png'

Stable Diffusion with diffusers(040)  Ver 0.01: Starting application...

 --result_image             :   results/image_040a.png
 --cpu                      :   False
 --log                      :   3
 --model_path               :   diffusers/sdxl-instructpix2pix-768
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   雪の中の場面にする
 --seed                     :   0
 --width                    :   768
 --height                   :   768
 --step                     :   20
 --scale                    :   7.0
 --image_scale              :   1.5

prompt: Make it a scene in the snow
width: 768, height: 768
seed: 0
Loading pipeline components...: 100%|████████████| 7/7 [00:04<00:00,  1.67it/s]
100%|██████████████████████████████████████████| 20/20 [00:03<00:00,  5.27it/s]
result_file: results/image_040a.png

Finished.

画像ファイル「image_040a.png」が生成される

SD1.5 / SDXL モデルによる生成画像の比較

プロンプト雪の中の場面にする春の場面にする夏の場面にする秋の場面にする冬の場面にする

SD1.5

SDXL

SD1.5 / SDXL モデルパイプラインを作成するオブジェクトの違い

モデルの種類基本画像サイズパイプライン作成オブジェクト

SD1.5 512x512 StableDiffusionInstructPix2PixPipeline

SDXL 768x768 StableDiffusionXLInstructPix2PixPipeline

モデルの種類	基本画像サイズ	パイプライン作成オブジェクト
SD1.5	512x512	StableDiffusionInstructPix2PixPipeline
SDXL	768x768	StableDiffusionXLInstructPix2PixPipeline

SDXL版留意点
・元画像のイメージオブジェクトは PILイメージとは異なるようで、サンプルコードにある diffusers.utils.load_image() で作成
・生成サイズは 768x768 固定のようなのでプログラム内部でこのサイズにリサイズしたものを元画像とする

モジュール・ソースコード

▼「sd_040.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(040)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_040.py    画像から画像生成（instruct-pix2pix）
##  Ver 0.00    2025.07.06  sd_040.py /sd_040a.py
##  Ver 0.01    2025.07.14  SD1.5/SDXL 対応 統合版

#       https://qiita.com/phyblas/items/28c342740c2ed00250b8
#       Model: https://huggingface.co/timbrooks/instruct-pix2pix

#   SD1.5 の場合    python sd_040.py
#   SDXL の場合     python sd_040.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_040a.png'

# タイトル
title = 'Stable Diffusion with diffusers(040)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from PIL import Image
from diffusers import StableDiffusionInstructPix2PixPipeline, StableDiffusionXLInstructPix2PixPipeline, logging
from translate import Translator

import my_logging
import sd_tools as sdt

logging.set_verbosity_error()

# 定数定義
DEF_MODEL_SD15 = 'timbrooks/instruct-pix2pix'
DEF_MODEL_SDXL = 'diffusers/sdxl-instructpix2pix-768'

# コマンドライン定義
opt_list = [
            ['result_image', 'results/image_040.png', 'path to output image file'],
            ['cpu', 'store_true', 'cpu mode'],
            ['log', '3', 'Log level(-1/0/1/2/3/4/5) Default value is \'3\''],
            ['model_dir', '', 'Model directory'],
            ['model_path', DEF_MODEL_SD15, 'Model Path'],
            ['image_path', 'images/sd_040_test.png', 'Sourcs image file path'],
            ['max_size', 0, 'image max size (0=source)'],
            ['prompt', '雪の中の場面にする', 'Prompt text'],
            ['seed', 0, 'Seed parameter (-1 = rundom)'],
            ['width', 512, 'image size width'],
            ['height', 512, 'image size height'],
            ['step', 20, 'infer step'],
            ['scale', 7.0, 'gaidanse scale'],
            ['image_scale', 1.5, 'image gaidanse scale'],
           ]

# モデルを調べる
#   in:     model       モデル名
#   out:    bool        True = SD1.5, False = SDXL
def is_sd15(model):
    return (model != DEF_MODEL_SDXL)

# 画像生成
def image_generation(model_path, src_image, prompt, seed, num_inference_steps=20, width=512, height=512, guidance_scale=7.0, image_guidance_scale=1.5, device='cpu'):
    # パイプラインを作成
    if is_sd15(model_path):
        if device == 'cpu':
            pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_path).to(device)
        else:
            pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(
                    model_path,
                    torch_dtype = torch.float16,
                    ).to(device)
    else:
        if device == 'cpu':
            pipeline = StableDiffusionXLInstructPix2PixPipeline.from_pretrained(model_path).to(device)
        else:
            pipeline = StableDiffusionXLInstructPix2PixPipeline.from_pretrained(
                    model_path,
                    torch_dtype = torch.float16,
                    ).to(device)

    # Generatorオブジェクト作成
    generator = torch.Generator(device).manual_seed(seed)

    # 画像を生成
    image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    num_inference_steps = num_inference_steps,
                    image_guidance_scale = image_guidance_scale,
                    width = width,
                    height = height,
                    generator = generator
                    ).images[0]

    return image


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    image_guidance_scale = sdt._get_image_guidance_scale(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = image_generation(model_path, image, prompt, seed, num_inference_steps, width, height, guidance_scale, image_guidance_scale, device)
    sdt.image_save2(image, result_image_path, result_image_path)
    logger.info(f'result_file: {result_image_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 41：「instruct-pix2pix」image_guidance_scale パラメータによる変化をみる †

このプログラムは「SD1.5」「SDXL」モデルで動作する
image_guidance_scale
・画像をどれくらい変えるかを決めるパラメータ
・1 以上を設定（初期値：1.5）

プログラムを実行する　　SD1.5モデル　　（実行時間：約 12秒 RTX 4070 Ti 12GB）

 python sd_041.py

(sd_test) PS > python sd_041.py

Stable Diffusion with diffusers(041)  Ver 0.01: Starting application...

 --result_image             :   results/image_041.png
 --cpu                      :   False
 --log                      :   3
 --model_path               :   timbrooks/instruct-pix2pix
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   雪の中の場面にする
 --seed                     :   0
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --image_scale              :   1.5

prompt: Make it a scene in the snow
width: 512, height: 512
seed: 0
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  4.87it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 18.16it/s]
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  5.65it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 18.43it/s]
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  5.70it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 18.44it/s]
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  5.35it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.70it/s]
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  5.65it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 18.42it/s]
Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00,  5.66it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 18.43it/s]
result_file: results/image_041.png

Finished.

画像ファイル「image_041.png」が生成される

プログラムを実行する　　SDXLモデル　　（実行時間：約 42秒 RTX 4070 Ti 12GB）

 python sd_041.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_041a.png'

(sd_test) PS > python sd_041.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_041a.png'

Stable Diffusion with diffusers(041)  Ver 0.01: Starting application...

 --result_image             :   results/image_041a.png
 --cpu                      :   False
 --log                      :   3
 --model_path               :   diffusers/sdxl-instructpix2pix-768
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   雪の中の場面にする
 --seed                     :   0
 --width                    :   768
 --height                   :   768
 --step                     :   20
 --scale                    :   7.0
 --image_scale              :   1.5

prompt: Make it a scene in the snow
width: 768, height: 768
seed: 0
Loading pipeline components...: 100%|█████████████| 7/7 [00:03<00:00,  1.76it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.36it/s]
Loading pipeline components...: 100%|█████████████| 7/7 [00:04<00:00,  1.70it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.45it/s]
Loading pipeline components...: 100%|█████████████| 7/7 [00:04<00:00,  1.73it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.44it/s]
Loading pipeline components...: 100%|█████████████| 7/7 [00:04<00:00,  1.66it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.45it/s]
Loading pipeline components...: 100%|█████████████| 7/7 [00:03<00:00,  1.80it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.44it/s]
Loading pipeline components...: 100%|█████████████| 7/7 [00:03<00:00,  1.78it/s]
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.45it/s]
result_file: results/image_041a.png

Finished.

画像ファイル「image_041a.png」が生成される

モジュール・ソースコード

▼「sd_041.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(041)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_041.py    画像から画像生成（instruct-pix2pix）
##              === イメージ・ガイダンススケールを調べる ===
##  Ver 0.00    2025.07.06  sd_041.py /sd_041a.py
##  Ver 0.01    2025.07.14  SD1.5/SDXL 対応 統合版

#       https://qiita.com/phyblas/items/28c342740c2ed00250b8
#       Model: https://huggingface.co/timbrooks/instruct-pix2pix

#   SD1.5 の場合    python sd_041.py
#   SDXL の場合     python sd_041.py --model_path 'diffusers/sdxl-instructpix2pix-768' --width 768 --height 768 --result_image 'results/image_041a.png'

# タイトル
title = 'Stable Diffusion with diffusers(041)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from PIL import Image
from diffusers import StableDiffusionInstructPix2PixPipeline, StableDiffusionXLInstructPix2PixPipeline, logging
from translate import Translator
import matplotlib.pyplot as plt

import my_logging
import sd_tools as sdt
import sd_040

logging.set_verbosity_error()

# 定数定義
DEF_RESULT_IMAGE = 'results/image_041.png'
sd_040.opt_list[0][1] = DEF_RESULT_IMAGE

# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    image_guidance_scale = sdt._get_image_guidance_scale(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 複数画像を生成
    plt.figure(figsize=[6, 9.5], dpi = 100)
    for i in range(6):
        ig_scale = 1 + 0.1 * i
        img = sd_040.image_generation(model_path, image, prompt, seed, num_inference_steps, width, height, guidance_scale, ig_scale, device)
        plt.subplot(3, 2, i + 1, title = 'image_guidance_scale = %.1f'%ig_scale)
        plt.imshow(img)
        plt.axis('off')

        # メモリー開放
        if device == 'cuda':
            torch.cuda.empty_cache()
        elif device == 'mps':
            torch.mps.empty_cache()

    plt.tight_layout()
    plt.savefig(result_image_path)
    plt.close()
    logger.info(f'result_file: {result_image_path｝')

    sdt.image_disp(result_image_path, result_image_path)


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, sd_040.opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 42：「controlnet instruct-pix2pix」で画像を変換する †

プログラムを実行する（実行時間：約 2秒 RTX 4070 Ti 12GB）

 python sd_042.py

　生成画像（左） image_042.png　元になる画像（右） sd_040_test.png →

(sd_test) PS D:\anaconda_win\workspace_3\sd_test> python sd_042.py

Stable Diffusion with diffusers(042)  Ver 0.01: Starting application...

 --result_image             :   results/image_042.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11e_sd15_ip2p_fp16.safetensors
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   浜辺の場面にする
 --seed                     :   12345678
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --cc_scale                 :   1.0

prompt: Set the scene on the beach
width: 512, height: 512
seed: 12345678
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 17.47it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 10.55it/s]
result_file: results/image_042.png

Finished.

画像ファイル「image_042.png」が生成される

プロンプトを変えて生成する
・「python sd_042.py --prompt 'プロンプト'」

 python sd_042.py --prompt '雪の中の場面にする'

・ベースモデル「beautifulRealistic_brav5.safetensors（リアル系）」

浜辺の場面にする	雪の中の場面にする	炎の中の場面にする	森の中の場面にする	山中の場面にする	砂漠の場面にする

着物姿に着替える	イラスト画像にする	アニメ画像にする	微笑んだ顔のアニメ画像	泣き顔のアニメ画像にする	嬉しそうな顔のアニメ画像

・ベースモデル「animePastelDream_softBakedVae.safetensors（イラスト系）」

浜辺の場面にする	雪の中の場面にする	炎の中の場面にする	森の中の場面にする	山中の場面にする	砂漠の場面にする

着物姿に着替える	イラスト画像にする	アニメ画像にする	微笑んだ顔のアニメ画像	泣き顔のアニメ画像にする	嬉しそうな顔のアニメ画像

モジュール・ソースコード

▼「sd_042.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(042)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_042.py    画像から画像生成（controlnet instruct-pix2pix）
##  Ver 0.00    2025.07.07  sd_042.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

##      https://qiita.com/phyblas/items/28c342740c2ed00250b8
##      model:          control_v11e_sd15_ip2p_fp16.safetensors
##      base model:     beautifulRealistic_brav5.safetensors        （リアル系）
##                      animePastelDream_softBakedVae.safetensors   （イラスト系）
##
##      プロンプト      '浜辺の場面にする'  （デフォールト）
##                      '雪の中の場面にする'
##                      '炎の中の場面にする'
##                      '森の中の場面にする'
##                      '山中の場面にする'
##                      '砂漠の場面にする'
##                      '着物姿に着替える'
##                      'イラスト画像にする'
##                      'アニメ画像にする'
##                      '微笑んだ顔のアニメ画像にする'
##                      '泣き顔のアニメ画像にする'
##                      '嬉しそうな顔のアニメ画像にする'

# タイトル
title = 'Stable Diffusion with diffusers(042)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from PIL import Image
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, logging
from translate import Translator

import my_logging
import sd_tools as sdt

logging.set_verbosity_error()

# 定数定義
DEF_MODEL_CNTL = 'control_v11e_sd15_ip2p_fp16.safetensors'
DEF_MODEL_BASE = 'SD1.5/beautifulRealistic_brav5.safetensors'
DEF_IMAGE_PATH = 'images/sd_040_test.png'

# コマンドライン定義
opt_list = [
            ['result_image', 'results/image_042.png', 'path to output image file'],                         #  0
            ['cpu', 'store_true', 'cpu mode'],                                                              #  1
            ['log', '3', 'Log level(-1/0/1/2/3/4/5) Default value is \'3\''],                               #  2
            ['model_dir', '/StabilityMatrix/Data/Models/StableDiffusion', 'Model directory'],               #  3
            ['model_path', DEF_MODEL_BASE, 'Model Path'],                                                   #  4
            ['ctrl_model_dir', '/StabilityMatrix/Data/Models/ControlNet', 'ControlNet Model directory'],    #  5
            ['ctrl_model_path', DEF_MODEL_CNTL, 'ControlNet Model Path'],                                   #  6
            ['image_path', DEF_IMAGE_PATH, 'Sourcs image file path'],                                       #  7
            ['ctrl_image_path', '', 'Control image file path'],                                             #  8
            ['max_size', 0, 'image max size (0=source)'],                                                   #  9
            ['prompt', '浜辺の場面にする', 'Prompt text'],                                                  # 10
            ['seed', 12345678, 'Seed parameter (-1 = rundom)'],                                             # 11
            ['width', 512, 'image size width'],                                                             # 12
            ['height', 512, 'image size height'],                                                           # 13
            ['step', 20, 'infer step'],                                                                     # 14
            ['scale', 7.0, 'gaidanse scale'],                                                               # 15
            ['cc_scale', 1.0, 'controlnet conditioning scale'],                                             # 16
           ]

# 画像生成
def image_generation(model_path, ctrl_model_path, src_image, prompt, seed, num_inference_steps=20, width=512, height=512, guidance_scale=7.0, cc_scale=1.0, device='cpu'):
    # パイプラインを作成
    if device == 'cpu':
        controlnet = ControlNetModel.from_single_file(ctrl_model_path).to(device)
        pipeline = StableDiffusionControlNetPipeline.from_single_file(model_path, controlnet=controlnet).to(device)
    else:
        controlnet = ControlNetModel.from_single_file(ctrl_model_path, torch_dtype=torch.float16).to(device)
        pipeline = StableDiffusionControlNetPipeline.from_single_file(
                    model_path,
                    controlnet=controlnet,
                    torch_dtype = torch.float16,
                    ).to(device)

    # スケジューラー
    pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)

    # Generatorオブジェクト作成
    generator = torch.Generator(device).manual_seed(seed)

    # 画像を生成
    image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    num_inference_steps = num_inference_steps,
                    width = width,
                    height = height,
                    controlnet_conditioning_scale = cc_scale,
                    generator = generator
                    ).images[0]

    return image


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    cc_scale = sdt._get_controlnet_conditioning_scale(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = image_generation(model_path, ctrl_model_path, image, prompt, seed, num_inference_steps, width, height, guidance_scale, cc_scale, device)
    sdt.image_save2(image, result_image_path, result_image_path)
    logger.info(f'result_file: {result_image_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 43：「controlnet instruct-pix2pix」controlnet_conditioning_scale パラメータによる変化をみる †

controlnet_conditioning_scale
・コントロール画像の影響の重みを決めるパラメータ
・既定値は最大値の 1（1より小さい値にしたら入力画像の影響が薄くなる）

プログラムを実行する（実行時間：約 6秒 RTX 4070 Ti 12GB）

 python sd_043.py

(sd_test) PS D:\anaconda_win\workspace_3\sd_test> python sd_043.py

Stable Diffusion with diffusers(043)  Ver 0.01: Starting application...

 --result_image             :   results/image_043.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11e_sd15_ip2p_fp16.safetensors
 --image_path               :   images/sd_040_test.png
 --max_size                 :   0
 --prompt                   :   浜辺の場面にする
 --seed                     :   12345678
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --cc_scale                 :   1.0

prompt: Set the scene on the beach
width: 512, height: 512
seed: 12345678
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 30.96it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 14.83it/s]
Fetching 11 files: 100%|█████████████████████| 11/11 [00:00<00:00, 7498.35it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.70it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 17.03it/s]
Fetching 11 files: 100%|████████████████████| 11/11 [00:00<00:00, 11016.56it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.90it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.16it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 21.80it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.25it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.68it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.56it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.80it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 16.98it/s]
result_file: results/image_043.png

Finished.

画像ファイル「image_043.png」が生成される

モジュール・ソースコード

▼「sd_043.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(042)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_043.py    画像から画像生成（controlnet instruct-pix2pix）
##              === controlnet_conditioning_scale を調べる ===
##  Ver 0.00    2025.07.07  sd_043.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

##      https://qiita.com/phyblas/items/28c342740c2ed00250b8

# タイトル
title = 'Stable Diffusion with diffusers(043)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from PIL import Image
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, logging
from translate import Translator
import matplotlib.pyplot as plt

import my_logging
import sd_tools as sdt
import sd_042

logging.set_verbosity_error()

# 定数定義
DEF_RESULT_IMAGE = 'results/image_043.png'
sd_042.opt_list[0][1] = DEF_RESULT_IMAGE


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    cc_scale = sdt._get_controlnet_conditioning_scale(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 複数画像を生成
    plt.figure(figsize=[6, 9.5], dpi = 100)
    for i in range(6):
        cc_scale = 0.6 + 0.08 * i
        img = sd_042.image_generation(model_path, ctrl_model_path, image, prompt, seed, num_inference_steps, width, height, guidance_scale, cc_scale, device)
        plt.subplot(3, 2, i + 1, title = 'control_condition_scale = %.2f'%cc_scale)
        plt.imshow(img)
        plt.axis('off')

        # メモリー開放
        if device == 'cuda':
            torch.cuda.empty_cache()
        elif device == 'mps':
            torch.mps.empty_cache()

    plt.tight_layout()
    plt.savefig(result_image_path)
    plt.close()
    logger.info(f'result_file: {result_image_path｝')

    sdt.image_disp(result_image_path, result_image_path)


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, sd_042.opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 44：「controlnet inpaint」で画像の一部を変換する †

画像の一部を修正する「inpaint」機能は「diffusers」は次の 2つが用意されている
① 従来のinpaint → Step 38：特定の部分だけ修正（inpaint）
② controlnet inpaint → Step 44, 45
マスク画像が必要（左） sd_038_test_mask.png　元画像（右） sd_038_test.png →

使用するパイプライン・オブジェクトの違い

種類	パイプライン作成オブジェクト
従来のinpaint	StableDiffusionInpaintPipeline
controlnet inpaint	StableDiffusionControlNetInpaintPipeline
controlnet (参考)	StableDiffusionControlNetPipeline

プログラムを実行する（実行時間：約 1秒 RTX 4070 Ti 12GB）

 python sd_044.py

　マスク画像（左） sd_038_test_mask.png　元画像（右） sd_038_test.png →

(sd_test) PS > python sd_044.py

Stable Diffusion with diffusers(044)  Ver 0.01: Starting application...

 --result_image             :   results/image_044.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11p_sd15_inpaint_fp16.safetensors
 --image_path               :   images/sd_038_test.png
 --ctrl_image_path          :   images/sd_038_test_mask.png
 --max_size                 :   0
 --prompt                   :   微笑んでいる女性
 --seed                     :   12345678
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --cc_scale                 :   1.0
 --strength                 :   0.6

prompt: Woman smiling
width: 512, height: 512
seed: 12345678
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 16.91it/s]
100%|██████████████████████████████████████████| 12/12 [00:01<00:00, 10.44it/s]
result_file: results/image_044.png

Finished.

画像ファイル「image_044.png」が生成される

プロンプトを変えて生成する
・「python sd_044.py --prompt 'プロンプト'」

 python sd_044.py --prompt '見つめている女性'

・元画像

・ベースモデル「beautifulRealistic_brav5.safetensors（リアル系）」

微笑んでいる女性	泣いている女性	怒っている女性	照れている女性	見つめている女性	笑っている女性

目を瞑っている女性	ウィンクしている女性	苛立っている女性	怖がっている女性	驚いている女性	疲れている女性

・ベースモデル「animePastelDream_softBakedVae.safetensors（イラスト系）」

微笑んでいる女性	泣いている女性	怒っている女性	照れている女性	見つめている女性	笑っている女性



目を瞑っている女性	ウィンクしている女性	苛立っている女性	怖がっている女性	驚いている女性	疲れている女性

モジュール・ソースコード

▼「sd_044.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(044)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_044.py    画像から画像生成（controlnet inpaint）
##              === 画像の一部を変換する ===
##  Ver 0.00    2025.07.08  sd_044.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

##      https://qiita.com/phyblas/items/7cacb9297650afd63d34
##      https://zako-lab929.hatenablog.com/entry/20240212/1707743575

##      プロンプト     '微笑んでいる女性'（デフォールト）
##                     '泣いている女性'
##                     '怒っている女性'
##                     '照れている女性'
##                     '見つめている女性'
##                     '笑っている女性'
##                     '目を瞑っている女性'
##                     'ウィンクしている女性'
##                     '苛立っている女性'
##                     '怖がっている女性'
##                     '驚いている女性'
##                     '疲れている女性'
##
##      model:          control_v11p_sd15_inpaint_fp16.safetensors
##      base model:     SD1.5/beautifulRealistic_brav5.safetensors        （リアル系）
##                      SD1.5/animePastelDream_softBakedVae.safetensors   （イラスト系）
##
##      元画像:         images/sd_038_test.png（デフォールト）
##                      images/sd_044_test1.png
##                      images/sd_044_test2.png
##                      images/sd_044_test3.png
##      マスク画像:     images/sd_038_test_mask.png（デフォールト）
##                      images/sd_044_test1_mask.png
##                      images/sd_044_test2_mask.png
##                      images/sd_044_test3_mask.png

# タイトル
title = 'Stable Diffusion with diffusers(044)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, logging
from diffusers.utils import load_image
from translate import Translator
import numpy as np

import my_logging
import sd_tools as sdt

logging.set_verbosity_error()

# 定数定義
DEF_MODEL_CNTL = 'control_v11p_sd15_inpaint_fp16.safetensors'
DEF_MODEL_BASE = 'SD1.5/beautifulRealistic_brav5.safetensors'
DEF_IMAGE_PATH = 'images/sd_038_test.png'
DEF_CTRL_IMAGE = 'images/sd_038_test_mask.png'

# コマンドライン定義
opt_list = [
            ['result_image', 'results/image_044.png', 'path to output image file'],                         #  0
            ['cpu', 'store_true', 'cpu mode'],                                                              #  1
            ['log', '3', 'Log level(-1/0/1/2/3/4/5) Default value is \'3\''],                               #  2
            ['model_dir', '/StabilityMatrix/Data/Models/StableDiffusion', 'Model directory'],               #  3
            ['model_path', DEF_MODEL_BASE, 'Model Path'],                                                   #  4
            ['ctrl_model_dir', '/StabilityMatrix/Data/Models/ControlNet', 'ControlNet Model directory'],    #  5
            ['ctrl_model_path', DEF_MODEL_CNTL, 'ControlNet Model Path'],                                   #  6
            ['image_path', DEF_IMAGE_PATH, 'Sourcs image file path'],                                       #  7
            ['ctrl_image_path', DEF_CTRL_IMAGE, 'Control image file path'],                                 #  8
            ['max_size', 0, 'image max size (0=source)'],                                                   #  9
            ['prompt', '微笑んでいる女性', 'Prompt text'],                                                  # 10
            ['seed', 12345678, 'Seed parameter (-1 = rundom)'],                                             # 11
            ['width', 512, 'image size width'],                                                             # 12
            ['height', 512, 'image size height'],                                                           # 13
            ['step', 20, 'infer step'],                                                                     # 14
            ['scale', 7.0, 'gaidanse scale'],                                                               # 15
            ['cc_scale', 1.0, 'controlnet conditioning scale'],                                             # 16
            ['strength', 0.6, 'strength value'],                                                            # 17
           ]

# コントロールイメージを作成するメソッド
def make_inpaint_condition(image, image_mask):
    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
    image_mask = np.array(image_mask.convert("L")).astype(np.float32) / 255.0

    assert image.shape[0:1] == image_mask.shape[0:1], "image and image_mask must have the same image size"
    image[image_mask > 0.5] = -1.0  # set as masked pixel
    image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
    image = torch.from_numpy(image)
    return image

# 画像生成
def image_generation(model_path, ctrl_model_path, src_image, msk_image, img_ctrl, prompt, seed, num_inference_steps=20, width=512, height=512, guidance_scale=7.0, cc_scale=1.0, strength=0.6, device='cpu'):
    # パイプラインを作成
    if device == 'cpu':
        controlnet = ControlNetModel.from_single_file(ctrl_model_path).to(device)
        pipeline = StableDiffusionControlNetInpaintPipeline.from_single_file(model_path, controlnet=controlnet).to(device)
    else:
        controlnet = ControlNetModel.from_single_file(ctrl_model_path, torch_dtype=torch.float16).to(device)
        pipeline = StableDiffusionControlNetInpaintPipeline.from_single_file(
                    model_path,
                    controlnet=controlnet,
                    torch_dtype = torch.float16,
                    ).to(device)

    # スケジューラー
    pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)

    # Generatorオブジェクト作成
    generator = torch.Generator(device).manual_seed(seed)

    # 画像を生成
    if strength == None:
        image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    mask_image = msk_image,
                    control_image = img_ctrl,
                    num_inference_steps = num_inference_steps,
                    width = width,
                    height = height,
                    controlnet_conditioning_scale = cc_scale,
                    generator = generator
                    ).images[0]
    else:
        image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    mask_image = msk_image,
                    control_image = img_ctrl,
                    num_inference_steps = num_inference_steps,
                    width = width,
                    height = height,
                    controlnet_conditioning_scale = cc_scale,
                    strength = strength,
                    generator = generator
                    ).images[0]

    return image


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    src_image = sdt._get_source_image(opt, logger)
    msk_image = sdt._get_control_image(opt, logger)
    img_ctrl = make_inpaint_condition(src_image, msk_image)                     # コントロール画像
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    cc_scale = sdt._get_controlnet_conditioning_scale(opt, logger)
    strength = sdt._get_strength(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = image_generation(model_path, ctrl_model_path, src_image, msk_image, img_ctrl, prompt, seed, num_inference_steps, width, height, guidance_scale, cc_scale, strength, device)
    sdt.image_save2(image, result_image_path, result_image_path)
    logger.info(f'result_file: {result_image_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 45：「controlnet inpaint」strength パラメータによる変化をみる †

strength
・どれくらいその部分を変更するかを決める数値
・既定値は 1（完全に新しいものに入れ替える）
・0 ～ 1 の値を指定して元の画像の形を保つ程度を決めることができる

プログラムを実行する（実行時間：約 5秒 RTX 4070 Ti 12GB）

 python sd_045.py

(sd_test) PS > python sd_045.py

Stable Diffusion with diffusers(045)  Ver 0.01: Starting application...

 --result_image             :   results/image_045.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11p_sd15_inpaint_fp16.safetensors
 --image_path               :   images/sd_038_test.png
 --ctrl_image_path          :   images/sd_038_test_mask.png
 --max_size                 :   0
 --prompt                   :   微笑んでいる女性
 --seed                     :   12345678
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --cc_scale                 :   1.0
 --strength                 :   0.6

prompt: Woman smiling
width: 512, height: 512
seed: 12345678
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 32.00it/s]
100%|█████████████████████████████████████████████| 2/2 [00:00<00:00,  9.69it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 33.38it/s]
100%|█████████████████████████████████████████████| 4/4 [00:00<00:00, 15.51it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 33.10it/s]
100%|█████████████████████████████████████████████| 6/6 [00:00<00:00, 16.51it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 22.99it/s]
100%|█████████████████████████████████████████████| 8/8 [00:00<00:00, 15.07it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 33.78it/s]
100%|███████████████████████████████████████████| 10/10 [00:00<00:00, 15.35it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 33.54it/s]
100%|███████████████████████████████████████████| 12/12 [00:00<00:00, 16.26it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 21.80it/s]
100%|███████████████████████████████████████████| 14/14 [00:00<00:00, 15.63it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 34.68it/s]
100%|███████████████████████████████████████████| 16/16 [00:00<00:00, 16.55it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 34.62it/s]
100%|███████████████████████████████████████████| 18/18 [00:01<00:00, 16.38it/s]
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 33.47it/s]
100%|███████████████████████████████████████████| 20/20 [00:01<00:00, 15.80it/s]
result_file: results/image_045.png

Finished.

画像ファイル「image_045.png」が生成される

モジュール・ソースコード

▼「sd_045.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(045)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_045.py    画像から画像生成（controlnet inpaint）
##              === strengthを調べる ===
##  Ver 0.00    2025.07.09  sd_045.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

# タイトル
title = 'Stable Diffusion with diffusers(045)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, EulerAncestralDiscreteScheduler, logging
from diffusers.utils import load_image
from translate import Translator
import numpy as np
import matplotlib.pyplot as plt

import my_logging
import sd_tools as sdt
import sd_044

logging.set_verbosity_error()

# 定数定義
DEF_RESULT_IMAGE = 'results/image_045.png'
sd_044.opt_list[0][1] = DEF_RESULT_IMAGE


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    src_image = sdt._get_source_image(opt, logger)
    msk_image = sdt._get_control_image(opt, logger)
    img_ctrl = sd_044.make_inpaint_condition(src_image, msk_image)                     # コントロール画像
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    cc_scale = sdt._get_controlnet_conditioning_scale(opt, logger)
    strength = sdt._get_strength(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 複数画像を生成
    plt.figure(figsize=[6, 15.5], dpi = 100)
    for i in range(10):
        strength = 0.1 + i * 0.1
        img = sd_044.image_generation(model_path, ctrl_model_path, src_image, msk_image, img_ctrl, prompt, seed, num_inference_steps, width, height, guidance_scale, cc_scale, strength, device)
        plt.subplot(5, 2, i + 1, title = 'strength = %.1f'%strength)
        plt.imshow(img)
        plt.axis('off')

        # メモリー開放
        if device == 'cuda':
            torch.cuda.empty_cache()
        elif device == 'mps':
            torch.mps.empty_cache()

    plt.tight_layout()
    plt.savefig(result_image_path)
    plt.close()
    logger.info(f'result_file: {result_image_path｝')

    sdt.image_disp(result_image_path, result_image_path)


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, sd_044.opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 46：「outpaint」画像の外側を書き加える †

「controlnet inpaint」を利用して画像の外側を修正する「outpaint」機能を実現する
≪処理の概要≫
① 縦長の元画像を用意する
② 画像を正方形にして左右を黒で埋める
③ 画像の部分を黒（元画像より左右の領域を小さくする）、残りを白のマスク画像を作成する
　②③ はプログラム内で処理され 512x512 のソース画像とマスク画像が準備される
④ Step 44「controlnet inpaint」の機能で左右を生成する

プログラムを実行する（実行時間：約 1秒 RTX 4070 Ti 12GB）

 python sd_046.py

(sd_test) PS > python sd_046.py

Stable Diffusion with diffusers(046)  Ver 0.01: Starting application...

 --result_image             :   results/image_046.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11p_sd15_inpaint_fp16.safetensors
 --image_path               :   images/sd_046_test.png
 --max_size                 :   0
 --prompt                   :   庭に立って微笑んでいる女性
 --seed                     :   12345678
 --width                    :   512
 --height                   :   512
 --step                     :   20
 --scale                    :   7.0
 --cc_scale                 :   1.0

prompt: Woman standing in a garden smiling
width: 512, height: 512
seed: 12345678
Fetching 11 files: 100%|████████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|█████████████| 6/6 [00:00<00:00, 22.32it/s]
100%|███████████████████████████████████████████| 20/20 [00:01<00:00, 11.69it/s]
result_file: results/image_046.png

画像ファイル「image_046.png」が生成される

モジュール・ソースコード

▼「sd_046.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(046)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_046.py    画像から画像生成（controlnet outpaint）
##              === 画像の外側を書き加える ===
##  Ver 0.00    2025.07.09  sd_046.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応（strength を設定すると機能しない！）

# タイトル
title = 'Stable Diffusion with diffusers(046)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt

import my_imagetool
import my_logging
import sd_tools as sdt
import sd_044

# 定数定義
DEF_RESULT_IMAGE = 'results/image_046.png'
sd_044.opt_list[0][1] = DEF_RESULT_IMAGE
DEF_IMAGE_PATH = 'images/sd_046_test.png'
sd_044.opt_list[7][1] = DEF_IMAGE_PATH
sd_044.opt_list[8][1] = ''                                                      # ctrl_image_path 消去
sd_044.opt_list[10][1] = '庭に立って微笑んでいる女性'                           # プロンプト
sd_044.opt_list[17][1] = ''                                                     # strength 消去（重要）

# マスク作成
def mask_square(image, size):
    img_h, img_w = image.shape[:2]
    x0 = 0
    y0 = 0
    x1 = 0
    y1 = 0
    if img_h > img_w:
        size = img_h
        x0 = int((size - img_w) / 2)
        x1 = x0 + img_w
        y1 = size
    else:
        size = img_w
        y0 = int((size - img_h) / 2)
        y1 = y0 + img_h
        x1 = size

    # 白ベースの画像を生成
    dist = np.array([size, size, 1])                                            # 縦×横 3チャンネル
    img = np.full(dist, 255, dtype=np.uint8)
    img[y0:y1, x0 + 16:x1 - 32] = 0                                             # 中央部分を黒（左右 16ピクセルづつ狭く）
    return img


# ** main関数 **
def main(opt, logger = None):
    # 入力画像の前処理
    src_path = opt.image_path
    s = os.path.splitext(src_path)
    image_path = s[0] + '_src' + s[1]
    mask_path = s[0] + '_msk' + s[1]
    opt.image_path = image_path
    opt.ctrl_image_path = mask_path

    size = 512
    img = cv2.imread(src_path)
    msk = mask_square(img, size)
    msk = my_imagetool.frame_resize(msk, size)
    my_imagetool.image_disp(msk,  mask_path, False, mask_path)                  # マスク画像保存
    img = my_imagetool.frame_square(img, (0, 0, 0))
    img = my_imagetool.frame_resize(img, size)
    my_imagetool.image_disp(img, image_path, False, image_path)                 # ソース画像保存

    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    src_image = sdt._get_source_image(opt, logger)
    msk_image = sdt._get_control_image(opt, logger)
    img_ctrl = sd_044.make_inpaint_condition(src_image, msk_image)              # コントロール画像
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    height, width = sdt._get_image_size(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)
    guidance_scale = sdt._get_guidance_scale(opt, logger)
    cc_scale = sdt._get_controlnet_conditioning_scale(opt, logger)
    strength = sdt._get_strength(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = sd_044.image_generation(model_path, ctrl_model_path, src_image, msk_image, img_ctrl, prompt, seed, num_inference_steps, width, height, guidance_scale, cc_scale, strength, device)
    sdt.image_save2(image, result_image_path, result_image_path)
    logger.info(f'result_file: {result_image_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, sd_044.opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 47：「controlnet scribble」手描きの線画から画像を生成 †

手描きしたイラストからテキスト（プロンプト）で画像を生成する

プログラムを実行する（実行時間：約 1秒 RTX 4070 Ti 12GB）

 python sd_047.py

　線画イラスト（左） sd_047.png　生成画像（右） image_047.png →

(sd_test) PS > python sd_047.py

Stable Diffusion with diffusers(047)  Ver 0.01: Starting application...

 --result_image             :   results/image_047.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/v1-5-pruned-emaonly.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11p_sd15_scribble_fp16.safetensors
 --image_path               :   images/sd_047.png
 --max_size                 :   0
 --prompt                   :   テーブル上の白いコーヒーカップ
 --seed                     :   12345678
 --step                     :   20

prompt: White coffee cup on the table
seed: 12345678
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:01<00:00,  5.70it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 10.11it/s]
result_file: results/image_047.png

Finished.

画像ファイル「image_047.png」が生成される

プロンプトを変えて生成する
・「python sd_047.py ['プロンプト']」

 python sd_047.py --prompt '木製のテーブルの上に置かれた白いコーヒーカップ'

・ベースモデル「v1-5-pruned-emaonly.safetensors

線画イラスト	テーブル上の白いコーヒーカップ	木製のテーブルの上に置かれた白いコーヒーカップ	ビーチに置かれたオレンジ色のコーヒーカップ

モジュール・ソースコード

▼「sd_047.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(047)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_047.py    画像から画像生成（controlnet scribble）
##              === 手描きの線画から画像を生成 ===
##  Ver 0.00    2025.07.10  sd_047.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

##      https://blog.mindboardapps.com/posts/stable-diffusion-and-control-net-img2img/

##      線画画像:       images/sd_047.png
##                      images/sd_047_1.png
##                      images/sd_047_2.png
##
##      プロンプト:     'テーブル上の白いコーヒーカップ'（デフォールト）
##                      '木製のテーブルの上に置かれた白いコーヒーカップ
##                      'ビーチに置かれたオレンジ色のコーヒーカップ''

# タイトル
title = 'Stable Diffusion with diffusers(047)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, logging
from diffusers.utils import load_image
from translate import Translator
import numpy as np

import my_logging
import sd_tools as sdt

logging.set_verbosity_error()

# 定数定義
DEF_MODEL_CNTL = 'control_v11p_sd15_scribble_fp16.safetensors'
DEF_MODEL_BASE = 'SD1.5/v1-5-pruned-emaonly.safetensors'
DEF_IMAGE_PATH = 'images/sd_047.png'

# コマンドライン定義
opt_list = [
            ['result_image', 'results/image_047.png', 'path to output image file'],                         #  0
            ['cpu', 'store_true', 'cpu mode'],                                                              #  1
            ['log', '3', 'Log level(-1/0/1/2/3/4/5) Default value is \'3\''],                               #  2
            ['model_dir', '/StabilityMatrix/Data/Models/StableDiffusion', 'Model directory'],               #  3
            ['model_path', DEF_MODEL_BASE, 'Model Path'],                                                   #  4
            ['ctrl_model_dir', '/StabilityMatrix/Data/Models/ControlNet', 'ControlNet Model directory'],    #  5
            ['ctrl_model_path', DEF_MODEL_CNTL, 'ControlNet Model Path'],                                   #  6
            ['image_path', DEF_IMAGE_PATH, 'Sourcs image file path'],                                       #  7
            ['max_size', 0, 'image max size (0=source)'],                                                   #  9
            ['prompt', 'テーブル上の白いコーヒーカップ', 'Prompt text'],                                    #  9
            ['seed', 12345678, 'Seed parameter (-1 = rundom)'],                                             # 10
            ['step', 20, 'infer step'],                                                                     # 11
           ]

# 画像生成
def image_generation(model_path, ctrl_model_path, src_image, prompt, seed, num_inference_steps=20, device='cpu'):
    # パイプラインを作成
    if device == 'cpu':
        controlnet = ControlNetModel.from_single_file(ctrl_model_path).to(device)
        pipeline = StableDiffusionControlNetPipeline.from_single_file(model_path, controlnet = controlnet).to(device)
    else:
        controlnet = ControlNetModel.from_single_file(ctrl_model_path, torch_dtype = torch.float16).to(device)
        pipeline = StableDiffusionControlNetPipeline.from_single_file(
                    model_path,
                    controlnet = controlnet,
                    torch_dtype = torch.float16,
                    ).to(device)

    # Generatorオブジェクト作成
    generator = torch.Generator(device).manual_seed(seed)

    # 画像を生成
    image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    num_inference_steps = 20,
                    generator = generator
                    ).images[0]

    return image


# ** main関数 **
def main(opt, logger = None):
    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    src_image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = image_generation(model_path, ctrl_model_path, src_image, prompt, seed, num_inference_steps, device)
    sdt.image_save2(image, result_image_path, result_image_path)
    logger.info(f'result_file: {result_image_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

Step 48：「controlnet openpose」画像から同じ姿勢の画像を生成 †

元画像から姿勢を推定してテキスト（プロンプト）で同じ姿勢の画像を生成する

追加のパッケージをインストールする
```
(sd_test) PS > pip install controlnet_aux
```

プログラムを実行する（実行時間：約 2秒 RTX 4070 Ti 12GB）

 python sd_048.py

　推定された姿勢（左） sd_048_test1_pose.png　元画像（右） sd_048_test1.png →

(sd_test) PS > python sd_048.py

Stable Diffusion with diffusers(048)  Ver 0.01: Starting application...

 --result_image             :   results/image_048.png
 --cpu                      :   False
 --log                      :   3
 --model_dir                :   /StabilityMatrix/Data/Models/StableDiffusion
 --model_path               :   SD1.5/beautifulRealistic_brav5.safetensors
 --ctrl_model_dir           :   /StabilityMatrix/Data/Models/ControlNet
 --ctrl_model_path          :   control_v11p_sd15_openpose_fp16.safetensors
 --image_path               :   images/sd_048_test1.png
 --max_size                 :   0
 --prompt                   :   ダンスを踊る女性
 --seed                     :   -1
 --step                     :   20

prompt: Dancing Woman
seed: 2142389556
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 15.00it/s]
100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 10.27it/s]
result_file: results/image_048_2142389556.png

Finished.

画像ファイル「image_048_2142389556.png」が生成される（ファイル名の末尾はシード値）
シード値を指定して生成する
・「python sd_047.py --seed 'シード値（-1 = ランダム生成）'」
```
(sd_test) PS > python sd_048.py --seed 1595966935
```
・ベースモデル「beautifulRealistic_brav5.safetensors（リアル系）」/「animePastelDream_softBakedVae.safetensors（イラスト系）」

元画像推定姿勢生成画像① 生成画像② 生成画像③

モジュール・ソースコード

▼「sd_048.py」

# -*- coding: utf-8 -*-
##--------------------------------------------------
##  Stable Diffusion with diffusers(048)   Ver 0.01
##
##               2025.07.14 Masahiro Izutsu
##--------------------------------------------------
## sd_048.py    画像から画像生成（controlnet openpose）
##              === 画像から同じ姿勢の画像を生成 ===
##  Ver 0.00    2025.07.11  sd_048.py
##  Ver 0.01    2025.07.14  コマンドライン入力対応

##      https://note.com/npaka/n/n06b9ca7994a4
##      https://huggingface.co/lllyasviel/control_v11p_sd15_openpose

##      prompt      'ダンスを踊る女性'（デフォールト）
##
##      model:      control_v11p_sd15_openpose_fp16.safetensors
##      base model: SD1.5/beautifulRealistic_brav5.safetensors        （リアル系）
##                  SD1.5/animePastelDream_softBakedVae.safetensors   （イラスト系）
##
##      元画像:     images/sd_048_test1.png
##                  images/sd_048_test2.png
##                  images/sd_048_test3.png

# タイトル
title = 'Stable Diffusion with diffusers(048)  Ver 0.01'

import warnings
warnings.simplefilter('ignore')

# インポート＆初期設定
import os
import torch
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, logging
from diffusers.utils import load_image
from translate import Translator
from controlnet_aux import OpenposeDetector

import numpy as np
import sys
import os
import random

import my_logging
import sd_tools as sdt
import sd_047

logging.set_verbosity_error()

# 定数定義
DEF_MODEL_CNTL = 'control_v11p_sd15_openpose_fp16.safetensors'
DEF_MODEL_BASE = 'SD1.5/beautifulRealistic_brav5.safetensors'
DEF_IMAGE_PATH = 'images/sd_048_test1.png'

# コマンドライン定義
opt_list = [
            ['result_image', 'results/image_048.png', 'path to output image file'],                         #  0
            ['cpu', 'store_true', 'cpu mode'],                                                              #  1
            ['log', '3', 'Log level(-1/0/1/2/3/4/5) Default value is \'3\''],                               #  2
            ['model_dir', '/StabilityMatrix/Data/Models/StableDiffusion', 'Model directory'],               #  3
            ['model_path', DEF_MODEL_BASE, 'Model Path'],                                                   #  4
            ['ctrl_model_dir', '/StabilityMatrix/Data/Models/ControlNet', 'ControlNet Model directory'],    #  5
            ['ctrl_model_path', DEF_MODEL_CNTL, 'ControlNet Model Path'],                                   #  6
            ['image_path', DEF_IMAGE_PATH, 'Sourcs image file path'],                                       #  7
            ['max_size', 0, 'image max size (0=source)'],                                                   #  9
            ['prompt', 'ダンスを踊る女性', 'Prompt text'],                                                  #  9
            ['seed', -1, 'Seed parameter (-1 = rundom)'],                                                   # 10
            ['step', 20, 'infer step'],                                                                     # 11
           ]


# ** main関数 **
def main(opt, logger = None):
    # 入力画像の前処理
    src_path = opt.image_path
    s = os.path.splitext(src_path)
    pose_path = s[0] + '_pose' + s[1]

    src_image = sdt._get_source_image(opt, logger)
    openpose_detector = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')
    openpose_image = openpose_detector(src_image)
    openpose_image.save(pose_path)
    sdt.image_disp(pose_path, pose_path)
    opt.image_path = pose_path

    # パラメータ設定
    device = sdt._get_device(opt, logger)
    result_image_path = sdt._get_result_image_path(opt, logger)
    result_path = sdt._get_result_path(opt, logger)
    prompt = sdt._get_prompt(opt, logger)
    src_image = sdt._get_source_image(opt, logger)
    model_path = sdt._get_model_path(opt, logger)
    ctrl_model_path = sdt._get_controlnet_model_path(opt, logger)
    seed = sdt._get_seed_value(opt, logger)
    num_inference_steps = sdt._get_inference_steps(opt, logger)

    # 出力フォルダ
    os.makedirs(result_path, exist_ok = True)

    # 画像生成
    image = sd_047.image_generation(model_path, ctrl_model_path, src_image, prompt, seed, num_inference_steps, device)

    s = os.path.splitext(result_image_path)
    save_path = s[0] + '_' + str(seed) + s[1]
    sdt.image_save2(image, save_path, save_path)
    logger.info(f'result_file: {save_path｝')


# main関数エントリーポイント(実行開始)
if __name__ == "__main__":
    parser = sdt.parse_args(None, opt_list)
    opt = parser.parse_args()
    sdt._get_device(opt)
    sdt.display_info(opt, title)

    # アプリケーション・ログ設定
    module = os.path.basename(__file__)
    module_name = os.path.splitext(module)[0]
    logger = my_logging.get_module_logger_sel(module_name, int(opt.log))

    main(opt, logger)

    logger.info('\nFinished.\n')

　※ 上記ソースコードは表示の都合上、半角コード '}' が全角 '｝'になっていることに注意

↑

忘備録 †

↑

error: (-2:Unspecified error) The function is not implemented. †

エラー内容：cv2.error: OpenCV(4.12.0)

    :
cv2.error: OpenCV(4.12.0) D:\a\opencv-python\opencv-python\opencv\modules\highgui\src\window.cpp:1284: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'

参考サイト
・Python　OpenCVのエラー解消方法

対処方法
・OpenCV 軽量版 opencv-python-headless のアンインストールと opencv-python の再インストール
```
 pip uninstall opencv-python-headless
 pip uninstall opencv-python
```

↑

AttributeError: module 'cv2' has no attribute 'INTER_AREA' †

エラー内容：cv2.error: OpenCV(4.12.0)

    :
  File "C:\Users\XXXX\anaconda3\envs\sd_test\Lib\site-packages\controlnet_aux\midas\midas\transforms.py", line 6, in <module>
    def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
                                                                ^^^^^^^^^^^^^^
AttributeError: module 'cv2' has no attribute 'INTER_AREA'

参考サイト
・Bug: AttributeError: module 'cv2' has no attribute 'INTER_AREA' #14023
・pip で OpenCV のインストール

対処方法
・OpenCV 拡張版 opencv-contrib-python インストール
```
 pip install opencv-contrib-python
```

↑

更新履歴 †

2025/07/05 初版

↑

参考資料 †

Diffusers

Stable Diffusion

書籍など
- 日経ソフトウエア 2025年7月号「ローカル生成AIプログラミング」
- Interface 2025年3月号「画像による異常検出＆ローカルLLM作り - 仕事のための生成AI」

プロンプト	雪の中の場面にする	春の場面にする	夏の場面にする	秋の場面にする	冬の場面にする
SD1.5
SDXL

元画像	推定姿勢	生成画像①	生成画像②	生成画像③