AI_Program2 のバックアップ(No.2) - PukiWiki

[ トップ ] [ 一覧 | 検索 | 履歴 | ログイン ]

私的AI研究会 > AI_Program2

生成 AI プログラミング２ == 編集中 == †

#ref(): File not found: "image_004_m.jpg" at page "AI_Program"

　これまで検証してきた結果をもとに、Python で生成 AI プログラムを書く

▲　目　次

生成 AI プログラミング２ == 編集中 ==
参考資料

※ 最終更新:2025/06/15　

diffusersではじめめる Stable Diffusion （応用編） †

　画像から画像を生成する　img2img

Step 30：一番簡単な画像から画像生成プログラム †

img2img 画像から画像生成
モデルの種類基本画像サイズパイプライン作成オブジェクト

SD1.5 512x512 StableDiffusionImg2ImgPipeline

SDXL 1024x1024 StableDiffusionXLImg2ImgPipeline

アニメ風画像をリアル風に変更する
例：使用モデル　beautifulRealistic_brav5

「sd_030.py」　　元になる画像 StableDiffusion_247.png →

## sd_030.py　画像から画像生成（img2img ）
## model:   beautifulRealistic_brav5.safetensors

import torch
from PIL import Image
from diffusers import StableDiffusionImg2ImgPipeline,DPMSolverMultistepScheduler, logging
from translate import Translator

logging.set_verbosity_error()

# モデルフォルダーのパス
model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors"
image_path = "images/StableDiffusion_247.png"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# seed 値
seed = 12345678

# パイプラインを作成
pipeline = StableDiffusionImg2ImgPipeline.from_single_file(
                    model_path,
                    torch_dtype = torch.float16,
                    ).to(device)

# スケジューラ設定
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)

# プロンプト
trans = Translator('en','ja').translate
prompt_jp = '黒髪で短い髪の女性'
prompt = trans(prompt_jp)
src_image = Image.open(image_path)

# Generatorオブジェクト作成
generator = torch.Generator(device).manual_seed(seed)

print(f'Seed: {seed}, Model: {model_path}')
print(f'prompt : {prompt_jp} → {prompt}')

# 画像を生成
image = pipeline(
                    prompt = prompt,
                    image = src_image,
                    num_inference_steps = 30,
                    guidance_scale = 7,
                    strength = 0.6,
                    generator = generator
                    ).images[0]

image.save("results/image_030.png")

プログラムを実行する

(sd_test) PS > python sd_030.py

Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 10.30it/s]
Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors
prompt : 黒髪で短い髪の女性 → a woman with short black hair
100%|██████████████████████████████████████████| 18/18 [00:01<00:00, 15.78it/s]

画像ファイル「image_030.png」が生成される

パラメータ調整やプロンプト、使用するモデルによって結果は大きく変わってくる
特に img2img でしか使わない strength の値が重要

Step 31：変化の強さを調整する（strength） †

strengthは変化の強さを表すパラメータ
値の範囲は 0 から 1 （0 = 元の画像のまま、1 = 完全に元画像を無視）
生成時のステップ数は「num_inference_steps × strength」となる
strength が小さいと速い。変化が少ない分、生成時間も短い

「sd_031.py」

## sd_031.py　画像から画像生成　strength 強さを表すパラメータ
## model:   beautifulRealistic_brav5.safetensors

import torch
from PIL import Image
from diffusers import StableDiffusionImg2ImgPipeline,DPMSolverMultistepScheduler, logging
from translate import Translator
import matplotlib.pyplot as plt

logging.set_verbosity_error()

# 画像生成
def image_generation(strength):
    # パイプラインを作成
    pipeline = StableDiffusionImg2ImgPipeline.from_single_file(
                    model_path,
                    torch_dtype = torch.float16,
                    ).to(device)

    # スケジューラ設定
    pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)

    # Generatorオブジェクト作成
    generator = torch.Generator(device).manual_seed(seed)

    # 画像を生成
    img = pipeline(
                    prompt = prompt,
                    image = src_image,
                    num_inference_steps = 30,
                    guidance_scale = 7,
                    strength = strength,
                    generator = generator
                    ).images[0]
    return img

# モデルフォルダーのパス
model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors"
image_path = "images/StableDiffusion_247.png"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# seed 値
seed = 12345678

# プロンプト
trans = Translator('en','ja').translate
prompt_jp = '黒髪で短い髪の女性'
#prompt_jp = 'テラスでコーヒーを飲む金髪の女性'
prompt = trans(prompt_jp)
src_image = Image.open(image_path)

print(f'Seed: {seed}, Model: {model_path}')
print(f'prompt : {prompt_jp} → {prompt}')

# 複数画像を生成
plt.figure(figsize = [6, 15.5], dpi = 100)
for i in range(10):
    strength = 0.1 + i * 0.1
    img = image_generation(strength)
    plt.subplot(5, 2, i + 1, title = "strength = %.1f" % strength)
    plt.imshow(img)
    plt.axis('off')

    # メモリー開放
    if device == 'cuda':
        torch.cuda.empty_cache()
    elif device == 'mps':
        torch.mps.empty_cache()

plt.tight_layout()
plt.savefig('results/image_031.png')
plt.close()

プログラムを実行する

(sd_test) PS > python sd_031.py

Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors
prompt : 黒髪で短い髪の女性 → a woman with short black hair
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 15.31it/s]
100%|████████████████████████████████████████████| 3/3 [00:00<00:00, 16.70it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 26.95it/s]
100%|████████████████████████████████████████████| 6/6 [00:00<00:00, 26.25it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.83it/s]
100%|████████████████████████████████████████████| 9/9 [00:00<00:00, 25.62it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.48it/s]
100%|██████████████████████████████████████████| 12/12 [00:00<00:00, 25.21it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 22.46it/s]
100%|██████████████████████████████████████████| 15/15 [00:00<00:00, 24.61it/s]
Fetching 11 files: 100%|████████████████████| 11/11 [00:00<00:00, 11032.36it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.50it/s]
100%|██████████████████████████████████████████| 18/18 [00:00<00:00, 24.45it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 29.71it/s]
100%|██████████████████████████████████████████| 21/21 [00:00<00:00, 24.55it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.32it/s]
100%|██████████████████████████████████████████| 24/24 [00:00<00:00, 24.17it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.67it/s]
100%|██████████████████████████████████████████| 27/27 [00:01<00:00, 24.19it/s]
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 23.52it/s]
100%|██████████████████████████████████████████| 30/30 [00:01<00:00, 24.25it/s]

画像ファイル「image_031.png」が生成される

プロンプト日本語入力自動英訳

① 黒髪で短い髪の女性 a woman with short black hair

② テラスでコーヒーを飲む金髪の女性 Blonde drinking coffee on the terrace

忘備録 †

更新履歴 †

2025/06/15 初版

参考資料 †

Stable Diffusion

書籍など
- 日経ソフトウエア 2025年7月号「ローカル生成AIプログラミング」
- Interface 2025年3月号「画像による異常検出＆ローカルLLM作り - 仕事のための生成AI」