AI_Program のバックアップ(No.4)

私的AI研究会 > AI_Program

生成 AI プログラミング == 編集中 == †

　これまで検証してきた結果をもとに、Python で生成 AI プログラムを書く

▲　目　次

生成 AI プログラミング == 編集中 ==
- diffusersで始める Stable Diffusion
- 更新履歴
参考資料

※ 最終更新:2025/06/13　

↑

diffusersで始める Stable Diffusion †

↑

環境構築 †

必要な最初に用意するフレームワークとライブラリ

ライブラリ名	概要
PyTorch	深層学習向けの機械学習フレームワーク
Transformers	自然言語処理の Transformer 系モデルの学習と推論用のライブラリ
Diffusers	画像生成などに使われる拡散モデルのライブラリ
Accelerrate	PyTorch で分散学習や高速化を簡単にするためのライブラリ
SciPy	数値計算用のライブラリ

「Anaconda」の動作する環境を構築しておく
→ Anaconda 環境構築
「Python」バージョンを指定して仮想環境『sd_test』を構築する

・Python 3.11 で作成する
```
(base) PS > conda create -n sd_test python=3.11 -y
```

仮想環境を有効にする

(base) PS > conda activate sd_test
(sd_test) PS >

環境に合わせた「PyTorch」をインストール
・オフィシャルサイト https://pytorch.org/ を開いてインストールコマンドを取得する →
```
(sd_test) PS > pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
```

その他のパッケージをインストールする

(sd_test) PS > pip install transformers diffusers accelerate scipy

以後、必要になるパッケージはその都度インストールしていく

↑

前提条件 †

プロジェクトは下記のフォルダ構成で実行する

:\（ドライブ・ルート）
├─anaconda_win/
│  ├─workspace_3/
│  │  ├─sd_test/　　　　　　　　　　　　　　　　　　 ← プロジェクトの実行フォルダ
   :
├─StabilityMatrix/
│  └─Data/
│      ├─Models/
│      │   ├─StableDiffusion/
│      │   │   ├─SD1.5/　　　　　　　　　　　　　　 ← SD1.5 モデルの場所
│      │   │   └─・・・・・・　　　　　　　　　　　 ← SDXL モデルの場所

・「workspace_3/sd_test/」フォルダが配置されているドライブ直下に「StabilityMatrix」フォルダが存在すること
・「StabilityMatrix」内の所定の場所にあるダウンロード済みのモデルを使用する
・モデル配置が異なる場合は以下のプログラムソースの「モデルのフォルダーパス」を変更する必要がある

GPU が使用できるかの確認

(sd_test) PS > python -c 'import torch;print(torch.cuda.is_available())'

↑

Step 1：一番簡単な画像生成プログラム †

「sd_001.py」

## sd_001.py「自然と滝の写真」~

import torch
from diffusers import StableDiffusionPipeline

# モデルのフォルダーのパス
model = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# パイプラインを作成
pipeline = StableDiffusionPipeline.from_single_file(model).to(device)

# プロンプト
prompt = "nature and waterfall photography"

# 画像を生成
response = pipeline(prompt=prompt)
image = response.images[0]
image.save("image_001.png")

・モデルのパスとデバイスを指定してパイプラインを作成する

モデルの種類	パイプライン作成オブジェクト
SD1.5	StableDiffusionPipeline
SDXL	StableDiffusionXLPipeline

・生成画像はリストとして出力されるが、1 枚しかないので.images[0]を付けて取得できる
・画像は PIL のオブジェクトで、save メソッドを使って保存できる

プログラムを実行する

(sd_test) PS > python sd_001.py

Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00,  8.43it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
100%|██████████████████████████████████████████| 50/50 [00:05<00:00,  8.50it/s]

画像ファイル「image_001.png」が生成される

↑

Step 2：不要な出力抑制と画像サイズの指定 †

「sd_002.py」

## sd_002.py「自然と滝の写真」（出力メッセージを抑制と画像サイズの指定）~

import torch
from diffusers import StableDiffusionPipeline,logging       ## 不要なエラー出力の抑制
logging.set_verbosity_error()                               ##

# モデルのフォルダーのパス
model = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# パイプラインを作成
pipeline = StableDiffusionPipeline.from_single_file(model).to(device)

# プロンプト
prompt = "nature and waterfall photography"

# 画像を生成
response = pipeline(prompt=prompt, width=768, height=512)   ## 出力サイズ 768x512
image = response.images[0]
image.save("image_002.png")

プログラムを実行する

(sd_test) PS > python sd_002.py

Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00,  8.99it/s]
100%|██████████████████████████████████████████| 50/50 [00:09<00:00,  5.54it/s]

画像ファイル「image_002.png」が生成される

何度も表示される忠告メッセージを抑制して、本当に重大なエラーしか出力しないようになる
画像サイズは使用するモデルによって指定する推奨サイズが規定されている

↑

Step 3：半精度にして高速化とメモリー節約（GPU 動作のみ） †

「sd_003.py」

## sd_003.py「自然と滝の写真」（半精度にして高速化とメモリー節約）~

import torch
from diffusers import StableDiffusionPipeline,logging       ## 不要なエラー出力の抑制
logging.set_verbosity_error()                               ##

# モデルのフォルダーのパス
model = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# パイプラインを作成
pipeline = StableDiffusionPipeline.from_single_file(
                model,
                torch_dtype=torch.float16
                ).to(device)

# プロンプト
prompt = "nature and waterfall photography"

# 画像を生成
response = pipeline(prompt=prompt, width=768, height=512)   ## 出力サイズ 768x512
image = response.images[0]
image.save("image_003.png")

プログラムを実行する

(sd_test) PS > python sd_003.py
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:01<00:00,  4.30it/s]
100%|██████████████████████████████████████████| 50/50 [00:03<00:00, 14.21it/s]

画像ファイル「image_003.png」が生成される

生成時間：9秒 → 3秒に短縮された（例：RTX 4070 Ti ）
・torch_dtype を float16（半精度）に指定するとメモリー消費は半分になり生成時間も速くなる（通常：float32）
・CPUを使う場合は半精度が使えない

↑

Step 4：ステップ数を指定して高速化する †

「sd_004.py」

## sd_004.py「自然と滝の写真」（ステップ数を指定する）~

import torch
from diffusers import StableDiffusionPipeline,logging       ## 不要なエラー出力の抑制
logging.set_verbosity_error()                               ##

# モデルのフォルダーのパス
model = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors"

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# パイプラインを作成
pipeline = StableDiffusionPipeline.from_single_file(model).to(device)

# プロンプト
prompt = "nature and waterfall photography"

# 画像を生成
response = pipeline(prompt=prompt,num_inference_steps=20, width=768, height=512)   ## 出力サイズ 768x512
image = response.images[0]
image.save("image_004.png")

プログラムを実行する

(sd_test) PS > python sd_004.py
Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 10.10it/s]
100%|██████████████████████████████████████████| 20/20 [00:03<00:00,  5.28it/s]

画像ファイル「image_003.png」が生成される

diffusers の既定値ではステップ数は 50（20～30が妥当?）
生成時間：デフォールトの単精度で 9秒 → 3秒に短縮された（例：RTX 4070 Ti ）
・float16（半精度）にするとさらに速くなる（9秒 → 1秒）

↑

更新履歴 †

2025/06/12 初版

↑

参考資料 †

'''Stable Diffusion
- 猫耳とdiffusersで始めるStable Diffusion入門
- 実は衝撃的に簡単だった Stable Diffusion の使い方

書籍など
- 日経ソフトウエア 2025年7月号「ローカル生成AIプログラミング」
- Interface 2025年3月号「画像による異常検出＆ローカルLLM作り - 仕事のための生成AI」