私的AI研究会 > AI_Program2
これまで検証してきた結果をもとに、Python で生成 AI プログラムを書く
画像から画像を生成する img2img |
参考サイト:diffusers(Stable Diffusion)による画像の改造/合成/変換/修正/拡大
(base) PS > conda activate sd_test (sd_test) PS > cd workspace_3/sd_test
Step | プログラム | GPU | CPU | |||||
RTX 4070Ti | RTX 4060 | RTX 4060L | RTX 3050 | GTX 1050 | i7-1260P | |||
30 | 一番簡単な画像から画像生成 | sd_030.py | 00:01 | 00:01 | 00:05 | 00:03 | 00:19 | × |
31 | 変化の強さを調整する(strength) | sd_031.py | 00:10 | 00:12 | 00:15 | 00:28 | 02:49 | × |
32 | プロンプトの重さ(guidance_scale) | sd_032.py | 00:18 | 00:53 | 01:06 | 02:45 | 14:01 | × |
33 | 【SDXL】モデル合成(refiner) | sd_033.py | 06:00 | 07:03 | 08:21 | 12:20 | 26:46 | × |
34 | 【SDXL】モデル合成(refiner)パラメータを比較 | sd_034.py | 50:00 | 01:06:59 | 56:26 | 01:25:33 | 02:54:54 | × |
35 | 潜在空間の変換(latent) | sd_035.py | 00:02 | 00:02 | 00:27 | 00:29 | 00:41 | × |
36 | 元画像を4倍拡大(x4 upscaler) | sd_036.py | 00:05 | 02:07 | 03:53 | 02:07 | 02:51 | × |
37 | 潜在空間で2倍拡大(x2 latent upscaler) | sd_037.py | 00:04 | 00:07 | 02:42 | 01:35 | 13:17 | × |
38 | 特定の部分だけ修正(inpaint) | sd_038.py | 00:01 | 00:55 | 03:05 | 02:42 | 02:45 | × |
モデルの種類 | 基本画像サイズ | パイプライン作成オブジェクト |
SD1.5 | 512x512 | StableDiffusionImg2ImgPipeline |
SDXL | 1024x1024 | StableDiffusionXLImg2ImgPipeline |
## sd_030.py 画像から画像生成(img2img ) ## model: beautifulRealistic_brav5.safetensors import torch from PIL import Image from diffusers import StableDiffusionImg2ImgPipeline,DPMSolverMultistepScheduler, logging from translate import Translator import sd_tools as sdt logging.set_verbosity_error() # モデルフォルダーのパス model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors" image_path = "images/StableDiffusion_247.png" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # パイプラインを作成 pipeline = StableDiffusionImg2ImgPipeline.from_single_file( model_path, torch_dtype = torch.float16, ).to(device) # スケジューラ設定 pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config) # プロンプト trans = Translator('en','ja').translate prompt_jp = '黒髪で短い髪の女性' prompt = trans(prompt_jp) src_image = Image.open(image_path) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) print(f'Seed: {seed}, Model: {model_path}') print(f'prompt : {prompt_jp} → {prompt}') # 画像を生成 image = pipeline( prompt = prompt, image = src_image, num_inference_steps = 30, guidance_scale = 7, strength = 0.6, generator = generator ).images[0] image.save("results/image_030.png") save_path = 'results/image_030.png' sdt.image_save2(image, save_path, save_path)
(sd_test) PS > python sd_030.py Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 10.30it/s] Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors prompt : 黒髪で短い髪の女性 → a woman with short black hair 100%|██████████████████████████████████████████| 18/18 [00:01<00:00, 15.78it/s]
## sd_031.py 画像から画像生成 strength 強さを表すパラメータ ## model: beautifulRealistic_brav5.safetensors import torch from PIL import Image from diffusers import StableDiffusionImg2ImgPipeline,DPMSolverMultistepScheduler, logging from translate import Translator import matplotlib.pyplot as plt import sd_tools as sdt logging.set_verbosity_error() # 画像生成 def image_generation(strength): # パイプラインを作成 pipeline = StableDiffusionImg2ImgPipeline.from_single_file( model_path, torch_dtype = torch.float16, ).to(device) # スケジューラ設定 pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) # 画像を生成 img = pipeline( prompt = prompt, image = src_image, num_inference_steps = 30, guidance_scale = 7, strength = strength, generator = generator ).images[0] return img # モデルフォルダーのパス model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors" image_path = "images/StableDiffusion_247.png" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # プロンプト trans = Translator('en','ja').translate prompt_jp = '黒髪で短い髪の女性' #prompt_jp = 'テラスでコーヒーを飲む金髪の女性' prompt = trans(prompt_jp) src_image = Image.open(image_path) print(f'Seed: {seed}, Model: {model_path}') print(f'prompt : {prompt_jp} → {prompt}') # 複数画像を生成 plt.figure(figsize = [6, 15.5], dpi = 100) for i in range(10): strength = 0.1 + i * 0.1 img = image_generation(strength) plt.subplot(5, 2, i + 1, title = "strength = %.1f" % strength) plt.imshow(img) plt.axis('off') # メモリー開放 if device == 'cuda': torch.cuda.empty_cache() elif device == 'mps': torch.mps.empty_cache() plt.tight_layout() save_path = 'results/image_031.png' plt.savefig(save_path) plt.close() sdt.image_disp(save_path, save_path)
(sd_test) PS > python sd_031.py Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors prompt : 黒髪で短い髪の女性 → a woman with short black hair Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 15.31it/s] 100%|████████████████████████████████████████████| 3/3 [00:00<00:00, 16.70it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 26.95it/s] 100%|████████████████████████████████████████████| 6/6 [00:00<00:00, 26.25it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.83it/s] 100%|████████████████████████████████████████████| 9/9 [00:00<00:00, 25.62it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.48it/s] 100%|██████████████████████████████████████████| 12/12 [00:00<00:00, 25.21it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 22.46it/s] 100%|██████████████████████████████████████████| 15/15 [00:00<00:00, 24.61it/s] Fetching 11 files: 100%|████████████████████| 11/11 [00:00<00:00, 11032.36it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.50it/s] 100%|██████████████████████████████████████████| 18/18 [00:00<00:00, 24.45it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 29.71it/s] 100%|██████████████████████████████████████████| 21/21 [00:00<00:00, 24.55it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.32it/s] 100%|██████████████████████████████████████████| 24/24 [00:00<00:00, 24.17it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 34.67it/s] 100%|██████████████████████████████████████████| 27/27 [00:01<00:00, 24.19it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 23.52it/s] 100%|██████████████████████████████████████████| 30/30 [00:01<00:00, 24.25it/s]
プロンプト | 日本語入力 | 自動英訳 |
① | 黒髪で短い髪の女性 | a woman with short black hair |
② | テラスでコーヒーを飲む金髪の女性 | Blonde drinking coffee on the terrace |
## sd_032.py 画像から画像生成 プロンプトの重要度(guidance_scale) ## model: beautifulRealistic_brav5.safetensors import torch from PIL import Image from diffusers import StableDiffusionImg2ImgPipeline,DPMSolverMultistepScheduler, logging from translate import Translator import matplotlib.pyplot as plt import sd_tools as sdt logging.set_verbosity_error() # 画像生成 def image_generation(g_scale): # パイプラインを作成 pipeline = StableDiffusionImg2ImgPipeline.from_single_file( model_path, torch_dtype = torch.float16, ).to(device) # スケジューラ設定 pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) # 画像を生成 img = pipeline( prompt = prompt, image = src_image, num_inference_steps = 30, guidance_scale = g_scale, strength = 0.5, generator = generator ).images[0] return img # モデルフォルダーのパス model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors" image_path = "images/kaisendon.jpg" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # プロンプト trans = Translator('en','ja').translate prompt_jp = 'ラーメン' #prompt_jp = '鰻丼' prompt = trans(prompt_jp) src_image = Image.open(image_path) print(f'Seed: {seed}, Model: {model_path}') print(f'prompt : {prompt_jp} → {prompt}') # 複数画像を生成 plt.figure(figsize = [6, 9.5], dpi = 100) for i in range(6): img = image_generation(i * 2) plt.subplot(3, 2, i + 1, title = 'guidance_scale = %d' % (i * 2)) plt.imshow(img) plt.axis('off') # メモリー開放 if device == 'cuda': torch.cuda.empty_cache() elif device == 'mps': torch.mps.empty_cache() plt.tight_layout() save_path = 'results/image_032.png' plt.savefig(save_path) plt.close() sdt.image_disp(save_path, save_path)
(sd_test) PS > python sd_032.py Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/beautifulRealistic_brav5.safetensors prompt : ラーメン → Ramen Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 14.86it/s] 100%|██████████████████████████████████████████| 15/15 [00:02<00:00, 7.26it/s] Fetching 11 files: 100%|█████████████████████| 11/11 [00:00<00:00, 8801.48it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.02it/s] 100%|██████████████████████████████████████████| 15/15 [00:03<00:00, 3.87it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.53it/s] 100%|██████████████████████████████████████████| 15/15 [00:03<00:00, 3.87it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.18it/s] 100%|██████████████████████████████████████████| 15/15 [00:03<00:00, 3.86it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 22.71it/s] 100%|██████████████████████████████████████████| 15/15 [00:03<00:00, 3.86it/s] Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 33.00it/s] 100%|██████████████████████████████████████████| 15/15 [00:03<00:00, 3.86it/s]
プロンプト | 日本語入力 | 自動英訳 |
① | ラーメン | Ramen |
② | 鰻丼 | Eel Rice Bowl |
## sd_033.py【SDXL】モデル合成(refiner) ## model: animexlXuebimix_v60LCM.safetensors ## fudukiMix_v20.safetensors import torch from PIL import Image from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline, EulerAncestralDiscreteScheduler, logging from translate import Translator import matplotlib.pyplot as plt import sd_tools as sdt logging.set_verbosity_error() # モデルフォルダーのパス model_base_path = "/StabilityMatrix/Data/Models/StableDiffusion/animexlXuebimix_v60LCM.safetensors" model_ref_path = "/StabilityMatrix/Data/Models/StableDiffusion/fudukiMix_v20.safetensors" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # ベースモデルのパイプライン pipe_base = StableDiffusionXLPipeline.from_single_file( model_base_path, torch_dtype = torch.float16 ).to(device) # スケジューラー設定 pipe_base.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe_base.scheduler.config) # リファイナーモデルのパイプライン pipe_ref = StableDiffusionXLImg2ImgPipeline.from_single_file( model_ref_path, torch_dtype = torch.float16, scheduler = pipe_base.scheduler # スケジューラーを統一 ).to(device) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) # プロンプト trans = Translator('en','ja').translate prompt_jp = '猫を抱いている短い髪のの女性' prompt = trans(prompt_jp) print(f'Seed: {seed}') print(f'Model1: {model_base_path}') print(f'Model2: {model_ref_path}') print(f'prompt : {prompt_jp} → {prompt}') # ベースモデルで画像生成 img0 = pipe_base( prompt, num_inference_steps = 20, generator = generator, denoising_end = 0.4, # 途中で生成をやめると指定 output_type = 'latent' # 出力を潜在空間と指定 ).images # リファイナーモデルで画像生成 image = pipe_ref( prompt, image = img0, num_inference_steps = 20, generator = generator, denoising_start = 0.4, # 生成を途中から続けると指定 ).images[0] #image.save('results/image_033.png') save_path = 'results/image_033.png' sdt.image_save2(image, save_path, save_path)
(sd_test) PS > python sd_033.py Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 4.16it/s] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 17009.34it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.38it/s] Seed: 12345678 Model1: /StabilityMatrix/Data/Models/StableDiffusion/animexlXuebimix_v60LCM.safetensors Model2: /StabilityMatrix/Data/Models/StableDiffusion/fudukiMix_v20.safetensors prompt : 猫を抱いている短い髪のの女性 → a short-haired woman holding a cat 100%|████████████████████████████████████████████| 8/8 [02:04<00:00, 15.57s/it] 100%|██████████████████████████████████████████| 12/12 [03:47<00:00, 18.99s/it]
## sd_034.py【SDXL】モデル合成(refiner)2 パラメータ比較 ## model: animexlXuebimix_v60LCM.safetensors ## fudukiMix_v20.safetensors import torch from PIL import Image from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline, EulerAncestralDiscreteScheduler, logging from translate import Translator import matplotlib.pyplot as plt import sd_tools as sdt logging.set_verbosity_error() # 画像生成 def image_generation(sep): # ベースモデルのパイプライン pipe_base = StableDiffusionXLPipeline.from_single_file( model_base_path, torch_dtype = torch.float16 ).to(device) # スケジューラー設定 pipe_base.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe_base.scheduler.config) # リファイナーモデルのパイプライン pipe_ref = StableDiffusionXLImg2ImgPipeline.from_single_file( model_ref_path, torch_dtype = torch.float16, scheduler = pipe_base.scheduler # スケジューラーを統一 ).to(device) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) # ベースモデルで画像生成 img0 = pipe_base( prompt, num_inference_steps = 20, generator = generator, denoising_end = sep, # 途中で生成をやめると指定 output_type = 'latent' # 出力を潜在空間と指定 ).images # リファイナーモデルで画像生成 img = pipe_ref( prompt, image = img0, num_inference_steps = 20, generator = generator, denoising_start = sep, # 生成を途中から続けると指定 ).images[0] return img # モデルフォルダーのパス model_base_path = "/StabilityMatrix/Data/Models/StableDiffusion/animexlXuebimix_v60LCM.safetensors" model_ref_path = "/StabilityMatrix/Data/Models/StableDiffusion/fudukiMix_v20.safetensors" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # プロンプト trans = Translator('en','ja').translate prompt_jp = '庭で兎と遊んでいる女性' prompt = trans(prompt_jp) print(f'Seed: {seed}') print(f'Model1: {model_base_path}') print(f'Model2: {model_ref_path}') print(f'prompt : {prompt_jp} → {prompt}') # 複数画像を生成 plt.figure(figsize = [6, 12.5], dpi = 100) for i in range(8): sep = 0.1 + 0.1 * i img = image_generation(sep) plt.subplot(4, 2, i + 1, title = '%.1f' % sep) plt.imshow(img) plt.axis('off') # メモリー開放 if device == 'cuda': torch.cuda.empty_cache() elif device == 'mps': torch.mps.empty_cache() plt.tight_layout(pad = 0.5) save_path = 'results/image_034.png' plt.savefig(save_path) plt.close() sdt.image_disp(save_path, save_path)
(sd_test) PS > python sd_034.py Seed: 12345678 Model1: /StabilityMatrix/Data/Models/StableDiffusion/animexlXuebimix_v60LCM.safetensors Model2: /StabilityMatrix/Data/Models/StableDiffusion/fudukiMix_v20.safetensors prompt : 庭で兎と遊んでいる女性 → Woman playing with a rabbit in the garden Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 8.33it/s] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 16989.08it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 14.65it/s] 100%|████████████████████████████████████████████| 2/2 [00:22<00:00, 11.01s/it] 100%|██████████████████████████████████████████| 18/18 [05:08<00:00, 17.14s/it] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 14.60it/s] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 17021.52it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 11.85it/s] 100%|████████████████████████████████████████████| 4/4 [00:52<00:00, 13.06s/it] 100%|██████████████████████████████████████████| 16/16 [04:44<00:00, 17.77s/it] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.44it/s] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 15972.93it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 5.16it/s] 100%|████████████████████████████████████████████| 6/6 [01:31<00:00, 15.20s/it] 100%|██████████████████████████████████████████| 14/14 [04:14<00:00, 18.19s/it] Fetching 17 files: 100%|█████████████████████| 17/17 [00:00<00:00, 5665.73it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.50it/s] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 16876.49it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.68it/s] 100%|████████████████████████████████████████████| 8/8 [01:52<00:00, 14.07s/it] 100%|██████████████████████████████████████████| 12/12 [03:31<00:00, 17.66s/it] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 5.73it/s] Fetching 17 files: 100%|█████████████████████| 17/17 [00:00<00:00, 8408.39it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.37it/s] 100%|██████████████████████████████████████████| 10/10 [02:25<00:00, 14.54s/it] 100%|██████████████████████████████████████████| 10/10 [03:42<00:00, 22.20s/it] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 7.43it/s] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 5.48it/s] 100%|██████████████████████████████████████████| 12/12 [02:55<00:00, 14.66s/it] 100%|████████████████████████████████████████████| 8/8 [02:14<00:00, 16.78s/it] Fetching 17 files: 100%|████████████████████| 17/17 [00:00<00:00, 17058.17it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:00<00:00, 7.51it/s] Fetching 17 files: 100%|█████████████████████| 17/17 [00:00<00:00, 7249.20it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.62it/s] 100%|██████████████████████████████████████████| 14/14 [03:20<00:00, 14.31s/it] 100%|████████████████████████████████████████████| 6/6 [01:38<00:00, 16.38s/it] Fetching 17 files: 100%|█████████████████████| 17/17 [00:00<00:00, 5674.74it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.62it/s] Fetching 17 files: 100%|███████████████████████████████| 17/17 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 7/7 [00:01<00:00, 6.62it/s] 100%|██████████████████████████████████████████| 16/16 [03:50<00:00, 14.40s/it] 100%|████████████████████████████████████████████| 4/4 [01:02<00:00, 15.63s/it]
imgl2 = torch.HalfTensor(np.array(img1).transpose(2, 0, 1)[None,:] / 255).to(device) imgl2 = pipe.vae.encode(imgl2).latent_dist.sample() * pipe.vae.config.scaling_factor
## sd_035.py 潜在空間の変換(latent) ## model: animePastelDream_softBakedVae.safetensors import torch from PIL import Image from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler, logging from translate import Translator import numpy as np import matplotlib.pyplot as plt from PIL import Image import sd_tools as sdt logging.set_verbosity_error() # モデルフォルダーのパス model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/animePastelDream_softBakedVae.safetensors" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # パイプラインを作成 pipeline = StableDiffusionPipeline.from_single_file( model_path, torch_dtype = torch.float16 ).to(device) # スケジューラー設定 pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) # プロンプト trans = Translator('en','ja').translate prompt_jp = '庭で兎と遊んでいる女性' prompt = trans(prompt_jp) print(f'Seed: {seed} Model: {model_path}') print(f'prompt : {prompt_jp} → {prompt}') # 画像生成(潜在空間) img_latent = pipeline( prompt = prompt, num_inference_steps = 20, generator = generator, output_type='latent' ).images print(f'latent.shape = {img_latent.shape}') # torch.Size([1, 4, 64, 64]) # 潜在空間を画像として出力 imgl = np.float32(img_latent[0].cpu()).transpose(1, 2, 0) plt.figure(figsize=[6, 6],dpi = 100) plt.imshow((imgl - imgl.min()) / (imgl.max() - imgl.min())) plt.tight_layout() save_path = 'results/image_035a.png' plt.savefig(save_path) plt.close() sdt.image_disp(save_path, save_path) # ピクセル空間に変換して出力 img1 = pipeline.vae.decode(img_latent / pipeline.vae.config.scaling_factor) img1 = img1.sample[0].detach().cpu().numpy().transpose(1, 2, 0) img1 = Image.fromarray(np.uint8(np.clip(img1 * 0.5 + 0.5, 0, 1) * 255)) #img1.save("results/image_035.png") save_path = 'results/image_035.png' sdt.image_save2(img1, save_path, save_path)
(sd_test) PS > python sd_035.py Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:01<00:00, 3.12it/s] Seed: 12345678 Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/animePastelDream_softBakedVae.safetensors prompt : 庭で兎と遊んでいる女性 → Woman playing with a rabbit in the garden 100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 15.95it/s] latent.shape = torch.Size([1, 4, 64, 64])
## sd_036.py 元画像を4倍拡大(x4 upscaler ) ## model: stabilityai/stable-diffusion-x4-upscaler import torch from PIL import Image from diffusers import StableDiffusionUpscalePipeline, logging import sd_tools as sdt logging.set_verbosity_error() # モデルフォルダーのパス model_path = "stabilityai/stable-diffusion-x4-upscaler" #image_path = "images/uptest_128x128.png" image_path = "images/uptest_256x256.png" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # パイプラインを作成 pipeline = StableDiffusionUpscalePipeline.from_pretrained( model_path, torch_dtype = torch.float16, ).to(device) # プロンプト prompt = '' # 元画像の読み込み src_image = Image.open(image_path) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) print(f'Seed: {seed}, Model: {model_path}') print(f'prompt : {prompt}') # 画像を生成 image = pipeline( prompt = prompt, image = src_image, num_inference_steps = 20, generator = generator ).images[0] #image.save("results/image_036.png") save_path = 'results/image_036.png' sdt.image_save2(image, save_path, save_path)
(sd_test) PS > python sd_036.py Loading pipeline components...: 100%|████████████| 6/6 [00:01<00:00, 3.95it/s] Seed: 12345678, Model: stabilityai/stable-diffusion-x4-upscaler prompt : 100%|██████████████████████████████████████████| 20/20 [00:04<00:00, 4.71it/s]
## sd_037.py 潜在空間で2倍拡大(x2 latent upscaler) import torch from diffusers import StableDiffusionPipeline, StableDiffusionLatentUpscalePipeline, logging from translate import Translator import sd_tools as sdt logging.set_verbosity_error() # モデルのフォルダーのパス model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors" # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # パイプラインを作成 pipeline = StableDiffusionPipeline.from_single_file(model_path).to(device) # 2番目のパイプライン pipeline_x2 = StableDiffusionLatentUpscalePipeline.from_pretrained( 'stabilityai/sd-x2-latent-upscaler', torch_dtype=torch.float16, ).to(device) # プロンプト trans = Translator('en','ja').translate prompt_jp = '満開の蘭' prompt = trans(prompt_jp) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) print(f'Seed: {seed}, Model: {model_path}') print(f'prompt : {prompt_jp} → {prompt}') # 画像を生成 img0 = pipeline( prompt=prompt, num_inference_steps = 20, generator = generator, output_type = 'latent' ).images image = pipeline_x2( '', img0, num_inference_steps=20, ).images[0] #image.save("results/image_037.png") save_path = 'results/image_037.png' sdt.image_save2(image, save_path, save_path) # 途中の生成画像の保存 from PIL import Image import numpy as np img1 = pipeline.vae.decode(img0 / pipeline.vae.config.scaling_factor) img1 = img1.sample[0].detach().cpu().numpy().transpose(1, 2, 0) img1 = np.uint8(np.clip(img1 * 0.5 + 0.5, 0,1) * 255) #Image.fromarray(img1).save('results/image_037_512.png') img = Image.fromarray(img1) save_path = 'results/image_037_512.png' sdt.image_save2(img, save_path, save_path)
(sd_test) PS > python sd_037.py Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s] Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 8.84it/s] Loading pipeline components...: 100%|████████████| 5/5 [00:01<00:00, 4.24it/s] Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors prompt : 満開の蘭 → Orchid in full bloom 100%|██████████████████████████████████████████| 20/20 [00:02<00:00, 7.19it/s] 100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 12.08it/s]
## sd_038.py 特定の部分だけ修正(inpaint) import torch from PIL import Image from diffusers import StableDiffusionInpaintPipeline, logging from translate import Translator import sd_tools as sdt logging.set_verbosity_error() # モデルのフォルダーのパス model_path = 'runwayml/stable-diffusion-inpainting' # モデル image_path = 'images/sd_038_test.png' # 元画像 mask_path = 'images/sd_038_test_mask.png' # マスク画像 # GPUを使う場合は"cuda" 使わない場合は"cpu" device = 'cuda' if torch.cuda.is_available() else 'cpu' # seed 値 seed = 12345678 # パイプラインを作成 pipeline = StableDiffusionInpaintPipeline.from_pretrained( model_path, torch_dtype = torch.float16, variant = 'fp16' ).to(device) # プロンプト trans = Translator('en','ja').translate prompt_jp = 'こっちを見て微笑んでいる女の子' prompt = trans(prompt_jp) # Generatorオブジェクト作成 generator = torch.Generator(device).manual_seed(seed) img0 = Image.open(image_path) img_mask = Image.open(mask_path) print(f'Seed: {seed}') print(f'prompt : {prompt_jp} → {prompt}') print(f'Model : {model_path}') print(f'source : {image_path}') print(f'mask : {mask_path}') # 画像を生成 image = pipeline( prompt=prompt, image = img0, mask_image = img_mask, num_inference_steps = 20, generator = generator, ).images[0] #image.save("results/image_038.png") save_path = 'results/image_038.png' sdt.image_save2(image, save_path, save_path)
(sd_test) PS > python sd_038.py Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00, 19.11it/s] Seed: 12345678 prompt : こっちを見て微笑んでいる女の子 → A girl smiling at me Model : runwayml/stable-diffusion-inpainting source : images/sd_038_test.png mask : images/sd_038_test_mask.png 100%|██████████████████████████████████████████| 20/20 [00:01<00:00, 15.12it/s]