StyleGAN

StyleGAN3による画像とビデオの編集：StyleGAN3 †

　「StyleGAN3」で画像とビデオを編集する。

StyleGAN3による画像とビデオの編集：StyleGAN3
- サイト『StyleGAN3による画像とビデオの編集』の検証
- 更新履歴
参考資料

※ 最終更新:2023/11/17　

↑

サイト『StyleGAN3による画像とビデオの編集』の検証 †

↑

概要 †

StyleGAN3とこれまでの手法を組み合わせた「画像とビデオの編集」を、上記サイトの手順に従って検証してみる。
StyleGAN3 は現在の「Google Colaboratory」環境で動作する。

（StarGAN/StarGAN2 では tensorflowの 1.x 系と cuda バージョンなどの問題で現在の「Google Colaboratory」では環境構築ができなかった）

オフィシャルサイト → Third Time's the Charm? Image and Video Editing with StyleGAN3 (AIM Workshop ECCV 2022)

↑

Google Colaboratory に実行環境を作成 †

上記サイト作者のデモサイトを開き「Open in Colab」① ボタンを押す
『stylegan_edit』の Google Colab が開くので「ファイル」メニューから「ドライブにコピーを保存」を選択
『stylegan_edit のコピー』のタイトルで開いた Google Colab のページで以降の操作を行う
データファイルをダウンロードして解凍する（解凍した「update/work/gan3/」を使用する
　update_20231117.zip (18.3MB) <アップデート・データ>

↑

環境設定 †

以下のセルを実行する ①（実行時間約 2分）

#@title セットアップ
import os
from pathlib import Path

os.chdir('/content')
CODE_DIR = 'stylegan3-editing'

# githubからコード取得
!git clone https://github.com/cedro3/stylegan3-editing.git $CODE_DIR

# ninjaインストール
!wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
!sudo unzip ninja-linux.zip -d /usr/local/bin/
!sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force

# pyrallis & CLIPインストール
!pip install pyrallis
!pip install git+https://github.com/openai/CLIP.git
os.chdir(f'./{CODE_DIR}')


# ライブラリー・インポート
import time
import sys
import pprint
import numpy as np
from PIL import Image
import dataclasses
import torch
import torchvision.transforms as transforms

sys.path.append(".")
sys.path.append("..")

from editing.interfacegan.face_editor import FaceEditor
from editing.styleclip_global_directions import edit as styleclip_edit
from models.stylegan3.model import GeneratorType
from notebooks.notebook_utils import Downloader, ENCODER_PATHS, INTERFACEGAN_PATHS, STYLECLIP_PATHS
from notebooks.notebook_utils import run_alignment, crop_image, compute_transforms
from utils.common import tensor2im
from utils.inference_utils import run_on_batch, load_encoder, get_average_image
from function import *

%load_ext autoreload
%autoreload 2


# 学習済みパラメータのダウンロード
downloader = Downloader(code_dir=CODE_DIR,
                        use_pydrive=False,
                        subdir="pretrained_models")

▼　- log -　GoogleColab Tesla T4

Cloning into 'stylegan3-editing'...
remote: Enumerating objects: 326, done.
remote: Counting objects: 100% (55/55), done.
remote: Compressing objects: 100% (41/41), done.
remote: Total 326 (delta 16), reused 14 (delta 14), pack-reused 271
Receiving objects: 100% (326/326), 78.16 MiB | 40.75 MiB/s, done.
Resolving deltas: 100% (59/59), done.
--2023-11-10 02:01:57--  https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
Resolving github.com (github.com)... 20.29.134.23
Connecting to github.com (github.com)|20.29.134.23|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/1335132/d2f252e2-9801-11e7-9fbf-bc7b4e4b5c83?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20231110%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231110T020157Z&X-Amz-Expires=300&X-Amz-Signature=84361608ae26e7851992a50022789a396d8d1d4ec31f82af264716c24cc91bbd&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=1335132&response-content-disposition=attachment%3B%20filename%3Dninja-linux.zip&response-content-type=application%2Foctet-stream [following]
--2023-11-10 02:01:57--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/1335132/d2f252e2-9801-11e7-9fbf-bc7b4e4b5c83?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20231110%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231110T020157Z&X-Amz-Expires=300&X-Amz-Signature=84361608ae26e7851992a50022789a396d8d1d4ec31f82af264716c24cc91bbd&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=1335132&response-content-disposition=attachment%3B%20filename%3Dninja-linux.zip&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 77854 (76K) [application/octet-stream]
Saving to: ‘ninja-linux.zip’

ninja-linux.zip     100%[===================>]  76.03K  --.-KB/s    in 0.02s   

2023-11-10 02:01:57 (4.23 MB/s) - ‘ninja-linux.zip’ saved [77854/77854]

Archive:  ninja-linux.zip
  inflating: /usr/local/bin/ninja    
update-alternatives: using /usr/local/bin/ninja to provide /usr/bin/ninja (ninja) in auto mode
Collecting pyrallis
  Downloading pyrallis-0.3.1-py3-none-any.whl (33 kB)
Collecting typing-inspect (from pyrallis)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from pyrallis) (6.0.1)
Collecting mypy-extensions>=0.3.0 (from typing-inspect->pyrallis)
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.10/dist-packages (from typing-inspect->pyrallis) (4.5.0)
Installing collected packages: mypy-extensions, typing-inspect, pyrallis
Successfully installed mypy-extensions-1.0.0 pyrallis-0.3.1 typing-inspect-0.9.0
Collecting git+https://github.com/openai/CLIP.git
  Cloning https://github.com/openai/CLIP.git to /tmp/pip-req-build-_25dcuj5
  Running command git clone --filter=blob:none --quiet https://github.com/openai/CLIP.git /tmp/pip-req-build-_25dcuj5
  Resolved https://github.com/openai/CLIP.git to commit a1d071733d7111c9c014f024669f959182114e33
  Preparing metadata (setup.py) ... done
Collecting ftfy (from clip==1.0)
  Downloading ftfy-6.1.1-py3-none-any.whl (53 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 1.2 MB/s eta 0:00:00
Requirement already satisfied: regex in /usr/local/lib/python3.10/dist-packages (from clip==1.0) (2023.6.3)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from clip==1.0) (4.66.1)
Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from clip==1.0) (2.1.0+cu118)
Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from clip==1.0) (0.16.0+cu118)
Requirement already satisfied: wcwidth>=0.2.5 in /usr/local/lib/python3.10/dist-packages (from ftfy->clip==1.0) (0.2.9)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (3.13.1)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (4.5.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (3.1.2)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (2023.6.0)
Requirement already satisfied: triton==2.1.0 in /usr/local/lib/python3.10/dist-packages (from torch->clip==1.0) (2.1.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from torchvision->clip==1.0) (1.23.5)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from torchvision->clip==1.0) (2.31.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.10/dist-packages (from torchvision->clip==1.0) (9.4.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->clip==1.0) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision->clip==1.0) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision->clip==1.0) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision->clip==1.0) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision->clip==1.0) (2023.7.22)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->clip==1.0) (1.3.0)
Building wheels for collected packages: clip
  Building wheel for clip (setup.py) ... done
  Created wheel for clip: filename=clip-1.0-py3-none-any.whl size=1369500 sha256=48f0b822a8bcadf88a160552b5b10e82641ffd535d90cf0d28c334a076dc5a54
  Stored in directory: /tmp/pip-ephem-wheel-cache-yj1p2lro/wheels/da/2b/4c/d6691fa9597aac8bb85d2ac13b112deb897d5b50f5ad9a37e4
Successfully built clip
Installing collected packages: ftfy, clip
Successfully installed clip-1.0 ftfy-6.1.1

セルの実行終了② 後、左サイドバーの「ファイルボタン」を押す
「stylegan3-editing」③ の下の「edit」の下に「pic」」④ フォルダがあることを確認する
「pic」④ フォルダに用意した顔画像を追加（ローカルマシンの画面からドラッグ＆ドロップで OK）
画像ファイルは「.jpg」フォーマットのみ
同様に「video」⑤ フォルダに動画を追加する（「01.mp4」「02.mp4」はあらかじめ配置済み）

↑

事前準備 †

画像から潜在変数を求めるエンコーダ（e4e、pSpから選択可能）を指定する

初期設定
・以下のセルを実行する（実行時間約 3分）

#@title 初期設定

# エンコーダタイプ選択
experiment_type = 'restyle_pSp_ffhq' #@param ['restyle_e4e_ffhq', 'restyle_pSp_ffhq']

EXPERIMENT_DATA_ARGS = {
    "restyle_pSp_ffhq": {
        "model_path": "./pretrained_models/restyle_pSp_ffhq.pt",
        "image_path": "./notebooks/images/face_image.jpg",
        "transform": transforms.Compose([
            transforms.Resize((256, 256)),
            transforms.ToTensor(),
            transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
    },
    "restyle_e4e_ffhq": {
        "model_path": "./pretrained_models/restyle_e4e_ffhq.pt",
        "image_path": "./notebooks/images/face_image.jpg",
        "transform": transforms.Compose([
            transforms.Resize((256, 256)),
            transforms.ToTensor(),
            transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
    }
}

EXPERIMENT_ARGS = EXPERIMENT_DATA_ARGS[experiment_type]


# エンコーダ・ダウンロード
if not os.path.exists(EXPERIMENT_ARGS['model_path']) or os.path.getsize(EXPERIMENT_ARGS['model_path']) < 1000000:
    print(f'Downloading ReStyle encoder model: {experiment_type}...')
    try:
      downloader.download_file(file_id=ENCODER_PATHS[experiment_type]['id'],
                              file_name=ENCODER_PATHS[experiment_type]['name'])
    except Exception as e:
      raise ValueError(f"Unable to download model correctly! {e}")
    # if google drive receives too many requests, we'll reach the quota limit and be unable to download the model
    if os.path.getsize(EXPERIMENT_ARGS['model_path']) < 1000000:
        raise ValueError("Pretrained model was unable to be downloaded correctly!")
    else:
        print('Done.')
else:
    print(f'Model for {experiment_type} already exists!')


# エンコーダ・ロード
model_path = EXPERIMENT_ARGS['model_path']
net, opts = load_encoder(checkpoint_path=model_path)
avg_image = get_average_image(net)


# --- 編集パラメータのダウンロード ---
download_with_pydrive = False

# download files for interfacegan
downloader = Downloader(code_dir=CODE_DIR,
                        use_pydrive=download_with_pydrive,
                        subdir="editing/interfacegan/boundaries/ffhq")
print("Downloading InterFaceGAN boundaries...")
for editing_file, params in INTERFACEGAN_PATHS.items():
    print(f"Downloading {editing_file} boundary...")
    downloader.download_file(file_id=params['id'],
                             file_name=params['name'])

# download files for styleclip
downloader = Downloader(code_dir=CODE_DIR,
                        use_pydrive=download_with_pydrive,
                        subdir="editing/styleclip_global_directions/sg3-r-ffhq-1024")
print("Downloading StyleCLIP auxiliary files...")
for editing_file, params in STYLECLIP_PATHS.items():
    print(f"Downloading {editing_file}...")
    downloader.download_file(file_id=params['id'],
                             file_name=params['name'])

editor = FaceEditor(stylegan_generator=net.decoder, generator_type=GeneratorType.ALIGNED)

▼　- log -　GoogleColab Tesla T4

Downloading ReStyle encoder model: restyle_pSp_ffhq...
Done.
Loading ReStyle pSp from checkpoint: ./pretrained_models/restyle_pSp_ffhq.pt
Loading StyleGAN3 generator from path: None
Done!
Model successfully loaded!
Setting up PyTorch plugin "filtered_lrelu_plugin"... Done.
Downloading InterFaceGAN boundaries...
Downloading age boundary...
Downloading smile boundary...
Downloading pose boundary...
Downloading Male boundary...
Downloading StyleCLIP auxiliary files...
Downloading delta_i_c...
Downloading s_stats...

align & crop の作成（1024x1024 pixel）
・edit/picフォルダにあるサンプル画像を align 処理（顔固定）及び、crop処理（背景固定）する
・以下のセルを実行する（実行時間約 1分）

#@title align & crop の作成

import os
import glob
from tqdm import tqdm

reset_folder('edit/align')
reset_folder('edit/crop')

files = sorted(os.listdir('edit/pic'))
for i, file in enumerate(tqdm(files)):
    input_image = run_alignment('edit/pic/'+file)
    cropped_image =crop_image('edit/pic/'+file)
    name = os.path.splitext(file)[0]
    input_image.save('edit/align/'+name+'.jpg')
    cropped_image.save('edit/crop/'+name+'.jpg')

print('=== pic ===')
display_pic('edit/pic')
print('=== align ===')
display_pic('edit/align')
print('=== crop ===')
display_pic('edit/crop')

▼　- log -　GoogleColab Tesla T4

100%|██████████| 6/6 [00:42<00:00,  7.04s/it]

・１列目が入力画像、２列目がalign処理（顔固定）、３列目がcrop処理（背景固定）したもの

invert の作成
・顔固定の画像から潜在変数を求め、潜在変数は無編集で、背景固定の画像を生成する
・下記の 2行をセルのコード 7行目に追加する（保存フォルダの追加作成）

reset_folder('edit/infer_gan')  #
reset_folder('edit/infer_clip') #

・変更した以下のセルを実行する（実行時間約 1秒）

#@title invert の作成

from tqdm import tqdm

reset_folder('edit/invert')
reset_folder('edit/latents')
reset_folder('edit/infer_gan')  #
reset_folder('edit/infer_clip') #

files = sorted(os.listdir('edit/align'))
for file in tqdm(files):
  input_image = Image.open('edit/align/'+file)
  aligned_path = 'edit/align/'+file
  cropped_path = 'edit/crop/'+file

  landmarks_transform = compute_transforms(aligned_path=aligned_path, cropped_path=cropped_path)

  opts.n_iters_per_batch = 3
  opts.resize_outputs = False  # generate outputs at full resolution

  img_transforms = EXPERIMENT_ARGS['transform']
  transformed_image = img_transforms(input_image)

  with torch.no_grad():
      tic = time.time()
      result_batch, result_latents = run_on_batch(inputs=transformed_image.unsqueeze(0).cuda().float(),
                                                net=net,
                                                opts=opts,
                                                avg_image=avg_image,
                                                landmarks_transform=torch.from_numpy(landmarks_transform).cuda().float())
      toc = time.time()
      #print('Inference took {:.4f} seconds.'.format(toc - tic))

  result_tensors = result_batch[0]
  final_rec = tensor2im(result_tensors[-1])#.resize(resize_amount)
  final_rec.save('edit/invert/'+file)

  name = os.path.splitext(file)[0]
  np.save('edit/latents/'+name, result_latents[0][-1])

print('=== crop ===')
display_pic('edit/crop')
print('=== invert ===')
display_pic('edit/invert')

▼　- log -　GoogleColab Tesla T4

100%|██████████| 6/6 [00:35<00:00,  5.97s/it]

・１列目が実写から切り抜いた背景固定の画像、２列目が潜在変数から生成した背景固定の画像

↑

画像編集 †

InterFaceGANによる編集
・編集のための設定パラメータ
　○ invert: 　　　　指定画像（.jpg フォーマット画像）
　○ edit_direction: 編集用パラメータ（age, smile, pose, male）
　○ min_value: 　　適用係数
　○ max_value: 　　　〃

・設定例「001.jpg」
　○ invert: 　　　　001.jpg
　○ edit_direction: age
　○ min_value: 　　-5
　○ max_value: 　　5

・下記の 2行をセルのコード 9行目と後ろから2行目に追加する（結果画像の保存）

infer_path = 'edit/infer_gan/'+invert   #
res.save(infer_path)  #

・変更した以下のセルを実行する（実行時間約 8秒）

invert = '01.jpg'#@param {type:"string"}
name = os.path.splitext(invert)[0]+'.npy'
result_latents_ = np.load('edit/latents/'+name)

aligned_path = 'edit/align/'+invert
cropped_path = 'edit/crop/'+invert
infer_path = 'edit/infer_gan/'+invert   #
landmarks_transform = compute_transforms(aligned_path=aligned_path, cropped_path=cropped_path)

edit_direction = 'age' #@param ['age', 'smile', 'pose', 'Male']
min_value = -5 #@param {type:"slider", min:-10, max:10, step:1}
max_value = 5 #@param {type:"slider", min:-10, max:10, step:1}


#@title Perform Edit! { display-mode: "form" }
print(f"Performing edit for {edit_direction}...")
#input_latent = torch.from_numpy(result_latents[0][-1]).unsqueeze(0).cuda()
input_latent = torch.from_numpy(result_latents_).unsqueeze(0).cuda()
edit_images, edit_latents = editor.edit(latents=input_latent,
                                        direction=edit_direction,
                                        factor_range=(min_value, max_value),
                                        user_transforms=landmarks_transform,
                                        apply_user_transformations=True)
print("Done!")


#@title Show Result { display-mode: "form" }
def prepare_edited_result(edit_images):
  if type(edit_images[0]) == list:
      edit_images = [image[0] for image in edit_images]
  res = np.array(edit_images[0].resize((512, 512)))
  for image in edit_images[1:]:
      res = np.concatenate([res, image.resize((512, 512))], axis=1)
  res = Image.fromarray(res).convert("RGB")
  return res

res = prepare_edited_result(edit_images)
res.save(infer_path)  #
res

▼　- log -　GoogleColab Tesla T4

Performing edit for age...
Done!

StyleCLIPによる編集
・編集のための設定パラメータ
　○ neutral_text:　　基準テキスト（英文）
　○ target_text: 　　対象テキスト（英文）
　○ alpha: 　　　　　適用係数
　○ beta: 　　　　　　　〃

・設定例「001.jpg」
　○ neutral_text:　　a face
　○ target_text: 　　a smiling face
　○ alpha: 　　　　　4
　○ beta: 　　　　　0.13

・下記の 2行をセルのコード後ろから 9行目と 2行目に追加する（結果画像の保存）

infer_clip_path = 'edit/infer_clip/'+invert  #
edit_coupled.save(infer_clip_path)  #

・変更した以下のセルを実行する（実行時間約 14秒）

#@title StyleCLIPによる編集

styleclip_args = styleclip_edit.EditConfig()
global_direction_calculator = styleclip_edit.load_direction_calculator(stylegan_model=net.decoder, opts=styleclip_args)

neutral_text = "a face" #@param {type:"raw"}
target_text = "a smiling face" #@param {type:"raw"}
alpha = 4 #@param {type:"slider", min:-5, max:5, step:0.5}
beta = 0.13 #@param {type:"slider", min:-1, max:1, step:0.1}


# 設定
opts = styleclip_edit.EditConfig()
opts.alpha_min = alpha
opts.alpha_max = alpha
opts.num_alphas = 1
opts.beta_min = beta
opts.beta_max = beta
opts.num_betas = 1
opts.neutral_text = neutral_text
opts.target_text = target_text

# 推論
input_latent = result_latents_
input_transforms = torch.from_numpy(landmarks_transform).cpu().numpy()
print(f'Performing edit for: "{opts.target_text}"...')
edit_res, edit_latent = styleclip_edit.edit_image(latent=input_latent,
                                                  landmarks_transform=input_transforms,
                                                  stylegan_model=net.decoder,
                                                  global_direction_calculator=global_direction_calculator,
                                                  opts=opts,
                                                  image_name=None,
                                                  save=False)
print("Done!")

input_image = Image.open('edit/invert/'+invert) ###
transformed_image = img_transforms(input_image) ###
infer_clip_path = 'edit/infer_clip/'+invert  #

# 表示
input_im = tensor2im(transformed_image).resize((512, 512))
edited_im = tensor2im(edit_res[0]).resize((512, 512))
edit_coupled = np.concatenate([np.array(input_im), np.array(edited_im)], axis=1)
edit_coupled = Image.fromarray(edit_coupled)
edit_coupled.save(infer_clip_path)  #
edit_coupled.resize((1024, 512))

▼　- log -　GoogleColab Tesla T4

100%|███████████████████████████████████████| 338M/338M [00:08<00:00, 42.6MiB/s]
Performing edit for: "a smiling face"...
Done!

動画の編集
・コマンドの引数
　○ –video_path: 　　　ビデオの指定
　○ –checkpoint_path:　エンコーダのパラメータの指定
　○ output_path : 　　　出力フォルダの指定
・google colab 無料版では長い動画は途中メモリー不足で終了する
　○ 01.mp4（2秒）→ 正常終了
　○ 02.mp4（7秒）→ メモリー不足で中断

・セルのコードのフォルダ・パスを修正

--video_path edit/video/01.mp4 \
--output_path out_01

・変更した以下のセルを実行する（01.mp4 実行時間約 20分）

# ビデオ編集（要PROハイメモリ）

# shape_predictor copy
import shutil
shutil.copy('shape_predictor_68_face_landmarks.dat', 'pretrained_models/shape_predictor_68_face_landmarks.dat')

! python inversion/video/inference_on_video.py \
--video_path edit/video/01.mp4 \
--checkpoint_path pretrained_models/restyle_pSp_ffhq.pt \
--output_path out_01

▼　- log -　GoogleColab Tesla T4

Parsing video!
100% 89/89 [00:02<00:00, 33.57it/s]
Saving aligned video frames...
100% 88/88 [01:03<00:00,  1.39it/s]
Saving cropped video frames...
100% 88/88 [00:05<00:00, 17.14it/s]
Loading ReStyle pSp from checkpoint: pretrained_models/restyle_pSp_ffhq.pt
Loading StyleGAN3 generator from path: None
Done!
Model successfully loaded!
Computing landmarks transforms...
100% 88/88 [03:40<00:00,  2.50s/it]
Setting up PyTorch plugin "filtered_lrelu_plugin"... Done.
88it [02:09,  1.47s/it]
/content/stylegan3-editing/inversion/video/inference_on_video.py:69: FutureWarning: The input object of type 'Tensor' is an array-like implementing one of the corresponding protocols (`__array__`, `__array_interface__` or `__array_struct__`); but not a sequence (or 0-D). In the future, this object will be coerced as if it was first converted using `np.array(obj)`. To retain the old behaviour, you have to either modify the type 'Tensor', or assign to an empty array created with `np.empty(correct_shape, dtype=object)`.
  landmarks_transforms = np.array(list(results["landmarks_transforms"]))
/content/stylegan3-editing/inversion/video/inference_on_video.py:69: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  landmarks_transforms = np.array(list(results["landmarks_transforms"]))
Generating smoothed frames...
84it [00:16,  5.21it/s]
Generating video for age edit...
88it [00:34,  2.53it/s]
Generating smoothed edited frames...
84it [00:17,  4.92it/s]
Generating smoothed edited frames...
84it [00:15,  5.34it/s]
Generating video for a  smiling face edit...
88it [00:27,  3.17it/s]
Generating smoothed edited frames...
84it [00:16,  4.96it/s]

・動画を再生する
・セルのコードのフォルダ・パスを修正

video_path = 'out_01/edited_video_age_start_coupled.mp4'

・変更した以下のセルを実行する

# ビデオ再生
video_path = 'out_01/edited_video_age_start_coupled.mp4'
display_mp4(video_path)

・生成結果のビデオは「out_02/」フォルダに「edited_video_age_start_coupled.mp4」として保存
・左から、実写ビデオ、潜在変数を無編集で生成したビデオ、潜在変数に若くする編集を行って生成したビデオ

↑

別の画像を編集 †

実行例
・InterFaceGAN（invert: okegawa_2.jpg edit_direction: age min_value: -5 max_value: 5）

・StyleCLIP（nutral_text: a faceg target_text: a smilling face alpha: 4 beta: 0.13）
実行例
・InterFaceGAN（invert: yaoi_3m.jpg edit_direction: age min_value: -5 max_value: 5）

・StyleCLIP（nutral_text: a faceg target_text: a smilling face alpha: 4 beta: 0.13）
実行例
・InterFaceGAN（invert: nitta_1m.jpg edit_direction: age min_value: -5 max_value: 5）

・StyleCLIP（nutral_text: a faceg target_text: a smilling face alpha: 4 beta: 0.13）
実行例
・InterFaceGAN（invert: izutsu_1m.jpg edit_direction: age min_value: -5 max_value: 5）

・StyleCLIP（nutral_text: a faceg target_text: a smilling face alpha: 4 beta: 0.13）
実行例
・InterFaceGAN（invert: kenta_2m.jpg edit_direction: age min_value: -5 max_value: 5）

・StyleCLIP（nutral_text: a faceg target_text: a smilling face alpha: 4 beta: 0.13）
その他の実行結果
・InterFaceGAN 年齢による顔の変化

・StyleCLIP 笑顔の生成

・動画の編集

↑

編集の終了・再接続後の実行 †

編集を終えるときは Colab「ランタイム」→「ランタイムを接続解除して削除」を選択する
・GPU 占有時間を少なくするためすべての実行作業が終了した場合は接続解除しておくことが望ましい
・接続解除して削除を実行しても、ノートブック上の実行結果はそのまま残る
再接続の場合は上記の環境設定からもう一度実行同じ手順をする

↑