Problems running kohya #741

organfreeman36 · 2023-05-07T17:45:01Z

organfreeman36
May 7, 2023

Hi I'm trying to run Kohya to train a lora model with no success. I'm running windows 10 and have a GTX 1070. I've tried reinstalling serval time and checking all the dependencies are properly installed. and am getting the following error when running:

E:\Koyah\kohya_ss>gui.bat --listen 127.0.0.1 --server_port 7860 --inbrowser
'.\venv\Scripts\activate.bat' is not recognized as an internal or external command,
operable program or batch file.
System Information:
System: Windows, Release: 10, Version: 10.0.19044, Machine: AMD64, Processor: Intel64 Family 6 Model 158 Stepping 10, GenuineIntel

Python Information:
Version: 3.10.9, Implementation: CPython, Compiler: MSC v.1934 64 bit (AMD64)

Virtual Environment Information:
Not running inside a virtual environment.

GPU Information:
Name: NVIDIA GeForce GTX 1070, VRAM: 8192 MiB

Validating that requirements are satisfied.
All requirements satisfied.
headless: False
Load CSS...
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Folder 100_Archie : 5000 steps
max_train_steps = 5000
stop_text_encoder_training = 0
lr_warmup_steps = 500
accelerate launch --num_cpu_threads_per_process=2 "train_db.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="E:/stable-diffusion/Lora Training Data/Archie/image" --resolution=512,512 --output_dir="E:/stable-diffusion/Lora Training Data/Archie/model" --logging_dir="E:/stable-diffusion/Lora Training Data/Archie/log" --save_model_as=safetensors --output_name="last" --max_data_loader_n_workers="0" --learning_rate="1e-05" --lr_scheduler="cosine" --lr_warmup_steps="500" --train_batch_size="1" --max_train_steps="5000" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale
prepare tokenizer
prepare images.
found directory E:\stable-diffusion\Lora Training Data\Archie\image\100_Archie contains 50 image files
5000 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 1
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True

[Subset 0 of Dataset 0]
image_dir: "E:\stable-diffusion\Lora Training Data\Archie\image\100_Archie"
image_count: 50
num_repeats: 100
shuffle_caption: False
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: Archie
caption_extension: .caption

[Dataset 0]
loading image sizes.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 206.52it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (320, 576), count: 400
bucket 1: resolution (320, 640), count: 100
bucket 2: resolution (320, 704), count: 200
bucket 3: resolution (320, 768), count: 200
bucket 4: resolution (384, 448), count: 100
bucket 5: resolution (384, 512), count: 300
bucket 6: resolution (384, 576), count: 800
bucket 7: resolution (448, 448), count: 700
bucket 8: resolution (448, 512), count: 1300
bucket 9: resolution (448, 576), count: 500
bucket 10: resolution (512, 384), count: 100
bucket 11: resolution (512, 448), count: 100
bucket 12: resolution (512, 512), count: 200
mean ar error (without repeats): 0.027661735600387
prepare accelerator
C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py:249: FutureWarning: logging_dir is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use project_dir instead.
warnings.warn(
Using accelerator 0.15.0 or above.
loading model for process 0/1
load Diffusers pretrained models
safety_checker\model.safetensors not found
Fetching 19 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<?, ?it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:13<00:00, 3.59it/s]
prepare optimizer, data loader etc.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cuda_setup\paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')}
warn(
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
Traceback (most recent call last):
File "E:\Koyah\kohya_ss\train_db.py", line 469, in
train(args)
File "E:\Koyah\kohya_ss\train_db.py", line 152, in train
_, , optimizer = train_util.get_optimizer(args, trainable_params)
File "E:\Koyah\kohya_ss\library\train_util.py", line 2517, in get_optimizer
import bitsandbytes as bnb
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes_init.py", line 6, in
from .autograd._functions import (
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\autograd_functions.py", line 5, in
import bitsandbytes.functional as F
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\functional.py", line 13, in
from .cextension import COMPILED_WITH_CUDA, lib
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cextension.py", line 41, in
lib = CUDALibrary_Singleton.get_instance().lib
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cextension.py", line 37, in get_instance
cls.instance.initialize()
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cextension.py", line 31, in initialize
self.lib = ct.cdll.LoadLibrary(binary_path)
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 452, in LoadLibrary
return self.dlltype(name)
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 364, in init
if '/' in name or '\' in name:
TypeError: argument of type 'WindowsPath' is not iterable
Traceback (most recent call last):
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\Scripts\accelerate.exe_main.py", line 7, in
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 923, in launch_command
simple_launcher(args)
File "C:\Users\Archie\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 579, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\Archie\AppData\Local\Programs\Python\Python310\python.exe', 'train_db.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:/stable-diffusion/Lora Training Data/Archie/image', '--resolution=512,512', '--output_dir=E:/stable-diffusion/Lora Training Data/Archie/model', '--logging_dir=E:/stable-diffusion/Lora Training Data/Archie/log', '--save_model_as=safetensors', '--output_name=last', '--max_data_loader_n_workers=0', '--learning_rate=1e-05', '--lr_scheduler=cosine', '--lr_warmup_steps=500', '--train_batch_size=1', '--max_train_steps=5000', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

StreakThunderstorm · 2023-05-08T16:53:18Z

StreakThunderstorm
May 8, 2023

Run setup.bat again.

1 reply

organfreeman36 May 11, 2023
Author

I've ran it again and it's producing the same error

dilectiogames · 2023-05-13T08:27:55Z

dilectiogames
May 13, 2023

I don't see the CUDA requirements in the main page but I think is needed. I've installed it at some point before installing this repository so maybe it found it and didn't throw an error for me.

I don't have the link anymore but you need to go to nvidia and find this file (or a more recent one)

cuda_12.1.1_531.14_windows.exe

is a 3Gb file with CUDA stuff inside

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems running kohya #741

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Problems running kohya #741

organfreeman36 May 7, 2023

Replies: 2 comments · 1 reply

StreakThunderstorm May 8, 2023

organfreeman36 May 11, 2023 Author

dilectiogames May 13, 2023

organfreeman36
May 7, 2023

Replies: 2 comments 1 reply

StreakThunderstorm
May 8, 2023

organfreeman36 May 11, 2023
Author

dilectiogames
May 13, 2023