-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WinError 1455] 页面文件太小,无法完成操作 #449
Comments
This error traceback is related to loading the "cudnn_cnn_infer64_8.dll" library file required by PyTorch. The error message indicates that the error is due to the "页面文件太小,无法完成操作" (which means "The page file is too small to complete the operation") error in Windows. This error could occur if the virtual memory or page file is too small for the system to operate correctly. You can try increasing the size of the page file to resolve this issue. To do this, follow the steps below: Open the System Properties window by right-clicking on "This PC" or "My Computer" and selecting "Properties". |
I was also getting this 1455 error recently. If it wasn't this error then I got a CUDA error and a tensorflow error. Sorry I should have saved them for debugging. Increasing the page file as much as I could didn't work for me. The solution was to change the "Max num workers for DataLoader" in "Advanced configuration" to a low value like 0, 1 or 2. Just adding a low number in here made this project work again like normal. v21.3.4 suggests the default value for this box has changed to 0 but I have to manually type 0 in the box for it to work. Leaving it empty gives me the paging file error. References that helped me find the solution: |
I will implement a change in the code to set that value to 0 if a user has not configured it ... as the default Kohya is using when it is not set is causing the issue. A work around until kohya fix this in his case I guess... |
Folder 10_yurisa: 17 images found
Folder 10_yurisa: 170 steps
max_train_steps = 1700
stop_text_encoder_training = 0
lr_warmup_steps = 170
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --pretrained_model_name_or_path="F:/Stable Diffusion/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors" --train_data_dir="F:/Stable Diffusion/loratrain/train/images" --resolution=512,640 --output_dir="F:/Stable Diffusion/loratrain/train/models" --logging_dir="F:/Stable Diffusion/loratrain/train/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=1e-4 --unet_lr=0.0001 --network_dim=8 --output_name="yurisa" --lr_scheduler_num_cycles="10" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="170" --train_batch_size="1" --max_train_steps="1700" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale
Save as...
prepare tokenizer
F:/Stable Diffusion/loratrain/train/1.json
Use DreamBooth method.
prepare images.
found directory F:\Stable Diffusion\loratrain\train\images\10_yurisa contains 34 image files
340 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 1
resolution: (512, 640)
enable_bucket: False
[Subset 0 of Dataset 0]
image_dir: "F:\Stable Diffusion\loratrain\train\images\10_yurisa"
image_count: 34
num_repeats: 10
shuffle_caption: False
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
is_reg: False
class_tokens: yurisa
caption_extension: .caption
[Dataset 0]
loading image sizes.
100%|████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 4264.29it/s]
prepare dataset
prepare accelerator
Using accelerator 0.15.0 or above.
load StableDiffusion checkpoint
loading u-net:
loading vae:
loading text encoder:
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [00:05<00:00, 3.29it/s]
import network module: networks.lora
create LoRA network. base dim (rank): 8, alpha: 1.0
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
CUDA SETUP: Loading binary F:\Stable Diffusion\loratrain\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll...
use 8-bit AdamW optimizer | {}
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 340
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 170
num epochs / epoch数: 10
batch size per device / バッチサイズ: 1
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 1700
steps: 0%| | 0/1700 [00:00<?, ?it/s]epoch 1/10
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\SJSM\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "F:\Stable Diffusion\loratrain\kohya_ss\train_network.py", line 1, in
from torch.nn.parallel import DistributedDataParallel as DDP
File "F:\Stable Diffusion\loratrain\kohya_ss\venv\lib\site-packages\torch_init.py", line 129, in
raise err
OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "F:\Stable Diffusion\loratrain\kohya_ss\venv\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.
I tried the following methods and none of them solved the problem:
1、Pin virtual memory(D:) to 200G to 200G in advanced system settings
2、Installation mwxKTEtelILoIbMbruuM.zip (RTX4080)
3、Reinstall multiple times
The text was updated successfully, but these errors were encountered: