You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear author,
I greatly admire your research. I am a graduate student from Beijing Forestry University in China, and I would like to apply it to the field of terrain generation in images. But I encountered some problems in implementing the paired training model in this paper, and I hope you can continue to advise me.
Here are my relevant configurations and training process. It seems that the loss function has not converged and the generated results are not satisfactory.
accelerate_config:
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: 'NO'
downcast_bf16: 'no'
enable_cpu_affinity: true
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
I think your training results look worse because of two things: precision="bf16" and batch_size=1.
Could you try making the following changes to your setting:
-make sure you are using the latest version of the training code.
-increase the batch_size to 2
-do not use bf16 mixed precision training
Thank you for your reply. I will reconfigure and train according to your suggestion, but I have configured it to always burst memory when batch_size set to 2. I would like to inquire about your specific configuration.
Dear author,
I greatly admire your research. I am a graduate student from Beijing Forestry University in China, and I would like to apply it to the field of terrain generation in images. But I encountered some problems in implementing the paired training model in this paper, and I hope you can continue to advise me.
Here are my relevant configurations and training process. It seems that the loss function has not converged and the generated results are not satisfactory.
accelerate_config:
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: 'NO'
downcast_bf16: 'no'
enable_cpu_affinity: true
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
accelerate launch src/train_pix2pix_turbo.py
--pretrained_model_name_or_path="stabilityai/sd-turbo"
--output_dir="output/pix2pix_turbo/fill50k"
--dataset_folder="data/my_fill50k"
--resolution=512
--train_batch_size=1
--enable_xformers_memory_efficient_attention --viz_freq 25
--track_val_fid
--report_to "w
--tracker_project_name "pix2pix_turbo_fill50k"
The text was updated successfully, but these errors were encountered: