Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting up PyTorch plugin "upfirdn2d_plugin"... Failed! #18

Open
Kuang-Hiu opened this issue Oct 8, 2022 · 11 comments
Open

Setting up PyTorch plugin "upfirdn2d_plugin"... Failed! #18

Kuang-Hiu opened this issue Oct 8, 2022 · 11 comments

Comments

@Kuang-Hiu
Copy link

Can you help me to train model.
When I try train model with car dataset, I met this error
image

@ghost
Copy link

ghost commented Oct 8, 2022 via email

@Kuang-Hiu
Copy link
Author

Thanks for your comment. I installed Visual Studio and MSVC 14.x is required. But I still get this error

image

@SteveJunGao
Copy link
Collaborator

Hi @Kuang-Hiu

It seems VS didn't have the corresponding C++ compiler from this error information (note that the code also require nvcc to compile the custom cuda kernel, not sure is this also installed in you system)

We haven't tested the code on Windows, unfortunately, and I'm also not quite familiar with windows' compiling system. I recommend you to try to run the code on Linux, it would be much easier for running this code

@SteveJunGao
Copy link
Collaborator

Hi @Kuang-Hiu I checked the github issue from StyleGAN3

Maybe this issue is helpful for you?

@Kuang-Hiu
Copy link
Author

Thanks @SteveJunGao. I required nvcc in my system. I just update my VS form 2019 to 2022 but everything almost not done.
This is my error log.
image

@lalalune
Copy link

lalalune commented Oct 9, 2022

@Kuang-Hiu I had these issues, and several different reasons they occured

I recommend downloading and running the run.sh installer from NVIDIA for the latest CUDAversion, notably 11.8. CUDA is backwards compatible and I got it successfully working

I think generally the reason these problems arise is that the built in Linux distro of CUDA won't work with Pytorch, but does automatically set the PATH variable properly, where the run.sh installer from the website installs the proper version but doesn't seem to set the paths automatically for you.

nvidia-smi should work, if things are installed properly, and show you your GPUs.
nvcc should also be installed. nvcc --version will tell you if it is. If it's not, you might have installed, but it's not at your PATH.
You can check this by running /usr/local/cuda-<version>/bin/nvcc --version
If that works, then you've got a symlink issue.

If you nano ~/.bashrc and add the following, it will fix the nvcc linker issue if it can't find:

export PATH="/usr/local/cuda-11.8/bin:$PATH"
export LD_LIBRARY_PATH=/usr/local/cuda-11..8/lib64/$LD_LIBRARY+PATH

I had a nightmare for some reason on A100s, passing through everything up to training and then crashing at start. I switched to V100s (using CoreWeave so I could keep storage volume and just restart) and everything worked perfectly. I think this was a pytorch version / driver version issue somewhere in my system, and maybe if I switched back over having set everything properly it would work. I don't know if this is correct knowledge but maybe helpful folklore to start digging into if you're out of ideas.

@CaffeyChen
Copy link

I got the same error in CentOS. So it seems like some configure problems.

@r530044129
Copy link

NVlabs/stylegan2-ada-pytorch#11
you can try this one, it's works for me

@Kuang-Hiu
Copy link
Author

Thanks for your support. But at the moment, I face to error:
C:\Users\Admin\AppData\Local\Programs\Python\Python39\python.exe C:\Users\Admin\Documents\GitHub\Actif3D\train_3d.py
==> start
==> use shapenet dataset
==> ERROR!!!! THIS SHOULD ONLY HAPPEN WHEN USING INFERENCE
==> use image path: ./tmp, num images: 1234
==> launch training

Training options:
{
"G_kwargs": {
"class_name": "training.networks_get3d.GeneratorDMTETMesh",
"z_dim": 512,
"w_dim": 512,
"mapping_kwargs": {
"num_layers": 8
},
"one_3d_generator": true,
"n_implicit_layer": 1,
"deformation_multiplier": 1.0,
"use_style_mixing": true,
"dmtet_scale": 1.0,
"feat_channel": 16,
"mlp_latent_channel": 32,
"tri_plane_resolution": 256,
"n_views": 1,
"render_type": "neural_render",
"use_tri_plane": true,
"tet_res": 90,
"geometry_type": "conv3d",
"data_camera_mode": "shapenet_car",
"channel_base": 32768,
"channel_max": 512,
"fused_modconv_default": "inference_only",
"num_fp16_res": 0,
"conv_clamp": null
},
"D_kwargs": {
"class_name": "training.networks_get3d.Discriminator",
"block_kwargs": {
"freeze_layers": 0
},
"mapping_kwargs": {},
"epilogue_kwargs": {
"mbstd_group_size": 4
},
"data_camera_mode": "shapenet_car",
"add_camera_cond": true,
"channel_base": 32768,
"channel_max": 512,
"architecture": "skip",
"num_fp16_res": 0,
"conv_clamp": null
},
"G_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"D_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"loss_kwargs": {
"class_name": "training.loss.StyleGAN2Loss",
"gamma_mask": 40.0,
"r1_gamma": 40.0,
"style_mixing_prob": 0.9,
"pl_weight": 0.0
},
"data_loader_kwargs": {
"pin_memory": true,
"prefetch_factor": 2,
"num_workers": 3
},
"inference_vis": true,
"inference_to_generate_textured_mesh": true,
"inference_save_interpolation": false,
"inference_compute_fid": false,
"inference_generate_geo": false,
"training_set_kwargs": {
"class_name": "training.dataset.ImageFolderDataset",
"path": "./tmp",
"use_labels": false,
"max_size": 1234,
"xflip": false,
"resolution": 1024,
"data_camera_mode": "shapenet_car",
"add_camera_cond": true,
"camera_path": "./tmp",
"split": "all"
},
"resume_pretrain": "./pretrained_model/shapenet_car.pt",
"D_reg_interval": 16,
"num_gpus": 1,
"batch_size": 8,
"batch_gpu": 4,
"metrics": [
"fid50k"
],
"total_kimg": 20000,
"kimg_per_tick": 1,
"image_snapshot_ticks": 50,
"network_snapshot_ticks": 200,
"model_name": "car",
"promts": "A pink car",
"random_seed": 891,
"ema_kimg": 2.5,
"G_reg_interval": 4,
"run_dir": "/content/drive/MyDrive/Get3D/GET3D/save_inference_results\inference"
}

Output directory: /content/drive/MyDrive/Get3D/GET3D/save_inference_results\inference
Number of GPUs: 1
Batch size: 8 images
Training duration: 20000 kimg
Dataset path: ./tmp
Dataset size: 1234 images
Dataset resolution: 1024
Dataset labels: False
Dataset x-flips: False

Creating output directory...
Launching processes...
Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
Traceback (most recent call last):
File "C:\Users\Admin\Documents\GitHub\Actif3D\train_3d.py", line 391, in
main() # pylint: disable=no-value-for-parameter
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\Admin\Documents\GitHub\Actif3D\train_3d.py", line 385, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "C:\Users\Admin\Documents\GitHub\Actif3D\train_3d.py", line 149, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "C:\Users\Admin\Documents\GitHub\Actif3D\train_3d.py", line 92, in subprocess_fn
inference_3d.inference(rank=rank, **c)
File "C:\Users\Admin\Documents\GitHub\Actif3D\training\inference_3d.py", line 50, in inference
upfirdn2d._init()
File "C:\Users\Admin\Documents\GitHub\Actif3D\torch_utils\ops\upfirdn2d.py", line 28, in _init
_plugin = custom_ops.get_plugin(
File "C:\Users\Admin\Documents\GitHub\Actif3D\torch_utils\custom_ops.py", line 140, in get_plugin
torch.utils.cpp_extension.load(
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1080, in load
return _jit_compile(
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1318, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1701, in _import_module_from_library
module = importlib.util.module_from_spec(spec)
File "", line 565, in module_from_spec
File "", line 1173, in create_module
File "", line 228, in _call_with_frames_removed
ImportError: DLL load failed while importing upfirdn2d_plugin: The specified module could not be found.

Process finished with exit code 1

Anyone has solve.?

@Sravanthgithub
Copy link

!apt-get update && apt-get upgrade -y && apt-get install -y nvidia-driver-460 - This worked for me

@mahmud30tibn
Copy link

mahmud30tibn commented Feb 6, 2024

Surprisingly this worked for me. (Ran the codes for the environments)
HawkAaron/warp-transducer#15
Screenshot 2024-02-06 at 3 06 21 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants