Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lammps with CPU MD #225

Closed
Amitcuhp opened this issue Nov 14, 2023 · 4 comments
Closed

Lammps with CPU MD #225

Amitcuhp opened this issue Nov 14, 2023 · 4 comments

Comments

@Amitcuhp
Copy link

After I have train the model by develop branch, lammps md not working and the following error comes :-

LAMMPS (28 Mar 2023 - Development)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task

The 'box' command has been removed and will be ignored

Reading data file ...
triclinic box = (0 0 0) to (30 30 30) with tilt (0 0 0)
1 by 1 by 2 MPI processor grid
reading atoms ...
55 atoms
Finding 1-2 1-3 1-4 neighbors ...
special bond factors lj: 0 0 0
special bond factors coul: 0 0 0
0 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
0 = max # of 1-4 neighbors
1 = max # of special neighbors
special bonds CPU = 0.001 seconds
read_data CPU = 0.002 seconds
CUDA unavailable, setting device type to torch::kCPU.
CUDA unavailable, setting device type to torch::kCPU.
Loading MACE model from "MACE_model_run-123_swa.model-lammps.pt" ...Loading MACE model from "MACE_model_run-123_swa.model-lammps.pt" ...terminate called after throwing an instance of 'c10::Error'
what(): open file failed because of errno 2 on fopen: , file path: MACE_model_run-123_swa.model-lammps.pt
Exception raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:21 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fdaa5ed556e in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7fdaa5e9ff18 in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::string const&) + 0x124 (0x7fda8db24634 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #3: caffe2::serialize::FileAdapter::FileAdapter(std::string const&) + 0x2e (0x7fda8db2468e in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x5a (0x7fda8db22ada in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device, std::unordered_map<std::string, std::string, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, std::string> > >&) + 0x2a5 (0x7fda8ebd0b85 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device) + 0x7b (0x7fda8ebd139b in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #7: torch::jit::load(std::string const&, c10::optionalc10::Device) + 0xa5 (0x7fda8ebd1475 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #8: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x96e54f]
frame #9: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4ab633]
frame #10: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b0ace]
frame #11: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b1395]
frame #12: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x491bf8]
frame #13: __libc_start_main + 0xf5 (0x7fda89260555 in /lib64/libc.so.6)
frame #14: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x493047]

terminate called after throwing an instance of 'c10::Error'
what(): open file failed because of errno 2 on fopen: , file path: MACE_model_run-123_swa.model-lammps.pt
Exception raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:21 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fa3c2e0656e in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7fa3c2dd0f18 in /home/Raman/lammps-mace-cpu/libtorch/lib/libc10.so)
frame #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::string const&) + 0x124 (0x7fa3aaa55634 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #3: caffe2::serialize::FileAdapter::FileAdapter(std::string const&) + 0x2e (0x7fa3aaa5568e in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x5a (0x7fa3aaa53ada in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device, std::unordered_map<std::string, std::string, std::hashstd::string, std::equal_tostd::string, std::allocator<std::pair<std::string const, std::string> > >&) + 0x2a5 (0x7fa3abb01b85 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::string const&, c10::optionalc10::Device) + 0x7b (0x7fa3abb0239b in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #7: torch::jit::load(std::string const&, c10::optionalc10::Device) + 0xa5 (0x7fa3abb02475 in /home/Raman/lammps-mace-cpu/libtorch/lib/libtorch_cpu.so)
frame #8: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x96e54f]
frame #9: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4ab633]
frame #10: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b0ace]
frame #11: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x4b1395]
frame #12: /home/Raman/lammps-mace-cpu/lammps/build/lmp() [0x491bf8]
frame #13: __libc_start_main + 0xf5 (0x7fa3a6191555 in /lib64/libc.so.6)

@wcwitt
Copy link
Collaborator

wcwitt commented Nov 14, 2023

It's hard to say much without seeing your LAMMPS input. But my first guess would be to try atom_style atomic.

@Amitcuhp
Copy link
Author

Sorry sir, it is working fine.
I have train now using the develop branch and it is working fine.

Now i am using the cpu version.

GPU version is not installting. Dont know how to solve but I will try.
Can you comment sir
gcc cuda cudnn which version should be used to gpu installtion.
Because I have tried with cuda-11.0.2 gcc-8.5 cudnn-8.0-11.0 It has some undefined reference to GLIB_2.27

@wcwitt
Copy link
Collaborator

wcwitt commented Nov 20, 2023

I'm not certain, but if I had to guess I would say those versions are sufficient. Do you have a system administrator who can help with building code?

@Amitcuhp
Copy link
Author

They have tried, but not able to compile GPU version !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants