Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORNL Summit : execution error test w150.inp with GPUs #53

Open
5 tasks done
hrmoncada opened this issue Jun 23, 2021 · 0 comments
Open
5 tasks done

ORNL Summit : execution error test w150.inp with GPUs #53

hrmoncada opened this issue Jun 23, 2021 · 0 comments

Comments

@hrmoncada
Copy link

hrmoncada commented Jun 23, 2021

  • Summary of the issue and expected results.
    ORNL Summit ERROR : I was able to install gamess on Summit and build the gamess.00.x.
    To test my installacion I run a small job. On which, I got errors even though I got results. Could anyone explain me what these errors means?

  • Description of the run environment:

  • Module list
login2:gamess_dev_test_1$ module list
Currently Loaded Modules:
  1) hsi/5.0.2.p5    7) python/3.6.6-anaconda3-5.3.0    13) hdf5/1.10.4
  2) xalt/1.2.1      8) spectrum-mpi/10.3.1.2-20200121  14) cmake/3.18.2
  3) lsf-tools/2.0   9) essl/6.2.0-20190419             15) git/2.20.1
  4) DefApps        10) fftw/3.3.8                      16) papi/5.7.0
  5) gcc/6.4.0      11) boost/1.66.0                    17) tau/2.29.1
  6) cuda/10.1.243  12) nsight-compute/2021.1.0         18) hpctoolkit/2021.03.01
  • Request al allocation
login2:gamess_dev_test_1$ bsub -nnodes 1 -W 1:00 -P CHM135 -Is /bin/bash
  • Submit a job using one node and 2 GPUs
bash-4.2$ ./rungms-dev w150.inp 00 2
source /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/install.info
setenv GMS_PATH /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1
setenv GMS_BUILD_DIR /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1
setenv GMS_TARGET ibm64
setenv GMS_HPC_SYSTEM_TARGET summit
setenv GMS_FORTRAN gfortran
setenv GMS_GFORTRAN_VERNO 6.4
setenv GMS_XL_PATH /sw/summit/xl/16.1.1-5
setenv GMS_MATHLIB essl
setenv GMS_MATHLIB_PATH /sw/summit/essl/6.2.0-20190419/essl/6.2
setenv GMS_DDI_COMM mpi
setenv GMS_MPI_LIB spectrum
setenv GMS_MPI_PATH /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb
setenv GMS_MSUCC false
setenv GMS_LIBCCHEM true
setenv GMS_LIBCCHEM_GPU_SUPPORT true
setenv GMS_CCHEM_HF false
setenv GMS_CCHEM_MP2 false
setenv GMS_CCHEM_RI false
setenv GMS_CCHEM_CC false
setenv GMS_HDF5_PATH /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/hdf5-1.10.4-qj7qoumocmjj2xkwbwotazo22sugeq5o
setenv GMS_CUDA_BOARD volta
setenv GMS_CUDA_PATH /sw/summit/cuda/10.1.243
setenv GMS_LIBCCHEM_LIBINT false
setenv GMS_LIBCCHEM_LIBS -ldl
setenv GMS_EIGEN_PATH /gpfs/alpine/chm135/proj-shared/hmoncada/eigen/build/include/eigen3
setenv GMS_BOOST_PATH /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/boost-1.66.0-l6qu5nrynyuksm4ex7gswb6rsvvhsj42
setenv GMS_GA_PATH /gpfs/alpine/chm135/proj-shared/hmoncada/ga/build
setenv GMS_PHI none
setenv GMS_SHMTYPE sysv
setenv GMS_OPENMP false
setenv GMS_LIBXC false
setenv GMS_FPE_FLAGS 
unset echo
LSF has assigned the following compute nodes to this run:
g33n12
Input file supplied : w150.inp
----- GAMESS execution script 'rungms.interactive' -----
This job is running on host batch3
under operating system Linux at Wed Jun 23 14:25:40 EDT 2021 with ga communication mode
Available scratch disk space (Kbyte units) at beginning of the job is
Filesystem           1K-blocks           Used       Available Use% Mounted on
alpine         242309426315264 99195279228928 143114147086336  41% /gpfs/alpine
GAMESS temporary binary files will be written to /gpfs/alpine/scratch/hmoncada/chm135/1105314
GAMESS supplementary output files will be written to /gpfs/alpine/scratch/hmoncada/chm135
removed '/gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.F05'
removed '/gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.nodes.mpd'
removed '/gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.processes.mpd'
Copying input file w150.inp to your run's scratch directory...
cp w150.inp /gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.F05
unset echo
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.casino': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.dmn': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.cim': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.cosmo': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.pot': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.gamma': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.efp': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.dip': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.hs1': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.hs2': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.dat': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.qmw': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.rst': No such file or directory
rm: cannot remove '/gpfs/alpine/scratch/hmoncada/chm135/w150.trj': No such file or directory
-----debug----
HOSTFILE /gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.nodes.mpd contains
batch3
--------------
-----debug----
PROCFILE /gpfs/alpine/scratch/hmoncada/chm135/1105314/w150.processes.mpd contains
--------------
-----debug libcchem----
the execution path is
/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/bin /opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/bin /sw/sources/lsf-tools/2.0/summit/bin /sw/summit/xalt/1.2.1/bin /sw/summit/tau/2.29.1/ibm64linux/bin /autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/papi-5.7.0-7ycgvcx5n3zk2h4dz3koxjzl72exp2le/bin /autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/git-2.20.1-6zhngdgjqjq4qhp5lxfz6czu3qc2b5lh/bin /autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.18.2-cirtl5oah4d6bequfcoji6jbetertrna/bin /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/hdf5-1.10.4-qj7qoumocmjj2xkwbwotazo22sugeq5o/bin /sw/summit/nsight-compute/2021.1.0 /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/fftw-3.3.8-7p6nny234obzuczdtyyea4t4uxej777s/bin /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/bin /sw/summit/python/3.6/anaconda3/5.3.0/bin /sw/summit/cuda/10.1.243/bin /sw/summit/gcc/6.4.0/bin /sw/sources/hpss/bin /opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/etc /opt/ibm/csm/bin /usr/local/bin /usr/bin /usr/local/sbin /usr/sbin /opt/ibm/flightlog/bin /opt/ibutils/bin /opt/ibm/spectrum_mpi/jsm_pmix/bin /opt/puppetlabs/bin /usr/lpp/mmfs/bin /opt/puppetlabs/bin /usr/lpp/mmfs/bin
the library path is
/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/hdf5-1.10.4-qj7qoumocmjj2xkwbwotazo22sugeq5o/lib:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/boost-1.66.0-l6qu5nrynyuksm4ex7gswb6rsvvhsj42/lib:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib:/sw/summit/cuda/10.1.243/lib64:/sw/summit/cuda/10.1.243/lib:/autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/libcchem/lib:/opt/ibm/spectrumcomputing/lsf/10.1.0.9/linux3.10-glibc2.17-ppc64le-csm/lib:/autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/papi-5.7.0-7ycgvcx5n3zk2h4dz3koxjzl72exp2le/lib:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/hdf5-1.10.4-qj7qoumocmjj2xkwbwotazo22sugeq5o/lib:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/boost-1.66.0-l6qu5nrynyuksm4ex7gswb6rsvvhsj42/lib:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/fftw-3.3.8-7p6nny234obzuczdtyyea4t4uxej777s/lib:/sw/summit/essl/6.2.0-20190419/essl/6.2/lib64:/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib:/sw/summit/cuda/10.1.243/lib64:/sw/summit/gcc/6.4.0/lib64:/sw/summit/xl/16.1.1-3/lib:/opt/ibm/spectrum_mpi/jsm_pmix/lib
The dynamically linked libraries for this binary are
/autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x :

ldd: /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x: No such file or directory
--------------
MPI kickoff will start GAMESS on 2 cores in 1 nodes.
LIBCCHEM will generate threads on all other cores in each node.
LIBCCHEM will run threads on 1 GPUs per node.
LIBCCHEM's control setting for CCHEM is devices=0;memory=50g
The binary to be executed is /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
The scratch disk space on each node is /gpfs/alpine/scratch/hmoncada/chm135/1105314, with free space
Filesystem           1K-blocks           Used       Available Use% Mounted on
alpine         242309426315264 99195279228928 143114147086336  41% /gpfs/alpine
jsrun -n 1 -c 2 -a 2 -g 1 /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
Error (No such file or directory) executing process: /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
Error (No such file or directory) executing process: /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
  • Output : Test result are built on scratch. Here is the folder output
 login5:1105314$ ll

total 35K
drwxrwxr-x 2 hmoncada hmoncada 4.0K Jun 23 14:25 .
drwx------ 8 hmoncada hmoncada 4.0K Jun 23 14:21 ..
-rw-rw-r-- 1 hmoncada hmoncada  25K Jun 23 14:25 w150.F05
-rw-rw-r-- 1 hmoncada hmoncada    7 Jun 23 14:25 w150.nodes.mpd
-rw-rw-r-- 1 hmoncada hmoncada    0 Jun 23 14:25 w150.processes.mpd
* login5:1105314$ vi w150.F05
$contrl scftyp=rhf  runtyp=energy icharg=0 ispher=1 maxit=30 $end
 $contrl nprint=-5 $end
 $system mwords=300 memddi=300 $end
 $basis  gbasis=pcSeg-0 $end
 $scf    dirscf=.true. $end
 $data
cyclic AMP...PM3 geometry...PM3 E= -148.9399327109, R M S= 0.0000063
C1
OXYGEN  8.0   -4.3996600    1.0763700    7.7009100
  HYDROGEN  1.0   -3.9596600    0.2663700    7.3109100
  HYDROGEN  1.0   -4.4296600    0.9963700    8.7009100
  OXYGEN  8.0   -6.2496600    3.4563700    1.8409100
  HYDROGEN  1.0   -6.8696600    3.8863700    1.1809100
  HYDROGEN  1.0   -5.3896600    3.9663700    1.8709100
  OXYGEN  8.0   -1.7196600    3.2263700    6.2709100
  HYDROGEN  1.0   -1.3296600    2.5663700    5.6309100
  HYDROGEN  1.0   -2.0796600    4.0063700    5.7709100
  OXYGEN  8.0  -17.8096600    2.7463700    6.2009100
  HYDROGEN  1.0  -18.6296600    2.3363700    5.7909100
  HYDROGEN  1.0  -18.0696600    3.2963700    6.9909100
  OXYGEN  8.0    1.2103400    5.6563700   -1.3290900
  HYDROGEN  1.0    1.9103400    6.1963700   -1.7890900
  HYDROGEN  1.0    1.6303400    4.9263700   -0.7990900
  OXYGEN  8.0  -14.9396600   -1.9436300   -5.4990900
  HYDROGEN  1.0  -14.7196600   -2.8136300   -5.0490900
  HYDROGEN  1.0  -14.4896600   -1.1936300   -5.0090900
  OXYGEN  8.0   -5.3996600    9.8463700   -2.7090900
  HYDROGEN  1.0   -5.6996600   10.6463700   -3.2190900
  HYDROGEN  1.0   -5.8396600    9.8363700   -1.8090900
  OXYGEN  8.0   -6.3496600    5.5263700   -6.1090900
.
.
.  OXYGEN  8.0    7.5603400   11.8163700    3.9509100
  HYDROGEN  1.0    7.6703400   12.5663700    4.6009100
  HYDROGEN  1.0    7.0603400   12.1463700    3.1509100
  OXYGEN  8.0    2.9703400   12.1063700  -10.1190900
  HYDROGEN  1.0    2.4503400   12.8963700   -9.7790900
  HYDROGEN  1.0    3.3303400   11.5863700   -9.3390900
  OXYGEN  8.0    8.0203400  -12.1036300   -6.5290900
  HYDROGEN  1.0    7.6303400  -12.0136300   -5.6090900
  HYDROGEN  1.0    8.1703400  -13.0636300   -6.7290900
  OXYGEN  8.0    3.1903400    9.0563700   -3.4090900
  HYDROGEN  1.0    4.1003400    8.9163700   -3.7990900
  HYDROGEN  1.0    2.5003400    8.7263700   -4.0490900
  OXYGEN  8.0    7.3103400  -12.4236300   -3.5890900
  HYDROGEN  1.0    6.5903400  -11.7636300   -3.4090900
  HYDROGEN  1.0    6.9603400  -13.3536300   -3.4290900
  OXYGEN  8.0  -11.0696600    3.1363700    9.8309100
  HYDROGEN  1.0  -11.7696600    2.6863700    9.2809100
  HYDROGEN  1.0  -10.7296600    3.9463700    9.3509100
  OXYGEN  8.0    4.0703400   -0.0936300   12.4509100
  HYDROGEN  1.0    3.9103400    0.2063700   13.3909100
  HYDROGEN  1.0    3.6903400   -1.0036300   12.3109100
$end
  • ERROR : Copy from above
Filesystem           1K-blocks           Used       Available Use% Mounted on
alpine         242309426315264 99195279228928 143114147086336  41% /gpfs/alpine
jsrun -n 1 -c 2 -a 2 -g 1 /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
Error (No such file or directory) executing process: /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x
Error (No such file or directory) executing process: /autofs/nccs-svm1_home1/hmoncada/repos_GAMESS/GAMESS/gamess_dev_test_1/gamess.cchem.00.x 
  • it looks that I need to build this file gamess.cchem.00.x How or where i can find information about?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant