Skip to content

XSEDE intro (by Y.K. Liu)

qzhu2017 edited this page Sep 19, 2017 · 1 revision

Login

The XSEDE requires users to use Multi-Factor Authorization (MFA) with the Duo app, see details. Basically you need a phone to log in.

  • If you are using Xshell, don't configure the account and password in new connection. just fill in the host as [login.xsede.org]
  • You can also use Xshell's local terminal as well, by typing this code and follow the instruction on https://portal.xsede.org/web/xup/single-sign-on-hub

ssh -l XUPusername login.xsede.org

File transfer

GUI

Use Globus which has a GUI, Instructions @ https://portal.xsede.org/software/globus

Modules

Find module

  spider -r           '^p'          Finds all the modules that start with `p' or `P'
  spider -r           mpi           Finds all modules that have "mpi" in their name.
  spider -r           'mpi$         Finds all modules that end with "mpi" in their name.

Load module

  load | add          module [...]  load module(s)
  try-load | try-add  module [...]  Add module(s), do not complain if not found
  del | unload        module [...]  Remove module(s), do not complain if not found

  list                              List loaded modules
  list                s1 s2 ...     List loaded modules that match the pattern

Save and load

  save | s                          Save the current list of modules to a user defined "default" collection.
  save | s            name          Save the current list of modules to "name" collection.
  reset                             The same as "restore system"
  restore | r                       Restore modules from the user's "default" or system default.
  restore | r         name          Restore modules from "name" collection.
  restore             system        Restore module state to system defaults.
  savelist                          List of saved collections.
  describe | mcc      name          Describe the contents of a module collection.

File system on XStream

Home file system

Each user on XStream has a home directory referenced by $HOME with a quota limit of 5GB (not purged). It is a small and low performance NFS storage space used to keep scripts, binaries, source files, small log files, etc.

The $HOME filesystem is accessible from any node in the system.

The $HOME directory is not intended to be used for computation. The Lustre parallel file system $WORK is much larger and faster, thus much more suited for computation.

Each project has a shared home directory referenced by $GROUP_HOME. Like $HOME, it is a NFS storage space used to store small files shared by all members of your primary POSIX group (usually your primary project).

Note: in $GROUP_HOME, only the owner of the files can delete them.

Important note on home backups: The system doesn't come with any backup system, however user and group home directories are backed up every night by Stanford Research Computing. Contact the XSEDE helpdesk in order to recover any lost files. We also recommend that you periodically back up your files outside of XStream.

Work file system

Work is a Lustre file system mounted on /cstor on any node in the system. This parallel file system has multiple purposes:

perform fast large I/Os store large computational data files allow multi-node jobs to write coherent files Each user has a work directory referenced by $WORK with a quota limit of 1TB which is not purged. Each project has a shared work directory referenced by $GROUP_WORK on the same file system with a group quota limit of 50TB. This space is shared by all members of the project.

Note: in $GROUP_WORK, only the owner of the files can delete them. User and group quota values are not cumulative, ie. the first limit reached takes precedence.

Important note on work backup: The parallel file system work is not replicated nor backed up.

Local scratch

A local SSD-based scratch space is available on each compute node (NOT on login nodes). It is made of 3 x Intel SSD (MLC) aggregated using Linux dm-raid for a total of 480 GB per node (447 GB usable) and intended for high IOPS local workload.

To access this local scratch space, please use the $LSTOR or $TMPDIR environment variables. This space will be purged when the compute node reboots or when this space becomes full.

Running jobs

https://portal.xsede.org/stanford-xstream#running

Install PLUMED and GROMACS

Upload via Globus

Download files from PLUMED and GROMACS websites. Beware of version numbers. GROMACS have 2017.x versions and PLUMED only support some of 5.x (in terminal type plumed patch -p and you will find out.) Upload files to XStream using Globus.

Load modules

Currently Loaded Modules:

  1. GNU/4.9.2-2.25
  2. icc/2015.5.223-GNU-4.9.2-2.25
  3. ifort/2015.5.223-GNU-4.9.2-2.25
  4. impi/5.0.3.049
  5. imkl/11.2.4.223
  6. GSL/2.1
  7. libtool/2.4.2
  8. libunistring/0.9.3
  9. pkg-config/0.27.1
  10. libffi/3.0.13
  11. Guile/1.8.8
  12. byacc/20150711
  13. libmatheval/1.1.11
  14. Tcl/8.6.4
  15. SQLite/3.8.10.2
  16. Tk/8.6.4-no-X11
  17. Python/2.7.10
  18. Boost/1.59.0-Python-2.7.10
  19. CUDA/7.5.18
  20. GCC/4.9.2-binutils-2.25
  21. zlib/1.2.8
  22. flex/2.5.39
  23. Bison/3.0.4
  24. GMP/6.0.0a
  25. ncurses/5.9
  26. libreadline/6.3
  27. bzip2/1.0.6
  28. binutils/2.25

Build PLUMED and patch and build GROMACS

tar zxvf gromacs-5.1.4.tar.gz
tar zxvf plumed-2.3.2.tgz
# below build plumed
cd plumed-2.3.2
./configure
make -j 4
make install prefix=$HOME/opt
cd ../gromacs-5.1.2
cd plumed-2.3.2/
# below set ENV
source sourceme.sh
# below patch GROMACS
cd
cd gromacs-5.1.4
plumed patch -p
# below build patched GROMACS
mkdir build
cd build
cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=on -DGMX_MPI=on -DCMAKE_INSTALL_PREFIX=$HOME/opt
make && make check
make install

Result of make check

      Start  1: TestUtilsUnitTests
 1/26 Test  #1: TestUtilsUnitTests ...............   Passed    0.12 sec
      Start  2: GmxlibTests
 2/26 Test  #2: GmxlibTests ......................   Passed    0.03 sec
      Start  3: MdlibUnitTest
 3/26 Test  #3: MdlibUnitTest ....................   Passed    0.03 sec
      Start  4: CommandLineUnitTests
 4/26 Test  #4: CommandLineUnitTests .............   Passed    0.27 sec
      Start  5: FFTUnitTests
 5/26 Test  #5: FFTUnitTests .....................   Passed    0.13 sec
      Start  6: MathUnitTests
 6/26 Test  #6: MathUnitTests ....................   Passed    0.03 sec
      Start  7: RandomUnitTests
 7/26 Test  #7: RandomUnitTests ..................   Passed    0.05 sec
      Start  8: OnlineHelpUnitTests
 8/26 Test  #8: OnlineHelpUnitTests ..............   Passed    1.87 sec
      Start  9: OptionsUnitTests
 9/26 Test  #9: OptionsUnitTests .................   Passed    0.04 sec
      Start 10: UtilityUnitTests
10/26 Test #10: UtilityUnitTests .................   Passed    0.06 sec
      Start 11: FileIOTests
11/26 Test #11: FileIOTests ......................   Passed    0.04 sec
      Start 12: SimdUnitTests
12/26 Test #12: SimdUnitTests ....................   Passed    0.10 sec
      Start 13: LegacyToolsTests
13/26 Test #13: LegacyToolsTests .................   Passed    0.24 sec
      Start 14: GmxPreprocessTests
14/26 Test #14: GmxPreprocessTests ...............   Passed    0.30 sec
      Start 15: CorrelationsTest
15/26 Test #15: CorrelationsTest .................   Passed    0.46 sec
      Start 16: AnalysisDataUnitTests
16/26 Test #16: AnalysisDataUnitTests ............   Passed    0.11 sec
      Start 17: SelectionUnitTests
17/26 Test #17: SelectionUnitTests ...............   Passed    0.33 sec
      Start 18: TrajectoryAnalysisUnitTests
18/26 Test #18: TrajectoryAnalysisUnitTests ......   Passed    0.73 sec
      Start 19: MdrunTests
19/26 Test #19: MdrunTests .......................   Passed   26.55 sec
      Start 20: MdrunMpiTests
20/26 Test #20: MdrunMpiTests ....................***Failed    0.31 sec
mpiexec_xstream-ln01.stanford.edu: cannot connect to local mpd (/tmp/xs-ykliu/mpd2.console_xstream-ln01.stanford.edu_xs-ykliu); possible causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)

      Start 21: regressiontests/simple
21/26 Test #21: regressiontests/simple ...........   Passed   14.09 sec
      Start 22: regressiontests/complex
22/26 Test #22: regressiontests/complex ..........   Passed   37.69 sec
      Start 23: regressiontests/kernel
23/26 Test #23: regressiontests/kernel ...........   Passed   65.32 sec
      Start 24: regressiontests/freeenergy
24/26 Test #24: regressiontests/freeenergy .......   Passed   10.03 sec
      Start 25: regressiontests/pdb2gmx
25/26 Test #25: regressiontests/pdb2gmx ..........   Passed   62.60 sec
      Start 26: regressiontests/rotation
26/26 Test #26: regressiontests/rotation .........   Passed   35.81 sec

96% tests passed, 1 tests failed out of 26

Label Time Summary:
GTest                 =   4.69 sec
IntegrationTest       =  26.79 sec
MpiIntegrationTest    =   0.31 sec
UnitTest              =   4.69 sec

Total Test time (real) = 257.40 sec

Although the MdrunMpiTests failed, XSTREAM is not a CPU cluster. So I guess that doesn't matter very much?

Test run

Metadynamics run from tutorial from PLUMED. Remeber to copy files in folder TOPO to folder Excercise_01.

gmx_mpi mdrun -plumed
                   :-) GROMACS - gmx mdrun, VERSION 5.1.4 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov  Herman J.C. Berendsen    Par Bjelkmar
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra   Sebastian Fritsch
  Gerrit Groenhof   Christoph Junghans   Anca Hamuraru    Vincent Hindriksen
 Dimitrios Karkoulis    Peter Kasson        Jiri Kraus      Carsten Kutzner
    Per Larsson      Justin A. Lemkul   Magnus Lundborg   Pieter Meulenhoff
   Erik Marklund      Teemu Murtola       Szilard Pall       Sander Pronk
   Roland Schulz     Alexey Shvetsov     Michael Shirts     Alfons Sijbers
   Peter Tieleman    Teemu Virolainen  Christian Wennberg    Maarten Wolf
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2015, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, VERSION 5.1.4
Executable:   /home/xsede/users/xs-ykliu/opt/bin/gmx_mpi
Data prefix:  /home/xsede/users/xs-ykliu/opt
Command line:
  gmx_mpi mdrun -plumed


Back Off! I just backed up md.log to ./#md.log.1#

Running on 1 node with total 12 cores, 12 logical cores, 1 compatible GPU
Hardware detected on host xstream-ln02.stanford.edu (the node of MPI rank 0):
  CPU info:
    Vendor: GenuineIntel
    Brand:  Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
    SIMD instructions most likely to fit this hardware: AVX_256
    SIMD instructions selected at GROMACS compile time: AVX_256
  GPU info:
    Number of GPUs detected: 1
    #0: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible

Reading file topol.tpr, VERSION 4.6.5 (single precision)
Note: file tpx version 83, software tpx version 103

NOTE: GPU(s) found, but the current simulation can not use GPUs
      To use a GPU, set the mdp option: cutoff-scheme = Verlet

Using 1 MPI process

1 compatible GPU detected in the system, but none will be used.
Consider trying GPU acceleration with the Verlet scheme!


NOTE: This file uses the deprecated 'group' cutoff_scheme. This will be
removed in a future release when 'verlet' supports all interaction forms.


Back Off! I just backed up traj.trr to ./#traj.trr.1#

Back Off! I just backed up ener.edr to ./#ener.edr.1#
starting mdrun 'Generated by trjconv : Alanine in vacuum in water t=   0.00000'
2500000 steps,   5000.0 ps.

Writing final coordinates.

Back Off! I just backed up confout.gro to ./#confout.gro.1#

               Core t (s)   Wall t (s)        (%)
       Time:      760.734      762.083       99.8
                 (ns/day)    (hour/ns)
Performance:      566.868        0.042

gcq#218: "Wild Pointers Couldn't Drag Me Away" (K.A. Feenstra)