Much of the information below has been exracted from information available through the web at MHPCC or IBM SP man pages by Jeff Nichols (ja_nichols@pnl.gov).
For NWChem support mail nwchem-support@emsl.pnl.gov or visit the NWChem homepage.
MHPCC uses LoadLeveler for scheduling batch use of the machine. You log on to one of the SP2 interactive nodes (tsunami.sp2.mhpcc.edu) and from there proceed to launch
You need to know about just a few facts and commands to get going.
Each node of the SP is a Power2 cpu with varying amounts of physical memory and local scratch disk (named /localscratch) see the table below. The O/S and I/O buffers consume about 17 MB (estimate), and the NWChem executable is about 7 MB. MHPCC provides temporary disk space for all users in two locations:
You should note that MHPCC will regularly remove "old" files from the temporary directories. File removal is based upon the last time the file was used/accessed. The schedule for when files are removed is subject to change. Currently, the schedule is:
There are only a couple of commands which will give you per node activity information; "jmstat" and "jm_status -Pv". These commands tell you activity on the SP (by job and by node). Examples of information retrieved from these commands is given below.
fr2n07% jmstat Job started Nodes PID Title User Mar_25_09:49:52 1 17407 LoadLeveler vnatoli Mar_25_09:49:54 1 18609 LoadLeveler tang Mar_25_09:50:16 1 17805 LoadLeveler apsteffe Mar_25_09:50:20 1 21931 LoadLeveler jjyoh Mar_25_09:50:27 1 22109 LoadLeveler apsteffe Mar_25_09:50:27 1 22407 LoadLeveler swilke Mar_25_09:50:54 3 23759 LoadLeveler kairys Mar_25_09:51:25 1 20436 LoadLeveler vnatoli Mar_25_09:53:35 1 22047 LoadLeveler apsteffe Mar_25_09:55:53 1 19344 LoadLeveler petrisor Mar_25_09:55:56 1 22998 LoadLeveler petrisor Mar_25_10:15:47 16 22545 LoadLeveler daypn Mar_25_10:15:47 8 14361 LoadLeveler ansaria Mar_25_10:45:32 1 17596 LoadLeveler mgomez Mar_25_10:45:32 1 19436 LoadLeveler sinkovit Mar_25_10:45:39 1 21947 LoadLeveler keesh Mar_25_10:46:04 1 15970 LoadLeveler mgomez Mar_25_13:05:44 32 17034 LoadLeveler rlee Mar_25_13:25:35 5 18803 LoadLeveler hyun Mar_25_14:24:19 8 16199 LoadLeveler zhong Mar_25_14:53:32 64 17314 LoadLeveler calhoun Mar_25_15:03:04 1 19428 LoadLeveler gardnerk Mar_25_15:14:55 8 15820 LoadLeveler mws Mar_25_15:15:59 8 16138 LoadLeveler ansaria Mar_25_15:25:13 16 21248 LoadLeveler bogusz Mar_25_15:33:43 1 20652 LoadLeveler gardnerk Mar_25_15:35:45 3 19149 LoadLeveler kairys
fr2n07% jm_status -Pv Pool 0: Free_for_all_pool Subpool: GENERAL Node: fr1n05.mhpcc.edu Node: fr1n06.mhpcc.edu Node: fr1n07.mhpcc.edu Node: fr1n08.mhpcc.edu Node: fr1n09.mhpcc.edu Node: fr1n10.mhpcc.edu Node: fr1n11.mhpcc.edu Node: fr1n12.mhpcc.edu Node: fr1n13.mhpcc.edu Node: fr1n14.mhpcc.edu Node: fr1n15.mhpcc.edu Node: fr1n16.mhpcc.edu Node: fr2install1.mhpcc.edu Node: fr2n04.mhpcc.edu Node: fr2n05.mhpcc.edu Node: fr2n06.mhpcc.edu Node: fr2n07.mhpcc.edu Node: fr2n08.mhpcc.edu Node: fr2n09.mhpcc.edu Node: fr2n10.mhpcc.edu Node: fr2n11.mhpcc.edu Node: fr2n12.mhpcc.edu Node: fr2n13.mhpcc.edu Node: fr2n14.mhpcc.edu Node: fr2n15.mhpcc.edu Node: fr2n16.mhpcc.edu Pool 1: LoadLeveler Subpool: BATCH Node: fr3install1.mhpcc.edu Job 108: time_allocated=Mon_Mar_25_10:45:32_1996 description=LoadLeveler requestor=sinkovit requestor_pid=19436 requestor_node=fr3n01.mhpcc.edu Adapter type=ETHERNET Usage: cpu=SHARED adapter=SHARED virtual task ids: 0 Node: fr3n02.mhpcc.edu ...
The configuration of the MHPCC SP2 as of 22 March 1996
---------------------------------------------------------------------------- LOCAL- NODE MEM SCRATCH MIN MAX TIME CLASS/USE #NODES TYPE MB GB PROC PROC LIMIT FRAMES ---------------------------------------------------------------------------- bigmem 5 wide 1024 2.0 1 5 8 hr 28 (n07-n15) ---------------------------------------------------------------------------- large 128 thin 128 1.0 64 128 8 hr 5,6,19,20,23,24,25,26 1 thin 128 1.0 64 128 8 hr 18 ---------------------------------------------------------------------------- medium 32 wide 256 2.0 8 64 4 hr 16,21 22,27 48 thin 64 .25 8 64 4 hr 10,11,12 ---------------------------------------------------------------------------- long 8 wide 256 2.0 1 32 24 hr 15 15 thin 128 1.0 1 32 24 hr 18 16 thin 64 .25 1 32 24 hr 9 8 thin 64 .25 1 32 24 hr 3 (n01-n08) ---------------------------------------------------------------------------- small_long 16 thin 128 1.0 1 8 8 hr 17 ---------------------------------------------------------------------------- small_short 16 thin 64 .25 1 8 2 hr 4 8 wide 256 1.0 1 8 2 hr 7 8 thin 64 .25 1 8 2 hr 3 (n09-n16) ---------------------------------------------------------------------------- Interactive 27 thin 64 .25 n/a n/a n/a 1,2 Only ---------------------------------------------------------------------------- Staff 16 thin 64 .25 n/a n/a n/a 29 (reserved) ---------------------------------------------------------------------------- Training 16 thin 64 .25 n/a n/a n/a 30 (reserved) ----------------------------------------------------------------------------
Reserved Nodes: fr1n01,n02,n03,n04 fr2n02 fr8n01,n03,n05,n07,n09,n11,n13,n15 fr28n01, fr28n03, fr28n05 fr29n01 - n16 fr30n01 - n16 fr13n01,n03,n05,n07,n09,n11,n13,n15 fr14n01,n03,n05,n07,n09,n11,n13,n15 Node Sharing Among Classes: fr17n01-fr17n16 are shared between small_long (primary) and small_short fr28n07-fr28n15 are shared between bigmem (primary) and medium
Interactive parallel jobs are executed using IBM's Parallel Operating Environment (poe). This is IBM's environment for developing and running distributed memory, parallel Fortran, C or C++ programs.
In order to execute a parallel program, you need to:
The POE executables are usually located in the directory /usr/lpp/poe/bin. There may be symbolic links pointing to them from /usr/bin or some other location. This may vary from system to system.
To determine if the POE executables are in your path, use a command such as which poe or whence. If the poe executable can not be found, then you will need to include the directory /usr/lpp/poe/bin in your path. For example:
set path = ($path /usr/lpp/poe/bin)
This can be done by typing at the Unix prompt or adding it to one of your startup files (.cshrc, .profile or .login):
If the POE executables are found with the which or whence command, you need to do nothing to your path.
Copy the .rhosts file supplied in this directory to your home directory.
POE includes three compile scripts which will automatically link in the necessary POE libraries and then call the native IBM Fortran, C or C++ compiler (xlf, cc, CC); mpxlf for Fortran, mpcc for C, and mpCC for C++. All three compiler scripts can take -ip or -us as options. The -ip flag causes the IP CSS to be statically bound with the executable. Communication during execution will use the Internet Protocol. The -us flag causes the US CSS library to be statically bound with the executable. This library uses the User Space protocol for dedicated use of the high-performance switch adapter. If neither flag is set, then no CSS library will be linked at compile time. Instead it will be linked dynamically with the executable at run time. The library which will be linked is determined by the MP_EUILIB environment variable.
Options include all valid options available with the native compiler (xlf, cc, CC). There are numerous compile options available with the IBM Fortran, C and C++ compilers, many of which can dramatically improve performance. Users are advised to consult the IBM documentation (e.g., man pages) for details.
There are many environment variables and command line flags that you can set to influence the operation on PE tools and the execution of parallel programs. A complete discussion and list of the PE environment variables can be found in the IBM AIX Parallel Environment Operation and Use manual. They can also be reviewed (in less detail) in the POE man page.
Environment variables may be set on the shell command line or placed within your shell's "dot" files (.cshrc, .profile). Alternately, they may be put into a file which is "sourced" prior to execution. The relevant variables are found in the accompanying .cshrc (which can be included in your own).
PE environment variables can be overridden by supplying the appropriate flag when the program executable is invoked. See the POE man page for details.
The host list file is not required if you let the Resource Manager allocate which nodes your job uses (MP_HOSTFILE set to NULL or ""). This is preferred. The sophisticated user can see the appropriate documentation to do otherwise.
Once the environment is setup and the executables are created, invoking the executables is relatively easy.
For single program multiple data (SPMD) programs, simply issue the name of the executable, specifying any command line flags that may be required. Command line flags may be used to temporarily override any MP environment variables that have been set. See the POE man page for a complete listing of flags.
Interactive jobs are straightforward if you utilize NFS input, output, and data files. If you wish to use local file systems (which is more efficient) it gets a bit more complicated. Perl scripts have been written by MHPCC staff (Lon Waters) which make the interactive job scripts resemble LoadLeveler (batch queueing) scripts. These will be discussed in the context of running NWChem after the LoadLeveler discussion below.
POE jobs typically require both CPU and communications adapter resources. The manner in which a job uses these two resources effects both job performance and whether or not other users can run jobs on the same node.
CPU Usage: may be either "unique" or "multiple"
Communications Adapter Usage: may be either "shared" or "dedicated"
Best performance is usually obtained running with US communications when CPU use is "unique" and the adapter is "dedicated".
The best "good neighbor" policy is for all POE jobs to run with IP communications using the defaults of CPU use "multiple" and adapter "shared".
For the most part, users do not need to change the default settings for CPU usage and communications adapter usage. One instance where the default might be changed concerns the use of US communications in an interactive pool of nodes shared by many users. In this case, it might be considered "good neighbor policy" to set MP_CPU_USE to multiple so that at least other IP jobs can run also.
LoadLeveler is a batch job scheduling application and program product of IBM. It provides the facility for building, submitting and processing batch jobs within a network of machines. It attempts to match job requirements with the best available machine resources. It can schedule serial or parallel (PVMe, PVM, MPL, MPI) jobs. It provides a graphical user interface called xloadl for job submission and monitoring.
The entire collection of machines available for LoadLeveler scheduling is called a "pool". Every machine in the pool has one or more LoadLeveler daemons running on it. There is one Central Manager machine for the LoadLeveler pool whose principal function is to coordinate LoadLeveler related activities on all machines in the pool; maintaining status information on all machines and jobs, making decisions on where jobs should be run, etc. Other machines in the pool may be used to: submit jobs, execute jobs, or schedule submitted jobs (in cooperation with Central Manager)
Every LoadLeveler job must be submitted through a job command file. LoadLeveler will not directly accept executable (a.out) files. Only after defining a job command file, may a user submit the job for scheduling and execution. LoadLeveler scripts resemble shell scripts in appearance. It contains LoadLeveler statement lines with LoadLeveler keywords that describe the job to run, comment lines (not executed), and may contain csh, ksh, or sh lines if desirable. LoadLeveler keywords specify job information such as: executable name, class (queue), resource requirements, input/output files, number of processors required, job type (serial, parallel, pvm3), etc.
With the LoadLeveler GUI (xloadl), jobs can be submitted by using the "File" menu on any of the 3 xloadl windows.
In addition to submitting the job, there are LoadLeveler commands available to monitor and change characteristics of the job. For example,
In the directory /u/nichols/nwchem/contrib/ibm_sp@mhpcc are several files
Append or copy the dot files to the dot files (.rhosts and .cshrc) in your login directory. Copy the scripts for running nwchem (LLnwchem and intnwchem) into your login directory.
Copy one of the example input files into your login directory (e.g., /u/nichols/nwchem/contrib/ibm_sp@mhpcc/examples/scf_h2o.nw - input for a conventional SCF calculation on water).
"llsubmit LLnwchem"
/u/nichols/nwchem/bin/SP1/nwchem scf_h2o.nw >& scf_h2o.out -rmpool 0 -procs 4
This is under development and currently not supported (4/10/96) Modify intnwchem as appropriate (e.g., change "nichols" to your user id, etc. Launch the job using the perl script poesubmit.
"poesubmit intnwchem"
Report NWChem problems, suggestions, feedback, etc., to nwchem-support@emsl.pnl.gov or using the WWW support form.
There is a mailing list for NWChem users that can be used for announcements and discussion amoung users. To subscribe send email to majordomo@emsl.pnl.gov with the body "subscribe nwchem-users". You can do this with the following command
echo subscribe nwchem-users | mail majordomo@emsl.pnl.gov
To post a message to the mailing list send it to nwchem-users@emsl.pnl.gov.