I’m trying to run the CMAQv5.3.2 Benchmark case and running into the following error:
*** ERROR in INIT3/INITLOG3 ***
Error opening log file on unit 99
I/O STATUS = 9
DESCRIPTION: permission to access file denied, unit 99, file /nasa/intel/impi/2018.3.222/compilers_and_libraries_2018.3.222/linux/mpi/intel64/lib/CTM_LOG_000.v532_intel_Bench_2016_12SE1_20160701
MPT ERROR: could not run executable.
I’m using an intel compiler, so the I/O STATUS means that INIT3/INITLOG3 are trying to write to a location that I don’t have permission to; in this case, the shared directory of the computer’s MPI module. WORKDIR and LOGDIR are both defined correctly in the run script. It seems like maybe the WORKDIR is being reset somewhere during execution of CCTM_v532.exe.
I’m re-installing CMAQ because the computer I’m using recently changed their OS and had to update their available modules, so I had to rebuild netcdf, netcdff, ioapi, and CMAQ. I built this new CMAQ_HOME from the same CMAQ_REPO that worked on the old OS. I thought it may be a bug with the new MPI module (e.g. the MPI tries to use its own library directory as the working directory), but a system administrator that I spoke to said that they have not heard of anyone else having this issue.
Is there a way to change the default location where the CTM_LOGs are initially written? I’ve looked in RUNTIME_VARS.F, where it seems that the path of the log files is defined (line 337). But I’m having some trouble interpreting how to define the variables used by GET_ENV (and get_env_mod.f90).
I suspect I might also have to change the default write location of the FLOOR* files, since those by default are also written to WORKDIR.
A quick check suggests to me that WORKDIR does not have much meaning. The script does not cd to it, and WORKDIR is not an environment variable that is referenced anywhere in the code, as far as I can tell.
The version of the benchmark run script that is distributed has commands to cd to the CCTM/scripts directory, and that is where the CTM_LOG files are written (initially); they are moved to LOGDIR after the end of the run. Please ensure that your run script contains a line to cd to some directory where you have write access. You can add a
pwd command prior to the line that calls the executable to be sure.
Have you made any modifications to the run script? Are you using SLURM?
If you are starting in a writable directory, then maybe your mpirun wrapper is doing something strange.
Use the command
which mpirun within the script to determine the location of the script that is actually being invoked.
You might also try compiling the model in serial and running on a single processor.
I looked into all of these questions and the run script sets everything correctly, including the directory I should be in and which mpirun I’m using. The serial build had the same issue, strangely.
However, I found a solution. One of the sysadmin helped me set up the MPI module in my own working directory, so now I have write access to my MPI_LIB_DIR. The CTM_LOG* files are still writing there, but at least now the simulation will complete. I just need to manually move all CTM_LOG* to the output directory after running.
Thank you for your reply!
While I’m glad you got the model run to complete, it makes no sense that a serial build, invoked directly and without using mpirun, should attempt to write to MPI_LIB_DIR.
I suspect some kind of library mismatch, possibly involving different versions of the IOAPI library and compiled module files. At runtime on your compute nodes, are you picking up the same versions of the libraries? If you’re using modules, you might add a
module list command prior to the line that calls the executable. Alternatively, add lines that echo $LD_LIBRARY_PATH.