CCTM build error

Hi all,
I’m trying to build CCTM in CMAQv5.2.1 using gcc version 4.8.5 and getting the following error:

x_ppm.F:261.132:

_W0, DRCN_E)
1
Error: There is no specific subroutine for the generic ‘noop_comm’ at (1)
x_ppm.F:262.132:

_W1, DRCN_W)
1
Error: There is no specific subroutine for the generic ‘noop_comm’ at (1)
make: *** [x_ppm.o] Error 1
ERROR while running make command

Online it looks like this error occurs when the arguments passed into a function are not of the correct type, e.g. an array where there should be a scalar. I found the problematic SUBST_COMM function in x_ppm.F in $CMAQ_REPO/CCTM/src/hadv/yamo, but am not sure how or if I should edit it.

Or, do you think this could be an issue with my compiler?

Thank you,
Elyse

The serial build is broken in v5.2.1 (both gcc and intel compilers). It has been fixed in v5.3 beta, which is available on GitHub. Alternatively, compile in parallel by uncommenting the line “set ParOpt” in the build script.

1 Like

Dear @cjcoats , @lizadams @cgnolte, @wong.david-c, @dazhong.yin

I got this error while building CCTM:

-DSUBST_GLOBAL_LOGICAL=SE_GLOBAL_LOGICAL -DSUBST_LOOP_INDEX=SE_LOOP_INDEX -DSUBST_SUBGRID_INDEX=SE_SUBGRID_INDEX -DSUBST_PE_COMM="./PE_COMM.EXT" -DSUBST_CONST="./CONST.EXT" -DSUBST_FILES_ID="./FILES_CTM.EXT" -DSUBST_EMISPRM="./EMISPRM.EXT" -DSUBST_RXCMMN="/home/catalyst/Desktop/Build_WRF/CMAQ-5.2/CCTM/src/MECHS/saprc99_ae6_aq/RXCM.EXT" -DSUBST_RXDATA="/home/catalyst/Desktop/Build_WRF/CMAQ-5.2/CCTM/src/MECHS/saprc99_ae6_aq/RXDT.EXT" -DSUBST_PACTL_ID="./PA_CTL.EXT" -DSUBST_PACMN_ID="./PA_CMN.EXT" -DSUBST_PADAT_ID="/PA_DAT.EXT" -DSUBST_MPI=“mpif.h” DEPV_DEFN.F
DEPV_DEFN.F:147:12:

      USE RXNS_DATA           ! chemical mechanism data
        1

Fatal Error: Can’t open module file ‘rxns_data.mod’ for reading at (1): No such file or directory
compilation terminated.
Makefile:402: recipe for target ‘DEPV_DEFN.o’ failed
make: *** [DEPV_DEFN.o] Error 1
ERROR while running make command

else if ( 0 ) then
endif
mv Makefile Makefile.gcc7.3.0
if ( -e Makefile.gcc7.3.0 && -e Makefile ) rm Makefile
ln -s Makefile.gcc7.3.0 Makefile
if ( 0 != 0 ) then
if ( -e /home/catalyst/Desktop/Build_WRF/CMAQ-5.2/CCTM/scripts/BLD_CCTM_36km_gcc7.3.0/CCTM_36km.cfg ) then
mv CCTM_36km.cfg.bld /home/catalyst/Desktop/Build_WRF/CMAQ-5.2/CCTM/scripts/BLD_CCTM_36km_gcc7.3.0/CCTM_36km.cfg
exit

Kindly help. Thanks

Catalyst

Hi Catalyst,

It looks like (correct me if I am wrong) you are trying to build a coupled model but compiling the model with the bldit script. With the two-way model, you can’t do that.

Is there a rxns_data.mod in your build-directory?

If not, do make RXNS_DATA_MODULE.o and try make again.

Thanks for your response. However, I’m not building a coupled model or two-way model.

Catalyst

Thanks for the response. There’s no rxns_data.mod in my build_directory. When I tried to do what you suggested, it says “No rule to make target…”

Hi Catalyst,

Could you remove all *.o *.mod and try it again and send me the log file that captures the entire compiling process as well as the Makefile?

Dear @wong.david-c, @lizadams , @bbaek, @cjcoats, @dazhong.yin , @tlspero ,
I have solved the above error. However, I got the error below while running cctm model:

ls -l /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
-rwxr-xr-x 1 catalyst catalyst 10970696 Oct 21 00:04 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
size /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
text data bss dec hex filename
5106669 4544560 7101288 16752517 ff9f85 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
unlimit
limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 4096
memorylocked 16384 kbytes
maxproc 63329
maxlocks unlimited
maxsignal 63329
maxmessage 819200
maxnice 0
maxrtprio 0
maxrttime unlimited
set MPI = /home/catalyst/Desktop/Build_WRF/LIBRARIES/openmpi/bin
set TASKMAP = /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/machines
cat /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/machines
n001:8
set MPIRUN = /home/catalyst/Desktop/Build_WRF/LIBRARIES/openmpi/bin/mpirun
/home/catalyst/Desktop/Build_WRF/LIBRARIES/openmpi/bin/mpirun -n 8 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
/home/catalyst/Desktop/Build_WRF/LIBRARIES/openmpi/bin/mpirun: Error: unknown option “-n”
0.001u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w
mkdir -p /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/cctm_36km/logs/2016031
mv: No match.
@ i = 30 + 1
end
while ( 31 < 31 )
date
Mon Oct 21 14:35:11 CST 2019
exit

The last section of the log file is posted above. I use openmpi.

Thanks in anticipation.

Catalyst

@wong.david-c, @lizadams, @dazhong.yin , @bbaek ,

Below is the last part of my CCTM script:

#> Executable call for multiple PE, set location of MPIRUN script
#set MPIRUN = /usr/bin/mpiexec
set MPI = /home/catalyst/Desktop/Build_WRF/LIBRARIES/openmpi/bin
#set TASKMAP = $BASE/machines
#cat $TASKMAP
set MPIRUN = $MPI/mpirun
#time $MPIRUN -np $NPROCS $BLD/$EXEC
#nohup time $MPIRUN -n $NPROCS WORKDIR/{EXEC}.sh < /dev/null
nohup time $MPIRUN --np $NPROCS BLD/{EXEC} < /dev/null

time mpirun -r ssh -np $NPROCS $BLD/$EXEC

#set MPIRUN = /public/software/mpi/mpich/3.2/intel/bin/mpiexec
#set TASKMAP = $BASE/machines
#set TASKMAP = /home/catalyst/Desktop/Build_WRF/CMAQ-5.2/CCTM/scripts/machines
#cat $TASKMAP

time $MPIRUN -machinefile $TASKMAP -np $NPROCS $BASE/$EXEC.sh < /dev/null

time mpirun -np $NPROCS $BASE/$EXEC.sh

time $MPIRUN -launcher rsh -launcher-exec /usr/bin/rsh -f $TASKMAP -n $NPROCS BASE/{EXEC}.sh < /dev/null

#nohup time $MPIRUN -n $NPROCS WORKDIR/{EXEC}.sh < /dev/null
#nohup time $MPIRUN -n $NPROCS BASE/{EXEC}.sh < /dev/null

ibrun BASE/{EXEC} -envall

mkdir -p {OUTDIR}/logs/{STDATE}
mv ./CTM_LOG_???.{CTM_APPL} {OUTDIR}/logs/${STDATE}

#setenv NEW_START true

gzip ${EMIS_1}

@ i = $i + 1
end

date
exit

Thanks

Catalyst

@lizadams, @bbaek, @cjcoats, @tlspero, @wong.david-c, @hogrefe.christian , @cgnolte, @eyortizd

This is the recent error I got:

ls -l /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
-rwxr-xr-x 1 catalyst catalyst 11193024 Oct 22 14:24 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
size /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
text data bss dec hex filename
4913497 4758192 7102728 16774417 fff511 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc
unlimit
limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 4096
memorylocked 16384 kbytes
maxproc 63329
maxlocks unlimited
maxsignal 63329
maxmessage 819200
maxnice 0
maxrtprio 0
maxrttime unlimited
set MPIRUN = mpiexec
set TASKMAP = /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/machines
cat /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/machines
n001:8
mpirun -np 8 /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc.sh

mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/BLD_saprc99_ae6_aq/CCTM_saprc99_ae6_aq_Linux4_x86_64gcc.sh
Node: catalyst-Precision-Tower-3620

while attempting to start process rank 0.

8 total processes failed to start
0.014u 0.011s 0:00.02 100.0% 0+0k 0+16408io 4pf+0w
mkdir -p /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/cctm_36km/logs/2016031
mv: No match.
@ i = 30 + 1
end
while ( 31 < 31 )
date
Tue Oct 22 17:14:35 CST 2019
exit

It seems the problem is due to openmpi. Your help will be greatly appreciated.

Thanks and regards.

Below is the updated error I got:

It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00824] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00825] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00826] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00827] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00828] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00829] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


Primary job terminated normally, but 1 process returned
a non-zero exit code… Per user-direction, the job has been aborted.


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
–> Returned value Error (-1) instead of ORTE_SUCCESS

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[catalyst-Precision-Tower-3620:00830] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
–> Returned value -1 instead of OPAL_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here’s some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
–> Returned “Error” (-1) instead of “Success” (0)


mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[25693,1],0]
Exit code: 1

mkdir -p /home/catalyst/Desktop/Build_WRF/CMAQv5.0.2/scripts/cctm/cctm_36km/logs/2016031
mv: No match.
@ i = 30 + 1
end
while ( 31 < 31 )
date
Tue Oct 22 09:51:15 CST 2019
exit

Kindly help, please. It seems its an openmpi problem.

Thanks

Catalyst

The Error is here: Error: unknown option “-n”
Based on this, it looks like your mpirun command is expecting a different option to specify the number of processors.
Please change the command to use -np rather than -n

1 Like

@lizadams,

I have resolved it and it’s working perfectly now.

Thanks and regards,

Catalyst

1 Like