I have successfully run CCTM in serial mode, but I have encountered problems when debugging multi node parallel mode. The error message of single node is as follows:
Value for IOAPI_ CHECK_ HEADERS: N returning FALSE
Value for IOAPI_ OFFSET_ 64: YES returning TRUE
Value for IOAPI_ CFMETA not defined;returning default: FALSE
Value for IOAPI_ CMAQMETA not defined; returning defaultval ': ‘NONE’
Value for IOAPI_ CMAQMETA not defined; returning defaultval ': ‘NONE’
Value for IOAPI_ SMOKEMETA not defined; returning defaultval ': ‘NONE’
Value for IOAPI_ SMOKEMETA not defined; returning defaultval ': ‘NONE’
PN_ CRTFIL3: Error creating PnetCDF file attribute FTYPE
netCDF error number -36 processing file "CTM_ CONC_ 1 "
NetCDF: Invalid argument
NetCDF: Invalid argument
WARNING in subroutine OPNLOG3 <<<
Warning netCDF file header attribute EXEC_ ID.
Not available for file: CTM_ CONC_1
netCDF error number -33
“CTM_ CONC_ 1” opened as NEW(READ-WRITE )
File name “/disk/Build_ WRF/CMAQ_ DATA/output/CCTM_ CONC_ v531_ gcc9.1.0_ Bench_ 2020_ SX_ 20200506.nc”
File type UNKNOWN
Grid name "
Dimensions: 0 rows, 0 cols, 0 lays, 0 vbles
NetCDF ID: 0 opened as VOLATILE READWRITE
Time-independent data.
Error flushing PnetCDF file "CTM_ CONC_ 1 "
PnetCDF error number -33
WRTFLAG: MPI_ SEND(SFLAG) error
*** ERROR ABORT in subroutine OPCONC on PE 000
Could not sync to disk CTM_ CONC_1
PM3EXIT: DTBUF 3:00:00 May 6, 2020
Date and time 3:00:00 May 6, 2020 (2020127:030000)
WARNING in subroutine SHUT3 <<<
Error closing PnetCDF file
File name: CTM_ CONC_1
PnetCDF error number -33
*** ERROR ABORT in subroutine PM3EXIT ***
Could not shut down I/O API files correctly
The error messages in the multi node test are as follows:
Fatal error in PMPI_Comm_create: Unknown error class, error stack:
PMPI_Comm_create(565)…: MPI_Comm_create(MPI_COMM_WORLD, group=0x88000001, new_comm=0x11a391f8) failed
PMPI_Comm_create(542)…:
MPIR_Comm_create_intra(207)…:
MPIR_Get_contextid_sparse_group(495):
MPIR_Allreduce_impl(293)…:
MPIR_Allreduce_intra_auto(178)…:
MPIR_Allreduce_intra_auto(84)…:
MPIR_Bcast_impl(310)…:
MPIR_Bcast_intra_auto(223)…:
MPIR_Bcast_intra_binomial(182)…: Failure during collective
Fatal error in PMPI_Comm_create: Unknown error class, error stack:
My running environment is cmaq5.3.1 + pnetcdf1.12.1 + ioapi-3.2 + mpich-3.3.2. Now I don’t know how to solve it. I hope I can get some help to realize multi node parallel running of CMAQ in cluster environment.
Thanks very much.