Could not open CTM_CONC_1 error

I am running a two-way coupled WRF-CMAQ (WRFv4.5.1, CMAQv5.5) simulation. The model completes a 10-day spin-up in December successfully but fails at the beginning of January 1. I suspect an input or restart alignment issue but have not identified the cause. The simulation fails after ~60 seconds.

I have verified that the following input files exist, align with the simulation dates/timing, and do not include erroneous nans: wrfbdy, wrffdda, wrfsdda, wrfinput, wrflowinp, ndown. I’ve also looked at all emission inputs (inln, stackgroups, emis_mole, merged_dates) for the new year, verifying the files are correctly linked, reflect the new year, and do not include erroneous values.

Due to permission restrictions, I have upload the rsl.out file, rsl.error file, module lists, and my submission scripts to my Github linked here. I have included my grep -i error rsl.out.0000 output below, however, I will note that the grep errors occur during successful runs and am not convinced they are the root of the problem here.

If anyone has suggestions on additional troubleshooting steps, I would appreciate any guidance you can offer.

-VL

rsl.out.0000 tail

 CONCENTRATION FILE   ---<|
=================================================

CTM_AELMO_1 :
/projects/b1045/rcs_support/wrf-cmaq/WRF_DIR/work_dir/
WRFv451_CMAQv55/CCTM/scripts/WRFCMAQ-BASE_202301_D02-sw_feedback/
CCTM_AELMO_4.5.155_BASE_202301_D02_20230101.nc

>>--->> WARNING in subroutine OPEN3
File not available.

Value for IOAPI_CHECK_HEADERS:  N returning FALSE
Value for IOAPI_OFFSET_64:  YES returning TRUE

"CTM_AELMO_1" opened as NEW(READ-WRITE )

Starting date and time  2023001:000000
Timestep 010000

CTM_CONC_1 :
/projects/b1045/rcs_support/wrf-cmaq/WRF_DIR/work_dir/
WRFv451_CMAQv55/CCTM/scripts/WRFCMAQ-BASE_202301_D02-sw_feedback/
CCTM_CONC_4.5.155_BASE_202301_D02_20230101.nc

>>--->> WARNING in subroutine OPEN3
File not available.

Could not open CTM_CONC_1 for update - try to open new

rsl.error.0000 tail

d01 2023-01-01_00:01:40 --> TOP OF DIAGNOSTICS PACKAGE
d01 2023-01-01_00:01:40 --> CALL DIAGNOSTICS PACKAGE: ACCUMULATED AND BUCKET DIAGNOSTICS
d01 2023-01-01_00:01:40  call HALO_RK_E
d01 2023-01-01_00:01:40 calling inc/HALO_EM_E_5_inline.inc
d01 2023-01-01_00:01:40  call HALO_RK_MOIST
d01 2023-01-01_00:01:40 calling inc/HALO_EM_MOIST_E_5_inline.inc
d01 2023-01-01_00:01:40  call HALO_RK_SCALAR
d01 2023-01-01_00:01:40 calling inc/HALO_EM_SCALAR_E_5_inline.inc
d01 2023-01-01_00:01:40  call end of solve_em

grep -i error rsl.out.0000

d01 2023-01-01_00:00:00  NetCDF error: NetCDF: Attribute not found
d01 2023-01-01_00:00:00  NetCDF error in ext_ncd_get_dom_ti.code REAL, line          83  Element GMT
d01 2023-01-01_00:00:00  NetCDF error: NetCDF: Attribute not found
d01 2023-01-01_00:00:00  NetCDF error in ext_ncd_get_dom_ti.code INTEGER, line          83  Element JULYR
d01 2023-01-01_00:00:00  NetCDF error: NetCDF: Attribute not found
d01 2023-01-01_00:00:00  NetCDF error in ext_ncd_get_dom_ti.code INTEGER, line          83  Element JULDAY
Binary file rsl.out.0000 matches

I am unsure, but noticed warnings about FDDA files not being of the correct date.

**WARNING** Time in input file not equal to time on domain **WARNING**
 **WARNING** Trying next time in file wrfsfdda_d02 ...

I wonder if this environment variable cont_from_spinup_run should be set to true instead of false in your run script.

Change:

set cont_from_spinup_run =        F   # indicates whether a wrf spinup run prior to the twoway model run

to:

set cont_from_spinup_run =        T   # indicates whether a wrf spinup run prior to the twoway model run

Hi Vlang,

I couldn’t find the mistake in your files. In my own work, I run simulations from December to January without issue—by keeping them continuous.. Try running one continuous simulation from December through January without stopping. Since part1.csh runs successfully, you can simply extend it to cover the full period.

Xu

Hi Ether!

Thank you for your reply. I originally run the model through December and restart January 1st because my emission point source stackgroup files have different dates, and the model fails. However, I did try to modify my emissions and run continuously but am having the same error unfortunately.

Other things that I have tried since my post:
(1) Following @lizadams suggestion, I modified my FDDA file dates to align with the run time of the CMAQ run, but the result was the same despite less inline warnings.
(2) Because the model runs successfully through the spin up in December, I made some dummy input files using December data and m3shifting them to represent January 1st (e.g. emissions: 2D, inline, stack group files, MCIP: METCRO, GRIDCRO, etc., and BC). My goal was to try to isolate if the issue stemmed from specific input files. However, the model is still hitting some kind of error in its work flow, regardless of using input files that have previously been successful.
3) I also generated a new CGRID file, in case that may also cause an issue. The Slurm output includes this error when it called the CGRID file from 12/31, so I again made a dummy file using 12/30 CGRID file and M3shift. The Slurm error is as follows:

** Runscript Detected an Error: CGRID file was not written. **
** Runscript Detected an Error: CGRID file was not written. **
echo **   This indicates that CMAQ was interrupted or an issue   **
**   This indicates that CMAQ was interrupted or an issue   **
echo **   exists with writing output. The runscript will now     **
**   exists with writing output. The runscript will now     **
echo **   abort rather than proceeding to subsequent days.       **
**   abort rather than proceeding to subsequent days.       **
echo **************************************************************

I am currently trying to isolate any other causes for the model to crash at this step and would welcome any ideas of other approaches I could use to debug, or isolate any issues with other met input files, which are largely the only ones I have not looked at yet.

Thank you again!
-VL

Hi Vlang,

Please send me (wong.david-c@epa.gov) all the rsl.* files as well as your run scripts (for December and January case).

Cheers, David