MCIP and CCTM issue for CMAQ version 5.2.1

Hy,

  1. I’ve point source emission files (ptegu) where NCOLS =1, NROWS=8531, XORIG = -2508000, YORIG = -1716000. On the other hand, in my mcip files NCOLS =200, NROWS=120, XORIG = -528000, YORIG = -1128000. Emission files and modeling domain have the same resolution of 12000. In order to run CMAQ, do I need to window my point source emission files so that NCOLS, NROWS, XORIG and YORIG exactly match with the MCIP domain’s NCOLS, NROWS, XORIG, and YORIG?

  2. If yes, in order to windowing I need LOROW, LOCOL, HICOL, HIROW values. Also, I know
    LOCOL = 1 + NINT(DX)
    LOROW = 1 + NINT( DY)
    HICOL = LOCOL + NCOLS2 - 1
    HIROW = LOROW + NROWS - 1

in the above equation, what is NINT?

Thanks
Rasel

If you look, you’ll also see that the point source emissions files are of type “custom” – they’re column-vectors which have to be multiplied by a “gridding matrix” in order to get gridded data. You will, of course, need to be using a gridding matrix which inputs a 8531-entry vector and outputs onto your 200x120 grid…

NINT is the “round to integer” function.

Hy,

Can you please explain a bit more how can I get gridded matrix and how do I multiply that with column vectors? Also, can you give an example? I’ve followed the as usual process of using m3wndw (inputed infile, outfile, LOCOL, LOROW, HICOL and HIROW) and it gave me an error:

 >>--->> WARNING in subroutine XTRACT3
 Error in col-bounds specification for file INFILE           variable ALL
 M3WARN:  DTBUF 12:00:00  June 21, 2020 (2020173:120000)
 ERROR:  Read-failure
 Lower bound:                  166
 Upper bound:                  365
 Actual number of cols:        1


 *** ERROR ABORT in subroutine M3WNDW
 Failure in program

Thanks
Rasel

That’s what SMOKE program GRDMAT does.

You do not need to window the inline point source emission file(s) STK_EMIS_### and STK_GRPS_### prepared for a larger CMAQ domain when running CMAQ for a smaller domain as long as both domains share the same projection. The assignment of the point sources to the CMAQ grid cell is computed from the XLOCA and YLOCA coordinates in the STK_GRPS_### file and the grid origin and cell size values for the domain you are running.

Hy,

I’ve done windowing on emission files (emis_mole_all…*) and didn’t window point source emission files. I’m not seeing any grid inconsistency in log files. Still, I’m seeing my model crashed at the end of June 04, 2020 (22:55:00 UTC June 04, 2020). If I run the model for day 03/ day 02 or day 01, 2020(for example) it works fine. Please see the attached log files and scripts. Is there anything I’m doing wrong?

run_cmaq_rasel.sh.txt (797 Bytes)
run_cctm_rasel.csh_sbatch.txt (29.5 KB)
CTM_LOG_003.v52_gcc_AQF5X_20200604.txt (533.0 KB)
CTM_LOG_002.v52_gcc_AQF5X_20200604.txt (533.0 KB)
CTM_LOG_001.v52_gcc_AQF5X_20200604.txt (532.7 KB)
LOG.error.txt (1.0 MB)

Thanks
Rasel

Your LOG.error.txt file indicates the following cause of the model crash:

CTM_LOG.txt-zzz2 28 8565.70 8707.92 -142.23 8683.65 1.390E-04 1.478E-04
CTM_LOG.txt-zzz2 29 8565.70 8562.15 3.55 8569.66 1.478E-04 1.483E-04
CTM_LOG.txt-zzz2 30 8565.70 8412.67 153.03 8452.82 1.483E-04 1.416E-04
CTM_LOG.txt-zzz2 31 8565.70 8212.34 353.36 8296.25 1.416E-04 1.256E-04
CTM_LOG.txt-zzz2 32 8565.70 8059.32 506.38 8176.99 1.256E-04 1.043E-04
CTM_LOG.txt-zzz2 33 8565.70 7482.49 1083.20 7726.51 1.043E-04 -1.943E-05
CTM_LOG.txt-zzz2 34 8565.70 8861.09 -295.39 8802.93 -1.943E-05 -4.629E-06
CTM_LOG.txt-zzz2 35 8565.70 8655.12 -89.42 8642.21 -4.629E-06 -1.722E-12
CTM_LOG.txt-
CTM_LOG.txt-
CTM_LOG.txt: *** ERROR ABORT in subroutine ZADVYPPM on PE 049
CTM_LOG.txt- vert adv soln failed at 225500 with adv step: 000500 HHMMSS Max Iterations = 30
CTM_LOG.txt- PM3EXIT: DTBUF 22:55:00 June 4, 2020
CTM_LOG.txt- Date and time 22:55:00 June 4, 2020 (2020156:225500)

Given that the error occurred on PE 049, the end of the log file for that processor (CTM_LOG_049.v52_gcc_AQF5X_20200604.txt) may contain additional information.

While I don’t have much experience with this, the error message indicates a problem with your meteorological fields. You could try reducing your current advection time step from 5 minutes to maybe 2 or 1 minutes to see if this helps avoid the crash. This can be done by reducing the value for environment variable CTM_MAXSYNC from 300 to 120 or 60. You might also want to investigate the simulated wind speeds for 23:00:00 to see if there are any strong gradients potentially causing this problem in zadvppmwrf.F

1 Like

Hy,

I’m running my model domain for 1 month, 11 days (May 21- June 30, 2020). After running the job as a new start, the job runs fine on our cluster for up to 5 days and gives me output until June 25th, 2020. After 5 days of job run, due to timeout, the job gets terminated. I want to restart the job run from June 26th using the CGRID file of June 25th. However, my job gets terminated right away, which I’m not figuring out why. The error is saying:

Error opening file at path-name:
netCDF error number -51 processing file “BNDY_GASC_S”
NetCDF: Unknown file format
NetCDF: Unknown file format

In my script, I’m not sure which file location I should give there. Or is there something else that caused the error?
log.txt (3.2 MB)
run_cctm_rasel.csh_sbatch.txt (29.5 KB)

Thanks
Rasel

Here’s the problem: netCDF maintains various internal in-memory data structures for the files. If you close them properly using NF_CLOSE or if you synch them to disk using NF_SYNCH, then these data structures are correctly flushed to disk, and the file is properly readable. Otherwise, when that “job gets terminated”, it corrupts the file, as you see here.

By the way, that’s the reason for the coding standard that says “Always use M3EXIT to terminate program execution”: M3EXIT calls NF_CLOSE for all the files you have open, thus making them readable.

On the other hand, see https://cjcoats.github.io/ioapi/BUFFERED.html#vol: if you declare a file “volatile” by adding a trailing "-v" to the setenv for that file:

setenv QUX "/tmp/mydir/volatiledata.mymodel -v"

then the I/O API will flush the file after every operation (at the cost of some performance).

BTW, this should have been a new topic, not a continuation of the previous one…

I’m sorry. I’m a beginner. I didn’t understand it properly. Can you please let me know how to correct my script? I had this same error in my other LOG files and they ran properly. If I do New Start, it doesn’t cause any problem. When I do the restart then it causes the problem.

It looks to me like this issue has to do with the Direct Decoupled Method (DDM) for obtaining model sensitivities. Are you intending to use DDM? It is an advanced tool, and not something I would recommend for a beginner.
If you do want to use DDM and continue to have difficulties, please start a new thread.

So for all output files, your script should use the “volatile”
form:

setenv <filename> "<path> -v"

and then these files will always be readable, even if the program crashes or if the queue-manager kills your run.

Hello,
I’m trying to run my script with a new start. However, for some reason my script goes to day 2 skipping day 1 run and looks for cgrid file. This gives me some sort of error. Can you please take a look into my log files and script and let me know what’s going on?

Regard,
Rasel
jCMAQ-5.2-NODE052-56584.out.txt (293.8 KB)
jCMAQ-5.2-NODE052-56584.err.txt (50.2 KB)
CTM_LOG_001.v52_gcc_AQF5X_20200102.txt (4.5 KB)
CTM_LOG_001.v52_gcc_AQF5X_20200101.txt (71.1 KB)
run_cctm_rasel_all.csh_sbatch.txt (27.9 KB)

Look at the end of your CTM_LOG file for 20200101.

 B3GRD           :/scratch/mrasel/data/cmaq/tempcmaqdata/land/b3grd_AQF5X_2016fh_16j.ncf
 
 >>--->> WARNING in subroutine OPEN3
 File not available.
 
 Could not open file "B3GRD".
 
 *** ERROR ABORT in subroutine TMPBEIS312
 Could not open file "B3GRD".

Hello,

I’ve set setenv CTM_BIOGEMIS N since I don’t have that file. However, still the job gets terminated. Can you please take a look?

Regards
RaselCTM_LOG_001.v52_gcc_AQF5X_20200101.txt (61.4 KB)
CTM_LOG_001.v52_gcc_AQF5X_20200102.txt (4.1 KB)

There is no error shown here. Especially for new users, I recommend directing the standard output and error streams to the same file.
As I said previously, please create a new thread rather than posting a completely unrelated error in an old thread. Be as specific as you can as to what you are trying to do.
You appear to be running with the Direct Decoupled Method. Please, first ensure that your base simulation can proceed. Then if your DDM simulation fails you can be more specific when you ask for help.

Hi ykaore,
I get the same error message. Did you solve the problem?