CGRID file was not written

Hi,

I am learning to run CMAQv5.3 for a single day (01/01/2015) using a domain 12US1. I got two errors in the log: one is a syntax error and another one is “CGRID file was not written”. I uploaded the run file and log file here run_cctm_2015_12US1_459X299.csh (31.9 KB) cctm.log.20150101.txt (5.0 KB). I would so appreciate it if someone can take a look at it and help me solve it.

Thanks,
Qian

Unfortunately, the log does not include enough information to help: evidently, something in the model is failing, but that part of the model is not following standards by logging the nature of the failure and then calling M3EXIT.

At a first cut, try seeing if this is a permissions-problem: do you have permission to write to the directory containing the CGRID file? Is there already a CGRID file there for this run (if so, you need to delete it)?
Beyond that, I don’t know ;-(

First, the failures are related.

The syntax error is between lines 96 and 253. Because it references the standard in, my guess is that a command that includes a | is not working correctly. That would includes lines 117 through 119 and 238. Lines 117 through 119 should grab the NCOLS and NROWS from the GRIDDESC file. They would be used to calculate the number of cells, which CMAQ reports as blank on line 81 of your log.

This means that either the GRIDDESC is not at the location specified or that the GRID_NAME is not in the GRIDDESC. I’ve reduced the problem to an easily repeatable subset of commands that can reproduce the syntax error. Paste the following commands in a script and update the GRIDDESC path. Then run to re

#!/bin/csh

set GRIDDESC = ../inputs/GRIDDESC
set GRID_NAME = 12US1
set NZ = 35
set NX = `grep -A 1 ${GRID_NAME} ${GRIDDESC} | tail -1 | sed 's/  */ /g' | cut -d' ' -f6`
set NY = `grep -A 1 ${GRID_NAME} ${GRIDDESC} | tail -1 | sed 's/  */ /g' | cut -d' ' -f7`
set NCELLS = `echo "${NX} * ${NY} * ${NZ}" | bc -l`
echo ${NZ} ${NY} ${NX} ${NCELLS}

When the GRIDDESC is correctly pointing to a valid GRIDDESC file and GRID_NAME is a grid in the GRIDDESC, it works fine. If I use a GRID_NAME that is not in the GRIDDESC, I get the syntax error. If I point to a missing GRIDDESC file, I get the syntax error and a grep error.

I notice in your script that your GRID_NAME is set to “2015_12US1_459_299”. Does your GRIDDESC have that domain?

application called MPI_Abort(MPI_COMM_WORLD, 16) - process 13

In general, one should always look for the first error message.
The above line says that the model aborted in process 13. Check the CTM_LOG_013 file, and it will likely point you to why the model aborted.

Thanks for your suggestion, @barronh! I updated the path and this syntax error disappeared.

Thanks, Chris! I was able to track my error based on your suggestions. However, after I solved some issued, I met another problem that is shown in the pic. I did not find any “error” in those log files.

Then I used 2 processors to run it to check the error in Log files, but still did not find any error. The log files are attached here for review. CTM_LOG_000.v53_intel_2015_12US1_459X299_20150101.txt (8.9 KB) CTM_LOG_001.v53_intel_2015_12US1_459X299_20150101.txt (9.9 KB)

You say you are using two processors, but your screenshot indicates the model thinks 32 processors are available. If you are in fact using only two processors, you should modify NPCOL_NPROW accordingly so that NPCOL*NPROW = 2.

The screenshot was 32 processors but then I reduced it to 2 and also modified the NPCOL_NPROW in the *csh file. I got the same error. The screenshot below is showing the result using 2 processors.

It’s hard to tell what is wrong, but it appears to me that you have a general problem with your computing environment and mpi. Were you able to follow the benchmark tutorial?

Yes, I ran the benchmark case successfully. Should I try to run it again to check the environment?

I actually ran the benchmark again just now and it went smoothly. Is it possible something wrong with the input files? I am using the met data from CMAS data warehouse, emis data generated by myself using SMOKE, icbc file generated by MCIP using the profile. However, based on current info in logs, I had no idea which file I should check first.

Thanks,
Qian

The problem is probably the emissions – it usually is – but there certainly should be a more helpful error message. If possible, can you run on the same domain using a previously-tested emissions data set?

I do not have any emission data available for 2015. Can I find any data for 2015 somewhere? If not, maybe I can run it for 2016 using the data from CMAS data warehouse and let you know if there is anything wrong with input file.

Hi Chris,

I was able to run on the same domain using 2016 emission data. It seems that there might be something wrong with my emission input. However, is there any method that I can use to check my emission data instead of running SMOKE again directly?

Thanks,
Qian

“Might” be something wrong with the emissions? I think you’ve shown that pretty conclusively! The weird thing is that there is something so catastrophically wrong that it is crashing your model run rather than aborting with an error message. That is unusual.

What is the output of ncdump -h <filename>? Are you able to visualize the file in ncview or VERDI? What if you use m3stat on the file?

Thanks for your suggestions, Chris! I will check the emission files carefully once again.

Thanks,
Qian

I found where the “==d== NO ncols nrows” message is coming from. It is in centralized_io_module.F. In the version of the code in our repo, that’s in a call to m3exit, so should be properly logged, but in the version of the code on our public GitHub, it is followed by a stop command – which explains why the model stops without further logging.

That still does not explain why the model is not properly setting ncols and nrows in one case versus another. Was there anything else different in your run script?

Regardless, let’s start from the beginning. I recommend you try the latest release, CMAQv5.3.1. Verify you can run the benchmark tutorial case with that version of the code.
Once you have that working, try your 2015 12US1 case again. If it still crashes, post a new thread. Include your main log file. If the log references a particular processor, then include the log file from that processor as well.