CCTM stops running on long simulation


#1

Hi all,

ok, so on a different topic I asked how to setup a long simulation so I then proceeded to update my run_cctm.csh script and launched the simulation, however, the simulation halts after a few hours of simulation, before this run, I was able to run the model for 24 hours just fine.

I can’t find any immediate errors on the run_cctm.csh output and on the CCTM_LOG files, this is my CCTM configuration:
http://demo-staging.com/cmaq_long_run/run_cctm.csh

and these are the output/logs:
http://demo-staging.com/cmaq_long_run/run.benchmark.log
http://demo-staging.com/cmaq_long_run/CTM_LOG_001.v521_gcc_CMAQ-ASU1K_20150601
http://demo-staging.com/cmaq_long_run/CTM_LOG_002.v521_gcc_CMAQ-ASU1K_20150601
http://demo-staging.com/cmaq_long_run/CTM_LOG_003.v521_gcc_CMAQ-ASU1K_20150601
http://demo-staging.com/cmaq_long_run/CTM_LOG_004.v521_gcc_CMAQ-ASU1K_20150601
http://demo-staging.com/cmaq_long_run/CTM_LOG_005.v521_gcc_CMAQ-ASU1K_20150601

Thanks in advance for the help!


#2

For the CMAQv5.2.1 version, it is necessary to change the END_DATE if you want to do a longer duration run.

Keep NSTEPS set to one day (24 hours), as this is the number of hours in one day.

set NSTEPS = 240000 #> time duration (HHMMSS) for this run

The run script has a while loop that is set up to run the model from the START_DATE to the END_DATE.
As long as the current day is less than the STOP_DAY, then the model will continue to run.

I’ve pulled out the lines of code that control the WHILE loop for you to review:

#> Set Start and End Days for looping
setenv NEW_START TRUE #> Set to FALSE for model restart
set START_DATE = “2011-07-01” #> beginning date (July 1, 2011)
set END_DATE = “2011-07-01” #> ending date (July 14, 2011)

set STOP_DAY = date -ud "${END_DATE}" +%Y%j #> Convert YYYY-MM-DD to YYYYJJJ

This is the start of the loop:

while ($TODAYJ <= $STOP_DAY ) #>Compare dates in terms of YYYYJJJ

Before the end loop, the script increments both the Gregorian and Julian Days.

#> The next simulation day will, by definition, be a restart
setenv NEW_START false

#> Increment both Gregorian and Julian Days
set TODAYG = date -ud "${TODAYG}+1days" +%Y-%m-%d #> Add a day for tomorrow
set TODAYJ = date -ud "${TODAYG}" +%Y%j #> Convert YYYY-MM-DD to YYYYJJJ
end #Loop to the next Simulation Day


#3

Hi @lizadams

made changes to my run cctm_script.csh as advised, NSTEPS was set to 24 hours:

set NSTEPS     = 240000

and START_DATE / END_DATE are set to the simulation period, in my case, June 1st 2015 to June 20 2015:

 set START_DATE = "2015-06-01"    #> beginning date (July 1, 2011)
 set END_DATE   = "2015-06-20"    #> ending date    (July 14, 2011)

however, the model still doesn’t go past the first 24 hours. I actually think it’s crashing but my biggest issue is that the model isn’t outputting any sort of error and I’m having a hard time troubleshooting what’s the issue, anything you can think of that I could turn on or something to gather more info?

This is my current run_cctm.csh script:

In case it helps, here are the output logs:

From run_ctm.csh
http://www.demo-staging.com/cctm_run/run.benchmark.txt

From CCTM
http://www.demo-staging.com/cctm_run/CTM_LOG_001.v521_gcc_CMAQ-ASU1K_20150601.txt
http://www.demo-staging.com/cctm_run/CTM_LOG_002.v521_gcc_CMAQ-ASU1K_20150601.txt
http://www.demo-staging.com/cctm_run/CTM_LOG_003.v521_gcc_CMAQ-ASU1K_20150601.txt
http://www.demo-staging.com/cctm_run/CTM_LOG_004.v521_gcc_CMAQ-ASU1K_20150601.txt
http://www.demo-staging.com/cctm_run/CTM_LOG_005.v521_gcc_CMAQ-ASU1K_20150601.txt

Thanks in advance for your help!


#4

@bbaek and @ever.barreto I am seeing this WARNING at the end of this CMAQ run in the log files:

—>> WARNING in subroutine EMIS_SPC_CHECK on PE 005
For optimal predictions, species with the missing surrogates should have a surrogate found in at least one source.
M3WARN: DTBUF 0:00:00 June 1, 2015 (2015152:000000)
It appears that the model doesn’t progress beyond that point, so perhaps it is crashing right away.

I also see the following warning earlier in the log file.

Checking header data for file: EMIS_1
Inconsistent value for vertical level


#5

Hi @lizadams,

yes I did saw those warnings, however, the are the same warnings that I got when I ran the simulation for 1 day, and the simulation completed successfully.

That’s why I think it’s something related to the simulation being more than 1 day.

Is there maybe like a debug option that I can turn to get more info on what the model is doing before it crashes/finishes?

Thanks for your time and patience!


#6

Hello Ever:
Have you tried to run 3 days for example changing the “set NSTEPS = 240000” to 7200000??

Maybe one of your inputs do not have the information for the periods you want.


#7

Sorry, set NSTEPS = 720000


#8

Hi @ernesto.pinoc,

I didn’t tried running it for 3 days, I did tried to run it for 20 days by increasing NSTEPS to 4800000 as you advised here:

but I will try to run it for 3 days and report back.

Thanks!


#9

Hi @ernesto.pinoc,

did another test run with NSTEPS set to 720000 but got the same results again, model seems to halt after day one.

I saw you did a run for 15 days here:

any chance you can share with me your run_cctm.csh script? Just want to double check and compare my settings against yours, maybe something pops up, think is possible?

Thank you!


#10

Looking at this, and looking at "initscen.F in the source code
I have a response, and additionally a problem-report.

The relevant environment variable is CTM_RUNLEN, which should
give the duration of the run in format H*MMSS, i.e.,

100*(100*HOURS + MINUTES) + SECONDS

The script needs to deal with this correctly; moreover, this
value should NOT be over-written elsewhere. Doing so is metadata
perjury.

And then when I start looking at the code and at your log, I find
that this routine is improperly using variables STDATE, STTIME
before they are defined: e.g., its very first “clause”, before
they have been retrieved from the environment, uses them:

  VARDESC = 'Main Program Name'
  CALL M3MESG( VARDESC )
  CALL ENVSTR( CTM_PROGNAME, VARDESC, 'DRIVER', PROGNAME, STATUS )
  IF ( STATUS .EQ. 1 ) THEN
     MSG = 'Environment variable improperly formatted'
     CALL M3EXIT( PNAME, STDATE, STTIME, MSG, XSTAT2 )
  ELSE IF ( STATUS .EQ. -1 ) THEN
     MSG = 'Environment variable set, but empty ... Using default:'
     WRITE( LOGDEV, '(5X, A, I9)' ) MSG, STTIME
  ELSE IF ( STATUS .EQ. -2 ) THEN
     MSG = 'Environment variable not set ... Using default:'
     WRITE( LOGDEV, '(5X, A, I9)' ) MSG, STTIME
  END IF

That’s the reason for the “********” in your log. By convention,
errors before the start of the time-stepped execution of the model
should be reported as having date&time 0:0.

Moreover, the “Using default:’…STTIME” messages do NOT report
the correct default – for ANY of the environment-variable calls.

For what it’s worth.

Carlie J. Coats, Jr.
I/O API Author/Maintainer


#11

Hi @cjcoats,

really appreciate your time looking into this!

From your comment however, I am not sure how to proceed, is there something I can do to get it fixed? Or should I go one version down for the model (CMAQ 5.1)?

Thank you!


#12

This block in the logfile:
“EMIS_1” opened as OLD:READ-ONLY
File name “/root/CMAQ-5.2.1/data/CMAQ-ASU1K/cctm_input/emis/HERMESv3_par_20150601.nc”
File type GRDDED3
Execution ID “0.1alpha”
Grid name “”
Dimensions: 53 rows, 53 cols, 18 lays, 31 vbles
NetCDF ID: 524288 opened as READONLY
Starting date and time 2015152:000000 (0:00:00 June 1, 2015)
Timestep 010000 (1:00:00 hh:mm:ss)
Maximum current record number 25
Checking header data for file: EMIS_1
Inconsistent value for vertical level

shows that your emissions file has 18 layers, which differs from your meteorology, which has 29 layers.
In your run script, below where you set EMISfile, also set CTM_EMLAYS:
set EMISfile = HERMESv3_par_${YYYYMMDD}.nc
setenv CTM_EMLAYS 18

This block in your log file:
“OCEAN_1” opened as OLD:READ-ONLY
File name “/root/CMAQ-5.2.1/data/CMAQ-ASU1K/cctm_input/land/ocean_file.dummy.CMAQ-ASU1K.nc”
File type GRDDED3
Execution ID “???”
Grid name “CMAQ-ASU1K”
Dimensions: 53 rows, 53 cols, 29 lays, 2 vbles
NetCDF ID: 589824 opened as READONLY
Starting date and time 2015152:000000 (0:00:00 June 1, 2015)
Timestep 010000 (1:00:00 hh:mm:ss)
Maximum current record number 504
Checking header data for file: OCEAN_1
Inconsistent values for VGTYP: 1 versus 7
Inconsistent values for VGTOP: 1.0000E+02 versus 5.0000E+03
Inconsistent value for vertical level

shows that your ocean file is rather strange. It should be a single layer, not 29, and it should be time-independent, rather than having 504 data records. I don’t know whether either of those differences are affecting you. I agree it is strange that you are not getting a clearer error in your log file.

Have you tried a multiday run of the benchmark case?


#13

Hi @cgnolte

Re: Emissions FIle, ok cool, going to make that change to my run script and change it to 18 levels.

Re: Ocean file, yes, I had to create a ‘zeroed’ ocean file as my domain doesn’t have ocean areas in it, on a separate thread on the former email list. I’m not 100% sure I did the process correctly as that script referenced to me wasn’t working so I had to do it by hand.

I didn’t tried to run the multiday benchmark case, I only ran the single day benchmark case, I’ll look into that and see if I get the same issue.