Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation

Hello all,

I am using CMAQ for 4km FL grid to simulate concentration from Feb 1,2023 through May 1,2023. I am facing error as below:
Processing Day/Time [YYYYDDD:HHMMSS]: 2023032:010000
Which is Equivalent to (UTC): 1:00:00 Wednesday, Feb. 1, 2023
Time-Step Length (HHMMSS): 000500

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 0x14ffee06e171 in ???
#1 0x14ffee06d313 in ???
#2 0x14ffed4fcb7f in ???
#3 0x8b2e39 in m3dry_
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/m3dry.F:652
#4 0x685858 in depv_defn_MOD_get_depv
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/DEPV_DEFN.F:544
#5 0x8988ec in vdiff

at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/vdiffproc.F:411
#6 0x7f7b0a in sciproc

at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/sciproc.F:233
#7 0x7e6f6c in cmaq_driver_
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/driver.F:717
#8 0x7e1ee5 in cmaq
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/cmaq_main.F:97
#9 0x7e21f4 in main
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/cmaq_main.F:32
#0 0x145df6f0a171 in ???
#1 0x145df6f09313 in ???
#2 0x145df6398b7f in ???
#3 0x8b2e39 in m3dry_
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/m3dry.F:652
#4 0x685858 in depv_defn_MOD_get_depv
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/DEPV_DEFN.F:544
#5 0x8988ec in vdiff

at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/vdiffproc.F:411
#6 0x7f7b0a in sciproc

at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/sciproc.F:233
#7 0x7e6f6c in cmaq_driver_
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/driver.F:717
#8 0x7e1ee5 in cmaq
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/cmaq_main.F:97
#9 0x7e21f4 in main
at /lustre/fs1/home/mhasan/Parallel/Build_CMAQ/CMAQ_Project/CCTM/scripts/BLD_CCTM_v532_gcc/cmaq_main.F:32

I am attaching my run file along with some log files. Thanks for your help.
run_cctm_202302_4FL_log.txt (57.5 KB)
CTM_LOG_000.v532_gcc_4MIDFLA1_87X81_20230201.txt (72.1 KB)
run_cctm_202302_4FL.txt (35.7 KB)

Hasibul

This looks like a variant of https://cjcoats.github.io/ioapi/ERRORS.html#inst: some model-component is compiled for a newer/different processor than you’re actually running the model on. It is not the case that you can just say, “I’m running on Intel XEON.”

What does cat /proc/cpuinfo say? …and where did the executable come from?

Hello Hasibul,

Line 652 which the backtrace is pointing to in the m3dry.F code is this one:

           rstom = MET_DATA%RS( c,r ) * dwat / dif0( l )
 &               + 1.0 / ( heff_ap / 3000.0 + 100.0 * meso( l ) ) / laicr

My suspicion is that there is some inconsistency in your WRF fields between the LWMASK or VEGF fields and the LAI fields. Specifically, I suspect your WRF fields may have grid cells where LAI is zero (causing a divide-by-zero error) despite having a non-zero VEGF value and/or a LWMASK indicating land.

Stomatal resistance (rstom) in line 652 is only calculated over land cells.

Line 502 determines whether a grid cell should be treated as all water (unvegetated)

        IF ( ( NINT(GRID_DATA%LWMASK( c,r )) .EQ. 0 ) .OR. ( vegcr .EQ. 0.0 ) ) THEN  ! water

Line 563 then starts the “else” branch of that “if” condition with the land-based calculations, including the calculation of rstom that assumes LAI to be non-zero.

The issue you are encountering seems similar to the one in this thread:

Similar to the discussion in that thread, you would want to perform a careful analysis of your LWMASK, VEGF, and LAI fields to identify and hopefully resolve any inconsistencies that currently exist.