Issue with Running CMAQ on Linux – Segmentation Fault Error!

Hello everyone,

I am trying to run CMAQv5.4 on a Linux system (Ubuntu 22.04)…, but I keep getting a segmentation fault error when executing CCTM.exe. Here is what I have done so far :-

  1. Successfully compiled CMAQ with Intel compilers (ifort, icc).
  2. Checked my bldit_cctm.csh script for any issues.
  3. Ensured all libraries (NetCDF, I/O API) are correctly installed.
  4. Ran ulimit -s unlimited to avoid stack size issues.

However, the model crashes with this error:

Segmentation fault (core dumped)

I suspect it might be related to memory allocation or missing environment variables. Has anyone encountered a similar issue: ?? Any guidance on debugging this would be highly appreciated !! I have also searched on the forum for the solution related to my query and found this thread https://forum.cmascenter.org/t/segmentation-fault-running-benchmark-in-cmaqv5-aws-devops but couldn’t get enough solution.

Thanks in advance !!

With Regards,
Derek Theler

It most likely is a memory issue, but how much memory you require will depend on the domain size and number of variables. If you are using a slurm scheduler you can increase the amount of memory per node using SBATCH commands:

Example:

#SBATCH --nodes=3
#SBATCH --ntasks-per-node=90
#SBATCH --mem-per-cpu=3G

After a job is completed, you can use the command

seff

Example for a 12US1 domain:

seff 81865
Job ID: 81865
Cluster: sycamore
User/Group: lizadams/rc_cep-emc_psx
State: COMPLETED (exit code 0)
Nodes: 3
Cores per node: 90
CPU Utilized: 10-12:44:31
CPU Efficiency: 68.16% of 15-10:48:00 core-walltime
Job Wall-clock time: 01:22:24
Memory Utilized: 809.94 GB (estimated maximum)
Memory Efficiency: 99.99% of 810.00 GB (3.00 GB/core)

If you do not have access to a job scheduler, then you may need to increase the number of cores that you are using.

Increase the values of NPCOLxNPROW to use more cores, and therefore have access to more memory.

For this case using NPCOL_NPROW | 8 4
output_CCTM_v55_gcc_Bench_2018_12NE3_cb6r5_ae7_aq

According to the seff report, 16GB of memory is utilized:

Job ID: 5887283
Cluster: dogwood
User/Group: lizadams/rc_cep-emc_psx
State: COMPLETED (exit code 0)
Nodes: 2
Cores per node: 16
CPU Utilized: 22:21:22
CPU Efficiency: 94.37% of 23:41:20 core-walltime
Job Wall-clock time: 00:44:25
Memory Utilized: 15.65 GB
Memory Efficiency: 1.56% of 1005.84 GB

You can try to measure the memory usage while a job is running using top and htop.

Additional tips:

1 Like