Can CMAQ only run with less than a thousand processors?

Hello

I am running CMAQv5.5 built using gcc on a cray machine. I’m testing the optimal configuration (e.g., domain decomposition for my HPC), for CONUS 12US1 domain using 2022v1 EMP data. In the process I set up CMAQ to run with more than 1000 processors:

@ NPCOL = 46; @ NPROW = 30

This fails almost immediately and I see that no log files are created for processor ID exceeding 999 (last two shown below):
CTM_LOG_999.v55_gcc_IE_12US1_cb6r5_ae7_aq_m3dry_20220101
'CTM_LOG_***.v55_gcc_IE_12US1_cb6r5_ae7_aq_m3dry_20220101'

Is that an expected behavior and CMAQ is set to run with less than 1000 processors total (which I’m guessing based on the log file name structure) or am I doing something wrong here? I’ve another ongoing run which is progressing fine with 900 processors (for testing).

Thank you for your insights.

PS: I recall we had a similar issue with MCIP but with the domain specified such that number of rows (or columns) exceeded 999 resulted in MCIP failing.
PS2: In my testing, a 7x increase in the number of processors resulted in a 50% reduction in run time.
PS3: my simulation run time has essentially flattened beyond 750 processors.

There should be a workaround for this. The names for the CTM_LOG files are created in setup_logdev.F. To correctly generate CTM_LOG file names when running with more than 1000 processors, you can edit this line from:
WRITE ( CMYPE, '(I3.3)' ) MYPE
to:
WRITE ( CMYPE, '(I4.4)' ) MYPE

Then also update a line in RUNTIME_VARS.F from:
CHARACTER( 3 ) :: CMYPE = "" ! Processor Number
to:
CHARACTER( 4 ) :: CMYPE = "" ! Processor Number

Optionally, you can also update wrsubdmap.f to properly format the processor numbers that get printed to the log file to describe the processor to subdomain mapping. The change would be from:

      write( *,* )
      write( *,* ) '         -=-  MPP Processor-to-Subdomain Map  -=-'
      write( *,'(A,I3)' ) '                 Number of Processors = ',nprocs
      write( *,* ) '   ____________________________________________________'
      write( *,* ) '   |                                                  |'
      write( *,* ) '   |' // title // ' |'
      write( *,* ) '   |__________________________________________________|'
      write( *,* ) '   |                                                  |'
      do i = 1, nprocs
         write( *,1003 ) i-1, ncols_pe(i), colsx_pe(1,i), colsx_pe(2,i),
     &                        nrows_pe(i), rowsx_pe(1,i), rowsx_pe(2,i)
      end do
      write( *,* ) '   |__________________________________________________|'
      write( *,* )

1003  format('    |', i3, 5x, i4, 3x, i4, ':', i4,
     &                    7x, i4, 3x, i4, ':', i4, '   |')

to:

      write( *,* )
      write( *,* ) '         -=-  MPP Processor-to-Subdomain Map  -=-'
      write( *,'(A,I4)' ) '                 Number of Processors = ',nprocs
      write( *,* ) '   ____________________________________________________'
      write( *,* ) '   |                                                  |'
      write( *,* ) '   |' // title // ' |'
      write( *,* ) '   |__________________________________________________|'
      write( *,* ) '   |                                                  |'
      do i = 1, nprocs
         write( *,1003 ) i-1, ncols_pe(i), colsx_pe(1,i), colsx_pe(2,i),
     &                        nrows_pe(i), rowsx_pe(1,i), rowsx_pe(2,i)
      end do
      write( *,* ) '   |__________________________________________________|'
      write( *,* )

1003  format('    |', i4, 5x, i4, 3x, i4, ':', i4,
     &                    7x, i4, 3x, i4, ':', i4, '   |')

Besides these formatting issues, I am not aware of anything else in the code that prevents CMAQ from running with more than 1000 processors.

One final thing is that our sample run scripts usually use CTM_LOG_??? to find log files, so this would need to be updated to CTM_LOG_???? for the run scripts to behave as expected with the above changes to the source code.

1 Like

thanks, Nash! I’ll give this a try.