CMAQ configuration for latest AMD EPYC 7713 with 1TB RAM machines

kmonk · May 12, 2022, 5:18am

We are trying to compile CMAQ on new AMD machines (AMD EPYC 7713 with 1TB RAM. This processor is unique in that the new architecture where the processor is a set of distributed chips (dies) connected by a central interconnect chip) and it is working for the benchmark case fine, but when we run our larger forecast runs (that were completing fine on old intel machines) we are getting sigbus errors and the memory usage is more then doubling.

As there are no default configuration options for AMD we are trying to figure them out ourselves. Does anyone have any experience with this and suggestions on flags to use or avoid?

lizadams · June 7, 2022, 1:59pm

Please try recompiling I/O API to use the medium memory model.

Documentation regarding this flag here: Availability/Download of the BAMS/Models-3 I/O API

cjcoats · June 7, 2022, 3:24pm

What compiler are you using? It might be beneficial to use your compiler’s processor-specific flags for this processor (e.g., with a recent gcc/gfortran, the “lazy way” to do this is to use -march=native -mtune=native for example; with Intel compilers you might well need to use an explicit -xCORE-AVX2 -march=core-avx2 instead of -xHost (some versions of Intel compilers are suspected of not doing this latter correctly for non-Intel processors)

And of course use compatible flags for compiling the entire system – not just CMAQ but also I/O API, netCDF, …

kmonk · June 24, 2022, 5:30am

Thank you for your suggestions. We have narrowed down the issue to IOAPI XTRACT3 function. On the intel machine it worked fine and at most the overall run would require 200G RAM whereas our latest runs will max out our 1T of RAM and eventually we’ll get an out of memory error. This XTRACT3 function appears to be the culprit in centrealised_io_module.F when loading our emission files (which are ~1.1G).

kmonk · June 24, 2022, 5:42am

further testing I believe has just shown that without loading the emission file its still maxing out the memory, its just getting there quicker (within 1 minute instead of 5).

lizadams · June 24, 2022, 11:11am

Please specify the version of CMAQ that you are using.

Have you tried the most recent version? CMAQv5.3.3?

cjcoats · June 24, 2022, 11:16am

what compiler ??? …and compile-flags? …and are they consistent across netCDF, I/O API, and CMAQ?

wong.david-c · June 24, 2022, 12:45pm

Hi kmonk,

If your system has sufficient number of nodes, please try to double number of nodes and halve the number of cores per node in your batch job script.

By the way, how large is your simulation (ncols x nrows x nlays)? Did you change any parameter in the IOAPI library when you recompile it on the AMD system?

Cheers,
David

wong.david-c · June 27, 2022, 11:42am

Hi kmonk,

Since this is a brand new system, sometime certain system parameters such as stacksize, did not set properly. Even though in the CMAQ run script, it contains the following line but most of the time it can't overwrite the system setting.

limit stacksize unlimited

So please check with your sys. admin to make sure stacksize does not set to a lower value somewhere.

Cheers,
David

kmonk · July 27, 2022, 4:37am

Update on our progress. We are using CMAQv5.3.2 and its compiled with gcc and flags have been kept consistent as possible, and we have also been testing with an intel compiler.

We were sticking with running on a single node and utilizing the 128 cpus available. We have tested it with multithreaded using up to 240 cpu but we got no improvement in performance due to the HADV step.

We have been using limit stacksize unlimited.

We managed to get the runs to work by converting our input files from NETCDF4 to either CDF5 (NETCDF3_64BIT_DATA), 64BIT_OFFSET or NETCDF4_CLASSIC. The system was not handling the compressed netcdf4 files. We believe this issue has arisen because our new system does not cache the uncompressed file (and our previous system did).

Topic		Replies	Views
CMAQ memory problem CMAQ	2	278	October 18, 2023
Is 68GB memory enough to run CMAQv5.3.1 with 12US1 platform? CMAQ	10	1438	May 5, 2020
How to compile IOAPI and CMAQ with cray compilers (ftn/cc/CC) I/O API	1	52	July 2, 2024
Two errors when running benchmark in CMAQ-5.2.1 CMAQ	1	392	November 15, 2019
Different simualtion results of CMAQ using same machine with different CPU cores numbers CMAQ	8	456	January 11, 2024

CMAQ configuration for latest AMD EPYC 7713 with 1TB RAM machines

Related topics