I am running SMOKE (2019 platform) over our own 36x36km domain and ran into segmentation error in MOVESMRG for RPHO and RPS. I can successfully run RPH, RPV and RPP though.
The MOVESMRG stopped in the middle of reading emission factors, and segmentation error appeared in the terminal output.
Some other steps were taken before posting here:
I have also tried to process RPS for 2019 platform domain (12US1), and that turned out to be successful.
I have compiled SMOKE v4.8.1, v5.0 successfully, but same segmentation error.
I have tried pre-compiled SMOKE releases (v4.8.1, v4.9 and v5.0) from Github, but also had segmentation error.
To isolate where the seg-fault is happening, it might be useful
to re-build and re-run with BIN = Linux2_x86_64ifortdbg â that run would compile with traceback enabled, and would then give you the file and its line-number at which the seg-fault occurred.
That would be a much better start than just knowing the problem is somewhere in MOVESMRG.
My first thought is out of memory, so how much memory/RAM are you allocating to RPHO and RPS? Are you able to run RPD? (RPD is the most memory intensive job.)
There have been significant changes to movesmrg since SMOKE v4.8. When you tested with SMOKE v5.0 did the seg fault occur in the same location?
You also said that you were able to successfully process the 12US1 domain. Is your 36 km domain fully nested into the 36US3 or is it offset in some way?
For testing purpose, I added setenv APPLY_NOX_HUMIDITY_ADJ âNâ to run script. And this does not change anything.
My observation is that the MOVESMRG failed before reading in eftables of all ref counties (there should be a few more files to read according RPP or RPV for example).
SMOKE v5.0 stopped at the same spot in the log file.
For 12US1 domain, I can process RPS but not RPHO, because RPHO requires MET3D files which I donât have.
================================================
Multi-day run note: Run starts with 2019001, is 250000 long
Running part 4, for 20190101âŚ
This program uses the EPA-AREAL/MCNC-EnvPgms/BAMS/ UNC IE
Models-3 I/O Applications Programming Interface, [I/O API]
which is built on top of the netCDF I/O library (Copyright
993, 1996 University Corporation for Atmospheric Research
Unidata Program) and the PVM parallel-programming library
(from Oak Ridge National Laboratory).
Copyright (C) 1992-2002 MCNC,
(C) 1992-2018 Carlie J. Coats, Jr.,
(C) 2003-2012 Baron Advanced Meteorological Systems, LLC, and
(C) 2014-2023 UNC Institute for the Environment.
Released under the GNU LGPL License, version 2.1. See URL
https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html
for conditions of use.
ioapi-3.2: $Id: init3.F90 247 2023-03-22 15:59:19Z coats $
netCDF version 4.9.2 of Jul 26 2024 09:55:29 $
SMOKE ---------------
Copyright (c)2004 Center for Environmental Modeling for Policy Development
All rights reserved
Program MOVESMRG, Version SMOKEv5.0_Jun2023
Online documentation
https://cmascenter.org/smoke/
No program description is available for MOVESMRG
You will need to enter the logical names for the input and
output files (and to have set them prior to program start,
using "setenv <logicalname> <pathname>").
You may use END_OF-FILE (control-D) to quit the program
during logical-name entry. Default responses are given in
brackets [LIKE THIS] and can be accepted by hitting the
<RETURN> key.
1982.842u 23.959s 33:37.80 99.4% 0+0k 20261712+5687688io 35pf+0w
now checking log file /Ext05/emissions/2019/2019ge_cb6_19k/intermed/onroad/RPS/logs/movesmrg_RPS_onroad_jan_2019ge_cb6_19k_20190101_12US1_cmaq_cb6ae7.log
Now running M3STAT
Since you are able to run RPS for 12US1, but not for your 36km domain, itâs possible there is a problem with your 36km met data, specifically in WV since thatâs where Movesmrg crashed. We note that QV (water vapor mixing ratio) and atmospheric pressure are in the METCRO3D, but we only use data from layer 1.
I have two Linux boxes. The first one has dual Intel CPUs, Ubuntu 18.04 LTS system, and intel oneapi 2024.1. For my own 36km domain (focused on Ontario, CA), on this box, I can successfully process RPH, RPV and RPP with compiled SMOKEv4.8.1. But I failed at processing RPD, RPS and RPHO.
The second box has AMD CPU, Ubuntu 22.04, and intel oneapi 2024.1. For my own 36km domain, I can only process RPD with Pre-compiled SMOKEv5.0 by EPA.
Since I can process the whole year for RPH, RPV and RPP, does that mean my MET data should be okay?
Thanks for your suggestion. I just recompiled IO/API and SMOKE v4.8.1 with debug options enabled.
When I run processing scripts for RPHO, it failed early on at SMKINVEN stage with error message at the terminal:
Processing environment variables EMISINV_A
SMKINVEN_MONTH set to 1
Running part 1âŚ
This program uses the EPA-AREAL/MCNC-EnvPgms/BAMS/ UNC IE
Models-3 I/O Applications Programming Interface, [I/O API]
which is built on top of the netCDF I/O library (Copyright
993, 1996 University Corporation for Atmospheric Research
Unidata Program) and the PVM parallel-programming library
(from Oak Ridge National Laboratory).
Copyright (C) 1992-2002 MCNC,
(C) 1992-2018 Carlie J. Coats, Jr.,
(C) 2003-2012 Baron Advanced Meteorological Systems, LLC, and
(C) 2014-2023 UNC Institute for the Environment.
Released under the GNU LGPL License, version 2.1. See URL
https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html
for conditions of use.
ioapi-3.2: $Id: init3.F90 247 2023-03-22 15:59:19Z coats $
netCDF version 4.9.2 of Jul 26 2024 09:55:29 $
SMOKE ---------------
Copyright (c)2004 Environmental Modeling for Policy Development
All rights reserved
Program SMKINVEN, Version SMOKEv4.8.1_Jan2021
Online documentation
http://www.cep.unc.edu/empd/products/smoke
Program SMKINVEN to take ASCII area or point source files
in IDA, EMS-95, or SMOKE list format, or mobile files
in IDA format, and produce the I/O API and ASCII SMOKE
inventory files and list of unique SCCs in the inventory.
You will need to enter the logical names for the input and
output files (and to have set them prior to program start,
using "setenv <logicalname> <pathname>").
You may use END_OF-FILE (control-D) to quit the program
during logical-name entry. Default responses are given in
brackets [LIKE THIS] and can be accepted by hitting the
<RETURN> key.
When I run SMOKEv4.8.1 with only movesmrg built with ifortdbg options, I got the following error message at terminal:
Multi-day run note: Run starts with 2019001, is 250000 long
Running part 4, for 20190101âŚ
This program uses the EPA-AREAL/MCNC-EnvPgms/BAMS/ UNC IE
Models-3 I/O Applications Programming Interface, [I/O API]
which is built on top of the netCDF I/O library (Copyright
993, 1996 University Corporation for Atmospheric Research
Unidata Program) and the PVM parallel-programming library
(from Oak Ridge National Laboratory).
Copyright (C) 1992-2002 MCNC,
(C) 1992-2018 Carlie J. Coats, Jr.,
(C) 2003-2012 Baron Advanced Meteorological Systems, LLC, and
(C) 2014-2023 UNC Institute for the Environment.
Released under the GNU LGPL License, version 2.1. See URL
https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html
for conditions of use.
ioapi-3.2: $Id: init3.F90 247 2023-03-22 15:59:19Z coats $
netCDF version 4.9.2 of Jul 26 2024 09:55:29 $
SMOKE ---------------
Copyright (c)2004 Environmental Modeling for Policy Development
All rights reserved
Program MOVESMRG, Version SMOKEv4.8.1_Jan2021
Online documentation
http://www.cep.unc.edu/empd/products/smoke
No program description is available for MOVESMRG
You will need to enter the logical names for the input and
output files (and to have set them prior to program start,
using "setenv <logicalname> <pathname>").
You may use END_OF-FILE (control-D) to quit the program
during logical-name entry. Default responses are given in
brackets [LIKE THIS] and can be accepted by hitting the
<RETURN> key.
In both cases, the failure is some sort of logical-name failure that manifests itself in a call to netCDF routine NF_OPEN, in places that have been in frequent use for the last 32 years, albeit in a situation related to SMOKE routine OPENSET. The code does something like the following:
My conclusion is that something is probably wrong with the EQNAME (e.g., it is longer than 511 characters, or has an embedded ASCII-zero, or âŚ) created by OPENSET (which has been in use for 24 years). Unfortunately, the standard scripts do not give access to the program-environment so we still donât have a good way to know exactly what is the reason for the failure. It is most probably a problem with one of the setenv
commands in the script (or in the scripting that creates the right-hand side of this command).
It could potentially be useful to modify the script that runs movesmrg so that it documents the program-environment where this error is probably coming from, so that the script does something like the following (and then look at this output to see if you can find what is the problem):
<*run program movesmrg*>
set foo = $status
if ( $foo )
echo "Error $foo in program movesmrg" >> $LOGFILE
echo "Environment:" >> $LOGFILE
env | sort >> $LOGFILE
exit ( $foo )
endif
FYI we donât normally run SMOKE-MOVES on the 36km domain, although it should be possible. There would be a different range of temperatures in 12km vs 36km and that could cause issues, also whether there are different FIPS codes on 36km. Instead, we typically aggregate the 12km emissions to 36km for the 36km cells that overlap the 12km domain. I havenât had a chance to pour over the log file to identify something more specific.
Your talking about not normally running on the 36KM grid reminded me of something I ran into several years ago.
See https://cjcoats.github.io/ioapi/AVAIL.html#medium: if there are very large arrays, that could be the problem for default Intel or GNU compiles.
There are three different binary-incompatible âmemory modelsâ available for 64-bit x86 Linux:
small: At most 2 GB each for static data (COMMONs, etc.), ALLOCATEd arrays, program-stack, and program machine-code.
medium: Program machine-code at most 2 GB; no restrictions on data (large arrays, large stack, large data are OK).
large: No restrictions on code or data.
GNU and Intel default to small. If something doesnât fit under this modelâs restrictions, then you may very well silently get into trouble that would then lead to the segfault. The fix is to re-compile everything (netCDF. I/O API, SMOKE) for MOVESMRG for the medium memory model â using BIN = Linux2_x86_64gfort_medium, Linux2_x86_64ifort_medium, or Linux2_x86_64pg_medium and see if that works.