Model fails in running MOVESMRG in Onroa_RPP

Hi all,
I am using nei 2016_v1 platform to run onroad_RPP emissions but I got an error in the standard log file as following:

  • ERROR detected in logfile:
  • /scratch/general/lustre/copycky/models/NEI/nei_2016_v1//2016fh_16j/intermed/onroad/RPP/logs/movesmrg_RPP_onroad_aug_2016fh_16j_20160801_US_12km_cmaq_cb6.log

ERROR: detected in movesmrg
ERROR: Running smk_run for part 4 for 20160801
testing SRGPRO_PATH set by SRGPRO input: /scratch/general/lustre/copycky/models/NEI/nei_2016_v1//ge_dat/gridding/surrogates/CONUS12_2014_30apr2019/

The movesmrg log file is also attached. It seems like the program stoped without any errors. I use met4moves from same platform to generate the MOVES files for this run. Could you please help me the see if there is anything wrong for my run?
movesmrg_RPP_onroad_aug_2016fh_16j_20160801_US_12km_cmaq_cb6.log.txt (21.2 KB)

Best,
K

Hi @cgnolte! I got such issues for a month and I have no idea how to deal with them. Could you help me with this issue?

I agree that there does not appear to be any error shown in that log file.
Tagging a few emissions experts who may be able to help.
@bbaek @eyth.alison

I do not see any particular error message in your log file either. This means the program probably ended with segmentation fault. @copycky28 Were you able to process RPD, RPV and RPH sectors successfully?

Yes, all other sectors were successfully done. It seems like this program ended without any error but I guess could be due to the warning message exceeding the MAXWARN in the script. I tried to modify the MAXWARN to a higher value but still same issue. It is possible due to the memory of my node or any other setting instead of the program itself?

Thanks for your reply and for tagging emission experts for me. Appreciate it.

I do not think it is related to no of warning messages but possibly memory issue or a bug since there was no specific error message. But if you were able to process RPD sector successfully, your server has enough memory since RPD sector requires the most memory to process. It means it may not be related to the memory but something else.

Can you share the terminal screen output when the run fails? If this is a segmentation fault, there should be some information for me to understand the issue better.

Also, do you mind sharing whether you made any changes to the original NEI modeling platform package from EPA or not? Like updated the inventories, change the domain, and so on?

Based on the info at the top of the log, it seems that you are using their own SMOKE compilation rather than the pre-compiled executables. If that is correct, you could give precompiled executables a try – sometimes that fixes issues like these.

https://gaftp.epa.gov/Air/emismod/2016/v2/smoke_2016v2_platform_core_01oct2021.zip

Thanks for your suggestion. I attached the standard log file to this reply. It shows errors in MOVESMRG and the MOVESMRG log file is what I upload previously. I have no idea why the program ends without error. So I am wondering the possible reason. Thanks again for your time.
RPP.o.txt (68.8 KB)

Are you running on AWS VM’s? How much memory are you using?

If this is not a memory issue (i.e. provisioning differences on your VM), I suggest you check out the latest SMOKE from github. We have recently fixed some issues with movesmrg crashing.

I use a hpc system from my institute, not AWS. I used 56 cores with 4 gb/core for this program. I tried 108 but still failed.

We typically run onroad sectors using one core with a lot of memory. I’m not exactly sure what the minimum is for a CONUS domain but I’m sure 4GB is not enough based on my tests.

Typically RPD is NOT the most memory intensive sector. Did you see my comment about the compiled version of SMOKE you are using?

Thanks for your suggestion. Now I can run the program without unknown interruption. But I got another error from MOVESMRG:
Processing MOVES lookup tables for reference county 000000006059 of fuel month: 7
Reading emission factors file rateperprofile_smoke_aq_cb6_saprc_1Aug2019_2016v1platform-2016-20190718_6059_07.csv
WARNING: Emission factors file does not contain requested model species ACET
WARNING: Emission factors file does not contain requested model species ALD2
WARNING: Emission factors file does not contain requested model species ALDX
WARNING: Emission factors file does not contain requested model species CH4
WARNING: Emission factors file does not contain requested model species ETH
WARNING: Emission factors file does not contain requested model species ETHA
WARNING: Emission factors file does not contain requested model species ETHY
WARNING: Emission factors file does not contain requested model species FORM
WARNING: Emission factors file does not contain requested model species KET
WARNING: Emission factors file does not contain requested model species MEOH
WARNING: Emission factors file does not contain requested model species NAPH
WARNING: Emission factors file does not contain requested model species TERP
WARNING: Emission factors file does not contain requested model species UNK

 *** ERROR ABORT in subroutine MOVESMRG
 Could not find minimum or maximum temperatures for county 000000006075 and episode month         8
 Date and time  0:00:00   Aug. 25, 2016   (2016238:000000)

Do you know how could I solve this issue?

I tried recompiling SMOKE and checked that model works well for all other sectors except RPP. The program still ends without any errors once I use the V1 platform.

We think this might be related to the met file created by met4moves for RPP. Can you provide one of the daily Movesmrg logs?

Thanks for your reply. I attached a daily mrgmoves log file. It seems the program end with unknown reason all the time.

movesmrg_RPP_onroad_aug_2016fh_16j_20160804_US_12km_cmaq_cb6.log.txt (21.2 KB)

We thought there might be a message similar to the one he pasted into one of his replies:

*** ERROR ABORT in subroutine MOVESMRG

Could not find minimum or maximum temperatures for county 000000006075 and episode month 8

Date and time 0:00:00 Aug. 25, 2016 (2016238:000000)

Since the error is for county 06075, that must have occurred with the onroad_ca_adj sector, not the onroad sector. The log you just now provided is onroad, and that still has the original error in which Movesmrg just stops without giving an error message. So there might be one thing going on for onroad and another thing going on with onroad_ca_adj.

Consider the following:

  • Do new, fresh runs for onroad RPP and onroad_ca_adj RPP using our executables from the EPA FTP site, not your compiled ones

  • Let it loop through all days, whether they complete or not

Review the logs from onroad/RPP/logs and oncrad_ca_adj/RPP/logs

Also, what is the grid you are running on?

Review the METMOVES file (SMOKE_DAILY_2016fc_v2_16j_US_12km_2016184-2016243.k1.ncf) to see if the size is correct compared to the other days or if there are any other anomalies.

If nothing becomes apparent following this, we may ask you to send some additional files.

Thanks for your suggestion. I run onroad and onroad_ca_adj sector separately and both have same issue (stop without error message). I do use pre-compiled executable from EPA FTP and my own compiled excutables to test but same outcome.
I run this test with my own domain setting which is not included in pre-generated files so I run met4moves for my own domain. Is there anything else I need to generated for my own domain for RPP?

Can you share your grid description with us including the map projection?