Segmentation fault of running ptfire3d

Hi all,

I got an error message of segmentation fault - invalid memory reference as shown in following:
" Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x2B89E37716D7
#1 0x2B89E3771D1E
#2 0x2B89E3FCE3FF
#3 0x2B89E4018F3B
Segmentation fault
It is a error message I got when tried to run ptfire3d-daily script from NEI 2016 v1 platform for a domain with grids of around 400 by 500. I successfully complete the ptfire3d-onetime run using same node settings. I tried to increase the node and core number but still got same issue. Any one know how to solve this problems. Hope to receive your help. Thanks very much.

Best,
K

A first thing to try: build the debug/traceback/check-everything version of the executable, and run that: it will tell you at what line of what source-file the invalid memory reference is happening.
And, since this is the largest problem you’ve run, it is conceivable that you’re running into the Linux memory-model issue (see https://cjcoats.github.io/ioapi/AVAIL.html#medium), so for another check run a medium-model executable…
And of course, you are already running with the following?

limit stacksize unlimited
limit memoryuse unlimited

Thanks for reply. I have added the unlimited stacksize lines in my script and got same error. What flags should I add to the makefile when I compile SMOKE executable if I want to build the debug/traceback/check-everything version?

Starting from the I/O API, build both the I/O API library and the SMOKE executables using one of the debug BIN-types – see the file-extensions for the *Makeinclude.dbg files in the ioapi directory

Makeinclude.Linux2_x86_64af95_dbg
Makeinclude.Linux2_x86_64af95dbg
Makeinclude.Linux2_x86_64dbg
Makeinclude.Linux2_x86_64g95dbg
Makeinclude.Linux2_x86_64gfort10_mediumdbg
Makeinclude.Linux2_x86_64gfort10dbg
Makeinclude.Linux2_x86_64gfort_mediumdbg
Makeinclude.Linux2_x86_64gfortdbg
Makeinclude.Linux2_x86_64ifort_mediumdbg
Makeinclude.Linux2_x86_64ifortdbg
Makeinclude.Linux2_x86_64ifortmpidbg
Makeinclude.Linux2_x86_64pathdbg
Makeinclude.Linux2_x86_64pgdbg
Makeinclude.Linux2_x86_64sundbg
Makeinclude.Linux2_x86af95_dbg
Makeinclude.Linux2_x86af95dbg
Makeinclude.Linux2_x86dbg
Makeinclude.Linux2_x86g95dbg
Makeinclude.Linux2_x86sundbg

It may be related to the compilations, but can you tell which program is getting the segmentation fault and if so, could you attach the log for that program?

I re-compiled IOAPI using gfortran_mediumdbg option and now got another error:

“SIGFPE: Floating-point exception - erroneous arithmetic operation.”

Is it also related to the IOAPI compilation or the SMOKE source code?

The following message is from the log file:
Backtrace for this error:
#0 0x2AB67E88A6D7
#1 0x2AB67E88AD1E
#2 0x2AB67F0E73FF
#3 0x5459DE in rdstcy_ at rdstcy.f:518
#4 0x4B4934 in MAIN__ at smkinven.f:214
Floating exception

The relevant code-chunk is in file rdstcy.f at lines 516-5188:

`IF( LSTCYPOP ) POP = STR2REAL( LINE( 114:128 ) )`
`POP = MAX( 0., POP )           ! Remove missing values`

where POP was never initialized (may have been initialized by the compiler to either zero or not-a-number by the compiler)
There are two possibilities: either the input line is bad, or else the compiler did a not-a-number initialization. In any case, there is a code-bug, and POP should have been initialized: put a POP=0.0 at line 502 and try again. If it still errors here, then the input data is bad.

Seeing more of the log and std out files would be helpful. Have you run SMOKE for other sectors successfully?

Hi cjcoats,

I initial the POP and re-compile the SMOKE and now model has no SIGREF error but I got another error from smkreport.log when processing COSTCY file:

*** ERROR ABORT in subroutine RDREPIN
Number of variables in TSUP file is not supported.

Is it mean the input file has something wrong?

Yes, I did successfully run several sectors when using default IOAPI. But I got error of memory issue on other sectors. When I used IOAPI medium_dbg, I got SIGREF error for all runs.

This error message is coming from Smkreport. Right? As @eyth.alison asked, please check and share your log files for us to understand the issue you are dealing with.

Thanks for reply. I attached the smkreport logfile here
smkreport_np_oilgas_2016fh_16j_temporal.log.txt (21.7 KB)

I used np_oilgas for testing.

Other sectors have same error logs as well

I think that you said the error was happening in Smkinven, not Smkreport, right?

The error message comes from smkreport so I assume it is due to smkreport.

Yes, looks like you are having issues with Smkreport run. Based on your error message, the format of TSUP (Temporal supplemental output file from Temporal program) is not compatible. The format of TSUP has been updated to accommodate new feature in SMOKE a few years ago. Can you check which version of SMOKE in your temporal log file? The version of SMOKE should be listed in your Temporal log file. Looks for the words “Version SMOKEv”. If you can not locate that information, please attach your Temporal log file here for me to review.
The version of SMOKE between Temporal and Smkreport runs has to be compatible.

Thanks for your answer. I am using SMOKE v 4.8 for this simulation. All programs in this simulation are from SMOKE v4.8. I compiled SMOKE using GCC instead of using the programs provided by the 2016 v1 platform since my HPC system can only support the program complied by GCC. I also attached the Temporal log output here in case you might need some information from it.
temporal_np_oilgas_2016fh_16j_20160712.log.txt (58.8 KB)

There is no versioning issue between Temporal and Smkreport. Can you show me the first 10 lines of your ATSUP output file to confirm whether there is an issue of format or not?

I copied and pasted the first several lines of ATSUP output to the following:

“FORMALD”, “METHANOL”, “BENZENE”, “BENZENE_NOI”, “ACETALD”, “NAPHTH”, “BUTADIE”, “ACROLEI”, “CHLORINE”, “CO”, “NH3”, “NONHAPVOC”, “NOX”, “PM10”, “PM2_5”, “SO2”, “VOC”, “VOC_INV”, “PMC”
“MTH”, 1, “000000001003/00000000002310000330”,OG4293
“WEK”, 1, “000000001003/00000000002310000330”,7
“MON”, 1, “000000001003/00000000002310000330”,24
“TUE”, 1, “000000001003/00000000002310000330”,24
“WED”, 1, “000000001003/00000000002310000330”,24
“THU”, 1, “000000001003/00000000002310000330”,24
“FRI”, 1, “000000001003/00000000002310000330”,24
“SAT”, 1, “000000001003/00000000002310000330”,24
“SUN”, 1, “000000001003/00000000002310000330”,24
“MTH”, 1, “000000001003/00000000002310000550”,OG4252
“WEK”, 1, “000000001003/00000000002310000550”,7
“MON”, 1, “000000001003/00000000002310000550”,24
“TUE”, 1, “000000001003/00000000002310000550”,24
“WED”, 1, “000000001003/00000000002310000550”,24
“THU”, 1, “000000001003/00000000002310000550”,24
“FRI”, 1, “000000001003/00000000002310000550”,24
“SAT”, 1, “000000001003/00000000002310000550”,24
“SUN”, 1, “000000001003/00000000002310000550”,24
“MTH”, 1, “000000001003/00000000002310010100”,OG2314
“WEK”, 1, “000000001003/00000000002310010100”,7