Error: Program received signal SIGSEGV: Segmentation fault - invalid memory reference

Dear,

There is a problem like this when I try to run CAMx v7.00
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x1478b5299d01 in ???
#1 0x1478b5298ed5 in ???
#2 0x1478b4f6320f in ???
#3 0x564d4f34b40e in ???
#4 0x564d4f34ce56 in ???
#5 0x564d4f31aaeb in ???
#6 0x564d4f2f7e52 in ???
#7 0x1478b4f440b2 in ???
#8 0x564d4f2f7e8d in ???
#9 0xffffffffffffffff in ???
Segmentation fault (core dumped)

Does the problem run with other (possibly smaller-grid) cases? If so, there’s possibly an un-checked ALLOCATE statement that is failing but for which the status is not being checked, and the program is going ahead and trying to use what should have been the result of that ALLOCATE anyway.

Generally, you can get more information in the backtrace if you compile with both debug and backtrace enabled (for GNU gfortran, compile-flags -g -O0 -fbacktrace; for Intel ifort or PGI pgf90, -g -O0 -traceback) instead of the -O2 or -O3 optiomization-fl;ags you probably have.

BTW, here’s an example of a properly-checked ALLOCATE statement, if you can find out where the seg-fault is happening, and it turns out to be related to a previously-ALLOCATEd array:

INTEGER ISTAT ! allocation-status

ALLOCATE( BUF1( NCOLS1NROWS1, NLAYS1 ),
& BUF2( NCOLS2
NROWS2, NLAYS1 ), STAT = ISTAT )
IF ( ISTAT .NE. 0 ) THEN
WRITE( *, ‘( A, I10)’ ) ‘Buffer allocation failed: STAT=’, ISTAT
STOP

Error status numbers ISTAT are compiler-vendor dependent; it’s relatively easy to find them for Intel and PGI compilers. However, gfortran error-numbers are not documented properly (according to extensive searches I have performed). The most-frequent allocation-error is “not enough memory” (either in terms of actual system RAM or in terms of your limit memoryuse… quota; however, I have seen allocation for other (programmer-bug) reasons.

[Forgive the indentation problems: this forum’s edit-windows refuse to Do the Right Thing® for talking about software.]

Dear cjcoats,

Thank you so much for your kind help. But I stil cannot figure out what happen with this error. I am just a starter of linux and camx. So, can you give me more details and clearer solution to that?

Best,

Does the problem happen for smaller (and therefore less memory-intense) problems? Note, for example, that a 200x200x40 grid will use about 8 times as much memory as a 100x100x20…)

Can you compile the model for traceback, using the flags I gave you, and then find out at what line of code in what file the seg-fault happens?

And if that line is an array-access, can you track down where the array was ALLOCATEd and put it proper error-reporting code, as I suggested.

And what compiler-system are you using? (I hope it’s not gfortran, which has truly lousy error-reporting and error-documentation).

Dear cjcoats,

When I compile is like this:
export CPPFLAGS=-I$DIR/netcdf-c-4.6.1.$FC/include
export LDFLAGS=-L$DIR/netcdf-c-4.6.1.$FC/lib
export LD_LIBRARY_PATH=$DIR/netcdf-c-4.6.1.$FC/lib
export FC=pgf90

sudo ./configure --prefix=$DIR/netcdf-c-4.6.1.$FC CPPFLAGS=-I$DIR/netcdf-c-4.6.1.$FC/include LDFLAGS=-L$DIR/netcdf-c-4.6.1.$FC/lib LD_LIBRARY_PATH=$DIR/netcdf-c-4.6.1.$FC/lib:$LD_LIBRARY_PAT F90=“pgf90” CXX=“pgc++” FC=gfortran --disable-shared

Dear cjcoats,

When I compile is like this:
export CPPFLAGS=-I$DIR/netcdf-c-4.6.1.$FC/include
export LDFLAGS=-L$DIR/netcdf-c-4.6.1.$FC/lib
export LD_LIBRARY_PATH=$DIR/netcdf-c-4.6.1.$FC/lib
export FC=pgf90

sudo ./configure --prefix=$DIR/netcdf-c-4.6.1.$FC CPPFLAGS=-I$DIR/netcdf-c-4.6.1.$FC/include LDFLAGS=-L$DIR/netcdf-c-4.6.1.$FC/lib LD_LIBRARY_PATH=$DIR/netcdf-c-4.6.1.$FC/lib:$LD_LIBRARY_PAT F90=“pgf90” CXX=“pgc++” FC=gfortran --disable-shared

And what about my first three questions?

Does the problem happen for smaller (and therefore less memory-intense) problems? Note, for example, that a 200x200x40 grid will use about 8 times as much memory as a 100x100x20…)

Can you compile the model for traceback, using the flags I gave you, and then find out at what line of code in what file the seg-fault happens?

And if that line is an array-access, can you track down where the array was ALLOCATEd and put it proper error-reporting code, as I suggested.

Dear Dr. Carlie J. Coats,

Thank you so much for your kind response. Now I can run CAMx. It is the problem related to ALLOCATED as you mentioned.

Best regards,
Lai Nguyen Huy

Hello,

I faced the same error when I was running ICON for CMAQ. Then I fixed this issue by increasing the number of nodes and processors in running ICON. For 1386x1078 (rowXcolumn), I have used nodes=2:ppn=96 in my PBS script.

Please start a new issue.