CCTM-APT: issue with boundary layer levels

Hi!

I am running a single day (20111101) CCTM-APT simulation using the CMAQv5.0.2 version.

As IC file, I am using the CGRID from the previous day (20111031 or 2011304) (from a separate CCTM run).
As BC file, I am using the BCON profile (static) for the same day (20111101 or 2011305) of the intended CCTM-APT simulation.

The model runs and initializes the SCICHEM but it then crashes with the following error message:

particle wash-out timescale (s):
BL grid too big; reducing vertical resolution
nzbl (old,new) = 11 2
ERROR
Routine=set_bl_from_ua
Met grid size too big for boundary layer arrays
Only have 2levels in boundary layer

In the “set_bl_from_ua.f90” routine, it appears that when nzbl is <=2, the above error message is written. I guess that nzbl refers to the number of layers in the vertical direction, but I do not understand where the old nzbl (11) comes from. In my met grid file, I have 35 vertical layers.

Could somebody help me understand this error and suggest some solutions?

Thanks!

.

Hi LGS

Please look at your SCICHEM control file. The variable NZBL is defined there:

&OPTIONS
T_AVG = .000000,
CMIN = .000000,
LSPLITZ = F,
DELMIN = 1.000000E+36,
WWTROP = 1.000000E-02,
EPSTROP = 3.999999E-04,
SLTROP = 10.0000,
UU_CALM = .250000,
SL_CALM = 1000.00,
NZBL = 11,
MGRD = 3,
GRDMIN = .000000,
Z_DOSAGE = .000000,
VRES = 500.,
DYNAMIC = F,

Hi Prakash,

I found the same information as above in the cmaqapt.inp file. Is 11 a recommended value for NZBL and why?

To avoid the error above I should set NZBL = 1. Is that a good thing to do or does not make any sense, physically. My met grid has 459 rows, 299 columns and 35 vertical layers. My understanding is that the set_bl_from_ua.f90 routine reads the boundary layer grid size from metgrid and it calculates the NZBL_MAX, which is 2 in my case. The NZBL was set to 11 in the APTTEST case and the PinG run just fine, but the metgrid size was smaller (119 rows, 158 columns and 22 vertical layers). What is the best approach to avoid the error I am reporting?

Thank you so much!

Hi LGS

Even though you are providing gridded met, SCICHEM does its own internal boundary layer gradient calculations. The NZBL specifies the number of layers for these calculations. This has nothing to do with the number of vertical layers in your gridded met field. We generally use a value of 11 for NZBL to get adequate resolution within the boundary layer. Please try your run with NZBL = 11.

I used NZBL = 11 in the first place and got the “size” error above.

It is actually the nzbl_max that must exceed 11 in order to avoid the model crash at this step. How do I control this?

nzbl_max = MAX3DB / nxyb (in set_bl_from_ua.f90)

MAX3DB = 300000 (in common_met.f90)

Based on these, nxyb is 150000 if the calculated nzbl_max = 2 (which becomes the new nzbl, as reported in the error above).
If I reduce nxyb by 6 times, the nzbl_max = 12 and the model would not crash here. Is this what I should do and how?

nxyb = nxb * nyb

nxb and nyb appears to be dimensions taken from met grid but not sure what they represent.

Please advice!

Thank you so much for your time!

Hi LGS

You can’t change nxb and nyb since they are the dimensions of your met grid. In newer versions of SCICHEM, the arrays are dynamically allocated. Unfortunately, the version of SCICHEM in CMAQ-APT is the older version with some array dimensions hard-wired. The best thing you can do is to change MAX3DB in common_met.f90 and rebuild the code. To get an nzbl_max = 11 or higher you can set MAX3DB = nzbl_max * nxb * nyb

Hi Prakash,

Thank you for your reply.

The MAX3DB parameter is set in SCICHEM/pig/inc/met_param_pig_inc.f90 (not in common_met.f90, as I mistakenly said before, sorry for the confusion).

To change MAX3DB as you suggested, I modified this line in met_param_pig_inc.f90

integer, parameter :: MAX3DB = nzbl * nxb * nyb ! max 3d boundary layer field size

Then, I re-built the cctm_apt. However, in the new BLD directory that was generated, the source code met_param_pig_inc.f90 that was copied there does not reflect the change above; it still shows:

integer, parameter :: MAX3DB = 300000 ! max 3d boundary layer field size

The gfortran module version created from met_param_pig_inc.f90 (aka met_param_inc.mod) also shows that MAX3DB has the old value ‘300000’ as below (in bold):

‘max3db’ ‘met_param_inc’ ‘’ 1 ((PARAMETER UNKNOWN-INTENT UNKNOWN-PROC
UNKNOWN IMPLICIT-SAVE 0 0) (INTEGER 4 0 0 0 INTEGER ()) 0 0 () (
CONSTANT (INTEGER 4 0 0 0 INTEGER ()) 0 ‘300000’) () 0 () () () 0 0)

Also, the date of the source code defining MAX3DB which was copied in the new BLD is not the same as the date of the version I modified by redefining MAX3DB. I used gedit to make the change in the source code and it shows that it was saved. Am I missing something?

Hi Prakash,

I figured out why the change of MAX3DB was not recognized when I re-built the cctm_apt. It was caused by the fact that the file storing the MAX3DB appears in two places. The one that actually was used during the model building is stored in “symlinked”:

$M3MODEL/CCTM_APT/SCICHEM/pig/inc/met_param_pig_inc.f90
$M3MODEL/CCTM_APT/SCICHEM/simlynked/met_param_pig_inc.f90

Also, to avoid declaring the variables in the new expression of MAX3DB, I choose to use an integer that is > 300000. The value used was based on my met grid dimensions (299 x 459) x 11. The ‘11’ represents the nzbl specified in the control file (cmaqapt50.inp). This has solved the “size” error which I reported in the first place.

Back to the cctm_apt simulation: now that the “size” error has been solved, the simulation runs a bit longer (ca. 10-15 minutes cpu time) but ends with the following message (repeated 32 times):

Program received signal 11 (SIGSEGV): Segmentation fault
Backtrace for this error:
Thread 32 (Thread 0x2ba56ce22700 (LWP 40584)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6

Any reason why this could happen? I’d appreciate your feedback on this.

Thank you!

Actually, the Thread in the Segmentation fault above repeats 31 times. Thread 1 is different and has 22 lines in it. Is this due to some memory size problem or something else? Is my domain too big or is the change I’ve made for MAX3DB causing it? I would appreciate any thoughts on this issue.

Program received signal 11 (SIGSEGV): Segmentation fault.
Backtrace for this error:
Thread 32 (Thread 0x2ba56ce22700 (LWP 40584)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 31 (Thread 0x2ba56d023700 (LWP 40585)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 30 (Thread 0x2ba56d224700 (LWP 40586)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 29 (Thread 0x2ba56d425700 (LWP 40587)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 28 (Thread 0x2ba56d626700 (LWP 40588)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 27 (Thread 0x2ba56d827700 (LWP 40589)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 26 (Thread 0x2ba56da28700 (LWP 40590)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 25 (Thread 0x2ba56dc29700 (LWP 40591)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 24 (Thread 0x2ba56de2a700 (LWP 40592)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 23 (Thread 0x2ba56e02b700 (LWP 40593)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 22 (Thread 0x2ba56e22c700 (LWP 40594)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 21 (Thread 0x2ba56e42d700 (LWP 40595)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 20 (Thread 0x2ba56e62e700 (LWP 40596)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 19 (Thread 0x2ba56e82f700 (LWP 40597)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x2ba56ea30700 (LWP 40598)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 17 (Thread 0x2ba56ec31700 (LWP 40599)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 16 (Thread 0x2ba56ee32700 (LWP 40600)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x2ba56f033700 (LWP 40601)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x2ba56f234700 (LWP 40602)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x2ba56f435700 (LWP 40603)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x2ba56f636700 (LWP 40604)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x2ba56f837700 (LWP 40605)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x2ba56fa38700 (LWP 40606)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x2ba56fc39700 (LWP 40607)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x2ba56fe3a700 (LWP 40608)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x2ba57003b700 (LWP 40609)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x2ba57023c700 (LWP 40610)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x2ba57043d700 (LWP 40611)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x2ba57063e700 (LWP 40612)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x2ba57083f700 (LWP 40613)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x2ba570a40700 (LWP 40614)):
#0 0x0000003c2ca0f076 in ?? () from /usr/lib64/libgomp.so.1
#1 0x0000003c2ca0e0a0 in ?? () from /usr/lib64/libgomp.so.1
#2 0x0000003c21a07a51 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003c212e893d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x2ba517f16fc0 (LWP 40482)):
#0 0x0000003c21a0f283 in wait () from /lib64/libpthread.so.0
#1 0x0000003c2721400d in ?? () from /usr/lib64/libgfortran.so.3
#2 0x0000003c2721582e in ?? () from /usr/lib64/libgfortran.so.3
#3 0x0000003c272146ca in ?? () from /usr/lib64/libgfortran.so.3
#4 signal handler called
#5 0x0000003c2128995f in memcpy () from /lib64/libc.so.6
#6 0x00002ba517750383 in MPID_Segment_contig_m2m () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#7 0x00002ba5177472db in MPID_Segment_manipulate () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#8 0x00002ba517750521 in MPID_Segment_unpack () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#9 0x00002ba517736aef in lmt_shm_recv_progress () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#10 0x00002ba5177377f8 in MPID_nem_lmt_shm_progress () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#11 0x00002ba51772bf77 in MPIDI_CH3I_Progress () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#12 0x00002ba5176c3147 in MPIC_Wait () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#13 0x00002ba5176c37c3 in MPIC_Sendrecv () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#14 0x00002ba5176112ed in MPIR_Bcast_scatter_ring_allgather () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#15 0x00002ba517611b45 in MPIR_Bcast_intra () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#16 0x00002ba51761268d in MPIR_Bcast_impl () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#17 0x00002ba517612cb8 in PMPI_Bcast () from /home/lgs4/lib/mpich-3.2/lib/libmpi.so.12
#18 0x00002ba517cfa5d5 in pmpi_bcast__ () from /home/lgs4/lib/mpich-3.2/lib/libmpifort.so.12
#19 0x0000000000575048 in ping_ ()
#20 0x00000000004a8650 in sciproc_ ()
#21 0x00000000004a2407 in MAIN__ ()
#22 0x00000000004093cd in main ()
3162.121u 460.321s 3:33.96 1693.0% 0+0k 27824+7827104io 67pf+0w

Hi Prakash,

One conclusion that I can draw from all the above is that: PinG (at least the version in CMAQv5.0.2) cannot handle a big domain like the one I am trying to use (please correct me if I am wrong), and to make the model work I have to make some changes in the source codes of PinG with all the negative consequences (see below):

(1) if I modify the default MAX3DB value (300000) by some number based on my domain size and the default nzbl value (11) from the cmaqapt50.inp file, the model run escapes the crash due to a SIZE error that I reported in the first place, but it crashes later with the segfault message (see above).
(2) I also noticed that MAXPUF has the same hard coded value as MAX3DB (300000). Are these two related? However, if I change MAXPUF to equal MAX3DB as in (1), I cannot build the CCTM_APT executable.

You have mentioned above that more recent SCICHEM versions exist. Were they implemented in CMAQ and if yes, in which CMAQ version?

Thank you!