MCIP v5.1 floating point error in resistcalc.f90

@tlspero

Dear all,

I get following error after running MCIPv5.1.

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 0x7f019ce912da in ???
#1 0x7f019ce90503 in ???
#2 0x7f019c2d7f1f in ???
#3 0x55e5854f669b in resistcalc_
at /home/a13/NAS/a13/mcip/src/resistcalc.f90:281
#4 0x55e5854cc2bc in pblsup_
at /home/a13/NAS/a13/mcip/src/pblsup.f90:331
#5 0x55e585477edd in dynflds_
at /home/a13/NAS/a13/mcip/src/dynflds.f90:67
#6 0x55e58543952c in mcip
at /home/a13/NAS/a13/mcip/src/mcip.f90:169
#7 0x55e585439833 in main
at /home/a13/NAS/a13/mcip/src/mcip.f90:79
Floating point exception (core dumped)
Error running mcip

1 Like

@a.kashfi73 –

The traceback shows that MCIP is crashing on line 281 in resistcalc.f90 where there is a division by leaf area index, which I suspect is set to 0.0 when MCIP is crashing. LAI could be 0.0 over water, but the block where LAI is calculated in MCIP should only be activated on land points. Can you check the values of XLAI and XLWMASK for the point where MCIP is crashing?

–Tanya

@tlspero
Dear Tanya,

I thank you for your quick response. I am sorry, but can you please tell me how I can check that?
I just saw LAI and LANDMASK in my WRF output, the values were fine to me(zero at sea, and not zero on land). Can you please tell me what you mean by the point where MCIP is crashing?

Thank you again,

Hi,
I think she’s referring the grid cell where mcip is dividing by zero and then crashing. One way to do it would be to add some write statements to resistcalc.f90 outputting i,j and the lai and landmask values just above line 281 (where it crashes) and then recompile and rerun

@mmallard @tlspero

Thank you for your suggestion Megan. Code is now as follows.

print*,xlai(c,r)
print*,xlwmask(c,r)
radf = 1.1 * xrgrnd(c,r) / ( radl * xlai(c,r) ) ! NP89 - EQN34
f1 = (rstmin / rsmax + radf) / (1.0 + radf)

With results as follows.

0.283726513
1.00000000
0.310195982
1.00000000
0.337951839
1.00000000
0.366095901
1.00000000
0.394635946
1.00000000

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 0x7f2bcbed22da in ???
#1 0x7f2bcbed1503 in ???
#2 0x7f2bcb318f1f in ???
#3 0x559d436207c3 in resistcalc_
at /home/a13/NAS/a13/mcip/src/resistcalc.f90:282
#4 0x559d435f62bc in pblsup_
at /home/a13/NAS/a13/mcip/src/pblsup.f90:331
#5 0x559d435a1edd in dynflds_
at /home/a13/NAS/a13/mcip/src/dynflds.f90:67
#6 0x559d4356352c in mcip
at /home/a13/NAS/a13/mcip/src/mcip.f90:169
#7 0x559d43563833 in main
at /home/a13/NAS/a13/mcip/src/mcip.f90:79
Floating point exception (core dumped)
Error running mcip

Thank you for your time.

That’s puzzling… I expected that LAI would have been set to zero somehow and the error was caused by dividing by zero. But LAI was set to a reasonable value just before the crash.
This type of error could also be caused by trying to do arithmetic operations with really, really small or big numbers. I’d recommend outputting the other variables in the equation (xrgrnd and radl) for completeness to make sure that all terms have reasonable values.

@tlspero @mmallard

Thank you for your time and for your help. I just noticed I had made a mistake in my log file. Sorry for that. There is a LAI zero as follows:

The code is:
IF ( rstmin > 130.0 ) THEN
radl = 30.0 ! [W/m2]
ELSE
radl = 100.0 ! [W/m
2]
ENDIF
print*,xlai(c,r)
print*,xlwmask(c,r)
print*,xrgrnd(c,r)
print*,c
print*,r
radf = 1.1 * xrgrnd(c,r) / ( radl * xlai(c,r) ) ! NP89 - EQN34
f1 = (rstmin / rsmax + radf) / (1.0 + radf)

The result is:
45
111
0.599254608
1.00000000
0.00000000
45
112
0.625336707
1.00000000
0.00000000
45
113
0.652132213
1.00000000
0.00000000
45
114
0.00000000
1.00000000
0.00000000
46
40

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0 0x7f83c0e1f2da in ???
#1 0x7f83c0e1e503 in ???
#2 0x7f83c0265f1f in ???
#3 0x5632d6bed925 in resistcalc_
at /home/a13/NAS/a13/mcip/src/resistcalc.f90:286
#4 0x5632d6bc32bc in pblsup_
at /home/a13/NAS/a13/mcip/src/pblsup.f90:331
#5 0x5632d6b6eedd in dynflds_
at /home/a13/NAS/a13/mcip/src/dynflds.f90:67
#6 0x5632d6b3052c in mcip
at /home/a13/NAS/a13/mcip/src/mcip.f90:169
#7 0x5632d6b30833 in main
at /home/a13/NAS/a13/mcip/src/mcip.f90:79
Floating point exception (core dumped)
Error running mcip

I thank you again.

@a.kashfi73 –

What LSM are you using in WRF? Are you using an urban model in WRF?

–Tanya

Dear Tanya,

I thank you for your quick response. My WRF sf_surface_physics option is layer thermal diffusion scheme. And I am not using urban model in WRF.

Thank you,

Is there any ice at this point? That would one way of having a zero LAI value over a land point.

I am not so sure. How can I check that? I only have SEAICE variable which is zero over my domain.
But what would be the solution for mcip anyways?

Thank you,

I was wondering if the land use category was a snow or ice category where LAI would be reduced. Just to clarify, even for cells in such categories, LAI should be set to a minimum value of 0.01. So this could be an odd bug on WRF’s end, or maybe there’s some confusion with the landsea mask. Either way, it’d be good to know the landuse category.

To find the LU category for this grid cell, you can check the field LU_INDEX in the wrfout file to get the category number or you could modify the resistcalc code to output it from mcip (“lu” is used just above around line 200). Then check the wrfout’s attributes to see which LU dataset was used. For example, for runs that used the USGS dataset, the attribute MMINLU = “USGS”. If you look at WRF’s vegetation parameter table under the “USGS” heading, you’ll find the category numbers on the first column and the category names in the last column. You’d need to check the corresponding stanzas of the table if you used a different dataset, such as NLCD.
A copy of the vegparm table can be found here:

I thank you for your guidance and for your time. There code now is as follows:

print*,xlai(c,r)
print*,xlwmask(c,r)
print*,xrgrnd(c,r)
print*,c
print*,r
print*,lu
radf = 1.1 * xrgrnd(c,r) / ( radl * xlai(c,r) ) ! NP89 - EQN34

And result is:

0.652132213
1.00000000
0.00000000
45
114
12
0.00000000
1.00000000
0.00000000
46
40
7
MMINLU of WRF output is “MODIFIED_IGBP_MODIS_NOAH”, and for this option 7 is ‘Open Shrublands’.

Thank you again,

No problem!
LAI should be constrained to a minimum value of 0.6 for that land use category. By the way, have you verified that there is a zero LAI at this grid cell within the wrfout file that’s being read? I doubt that mcip is overwriting it with a zero, but it’s worth eliminating the possibility.
Also, what WRF version is this? You can tell be looking at the TITLE attribute in your wrfout files.

By the way, if it turns out that this is a single odd value coming in from WRF and you’d like a quick, hard-coded solution to get by it, you could overwrite the zero LAI at that point with a quick if statement. metvars2ctm.f90 (see line 406) plugs in hard-coded LAI values from arrays just in case they are not available to read in from WRF, so you could use those as a reference of what to hard-code it to (see the 7th entry in the “laimod” and “laimnmod” arrays further down).

Of course, this would be a problematic solution if there are several zero’d LAI values to contend with and this is just the first one encountered by the code. So it’d be good to verify where the zero LAI value at that grid cell comes from so we can continue diagnosing the problem.

Hello again,

I do appreciate your help. Yes, there are also other points with zero LAI in WRF output. I am using WRFv4.1.3.
And for your suggestion, it also did not work, because the metvars2ctm.f90 does not execute to the line 406. It would execute line 377 first and skip the rest of the script.
ELSE IF ( iflai .AND. ( MAXVAL(lai) < smallnum ) ) THEN
xlai(:,: ) = lai(sc:ec,sr:er)
My fix is to just change the greater sign (>) to the less sign (<) and everything will be fine. Is that the right way do you think?
I just checked the metvars2ctm.f90 in the CMAQv5.2.1 package, the above IF condition is also different.
Please let me know how it goes.

Thank you,

@a.kashfi73 –

I would NOT advise you to follow your suggestion.

The MCIP variable “lai” is initialized to 0.0 in init_met.f90. If leaf-area index is available in the WRF output as netCDF variable “LAI” or “LAI_PX”, it will be read and loaded into MCIP variable “lai” or “lai_px” in rdwrfem.f90.

In metvars2ctm.f90, the MCIP internal variable “xlai” is filled on the transfer grid (hence, the “x” in the name of the variable) which is the MCIP output domain plus the boundaries. In the section of code that you identified in metvars2ctm.f90, MCIP is trying to determine how to fill “xlai”, whether it uses leaf-area index from a variable in the WRF output or whether it will use a look-up table based on land use classification.

In the solution that you propose, you are suggesting that leaf-area index might be available in the WRF output file, but you want to bypass it and use a lookup table instead. That is, “MAXVAL(lai)” will still be 0.0 if LAI was not read from the WRF output, but it will be larger than “smallnum” if LAI was in the WRF output and filled with realistic values. We only want to use the lookup table if LAI is not available. However, if LAI is available, we need to ensure that there are reasonable values in the field.

I need a few more pieces of information from you to determine whether MCIP has been properly coded to handle the 5-layer thermal diffusion scheme (which I cannot guarantee that it has).

  1. I need to know whether or not LAI is present in your WRF output file and what kind of values it contains. Therefore, I’d like for you to first type “ncdump -h wrfout | grep LAI $1”, and then paste the result of that in your reply here. That will show whether or not LAI is even in your WRF output file (so we know whether we need to use a lookup table in MCIP or if we are getting LAI directly from WRF).
  2. If LAI is in the file, I’d like for you to visualize the field and tell me the range of values.
  3. I’d also like to see the header to your WRF file. Please type “ncdump -h wrfout” and post the results in your reply. You can either cut-and-paste to the reply or add an attachment.

Thank you. Hopefully we will get this resolved for you soon. I want to determine if this is specific to you or if there is a bug in MCIP for handling the 5-layer thermal diffusion scheme.
–Tanya

Thank you Tanya,

Please see the attachments.

header.txt (38.6 KB) LAI.txt (223 Bytes)

@a.kashfi73 –

Thank you for providing the information. I’m still looking into your issue, and I don’t see anything obvious, so I’ll need to do a little more digging. I may need to ask for some additional files, as well.

In the meantime, I’m curious as to why (scientifically) you’ve selected the 5-layer thermal diffusion instead of a more advanced LSM like Noah or P-X. Would you be willing to answer that?

–Tanya

Hi Tanya,

I would just like to add that I’m having the exact same error message after running MCIP V5.1 over the weekend. So, this isn’t specific to a.kashfi73. Please let me know if there is any information I can provide with regard to my run/error that might help find the cause of this bug.

Best,

Nathan

@NRiceKYDAQ –

I’m sorry that you’re also having the same issue. Can you confirm the setting you used for sf_surface_physics in WRF, as well as the land use classification in your runs?

–Tanya