Is r_interpolate_var_2db optimizable?

hello there~
I found an possiblely bottleneck in CMAQ5.4 in my machine.
code is below:

In this code, store_beg_ind to store_end_ind is a flatten of a 3d data. But if lvl is presented, the output data only used a flatten of 2d data.
With a -Dparallel, I think most call will have lvl presented.
the reference call is below:

I have tested that cio_bndy_data will change in the first tstep in every hour, but I’m wondering if it will change except this situation.
if it won’t change, maybe we can cache the cio_bndy_data(store_beg_ind:store_end_ind) to acheive some performance optimization.
E.G.:
If subroutine r_interpolate_var_2db (vname, date, time, data, type, lvl) has called once, save the cio_bndy_data(store_beg_ind:store_end_ind) to cache.when it’s called again,while vname, date, time, type is unchanged, but a different lvl, because the value is calculated before, we use cache data, to avoid too much copy and muladd.
I’m basiclly new to cmaq and time interpolate, i am worried if this optimize will cause some bug or some other unanticipated behavior.
Thank you for any idea~