CMAQ PM2.5 overestimation

Hello CMAS members,

I am using CMAQ v5.5 to model PM2.5 concentrations over the East Asia region.
For emissions, I am using the EDGARv8.1 inventory, which I processed into a CMAQ-compatible format using SMOKEv5.0.

However, I am experiencing a serious overestimation of PM2.5 concentrations in my model results. I would like to know if there are any suggestions or solutions to address this issue.

Hi,
For East Asia CMAQ simulations, there are a few better emission inventories than EDGAR, since EDGAR is a coarse and relatively outdated inventory. I would suggest trying to use the MIX emission inventory (developed by Dr.Meng Li) to replace the EDGAR inventory if you have checked the meteorological fields are correct.

Ryan,

Thank you for your response. As far as I know, the currently available MIX inventory only goes up to 2017. On the other hand, EDGAR v8.1 provides data up to 2022 at the same spatial resolution.

The issue is that previous studies generally points to large bias in EDGAR which thereby not a well-suited one. It might be even better to direct use the newest MIX inventory with emission projections (I even think that the MIX2017 would be better than the EDGAR2022 if you try it).
By the way, I recall that there is a new version of MIX inventory nearby 2020, you can reach out to Dr.Li to see if she has updated it.

Hello,

I am currently working on configuring the SMOKE model to utilize the EDGARv8.1 emission inventory on a monthly basis. Based on your previous posts, I believe you have successfully processed the EDGAR monthly gridmaps.

I checked your arinv file in the following post and realized you have obtained monthly .nc files. Would you please let me know how this should be done? How can I utilize Edgar monthly gridmaps which are single .nc files (containing 12 months over the year) and obtain separate monthly .nc files (EDGARv8.1 or HTAP_V31)? Do I need a specific tool or software?

I also have a general question, in your experience, how does using EDGARv8.1 compare to HTAP_V3.1 in terms of data quality or model performance?

Kind Regards,

Mason

Mason,

As for EDGARv8.1, since it is provided only as an annual inventory, I was not able to separate it into monthly files. On the other hand, HTAPv3.1 offers both monthly and annual formats. I attempted to process the monthly data, but when I filled in the “month” field in the arinv file with values other than 0, I encountered a negative value error. As a result, I am currently using the annual files for my simulations.

In my experience, HTAPv3.1 tends to perform better than EDGARv8.1, possibly because HTAP includes regional inventories in its emission estimates.

By the way, were you able to process a monthly inventory by setting the “month” in the arinv file? If so, I would appreciate it if you could share how you managed to do this.

Best regards,
Heomin

Dear Heomin,

Thank you very much for your information.

Based on my understandin, EDGARv8.1 and HTAPv3.1 offer their monthly gridmap data in a consistent format, as mentioned in their “Click to Expand” sections.

I am currently using Xarray in Python to extract a specific month from the 12-month EDGARv8.1 monthly gridmap .nc file. Following information by @ernesto.pinoc, I also sliced out the “time variable”, resulting in a time-independent .nc file containing data for that month only.

I have to update the arinv file with my corresponding month number (instead of 0) to test if this works. However, I’m creating my GRIDMASK file at the moment. I look forward to give an update.

Kind Regards,

Mason

Dear Heomin,

I was finally able to process monthly Edgar v8.1 inventory files (in smkinven) by setting the “month” in the arinv file.

I’d like to thank @eyth.alison, @cjcoats, and @bbaek for their continued guidance and support on this forum.:folded_hands:

I’m currently working with SMOKEv5.2 using the Intel Fortran Compiler (ifx) Version 2024.1 on Ubuntu 22.04 LTS.

I have attached my developed code for v8.1 database .nc files processing, as well as my arinv file.

arinv.area.lst.txt (5.8 KB)

Edgar v8.1 Monthly Processor - January.py (574 Bytes)

Best Regards,

Mason

Dear Mason,

I am glad to hear that the issue has been resolved. I just wanted to ask whether the SECTOR names in the EDGAR v8.1 monthly data need to be mapped, since they do not seem to be identical to the SECTOR names generally used in the SMOKE model.

If possible, could you kindly share the ge_dat data and the Run scripts you are using for the EDGAR v8.1 data?
My email address is gjals1015@pusan.ac.kr, and I would greatly appreciate your response.

Sincerely,
Heomin

Dear Min,
Thank you for your message.

Regarding the sector names, I assume that the first column (SCC) in the ARINV list file can be configured according to the user’s preference. Please correct me if I am mistaken.

Conversely, the third column “Variable_Name” appears to map directly to the exact variable names presented in your NetCDF files.

You can find actual (unmodified) variable names of 4 datasets (HTAP 2, HTAP3, HTAP 3.1, EDGAR 8.1) in the following screenshot:

However, you could even modify the variable names in the .nc files, see EXTRACT emissions from EDGAR_HTAPv3 with limited geographical domain (South America) - #16 by Jano

Kind Regards,

Mason

Dear Min,

I noticed that the .ncf files generated by my previous Normal Completion of smkinven cause ‘Negative Value’ error in the program temporal.

I applied your solution and set the month number to 0 in the ARINV file for the monthly files:

And temporal finished with Normal Completion!

Dear Mason,

I’m glad to hear that the error has been resolved.
However, as far as I understand, setting the month to 0 means treating the emissions inventory as annual data. I am not quite sure if it is appropriate to split the monthly inventory into separate months and then set the month to 0.

What I would like to achieve is to use the monthly inventory directly, without applying the monthly factor from atpro, so that only the weekly and hourly factors are applied.

Best regards,
Min

Dear Min,

Thank you for your insights, I feel I have a much clearer understanding now. I’m happy to share my thoughts on the topic, though I’m still relatively new to this.

About the monthly factors; variant monthly factors are applied when processing yearly inventory files. If the inventory input consists of pregridded separate monthly NetCDF files, then the full emissions of each month should be processed (factor of each month=1), unless there would be another logic behind it.

I assume the issue with the month variable in arinv.edgar.lst for EDGARv8 is specific to this dataset. Additionally, setting the month variable to 0 (as in your current solution for EDGARv8 files) does not necessarily guide SMOKE to process data on a yearly basis, it always does. Before I had applied your solution, I ran smkinven and temporal with month ≠ 0 and without providing ATPRO_MONTHLY. Despite this, temporal was still prompting for ATPRO_MONTHLY (@eyth.alison had mentioned this). It seems that the variable setenv SMKINVEN_MONTH might be more critical in controlling monthly processing.

Considering EDGAR.v2 and its separate monthly files, besides smooth compatibility with SMOKE, it seems to me feeding separate monthly files is almost essential for targeted monthly processing, avoiding the need to process an entire year’s data (yearly inventories) through temporal. With EDGAR.v8’s new structure, it appears to be a matter of data presentation, requiring additional preprocessing steps for SMOKE users.

The only minor drawback of your solution would be the necessity of multiple arinv.edgar.lst for distinguishing among the files of different months (maybe). Owing to the fact that G_RUNLEN has a small cap and SMKINVEN_MONTH targets a single month, and running temporal program on a daily basis is a common practice, I think this challenge doesn’t make a big difference eventually.

Kind regards,

Mason

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Just a correction: G_RUNLENis not limited to a small cap and accepts hour numbers greater than two digits.