AMET_for specific region and specific emissions level

Hello,

I am reading the model performance part of the EPA 2016V2 AQ modeling TSD https://gaftp.epa.gov/aqmg/2016v2_Platform_Modeling_Data/AQ%20Modeling%20TSD_2016v2%20Platform_rev_2022_0119a.pdf

EPA divided the CONUS into multiple regions
image

And considered the Ozone pollutant for the following conditions:

image

I want to run *stats_plots_leaflet.csh script from /script_analysis/aqExample for the specific time period as EPA (may to September) and for specific region (for example, SOUTH) when MDA8>= 60 ppb.

I know to how to run the analysis script for the specific time period using AMET_ADD_QUERY variable (please see this screenshot)

I am not sure what to change in the in sitex_output script to get the 8HR MAXOZONE (MDA8) data >= 60 ppb only.


I can manually extract all the rows to create a newfile for the IN_TABLE variable using first 2 digits of site_id (for example, for Arkansas site_id starts with 05) for all the states in the SOUTH and the replace the file name in sitex_script ( setenv IN_TABLE), and do it for all other regions

My questions are:

  1. What changes I need to make to run the *stats_plots_leaflet.csh script for MDA8 Ozone >= 60 ppb?

  2. Is there any easy way to run the script for the specific region (SOUTH, SOUTHEAST,SOUTHWEST, etc.) without creating new IN_TABLE file for each of the regions?

Thanks in advance

Thanks for your questions. I’ll answer them separately below.

  1. What changes I need to make to run the *stats_plots_leaflet.csh script for MDA8 Ozone >= 60 ppb?

To limit your analysis to just MDA8 O3 values >= 60, you can use the following option:

 In the run_stats_plots_leaflet.csh script, you can add the following environment variable:

    setenv AMET_ADD_QUERY "and d.O3_8hrmax_ob >= 60"

This will limit the MDA8 O3 observations to only those values greater than or equal to 60ppb. 
Similarly, 

    setenv AMET_ADD_QUERY "and (d.O3_8hrmax_ob >= 60 or d.O3_8hrmax_mod >= 60)"

will limit both the analysis to only those records where EITHER the observed MDA8 O3 or modeled 
MDA8 O3 value is greater than or equal to 60ppb.

    setenv AMET_ADD_QUERY "and (d.O3_8hrmax_ob >= 60 and d.O3_8hrmax_mod >= 60)"

 will limit the analysis to only those records where BOTH the observed and modeled MDA8 O3 
 values are greater than or equal to 60ppb. 
  1. Is there any easy way to run the script for the specific region (SOUTH, SOUTHEAST,SOUTHWEST, etc.) without creating new IN_TABLE file for each of the regions?

    To limit the analysis to a specific region, you can use the predefined regions in the Network.input file,
    which can be found in $AMET_BASE/scripts_analysis/aqExample/input_files. At the bottom of the
    file are predefined combinations of states for the different RPO, PCA, and NOAA climate regions.

    So, for example, if you want to limit your analysis to just the South NOAA climate region, you can
    set the “clim_reg” variable in the all_scripts.input file to “South” (the default value for clim_reg in that
    file is “None”). This will limit the states in your analysis to only the states defined in the South
    clim_reg region.

Note that you can use a combination of the ADD_AMET_QUERY and clim_reg together to limit the MDA8 O3 analysis for the South region to just 60ppb or greater.

Let me know if you have any questions about any of my responses, or if you run into to any issues. Also, you must be using the AMET database for these solutions to work, as they are all database queries. Best of luck!

Wyat

1 Like

Hi @wyat.appel ,

Thank you so much for the response. I forgot to mention, I am using AMETv1.4.

I think your suggestion of changing the variable’s name for the specific region is for AMETv1.5

Can I use Network.input and all_scripts.input files from v1.5 to older version AMET v1.4 without modifying anything else in v1.4?

That’s actually a good question. I think you should actually be able to use the updated Network.input and all_scripts.input from v1.5 with v1.4. It’s certainly something that easy to try (just keep your original versions around too).

If we run into any trouble with that, there’s another option that will work. We can simply add the clim_reg query string to the ADD_AMET_QUERY string you’re already using. So, no worries if replacing the files does not work for some reason.

Wyat

Hello @wyat.appel , I am sorry for the late response. I have encountered a new issue while running AMETv1.4 before trying your suggestions (please see below):

Step–1: First I created a file to use in AMET by using COMBINE software. My spec_def file contains only one pollutant as I need only MDA8 (daily maximum 8 hour average)
Screenshot from 2022-09-26 12-35-13

step–2: Then I used ‘T’ flag only on AQS_DAILY_03 in scripts_db file.

step–3: after running the script, I got the following error

Screenshot from 2022-09-26 12-33-27

                      and the run created 

a sitex_AQS_Daily_O3 script

and an empty AQS_Daily_O3.csv file

It looks like the script is not generating MDA8 ozone data. I couldn’t find out the issue.

Does the script automatically calculate 8hr Max Ozone (MDA8) data? If not, what changes I need to make in the script?

Thanks in advance for all your suggestions and help.

The sitex_AQS_Daily_O3 script should only need O3 from the COMBINE file, but clearly something isn’t working out at the sitecmp_dailyo3 stage which is why you end up with an empty AQS_Daily_O3.csv file and subsequent AMET R analysis script error. The sitecmp_dailyo3 log file should contain information on what’s happening, so that’d be the place to start debugging this issue.

As Christian said, the log file is the place to start to diagnose this problem. Off hand, I’m not sure what the problem is exactly. You could check the sitex script that was created to make sure all the paths to the data files (both model file and obs file) are correct, and that the correct dates are specified for your combine file.

But the primary log file is the best place to see where things went wrong.

Wyat

Hello @hogrefe.christian

I don’t see sitecmp_dailyo3 log file in any of the directories of AMET. I can see only two files in the output/ directory

Where is it located?

Hi @wyat.appel , the data file paths are correct. One thing I noticed (not sure if this is an issue), in my scripts_db scripts date is formatted YYYYMMDD
image

But in the sitex script, the date format is
image

Thanks in advance

Can you provide the entire sitex .csh script? And the entire AMET run script? Those might be helpful in diagnosing what’s going on.

There should be a log file somewhere. Is there any log file in your run directory?

Hello @wyat.appel

Thank you so much for the response. No, there is no log file in the AMET directory. I have searched the directories manually and also using the command
Screenshot from 2022-09-26 16-46-32

Please see the attached scripts_db and sites script
sitex_AQS_Daily_O3_MPE_2016v2_US.csh (2.5 KB)
scripts_db_MPE_2016v2.csh (7.1 KB)

Thanks in advance

Odd there is no log file. Can you pipe the output from the sitex_AQS_Daily_O3_MPE_2016v2_US.csh script to a log file and provide that?

Hello @wyat.appel

Please see the terminal output
log.txt (33.1 KB)
from sitex_AQS_Daily*
sitex_AQS_Daily_O3_MPE_2016v2_US.csh (2.5 KB)
script run

Thanks in advance

Thanks for posting this log file. The time stamps in your COMBINE file are incorrect (the file starts on Dec. 22, 0015), so you need to go back and double check how it was created. Because the time period in this COMBINE file doesn’t overlap with any observations, you end up without any matched observation/model pairs in the output file.

 "M3_FILE_1" opened as OLD:READ-ONLY   
 File name "/root/RAID2/AMET_v14/tools_src/combine/ARDEQ_outputs/ARDEQ_MPE_combine_camxv7.10_cb6r5.cf2e.DMS.2016v2.2016year.grd01.nc"
 File type GRDDED3 
 Execution ID "????????????????"
 Grid name "CAMx v7.10"
 Dimensions: 246 rows, 396 cols, 1 lays, 3 vbles
 NetCDF ID:     65536  opened as READONLY            
 Starting date and time    15356:000000 (0:00:00   Dec. 22, 0015)
 Timestep                          010000 (1:00:00 hh:mm:ss)
 Maximum current record number      9024

Hi @hogrefe.christian

Thank you so much for the response. I went back and created a new COMBINE file using timestamp (20160101 to 20161231). And then I ran AMET scripts again, which giving me the same error. Please see the attached new log file
Newlog.txt (33.2 KB)

Thanks in advance

Your year is still incorrect in the new file that now starts 10 days later than the previous file - it’s now year 16 (not 2016) while before it was year 15 (not 2015). You’re still off by 2000 years.

 Value for M3_FILE_1:  '/root/RAID2/AMET_v14/tools_src/combine/ARDEQ_outputs/ARDEQ_MPE_combine_camxv7.10_cb6r5.cf2e.DMS.2016v2.2016year.grd01_O3_only.nc'
 Value for M3_FILE_1:  '/root/RAID2/AMET_v14/tools_src/combine/ARDEQ_outputs/ARDEQ_MPE_combine_camxv7.10_cb6r5.cf2e.DMS.2016v2.2016year.grd01_O3_only.nc'
 Value for M3_FILE_2 not defined; returning defaultval ':  '                '
 Value for M3_FILE_2 not defined; returning defaultval ':  '                '
 Value for IOAPI_CHECK_HEADERS not defined;returning default:   FALSE
  
 "M3_FILE_1" opened as OLD:READ-ONLY   
 File name "/root/RAID2/AMET_v14/tools_src/combine/ARDEQ_outputs/ARDEQ_MPE_combine_camxv7.10_cb6r5.cf2e.DMS.2016v2.2016year.grd01_O3_only.nc"
 File type GRDDED3 
 Execution ID "????????????????"
 Grid name "CAMx v7.10"
 Dimensions: 246 rows, 396 cols, 1 lays, 1 vbles
 NetCDF ID:     65536  opened as READONLY            
 Starting date and time    16001:000000 (0:00:00   Jan. 1, 0016)
 Timestep                          010000 (1:00:00 hh:mm:ss)
 Maximum current record number      8784

Hello @hogrefe.christian
Thank you so much for the response. I understand the issue, Thanks. Please see below steps that I tried:

My AQ model output data (.nc files ) is formatted as YYDDD format
data_camxfile1

So, my COMBINE file is in the same format using this script
MPE_combine.Year2016_2016V2.csh (955 Bytes)

combine_output_2016v2_1

As my model data is in YYDDD format , I made a change in the scripts_db
MPE_2016v2_MOD.csh (7.1 KB)

Original commad

Modified command

Which basically gives START_DATE_J and END_DATE_J in YYJJJ format (changing uppercase "Y’ to lowercase ‘y’).

Then I ran the scripts , where I can see for both mod and obs files are in same format

But it’s still giving me same error
log_f.txt (33.2 KB)

Would you please suggest me what else changes I need to make in AMET script?. Or could you please suggest me any changes in the COMBINE script?

Thanks in advance

The problem is with the date format of your AQ model output. For CMAQ-based post-processing tools (COMBINE, sitecmp, AMET, etc.) to work properly, all dates need to adhere to the I/O API Date/Time conventions. Specifically, dates need to be in YYYYDDD format.

Are the “AQ model output data (.nc files)” you mentioned above the original CAMx output files, or were other tools used to post-process CAMx output files before running the COMBINE script? If these are the original CAMx output files, you would probably want to check with the CAMx developers why the model generates “I/O API - style” .nc files without actually conforming to I/O API conventions, rather than trying to hack CMAQ post-processing tools to ingest such non-compliant files. If some other intermediate tool was used to generate these files, the problem may originate there.

Just changing START_DATE settings in the sitecmp_dailyo3 run script is not a viable approach. In addition to reading model data, the program also reads dates from the observation file and uses I/O API functions to find matching dates. In your current setup, the dates in the model file and the dates in the observation file are fundamentally mismatched, and you need to fix the date problem on the model side of things to conform with I/O API conventions for things to work.

While I still think you should try to find out where things went wrong with your existing setup (original CAMx output or some other potential intermediate processing step to generate $indir/ARDEQ_${gday}.12US2.35.2016fj_v7.10_CB6r5.cf2e.epa.camx.avrg.grd01.nc), you could conceivably use I/O API tool m3tshift to process these $indir/ARDEQ_${gday}.12US2.35.2016fj_v7.10_CB6r5.cf2e.epa.camx.avrg.grd01.nc files to shift the start date of each file from the year 16 A.D. to the year 2016 A.D. and then run COMBINE and the AMET scripts with the proper 2016 time settings.

1 Like

Hello @hogrefe.christian

Thank you so much for the suggestions. While I am trying to figure out the issue in the existing AQ modeling setup,

I tried to use the m3tshift process:

Here is my setup for the m3tshift

I selected all options DEFAULT except the “Enter Target date” :

The m3tshift run is successful. Then I tried to check if the new file format is same as the original.


In the diff_1.txt (
diff_1.txt (48.4 KB)
just attached 1000 lines, because file is too big) file, I can see some differences for example, space is created before double quotes ( ex. "PTI " )

Also, time period is shifted ( in original file 16366, 0, but in the new file 2016366, 10000,)

Would you please help me understand what I need to fix?

Also, Do you have an example bash script to process all the files (366 modeling output files) in one run?.

Thanks in advance

This seems reasonable, and the differences you’re highlighting confirm that whatever tool was used to generate the original file (CAMx or some downstream tool) created .nc files that “look like” I/O API but deviate from I/O API conventions in a number of ways, e.g. with an additional “coordinates” attribute, not padding variable names to 16 characters, including non-I/O API attributes ISTAG, IUTM, etc. - all of that is being removed when the m3tshift tool (that obviously uses the I/O API library) is used to create a file. Most of those deviations from conventions didn’t cause problems, but the deviation from the date convention obviously did.

Given that you could run m3shift successfully and the time stamps in the new file are now correct, I think you can proceed with this band-aid approach, until/unless you find a way to create I/O API-compliant files further upstream in your workflow which would be the preferred approach when using CMAQ tools for your work.

I don’t have bash scripts for running m3tshift on a series of files (I probably used some csh scripts before), but you’d basically just want to capture in the script what you did interactively while looping over files which should be fairly straightforward to do. You could also apply the m3tshift approach to your existing COMBINE file.

And to get back to the root cause of these issues, I should probably quote the I/O API author @cjcoats in this context as well, because this is another example of things going wrong when some tools treat I/O API as a format:

Note: the I/O API is not a format – it’s a programming interface (that’s what API means); so far, every attempt I know about to treat it as a format has failed to some degree or another (and I/m the I/O API author, so I’m the authority on that).