Inconsistency between tz.csv and AMET site offset?

I was comparing the GMT_offset of AQS sites based on the AMET_site_metadata_files.tar.gz downloaded from the CMAS Data Clearinghouse to the tz.csv file on the CMAQ/POST/hr2day/inputs GitHub page.

I found some potential inconsistencies where the AMET site metadata placed a site in one timezone, while the tz.csv would indicate it belonged in a neighboring timezone.

For example in the map below there were mismatched sites in the following states:

  • Florida
  • Tennessee
  • Texas
  • Kansa
  • Nebraska
  • South Dakota
  • Idaho
  • Oregon

Has anyone come across this issue before? I could see this as a potential issue for the hr2day program for 24-hour averages if they are misaligned from observations by one hour.

Hello Mike,

at first glance, this seems to be an issue with the GMT_offset entries for these AQS sites. To my knowledge, this information is obtained straight from AQS pre-packaged station metadata files, but to help us track things down more precisely, it’d help if you could provide a list of AQS IDs for which you see these inconsistencies. And to clarify, in your map, did you plot all the stations contained in file AQS_full_site_list.csv downloaded from the link you included in your post, or did you subset the stations to only those that had observations for a certain subset of pollutants for a specific year or subset of years?

1 Like

To confirm Christian’s question, the GMT_offset does come directly from the site file provided on the AQS website (https://aqs.epa.gov/aqsweb/airdata/download_files.html#Meta). I pass the GMT_offset value in that file straight through to the AMET site metadata file. Now, it is certainly conceivable that the GMT offset may be incorrect for the lat/lon of a particular site.

As Christian suggested, the quickest and easiest way to figure out what’s going on is to provide a list of the sites that are mismatched between the site metadata file and the tz file. BTW, this is very cool map and it’s to see that most of the sites do fall into the correct timezone. I don’t know if this is something we’ve ever looked at visually.

Wyat

1 Like

Thanks Christian and Wyat,

See the attached csv for a list of the sites where I found the inconsistencies.

aqs_tzcompare.20251006.csv (16.9 KB)

This the subset of sites from the CMAS Data Clearninghouse file which are active (i.e., no end date). I did not perform any filtering by pollutants.

This is the python snippet I used to filter the sites:

# Table of AQS sites from CMAS Data Clearinghouse
monitor_list = pd.read_csv("./AQS_full_site_list.csv")

# Filter to only show active sites (i.e., those without a specified end date)
monitor_list_active = monitor_list.loc[
    np.isnan(pd.to_datetime(monitor_list['end_date']))
]
Full List of Inconsistent Sites
inconsistent_sites = [
    660101703,
    471630102,
    470111002,
    471050102,
    470110102,
    471450105,
    180855502,
    470930025,
    471450103,
    471050101,
    470890101,
    471510101,
    470650032,
    470930020,
    380530108,
    470110103,
    120179000,
    470730101,
    471730104,
    471210103,
    471630105,
    471551101,
    471550102,
    471630106,
    471631006,
    470890001,
    471070101,
    470259991,
    470258001,
    471050103,
    471050104,
    471050105,
    471050106,
    471050107,
    471630101,
    471450104,
    181230009,
    181230007,
    181230006,
    181759000,
    470190101,
    470018001,
    800060005,
    800060011,
    481419018,
    481410045,
    481410051,
    481410056,
    471410004,
    470855503,
    470430010,
    470215501,
    470370024,
    460330003,
    460330132,
    460710001,
    461030013,
    461030016,
    461030020,
    410610010,
    410630002,
    410390070,
    410050010,
    380590003,
    311570002,
    310699000,
    260710001,
    260430002,
    260430903,
    212210013,
    212218001,
    201810001,
    201810003,
    181250005,
    160499991,
    160550006,
    160550010,
    160550013,
    160550014,
    160570005,
    160690006,
    160690010,
    160690012,
    160690013,
    160690014,
    160170003,
    160170004,
    160170005,
    160210001,
    160490002,
    160090010,
    160090011,
    120330022,
    120051004,
    CC0110001,
    CC0110002,
]

Old timezones?

Additionally, I was looking around wikipedia and found this map of US timezones from 1921. In the map above, Tennessee is cut in half between ET/CT with some of the inconsistent sites on the western side of the state. But in this old map Tennessee is entirely in central time. So maybe this is related to changes in timezones over the years?

Thanks, Mike. This is super helpful. I spot checked the time zone offset from several sites between the list you provided and the offset in the AQS provided site file and it’s clear that the offset values are passing correctly between the AQS file and my site metdata file. So, the issue is that the AQS provided offset is incorrect. Why exactly that is is anyone’s guess.

The question now is how to best deal with the issue. I’ll need to coordinate with Christian to come up with a plan. If we are confident that the CMAQTZ is correct, maybe I need to use that to calculate the offset and not just pass the value through from the AQS file. Whatever solution we come up with, it will take a little time to implement. In the mean time, you could simply replace the offset in the site metadata file for those sites that clearly have an incorrect value with the correct value from the CMAQTZ.

Christian, does the offset value in the site metadata file affect HR2DAY? I know it is used with site compare but I can’t recall if it’s also used with HR2DAY.

Wyat

Hi Wyat, no, hr2day does not deal with any observational data at all, so any errors in the site metadata file don’t have any impacts on the gridded files generated by hr2day. The place this clearly does play a role is the obs-model data matching performed by sitecmp and sitecmp_dailyo3, since AQS data are assumed to be in LST and the GMT_Offset information from the site metadata file is used to shift model output from GMT to LST for pairing.