Create CEM file for 2022?


I was wondering if it was possible to convert the first quarter of CEM data from the EPA air data mart into the proper FF10 format. (IE: it’s 2022, and I would like to run some January-February 2022 PTEGU files). If so, is there some documentation for how to create the HOUR_UNIT* files from the air data mart data? I have emailed the EPA contact line, and they said they were not sure how to convert the air data mart data into the SMOKE input format.

Thank you!

Ok so I figured it out. The EPA uploads cem files quarterly (according to the help desk?), which you can access via FTP:

I made a script to batch download the zip files, unzip them, and save the file names.

Then I formatted the HOUR_UNIT files to match this: 8.2.7. PTHOUR: Point source hour-specific emissions
which just means rearranging the columns of the files from the FTP, formatting the dates, and concattenating all the files within a specific month. I used the PTINV file in the SMOKE/input/ptegu directory to guide me to make sure the IDs were matching up. It’s kinda tricky reading through the user guide to get this, but ends up being straightforward I think?

One negative is that I could not find metadata in the files to help navigate the files across the FTP and the SMOKE platform so that I can really confirm how these are all connected. Like, the EPA files use ‘ORISPL_CODE’ to match on to the ‘ORISID’, ‘UNITID’ to match ‘BLRID’, etc., etc. (at least, from matching unique datasets across the PTINV, an HOUR_UNIT*, and these FTP files, that’s what I have back-engineered to understand?). So I’m working off my matching and assumptions, but it seems to work. Also, I am not sure if I need to further clean the data but that will be my next step to compare my 2021 CEM files to my 2022 CEM files.

1 Like