I used the shapefiles from 2020platform (https://gaftp.epa.gov/air/emismod/2020/spatial_surrogates/) to generate the 2020 spatial surrogates for the 12 km domain (us12k_516x444). The results are different with the benchmark (CONUS12_2020NEI_surrogates_25may2023.zip).
The surrogates of census (i.e. 100, 110), ports (801), mine (860), and OSM (901, 902) show large differences in non-CONUS areas (AK, HI, and Puerto Rico). The surrogates of NTAD and NLCD show differences and have more rows than the benchmark. The other surrogates (HPMS, FAO, airports, and Oilgas) passed the comparison.
The script configurations are the same with benchmark and I am using Spatial Allocator 4.4 with older version libraries (proj-4.8.0, gdal-1.11.0, and geos-3.4.2).
First, we wanted to confirm the version of the surrogates you are comparing to. The surrogates posted on the FTP site were updated in early June, so if you used an older version than that, there would be differences for NLCD-related surrogates. Have you gapfilled the surrogates you generated before comparing them to the EPA surrogates? Have you run the QA tool that is available as a java program to compare the surrogates? If not, how are you comparing them?
We are unclear as to how you are comparing the surrogates for AK, HI, and Puerto Rico. Only the southernmost tip of AK is in the surrogate grid, and HI and Puerto Rico are not in that grid.
Finally, the EPA-generated surrogates were generated with the SurrogateToolsDB system released by the CMAS center – it is possible that this could have some differences with Spatial Allocator-generated surrogates, but they have been shown to be quite close in most cases.
Thank you for you reply!
I updated the surrogates, the NLCD surrogates are the same now while the NTAD surrogates still show more rows than the CONUS12_2020NEI_25may2023. I compared NOFILL and FILL files and they both show differences in the surrogates I mentioned above. The tool “bin/64bits/diffsurr.exe” is used for the comparison.
The surrogates in CONUS12_2020NEI_25may2023 include some grids which are from AK and Puerto Rico, such as 02100, 02105, 02110, and 72097 (Sorry that I added HI by wrong.). But there are only several grids and won’t influence much.
That is good to hear that there are some improvements. We note that NTAD is not gap-filled with the NLCD, so there could be some other processing issue – perhaps related to the use of the older Spatial Allocator and not the Postgres version, or possible a configuration issue. If they could upload one of the NTAD surrogates you produced and the output of diffsurr, that would help with identifying the problem.
We note that the differences in 26163 (Wayne Co. MI) are on the order of 0.001% so are very minor and not worth worrying about.
The differences in 28163 (Wazoo Co MS) seem to be related to an issue with the Spatial Allocator where it is assigning some of the county emissions to a grid cell (341, 259) nowhere near the county.
The rail activity hasn’t changed since the 2014 platform. We looked and noticed that the 28163 issue is in 2014 and 2017 platform surrogates, but since it is a small fraction (0.18%) and the county sums to 1 so nobody noticed it.
We note that differences are very small and seem to be related to the postgres surrogate tool vs the Spatial tool.
It seems that the postgres tool does a better job for surrogate 261, although perhaps there is a flaw in the Shapefile being used to generate the surrogate which has a small segment that is mislabeled to be in the wrong county. It doesn’t seem concerning, but you can delete the misfiled grid cell entry and then use the tools to renormalize the surrogates so they still sum to 1.