How to maximize speed on my CMAQ runs for suboptimal computer?

Hello again.

I’ve successfully run the test case in my CMAQ installation. However, the run took 4 hours, and it only worked if I was there to make sure the computer doesn’t lock, as when this happened, the run basically stopped and the virtual machine froze. This happened even after I turned off the setting to automatically lock the screen in both my PC and the Virtual Machine. I don’t really have access to another computer, so I have to work with what I have. The test case was only one day; my study will be ~5 days long (as well as all the preprocessing necessary) and I would really like to not spend 20 hours in front of my computer moving the cursor from time to time so it doesn’t lock and freeze again.

Therefore, I have a few questions about how to optimize my runs, so the time I spend nursing my computer is minimized.

The first thing (and probably biggest contributor to the slowness) is that my $CMAQ_DATA folder, where both inputs and outputs are stored, is on an external hard drive. This is because the tutorial for benchmarking suggested 400 GB of free storage space for runs. I have enough storage for the input and output files of the test case, but nowhere near close to 400 GB, hence the hard drive. My first question is: how important is all that extra space? Is it really worth it to run from an external hard drive? What would run better/faster, just storing everything on my virtual machine (my laptop has an SSD), or is it better to use the external drive?

Next thing is, if the extra space really is necessary, would and external SSD work better? Even though getting another PC is outside my budget, I could buy and external SSD with enough storage, or at least borrow one. Would an SSD run faster (enough to justify buying it)?

I know either way my run times won’t be ideal considering I can only use a virtual machine, and not a very powerful one at that, but any tips that help bring down the run time would be greatly appreciated. Thanks!

While running CMAQ on a virtual machine is not a normal case for most CMAQ users, I guess I do have some suggestions that may work for you:

  1. Firstly, as you mention that you are concerned with the storage, it is true that CMAQ runs will create large files (particularly the CCTM_CONC file if you include all vertical layers). In this case, you can write a shell script to automatically compress these files as tar.gz files which could significantly reduce the storage cost.
  2. Reading data from external hard drives could be an option if the I/O works for your machine. However, any issue could arise in terms of such abnormal condition.

To solve terminal time-outs, I recommend using tmux. Here is an article about the advantages of using it. A Quick and Easy Guide to tmux

Hi HectorVG,

To prevent your pc and your virtually machine locking up due to inactivity, you can try this trick (hopefully it works for you too), play a song or a piece of music on Window Media Player with infinite loop.

 It sounds like you are running CMAQ on your workstation (PC). I wonder how much memory (RAM) and how many cores are on that system. Let say there are 8 cores on the system, definitely running with 8 cores will be faster than running with 4 cores. However, it will slow down if you run with more than 8 cores. The amount of memory on the system also plays an important role. If you don't have enough memory but rely on your hard drive as the virtual memory, it will slow you down due to heavy swapping of data in memory. On the storage side, you need to have enough space to store you five days simulation. External SSD could be an option. Yes SSD is faster than traditional SATA drive but it is a little bit more expensive. Base on your budget, you need to do some homework to see it is justifiable to use SSD. In summary, I will upgrade memory as the first priority. 

 These days pretty much any university or research institution has access to a moderate size of high performance computing system. If you have additional question, please let me know (wong.david-c@epa.gov).

Cheers,
David

Thanks for the tips! I might have to implement them If I can’t get the run time down while running from an external drive.

Thank you for the help again! I was running cmaq con tmux, but the virtual machine would still lock up. I’m guessing it has more to do with the fact that I’m running a virtual machine than the ubuntu OS. Can’t be sure though as I am a little out of my element here haha.

Thanks for all the helpful information, this will definitely be useful! My next run I’ll try playing something on media player to see if it stops my PC from locking itself.

My PC has 4 cores, and I give them all to the virtual machine when running, to which VM VirtualBox raises no complaints, so I guess it’s okay (I adjusted the run script to npcols = 2, nprows = 2).

I thought about asking my university for a more powerful (and actually Linux) PC, but considering how long it took me to build CMAQ on my PC, and the fact that my semester is close to over, I don’t think I’m going to have enough time to go through that process again.

I’ll do a bit more research on how helpful an SSD would be. Thanks again for all your answers.