I am not certain, but I believe 68 GB should be enough memory.
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 7 with PID 0 on node ip-10-21-154-88 exited on
signal 9 (Killed).
Look at the “ancillary” log file from processor 7, which should have a name beginning with CTM_LOG_007. Is there an error message?
Please see this post (and the Debug Tutorial) for ideas on how to debug this issue.