-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP and MPI+HIP updates #361
base: master
Are you sure you want to change the base?
Conversation
…MPI+HIP codes. Unify CUDA and HIP code paths (CUDA / HIP => GPU, CUDA_MPIV / HIP_MPIV => MPIV_GPU, etc.).
As noted above, all tests (full test suite) are passing for the CUDA and MPI+CUDA (1 GPU) versions. However, some tests are failing for the HIP and MPI+HIP versions. See the logs below from tests on the MI210s on the AMD AAC. Interestingly, the test failures are slightly different between the HIP and MPI+HIP versions. This is a bit difficult to debug at least when comparing against the working HIP / MPI+HIP versions from the 23.08b release as there are also a number of test failures there. It may be better to pick a commit before the f-function optimizations and run tests there for comparison. Test configuration on the AMD AAC:
HIP test summary and diffs: MPI+HIP test summary and diffs: |
…preprocessor definitions for performance and storage considerations. Refactor preprocessor defintions to avoid unnecessary arithmetic.
…aths for older HIP builds.
…regarding STORE_OPERATOR). Fix segfault in debug builds of GPU code without ERI f function supported enabled but basis contains f functions. Remove unneeded DGEMM operation in CUDA codes in SCF/USCF methods. Other code clean-up.
…ggled on in CMake build.
…power functions (inlined device functions calling pow to preprocessor definitions using multiplication operations). Other code clean-up.
fbd9602
to
f937da6
Compare
…s. Add CMake option to enable LLVM-based address sanitizer (ASAN) for debugging with HIP builds.
NOTE: do not merge this until the issues below are resolved.
This MR ports the updated CUDA and MPI+CUDA codes (including recently merged f-function optimizations) to HIP and MPI+HIP versions, respectively. Also, this MR begins to unify the CUDA and HIP versions to simpily future GPU code maintainence.
Closes #344.