Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the RoCM/HIP device to accelerate certain DPLASMA kernels #57

Open
wants to merge 48 commits into
base: master
Choose a base branch
from

Conversation

abouteiller
Copy link
Contributor

This PR adds RoCM enabled kernels to the GEMM, PORTF and memory-aware GEMM operations.

@bosilca
Copy link
Contributor

bosilca commented Mar 31, 2023

As discussed on 03/31/23 we need to rebase and check the result. This will be tested next week on Frontier, we need it to be ready.

Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>

Conflicts:
	src/CMakeLists.txt
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Use proper error checks instead of asserts
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
Signed-off-by: Aurelien Bouteiller <[email protected]>
@abouteiller abouteiller marked this pull request as ready for review October 19, 2023 18:21
@abouteiller abouteiller requested a review from a team as a code owner October 19, 2023 18:21
@abouteiller abouteiller self-assigned this Oct 19, 2023
@bosilca
Copy link
Contributor

bosilca commented Oct 19, 2023

please squash to fewer commits.

@abouteiller
Copy link
Contributor Author

This is in ready to merge state beside the 'squash to less commits'.

@abouteiller abouteiller enabled auto-merge (squash) May 21, 2024 03:57
@abouteiller
Copy link
Contributor Author

I decided to go for a squash merge, this is ready for final review @bosilca @devreal

Error in ctest is #115, preexisting and unrelated to hip.

Copy link
Contributor

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only certain kernels are HIP-enabled instead of having all CUDA-enabled kernel also be HIP-enabled ?

@@ -109,8 +109,7 @@ extern void *dplasma_pcomm;
#define dplasma_error(__func, __msg) do { fprintf(stderr, "%s: %s\n", (__func), (__msg)); } while(0)
#endif /* defined(DPLASMA_DEBUG) */

#if defined(DPLASMA_HAVE_CUDA)
#include "dplasmaaux_cuda.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these headers not protected ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protection is self contained in the header itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had proper protection in place avoiding to include unnecessary files. What justify the need to made this change ?

stage_in=stage_in_lapack
stage_out=stage_out_lapack]
stage_in=dplasma_cuda_lapack_stage_in
stage_out=dplasma_cuda_lapack_stage_out]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is no HIP chore for the normal GEMM (aka. not the summa version) ?

Copy link
Contributor Author

@abouteiller abouteiller Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all kernels are implemented (tracking issue #98); those that are implemented have all of their particular variants implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants