Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU deskew #146

Merged
merged 15 commits into from
Jul 11, 2024
Merged

GPU deskew #146

merged 15 commits into from
Jul 11, 2024

Conversation

talonchandler
Copy link
Contributor

@talonchandler talonchandler commented Jul 2, 2024

This PR changes the existing scipy.ndimage deskew to a GPU-accelerated monai deskew.

--
Benchmarks:
Deskewing a 280x600x1372 argolight target takes:

(Deskewing function calls, including CPU->GPU->CPU transfers)
(Most relevant to @ieivanov's live applications)
Before: 61 s
After: 5.1 s

(Deskewing CLI calls, including IO, imports, and other overhead)
Before: 82 s
After: 49 s

--
The new and old deskews do not match exactly for two reasons:

(1) MONAI's GPU deskew only supports nearest-neighbor and bilinear interpolation modes, not the linear spline interpolation that we used previously. I chose to use bilinear interpolation, and on close inspection (screenshots don't show any difference) the only differences I observe are:

  • a <1 pixel shift
  • <1% interpolation differences
  • minor changes in auto-contrast---interpolation differences can change the min and max, so autocontrast can change. Locking the contrast when comparing generates indistinguishable contrast.

@ieivanov I would appreciate your help in scrutinizing a few initial deskews as we onboard this change.

(2) MONAI's GPU deskew only supports filling empty values with zeros, not the background-estimated fill that we used previously. This difference is not important because we almost always clip off the overhang, leaving no values to fill.

You can take a closer look at my argolight tests here:
/hpc/projects/comp.micro/mantis/2024_04_23_mantis_alignment/2-deskew/test-monai

--

Other notes:

  • if a GPU is unavailable, the new deskew will still succeed (slowly)
  • I expect that @ieivanov's scatter-gather strategy for splitting a deskew into GPU-memory-sized chunks will still succeed.

@talonchandler talonchandler changed the title Non-working attempt at MONAI deskew GPU deskew Jul 5, 2024
@talonchandler talonchandler marked this pull request as ready for review July 5, 2024 18:13
mantis/analysis/deskew.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ieivanov ieivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look good to me, thanks @talonchandler. Let's see if we can get the tests to pass now

@ieivanov
Copy link
Collaborator

I think this is ready, @talonchandler @edyoshikun could you please take another look?

@ieivanov ieivanov requested a review from edyoshikun July 11, 2024 18:07
@talonchandler
Copy link
Contributor Author

LGTM! Thanks for wrangling tests, @ieivanov.

mantis/analysis/deskew.py Outdated Show resolved Hide resolved
Copy link
Contributor

@edyoshikun edyoshikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with the opencell data and the argolight target.
The opencell dataset for 1 timepoint 1 channel it took about ~ 3min to apply the deskew. Was that the same experience with you? Somehow I recall this being faster. Was that the same experience you guys had?

Otherwise this works. The parallelization using the -j flag also worked. I was using the gpu-small-nodes.

@talonchandler
Copy link
Contributor Author

The opencell dataset for 1 timepoint 1 channel it took about ~ 3min to apply the deskew. Was that the same experience with you? Somehow I recall this being faster. Was that the same experience you guys had?

Thanks for testing @edyoshikun. I only ran before-and-after benchmarks on the Argolight target.

How large was the opencell volume you tested? Is the ~3 minutes consistent with 49 seconds I clocked for my 280x600x1372 volume.

@ieivanov
Copy link
Collaborator

ieivanov commented Jul 11, 2024

We just discussed that the CLI call executes on CPU, as before. deskew_data has an option to use GPUs but that's not implemented in the CLI and that's OK because we speed things up with slurm and CPU multiprocessing. The option to deskew data on GPU is most useful for live data reconstruction / visualization, which currently happens through scripts.

@ieivanov ieivanov merged commit 162da51 into main Jul 11, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants