Testing suite is too slow #285

cako · 2021-12-29T23:47:36Z

On a Intel Core i7-8565U CPU @ 1.80GHz × 8, the full test suite takes on average, 250s. I have noticed that tests are routinely failing in the GitHub CI, and I believe it may be because of this.

To profile the matter, I generated a list of all tests which take more than 2 seconds to run (see below). It seems WavefieldDecomposition and SeismicInterpolation are the biggest culprits, so one suggestions would be to make the dimensions in those examples smaller.

============================== slowest durations ===============================
26.72s call     pytests/test_wavedecomposition.py::test_WavefieldDecomposition3D[par1]
25.61s call     pytests/test_seismicinterpolation.py::test_SeismicInterpolation2d[par3]
17.35s call     pytests/test_seismicinterpolation.py::test_SeismicInterpolation3d[par1]
8.76s call     pytests/test_wavedecomposition.py::test_WavefieldDecomposition3D[par0]
8.37s call     pytests/test_marchenko.py::test_Marchenko_timemulti_ana[par0]
6.01s call     pytests/test_seismicinterpolation.py::test_SeismicInterpolation2d[par2]
5.30s call     pytests/test_marchenko.py::test_Marchenko_freq[par0]
5.06s call     pytests/test_marchenko.py::test_Marchenko_time_ana[par0]
5.02s call     pytests/test_marchenko.py::test_Marchenko_time[par0]
5.01s call     pytests/test_marchenko.py::test_Marchenko_freq[par1]
4.88s setup    pytests/test_wavedecomposition.py::test_WavefieldDecomposition3D[par0]
4.88s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par3]
4.87s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par1]
4.74s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par5]
4.71s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par2]
4.64s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par0]
4.62s call     pytests/test_sparsity.py::test_ISTA_FISTA_multiplerhs[par4]
4.16s call     pytests/test_radon.py::test_Radon3D[par2]
3.41s call     pytests/test_radon.py::test_Radon3D[par5]
3.36s call     pytests/test_radon.py::test_Radon2D[par2]
3.34s call     pytests/test_radon.py::test_Radon3D[par7]
3.17s call     pytests/test_sparsity.py::test_ISTA_FISTA[par2]
3.11s call     pytests/test_sparsity.py::test_ISTA_FISTA[par0]
3.10s call     pytests/test_sparsity.py::test_ISTA_FISTA[par1]
3.07s call     pytests/test_sparsity.py::test_ISTA_FISTA[par5]
2.99s call     pytests/test_sparsity.py::test_ISTA_FISTA[par4]
2.96s call     pytests/test_sparsity.py::test_ISTA_FISTA[par3]
2.69s call     pytests/test_radon.py::test_Radon2D[par5]
2.30s call     pytests/test_radon.py::test_Radon3D[par3]
2.21s call     pytests/test_radon.py::test_Radon2D[par7]

mrava87 · 2021-12-31T10:48:31Z

Regarding GitHub CI, this never fails (or at least not to my knowledge). The red stars you see in the latest commits are due to my impatience of merging without waiting for the tests to finish (since I only modified documentation files). The behavior of GitHub CI seems to be that the CI stops (quite understandably)...

To the tests, indeed anything that still does the job of checking an operator but shorter in terms of time is welcome.

I went ahead and made some small changes to test_WavefieldDecomposition3D, mostly reducing the size of the input 3d tensor and adapting parameters for the FK correct, now on my laptop it runs in about 1sec.

Feel free to take a look at test_SeismicInterpolation2d, here I worry it may be harder because L1 solver need a large number (100s) of iterations to converge so we may need to either reduce thresholds of acceptance or again try to make the example smaller (if possible)

In order to speed up our CI pipeline, PyLops#285 suggested to reduce the size of a number of tests, test_wavedecomposition being one of them. The tests are now running with a smaller input data.

cako · 2021-12-31T15:56:59Z

Awesome, thank you! Runtime has now dropped to 180s. If you don't mind, I think we can keep the issue open so we don't forget to improve the runtime of other tests.

mrava87 · 2021-12-31T16:07:59Z

Yep, agree. As 1 and 4 are now very fast, I think if we can improve on 2 and 3 the overall timing of the test suite should become more reasonable. I am a bit unsure how much effort we should put on any test taking <5s as it may be more than the actual gain. What do you think?

cako · 2021-12-31T18:13:08Z

I think in the list of priorities it is not very far up, but it would definitely be nice at least remove the tests that take a long time (>5s). I can mark this as a first issue and we could maybe help people who want to get their toes wet with PyLops to tackle these issues? It is an excellent way of learning about the library.

dikwickley · 2023-01-23T11:35:22Z

i have worked with pytest before and it can get very slow at times. how i solved this with my case was by running the tests in parallel using the pytest-xdist module. i can work on this if you think this can work.
@cako @mrava87

mrava87 · 2023-01-23T12:07:24Z

Hello @dikwickley,
this could be an interesting avenue. May I ask if you have tried this on CI before (eg GitHub Actions, Azure Pipelines)?

dikwickley · 2023-01-23T13:24:50Z

@mrava87 it was used with jenkins. we just need to add -n <worker_count> at the end of pytest command. (ofcourse pytest-xdistshould be installed via requirements.txt)

mrava87 · 2023-01-23T13:46:15Z

Mmmh, we need to be able to run our CI with either GitHub Actions or Azure Pipelines (ideally both as we do now). Please have a look if this is allowed by those CI platforms before implementing it. If yes, we will be happy for you to go ahead and make a PR with the suggested changed :)

dikwickley · 2023-01-24T19:00:13Z

I tested out parallelization locally, here are some interesting results

without parallelization

with parallelization (4 workers)

with parallelization (6 workers)

as you can see there isn't much of a speed improvement. I found out last 20% of the test cases took the most time, maybe we can try reducing test cases or removing trivial ones.

mrava87 · 2023-01-24T23:03:07Z

Alright, good to see. I was a bit unsure this would bring much improvement as we heavily use numpy (most methods are multi-threaded) and so we would end up trading the speed-up of multi-threaded functions with that of having multiple processes working in parallel.

I am not in favour of removing any test. If you want you can run the test suite locally like @cako did in the past (see above), identity the tests that take longer and try to reduce the size of the problem for those tests… we did this already for the very heavy ones, but I remember writing this issue as we thought there are at least a couple more that could be further sped up.

dikwickley · 2023-02-16T05:24:38Z

We can use the --durations=N flag to find the slowest N tests. Let me try it find which ones are the heaviest.

mrava87 · 2023-02-16T06:52:07Z

Sure. Even if you just run tests in a IDE like Pycharm/VSCode you can get directly stats on duration. I suggest you look at the top 5 and we can discuss which one makes sense to tackle first. There may be problems that we can’t make smaller than what they are as the inversion test would simply be not representative anymore

dikwickley · 2023-02-16T08:53:42Z

I think we can start with these
test_SeismicInterpolation2d
test_SeismicInterpolation3d
test_Marchenko_freq
test_lsm2d

How can we decide that we can't reduce a test case any further for some test?

mrava87 · 2023-02-16T09:30:52Z

Sounds good. Well based on knowledge of the problem. You can start taking the top one and reduce its size, but I would not just run it as test blindly, I would also make some plots to see what the input is, what the expected output is and what the one obtained is… and use tutorials using the same operators to understand what the problem entails if you are not familiar :)

cako added the help wanted Extra attention is needed label Dec 29, 2021

mrava87 mentioned this issue Dec 31, 2021

Reduced size of test_wavedecomposition #287

Merged

cako added the good first issue Good for newcomers label Dec 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing suite is too slow #285

Testing suite is too slow #285

cako commented Dec 29, 2021

mrava87 commented Dec 31, 2021

cako commented Dec 31, 2021

mrava87 commented Dec 31, 2021

cako commented Dec 31, 2021

dikwickley commented Jan 23, 2023 •

edited

Loading

mrava87 commented Jan 23, 2023

dikwickley commented Jan 23, 2023

mrava87 commented Jan 23, 2023

dikwickley commented Jan 24, 2023

mrava87 commented Jan 24, 2023 •

edited

Loading

dikwickley commented Feb 16, 2023

mrava87 commented Feb 16, 2023

dikwickley commented Feb 16, 2023 •

edited

Loading

mrava87 commented Feb 16, 2023

Testing suite is too slow #285

Testing suite is too slow #285

Comments

cako commented Dec 29, 2021

mrava87 commented Dec 31, 2021

cako commented Dec 31, 2021

mrava87 commented Dec 31, 2021

cako commented Dec 31, 2021

dikwickley commented Jan 23, 2023 • edited Loading

mrava87 commented Jan 23, 2023

dikwickley commented Jan 23, 2023

mrava87 commented Jan 23, 2023

dikwickley commented Jan 24, 2023

mrava87 commented Jan 24, 2023 • edited Loading

dikwickley commented Feb 16, 2023

mrava87 commented Feb 16, 2023

dikwickley commented Feb 16, 2023 • edited Loading

mrava87 commented Feb 16, 2023

dikwickley commented Jan 23, 2023 •

edited

Loading

mrava87 commented Jan 24, 2023 •

edited

Loading

dikwickley commented Feb 16, 2023 •

edited

Loading