-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref: Restore explicit vectorization in the Vc AoS plugin #118
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
562414f
to
d4e78a6
Compare
Based on #97 , so that the storage vector type can be reused and |
81e07f8
to
54c567f
Compare
It seems Vc AoS gets slower with double for some cases, is that normal? |
It is expected for vectorized code to be about half as slow in double precision as in single precision (half the number of values fits into the same number of bits in the registers). I believe that the cmath plugin is slower in double precision could be hint that it is in fact partly autovectorized. Why Vc AoS is so much slower than cmath in double precision I don't know, but this is what I have seen before as well. |
Yes I was asking why Vc double is slower than cmath double (not w.r.t float). Thanks for the answer |
These benchmarks are a bit outdated though (and also done with CPU scaling, so that e.g. addition and subtraction show different results). It works much better running:
with CPU scaling disabled |
7913fe8
to
58d17de
Compare
The CI failure does not seem to be related to this PR, see #123 |
44d9055
to
ee48a3b
Compare
This refactors the Vc AoS plugin to use explicitly vectorizing Vc types again. Also adds it to the benchmarks and tests. Since not all functionality is implemented, yet, I split the test suite into three blocks, so that new plugins can be implemented incrementally with testing enabled. Finally, I harmonized the naming between this plugin and the new SoA plugin.
Also removes the warmup from the bencharmks, since google benchmark can be configured to do that for us.
Edit: I refactored the
vc_aos::transform3
andmatrix44
types, so that are shared between Vc AoS and SoA now