c - Flipping sign on packed SSE floats -
I am looking for the most efficient way to flip the sign on all the four floats packed in the SSE register.
I have not found any internal input to do this in the Intel Architecture Software Dev manual. Below are the things that I have already tried
For each case, I omitted the code 10 billion times and indicated the timing of the wall I am trying at least 4 seconds , It takes my non-SIM approach, which is just using the unique minus operator.
[48 seconds]
_mm_sub_ps (_mm_setzero_ps) (), Vec);
[32 seconds]
_mm_mul_ps (_mm_set1_ps (-1.0F), VCC);
[9 seconds]
The compiler is with GCC 4.2-O3. The CPU is an Intel Core 2 Duo.
Just about the underlying vectors to complete their answer through the GCC documentation:
This type of defined type can be used with a subset of normal operation. At present, GCC will allow the following operators to use these types: `+, - *, /, unary minus, ^, |, And, ~ '
Possible when it is possible to always paste it is a good idea The will of the common GCC always the most efficient code for the SSE stuff.
For your compiler options, add something more specific to your architecture, such as -march = native
in most cases.
Comments
Post a Comment