c - Flipping sign on packed SSE floats -


I am looking for the most efficient way to flip the sign on all the four floats packed in the SSE register.

I have not found any internal input to do this in the Intel Architecture Software Dev manual. Below are the things that I have already tried

For each case, I omitted the code 10 billion times and indicated the timing of the wall I am trying at least 4 seconds , It takes my non-SIM approach, which is just using the unique minus operator.


[48 seconds]
_mm_sub_ps (_mm_setzero_ps) (), Vec);


[32 seconds]
_mm_mul_ps (_mm_set1_ps (-1.0F), VCC);


[9 seconds]

Union negative mask {int intRep; Float flip rip; } NGMsk; NegMask.intRep = 0x80000000; _mm_xor_ps (_mm_set1_ps (negmask.fltRep), vec);


The compiler is with GCC 4.2-O3. The CPU is an Intel Core 2 Duo.

Just about the underlying vectors to complete their answer through the GCC documentation:

  This type of defined type can be used with a subset of normal operation. At present, GCC will allow the following operators to use these types: `+, - *, /, unary minus, ^, |, And, ~ ' 

Possible when it is possible to always paste it is a good idea The will of the common GCC always the most efficient code for the SSE stuff.

For your compiler options, add something more specific to your architecture, such as -march = native in most cases.


Comments

Popular posts from this blog

c# - sqlDecimal to decimal clr stored procedure Unable to cast object of type 'System.Data.SqlTypes.SqlDecimal' to type 'System.IConvertible' -

Calling GetGUIThreadInfo from Outlook VBA -

Obfuscating Python code? -