Turning process to RunFast/IEEE mode using preloaded library.
Vector calculations become faster up to 58% for floats and that is usual pattern when you need fast performance.
Test example done as C++ due to it will be standard language for MeeGo apps due to Qt.
See also https://projects.maemo.org/bugzilla/show_bug.cgi?id=198547

Testing as is
XXX-43-3:/home/tests/fpumode# ./fpumodetest
* run scalar_load<float> testing
=> 14.050568 seconds
* run vector_load<float> testing
=> 10.329529 seconds
* run scalar_load<double> testing
=> 14.039001 seconds
* run vector_load<double> testing
=> 11.4294496776 seconds

Testing with turning FPU into _FPU_DEFAULT as it set in fpu_control.h
XXX-43-3:/home/tests/fpumode# LD_PRELOAD=$PWD/fpumode.so ./fpumodetest
* fpu mode build Oct 29 2010 14:28:20
* current fpu mode is 0x00000000 [CUSTOM]
* changing mode to 0x00000000 [DEFAULT]
* run scalar_load<float> testing
=> 14.046722 seconds
* run vector_load<float> testing
=> 10.328735 seconds
* run scalar_load<double> testing
=> 14.039399 seconds
* run vector_load<double> testing
=> 11.4294496806 seconds

Force FPU to _FPU_IEEE mode according to fpu_control.h
XXX-43-3:/home/tests/fpumode# LD_PRELOAD=$PWD/fpumode-ieee.so ./fpumodetest
* fpu mode build Oct 29 2010 14:28:20
* current fpu mode is 0x00000000 [CUSTOM]
* changing mode to 0x00001f00 [IEEE]
* run scalar_load<float> testing
=> 14.100861 seconds
* run vector_load<float> testing
=> 11.4294296032 seconds
* run scalar_load<double> testing
=> 14.039764 seconds
* run vector_load<double> testing
=> 10.529907 seconds

Force FPU to fast mode
XXX-43-3:/home/tests/fpumode# LD_PRELOAD=$PWD/fpumode-fast.so ./fpumodetest
* fpu mode build Oct 29 2010 14:28:20
* current fpu mode is 0x00000000 [CUSTOM]
* changing mode to 0x03000000 [RUN FAST]
* run scalar_load<float> testing
=> 14.063935 seconds
* run vector_load<float> testing
=> 6.518768 seconds
* run scalar_load<double> testing
=> 14.039734 seconds
* run vector_load<double> testing
=> 11.4294496471 seconds
