Re: code generation options for v8 compilers

2013-08-16 Thread Mans Rullgard
On 14 August 2013 18:33, Padgett Don-B43265 wrote: > Is there a way to generate A32 code with the gcc-linaro-aarch64- compilers? No, use gcc-foo-arm for that. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.

Re: [SPAM] $99 A9 desktop

2013-07-15 Thread Mans Rullgard
On 15 July 2013 21:19, Renato Golin wrote: > + Quad-core-A9 @ 1.2GHz > + 4GB RAM How do you attach 4GB RAM to an A9? Where does the MMIO go? -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.or

Re: Overheating Pandas

2013-07-04 Thread Mans Rullgard
an convince them to lower the CPU frequency. Chips intended for compute clusters will no doubt be possible to cool sufficiently to run at full speed all the time. Designing chips for different markets involves different sets of tradeoffs, and you're seeing the result of that. -- Mans R

Re: [Linaro-validation] Overheating Pandas

2013-07-04 Thread Mans Rullgard
peak usage from another device. That is not how electricity works. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Overheating Pandas

2013-07-03 Thread Mans Rullgard
On 3 July 2013 18:33, Richard Earnshaw wrote: > On 03/07/13 17:41, Renato Golin wrote: >> >> On 3 July 2013 17:22, Mans Rullgard > <mailto:mans.rullg...@linaro.org>> wrote: >> >> I repeat, the 4460 will run at 1.2GHz indefinitely without thermal >>

Re: Overheating Pandas

2013-07-03 Thread Mans Rullgard
On 3 July 2013 17:41, Renato Golin wrote: > On 3 July 2013 17:22, Mans Rullgard wrote: >> >> I repeat, the 4460 will run at 1.2GHz indefinitely without thermal >> management. > > > My mistake, I said 1.3GHz when it was actually 1.2GHz. So, at 1.2GHz, it > freeze

Re: Overheating Pandas

2013-07-03 Thread Mans Rullgard
On 3 July 2013 16:48, Renato Golin wrote: > On 3 July 2013 15:59, Mans Rullgard wrote: >> >> An OMAP4460 will run at 1.2GHz indefinitely without overheating in >> reasonable ambient temperature. >> >> If you don't have thermal management in the kernel yo

Re: Overheating Pandas

2013-07-03 Thread Mans Rullgard
only meant to be used in conjunction with (software) thermal management to throttle back if temperature rises. If you don't have thermal management in the kernel you're running, you need to clamp the clock at a safe value. -- Mans Rullgard / mru __

Re: gcc 4.8-2013.06 reports incorrect version number

2013-06-18 Thread Mans Rullgard
7;t set in the tarball. > > ttyl > bero -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call

2013-06-06 Thread Mans Rullgard
Some targets, e.g. MIPS, have a combined div/mod instruction. Those could benefit from this as well, unless they already achieve that optimisation differently. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: alignment trap on small inline memcpy

2013-05-30 Thread Mans Rullgard
and stores? If targeting ARMv6 or later, yes. This is the reset default and the recommended setting. > If the answer to the above is "no" we can isolate the code more and bring it > back to the list. If you, ill-advisedly, enable strict alignment checking, you must compile with -mno

Re: help fighting with optimizations

2013-05-23 Thread Mans Rullgard
On 23 May 2013 04:09, Michael Hudson-Doyle wrote: > Mans Rullgard writes: > >> If you need to disable CSE for part of the code, you might want to try >> your luck with __attribute__((optimize("no-gcse"))) on the relevant >> functions. >> >>>

Re: help fighting with optimizations

2013-05-22 Thread Mans Rullgard
se to do :-) I suggest running some benchmarks under perf and counting branch prediction misses. Maybe it's not as much of a problem as you think. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: hot spot on the vsub.f32 instruction

2013-05-17 Thread Mans Rullgard
On 17 May 2013 14:26, Peter Maydell wrote: > On 17 May 2013 14:03, Mans Rullgard wrote: >> On 17 May 2013 13:30, Renato Golin wrote: >>> Why don't we print full information like Intel? >> >> The part you snipped: >> >>>> CPU implementer

Re: hot spot on the vsub.f32 instruction

2013-05-17 Thread Mans Rullgard
;> CPU revision : 1 That says Cortex-A5 r0p1 loud and clear. What info do you think is missing? -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks

2013-03-23 Thread Mans Rullgard
On 23 March 2013 20:18, Renato Golin wrote: > On 23 March 2013 18:58, Mans Rullgard wrote: >> >> The thing is, those of us who are careful when writing code actually >> want these optimisations. The more information the compiler can >> infer from the code, the bett

Re: Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks

2013-03-23 Thread Mans Rullgard
ed with tomorrow's gcc. The thing is, those of us who are careful when writing code actually want these optimisations. The more information the compiler can infer from the code, the better. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks

2013-03-23 Thread Mans Rullgard
This post is making the rounds today: http://blog.regehr.org/archives/918 -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: LLVM ARM NEON VMUL.f32

2013-03-20 Thread Mans Rullgard
vour of speed further reinforce the expectation that the default will be standards compliance. I am strongly in favour of your option 1. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: aarch64 does not run

2013-02-10 Thread Mans Rullgard
he executable is requesting a non-existent "interpreter" (dynamic loader). You need to install the 32-bit compat lib package. I don't remember what it's called on ubuntu, probably ia32-libs or similar. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Newbie question

2013-01-29 Thread Mans Rullgard
lity to allow v7 assembly to be compiled for a v8 model? Use a normal arm toolchain for 32-bit code. The aarch64 toolchain is for 64-bit only. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists

[ACTIVITY] week 49

2012-12-09 Thread Mans Rullgard
ed runtime cpu feature (neon, vfp, etc) detection in Libav * ongoing: tracking gcc trunk on Libav test systems, no ARM regressions this week * user support on irc -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linar

Re: corruption while doing open close

2012-11-30 Thread Mans Rullgard
main(int argc, char *argv[]) > { > int i; > pthread_t tid; > > for (i = 0; i < MAX_THREAD; ++i) { > pthread_create(&tid, NULL, &pipe_thread, (void*)NULL); > } > sleep(60); > } On returning from main(), all open streams are closed,

Re: unexpected reloc type 0x03 error with gcc-4.6.4 (2012.10 version)

2012-11-13 Thread Mans Rullgard
ed lib? Make sure you compile all the code going into it with -fPIC. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Getting a broken program when the -funroll-loops flag is set (gcc linaro arm)

2012-11-12 Thread Mans Rullgard
-fsl-linux-gnueabi-g++ (Freescale MAD -- Linaro 2011.07 -- Built at > 2011/08/10 09:20) 4.6.2 20110630 (prerelease) It would be great if you could test this with a 4.7 release as well. If possible, a test with 4.8 trunk would also be useful. -- Mans Rullgard / mru ___

Re: Compilation speed of Linaro's gcc compared to e.g. Ubuntu's version

2012-11-11 Thread Mans Rullgard
mpiling pcre > with -O3 -mfpu=neon -march=armv7-a -mtune=cortex-a8 takes 18.8 s for > the Ubuntu Precise 4.6 compiler, 17.8 s for the Ubuntu Quantal 4.7 > compiler, and 41.2 s for the Linaro 4.7 2012.10 build. I've logged > LP: #1077739 to track. I'll spin a --enable-checking=rel

Re: Compilation speed of Linaro's gcc compared to e.g. Ubuntu's version

2012-10-30 Thread Mans Rullgard
On 29 October 2012 16:28, "Frank Müller" wrote: > Mans Rullgard wrote: >> On 28 October 2012 18:08, "Frank Müller" wrote: >> > For easier maintenance, we are now switching to Linaro. The image is set >> up and I can compile, however I notice a

Re: Compilation speed of Linaro's gcc compared to e.g. Ubuntu's version

2012-10-29 Thread Mans Rullgard
; shipped version), with no significant difference. > > Compiler flags for the system are -march=armv7-a -mtune=cortex-a8 -mfpu=neon > -mfloat-abi=hard Could you please show us the full output from compiling one of your source files adding -v to the flags with both

Re: alignment faults in 3.6

2012-10-09 Thread Mans Rullgard
r starts 4 bytes into the struct, so if the struct is properly aligned, there will be no fault here. The problem is that networking code does not always align these structs correctly. > attempts to do a type conversion using pointers, then dereference > it. I would have thought: > > i

Re: Test Case to verify the support of VFPV3 and VFPV4

2012-10-09 Thread Mans Rullgard
on-IEEE floating point results are OK. Otherwise it's not valid to > emit a fused multiply-add for a*b+c because IEEE specifies that you > should get a rounding step between the multiply and the add. Or > does gcc default to non-IEEE arithmetic? Maybe adding -ffast-math does

Re: Query: What happens in case of char array overflow?

2012-10-09 Thread Mans Rullgard
sive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array. In your case, the array will be initialised with the numbers 1-5, the null terminator of the string literal is ign

Re: alignment faults in 3.6

2012-10-06 Thread Mans Rullgard
On 5 October 2012 23:42, Russell King - ARM Linux wrote: > On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote: >> The problem is the (__be32 *) casts. This is a normal pointer to a 32-bit, >> which is assumed to be aligned, and the cast overrides the packed attrib

Re: alignment faults in 3.6

2012-10-05 Thread Mans Rullgard
*p_id = id; > return flush; > } The problem is the (__be32 *) casts. This is a normal pointer to a 32-bit, which is assumed to be aligned, and the cast overrides the packed attribute from the struct. Dereferencing these cast expressions must be done with the macros from asm/unaligned.h -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: alignment faults in 3.6

2012-10-05 Thread Mans Rullgard
On 5 October 2012 09:33, Russell King - ARM Linux wrote: > On Fri, Oct 05, 2012 at 09:33:04AM +0100, Mans Rullgard wrote: >> On 5 October 2012 09:24, Russell King - ARM Linux >> wrote: >> > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote: >> >> O

Re: alignment faults in 3.6

2012-10-05 Thread Mans Rullgard
On 5 October 2012 09:24, Russell King - ARM Linux wrote: > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote: >> On 5 October 2012 08:12, Russell King - ARM Linux >> wrote: >> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote: >> >>

Re: alignment faults in 3.6

2012-10-05 Thread Mans Rullgard
On 5 October 2012 08:12, Russell King - ARM Linux wrote: > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote: >> On 5 October 2012 02:56, Rob Herring wrote: >> > This struct is the IP header, so a struct ptr is just set to the >> > beginning of the re

Re: alignment faults in 3.6

2012-10-04 Thread Mans Rullgard
On 5 October 2012 02:56, Rob Herring wrote: > On 10/04/2012 08:26 PM, Mans Rullgard wrote: >> On 5 October 2012 01:58, Michael Hope wrote: >>> On 5 October 2012 12:10, Rob Herring wrote: >>>> I've been scratching my head with a "scheduling while atomic&

Re: alignment faults in 3.6

2012-10-04 Thread Mans Rullgard
" #endif __u8tos; __be16 tot_len; __be16 id; __be16 frag_off; __u8ttl; __u8protocol; __sum16 check; __be32 saddr; __be32 daddr; /*The options start here. */ }; In a normal build (there's

Re: PGO and LTO session preparation

2012-08-16 Thread Mans Rullgard
On 16 August 2012 13:04, Mans Rullgard wrote: > On 15 August 2012 22:38, Mans Rullgard wrote: >> On 15 August 2012 17:17, Matthew Gretton-Dann >> wrote: >>> The performance of PGO on 'a popular embedded benchmark' is 14% >>> improvement, LTO is 7%.

Re: PGO and LTO session preparation

2012-08-16 Thread Mans Rullgard
On 15 August 2012 22:38, Mans Rullgard wrote: > On 15 August 2012 17:17, Matthew Gretton-Dann > wrote: >> The performance of PGO on 'a popular embedded benchmark' is 14% >> improvement, LTO is 7%. Don't know both together, or SPEC. > > On 'a pop

Re: PGO and LTO session preparation

2012-08-15 Thread Mans Rullgard
as 11%. The relative gains are larger on average with hand-written assembly disabled, but obviously nowhere near the performance with it enabled. On the same library LTO is 2-3.5 _times_ slower than without on all tests, although it does pass the test

Re: Options for ARM A9 Cortex-MP without FPU

2012-08-13 Thread Mans Rullgard
gt; (toolchain and/or kernel) soft FPU emulation related options > for armv7-a without a FPU? To build a softfloat gcc, pass --with-float=soft to configure. Build the kernel as usual, nothing special is required there. -- Mans Rullgard / mru ___ lin

Re: Vector-alignment patch performance regressions

2012-08-08 Thread Mans Rullgard
that everything should be aligned to 32-bytes for > A9 (as that is the A9 cache line size: Too aggressively aligning things to 32 bytes will only end up wasting cache space. What you ideally want is to align everything (at least) to the access size while minimising the number of cache l

Re: Vector-alignment patch performance regressions

2012-08-08 Thread Mans Rullgard
igned to 8 bytes. To check whether > this makes a difference, I've modified the compiler as a hack to always > force all global arrays to be 16 byte aligned. And interestingly enough, > this appears to fix this particular performance regression ... Are those

Re: AND vs UXTB

2012-08-03 Thread Mans Rullgard
On 3 August 2012 13:53, Richard Earnshaw wrote: > On 03/08/12 13:49, Mans Rullgard wrote: >> I have noticed gcc has a preference for generating UXTB instructions >> when an AND with #255 would do the same thing. This is bad, because >> on A9 UXTB has two cycles latency comp

AND vs UXTB

2012-08-03 Thread Mans Rullgard
I have noticed gcc has a preference for generating UXTB instructions when an AND with #255 would do the same thing. This is bad, because on A9 UXTB has two cycles latency compared to one cycle for AND. On A8 both instructions have one cycle latency. -- Mans Rullgard / mru

Re: Distinguishing SF/HF ABI binaries, take two

2012-08-03 Thread Mans Rullgard
e discussed it in the past, and while you possibly *could* do it, > it's such an edge case that nobody cared. Likely to be fragile, too. If it really becomes necessary, an override flag to dlopen() should be simple enough to implement. -- Mans Rullgard / mru _

Re: Distinguishing SF/HF ABI binaries, take two

2012-08-03 Thread Mans Rullgard
On 3 August 2012 12:00, Richard Earnshaw wrote: > On 02/08/12 18:39, Mans Rullgard wrote: >> Nevertheless, the tags in the .ARM.attributes section are the standard, >> published way to identify FP ABI as well as a number of other properties >> that might be relevant to

Re: Distinguishing SF/HF ABI binaries, take two

2012-08-02 Thread Mans Rullgard
res recompiling everything, so there's no difference there. > However, > it avoids the two drawbacks of your method Mans pointed out: > - there is no duplication of data (there is a bit of extra meta > data in the form of the new program header, but the actual data > cov

Re: Distinguishing SF/HF ABI binaries, take two

2012-08-02 Thread Mans Rullgard
On 2 August 2012 19:00, Steve McIntyre wrote: > On Thu, Aug 02, 2012 at 06:39:33PM +0100, Mans Rullgard wrote: >>On 2 August 2012 17:43, Steve McIntyre wrote: >>> [ Also posted to debian-arm; not cross-posted to avoid subscription >>> complaints... ] >>> &

Re: Distinguishing SF/HF ABI binaries, take two

2012-08-02 Thread Mans Rullgard
ainers are amenable. I'm about to post a similar message there. I really think the only sane thing to do is fix glibc so it can fetch the attributes from their standard locations. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Libav benchmarks

2012-07-16 Thread Mans Rullgard
On 10 July 2012 14:57, Ramana Radhakrishnan wrote: > On 6 July 2012 16:52, Mans Rullgard wrote: >> I ran my usual set of benchmarks of libav compiled with the current gcc >> releases (hand-written assembly disabled). The results are in this >> spreadsheet: >> https://

Libav benchmarks

2012-07-06 Thread Mans Rullgard
=libav.git;a=blob;f=libavcodec/flacdsp_template.c;h=0affe22ddb76325682aef46731722b068dd3a791;hb=HEAD#l51 [4] http://git.libav.org/?p=libav.git;a=blob;f=libavcodec/simple_idct_template.c;h=3c855e3825dc7b87747b8217b9955eae83528246;hb=HEAD#l315 -- Mans Rullgard / mru

Re: String routines writeup with benchmarks

2012-06-15 Thread Mans Rullgard
On 15 June 2012 00:33, Michael Hope wrote: > On 11 June 2012 21:53, Mans Rullgard wrote: >> On 11 June 2012 02:14, Michael Hope wrote: >>> We talked at Connect about finishing up the cortex-strings work by >>> upstreaming them into Bionic, Newlib, and GLIBC.  I&#

Re: Vectoriser performance regression in 4.7

2012-06-11 Thread Mans Rullgard
On 11 June 2012 17:34, Ulrich Weigand wrote: > Mans Rullgard wrote: > >> static void ps_hybrid_analysis_ileave_c(float (*out)[32][2], >>                                         float L[2][38][64], >>                                         int i, int len) >> { &g

Vectoriser performance regression in 4.7

2012-06-11 Thread Mans Rullgard
his at all, 4.7 goes crazy with a massive slowdown, about 20x slower than non-vectorised with Linaro 4.7 and much worse with FSF 4.7. Let me know if you need more information. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-too

Re: String routines writeup with benchmarks

2012-06-11 Thread Mans Rullgard
ect functions doing some prefetching to perform better there. Some time ago, I compared a few memcpy() implementations on large blocks, and the Bionic NEON-optimised one was several times faster than glibc. It is of course possible that glibc has improved since then. -- Mans Rullgard / mru

Re: GCC trunk fails to build

2012-05-16 Thread Mans Rullgard
gcc2.c:397:1: internal compiler error: >> in df_uses_record, at df-scan.c:3179 >> > > > This looks like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53278 > which was fixed last week by rev 187299 > > Are we still seeing this failure ? I saw that failure a while ago, b

Re: Armhf dynamic linker path

2012-05-02 Thread Mans Rullgard
On 2 May 2012 13:42, Richard Earnshaw wrote: > On 02/05/12 13:25, Mans Rullgard wrote: >> On 2 May 2012 05:15, Michael Hope wrote: >>> On 27 April 2012 11:59, Michael Hope wrote: >>>> On 23 April 2012 14:23, Jon Masters wrote: >>>>> On 04/22/20

Re: Armhf dynamic linker path

2012-05-02 Thread Mans Rullgard
patch is now upstream as r186859 and r187012. I noticed that it now sets the dynamic loader to /lib/ld-linux-armhf.so.3 even when configured for soft-float ABI and linking against a soft-float rootfs. The resulting binaries then fail to run. Passing -mfloat-abi=softfp to the link command fixes it. Is this change in behaviour intentional? -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Cross testing applications under QEMU

2012-04-25 Thread Mans Rullgard
d only tested the output in Qemu, never on hardware. Since then, many bugs in Qemu have been fixed, but I would still not trust it for validating a compiler. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.or

Re: getting armv7 linker to emit armv4t thumb interworking

2012-04-13 Thread Mans Rullgard
luding those that are normally built as part of the toolchain > (libgcc, etc). This is in the context of building a u-boot SPL for Tegra2/3. This is completely self-contained, not even relying on the compiler's libgcc. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Selectively disable vectorization

2012-04-11 Thread Mans Rullgard
On 11 April 2012 22:05, Ramana Radhakrishnan wrote: > On 11 April 2012 17:21, Mans Rullgard wrote: >> On 11 April 2012 16:16, Ulrich Weigand wrote: >>> "Singh, Ravi Kumar (Ravi)" wrote: >>> >>>> Are there any pragmas for selectively disabling (i

Re: Selectively disable vectorization

2012-04-11 Thread Mans Rullgard
ization settings). Are you saying __attribute__((optimise("foo"))) is a lie? -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: libav inline assembly

2012-03-21 Thread Mans Rullgard
c older than 4.5. If you find any others where gcc has improved recently, let us know so we can make them conditional. -- Mans Rullgard / mru ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Plan for changing the binary toolchain to 4.7 and hardfloat

2012-03-18 Thread Mans Rullgard
configuration including > soft or hard float.  We talked about this at a cross distro session > and Steve McIntyre was going to push some of the first steps. FWIW, Gentoo has been using arm-hardfloat-linux-gnueabi for hardfloat configurations ever since gcc started supporting it.