Re: Spam, bounces and gcc list removal

2020-03-21 Thread Oleg Endo
On Sat, 2020-03-21 at 13:08 -0700, H.J. Lu via Gcc wrote:
> On Sat, Mar 21, 2020 at 12:40 PM Thomas Koenig via Gcc <
> gcc@gcc.gnu.org> wrote:
> > 
> > Hi,
> > 
> > > since the change to the new list management, there has been
> > > an uptick of spam getting through. Spam is bounced by my ISP,
> > > and this just resulted in a warning that there were too many
> > > bounces and that I would get removed from the list unless I
> > > confirmed it (which I then did).
> > 
> > This has now happened a second time, and this question
> 
> Same here.
> 

Same here.


Cheers,
Oleg



Re: size of exception handling (Was: performance of exception handling)

2020-05-12 Thread Oleg Endo
On Tue, 2020-05-12 at 09:20 +0200, Freddie Chopin wrote:
> 
> I actually have to build my own toolchain instead of the one provided
> by ARM, because to really NOT use C++ exceptions, you have to recompile
> the whole libstdc++ with `-fno-exceptions -fno-rtti` (yes, I know they
> provide the "nano" libraries, but I the options they used for newlib
> don't suit my needs - this is "too minimized"). If you pass these two
> flags during compilation and linking of your own application, this
> disables these features only in your code. As libstdc++ is compiled
> with exceptions and RTTI enabled, ...

IMHO this is a conceptual fail of the whole concept of using pre-
compiled pre-installed libraries somewhere in the toolchain, in
particular for this kind of cross-compilation scenario.  Like you say,
when we set "exceptions off" it usually means for the whole embedded
app, and the whole embedded app usually means all the OS and runtime
libraries and everything, not just the user code.

One option is to not use the pre-compiled toolchain libstc++ but build
it from source (or use another c++ std lib of your choice), as part of
the whole project, with the desired project settings.


BTW, just to throw in my 2-cents into the "I'm using MCU" pool of
pain/joy ... in one of my projects I'm using STM32F051K6U6, 32 KB
flash, 8 KB RAM, running all C++ code with shared C++ RPC libraries to
communicate with other (bigger) devices.  Exceptions, RTTI, threads
have to be turned off and only the header-only things from the stdlib
can be used and no heap allocations.  Otherwise the thing doesn't fit. 
Don't feel like rewriting the whole thing either.  There are some
annoyances when turning off exceptions and RTTI which results in
increased code maintenance.  I'd definitely be good and highly
appreciated if there were any improvements in the area of exception
handling.

Cheers,
Oleg



Re: rtx_cost of insns

2015-06-29 Thread Oleg Endo

On 29 Jun 2015, at 16:46, Alan Modra  wrote:

> On Thu, Jun 25, 2015 at 01:28:39PM +0100, Richard Earnshaw wrote:
>> Perhaps the best thing to do is to use the OUTER code to spot the
>> specific case where you've got a SET and return non-zero in that case.
> 
> That's exactly the path I've been following.  It's not as easy as it
> sounds..
> 
> First, some backends call rtx_cost from their targetm.rtx_costs.
> ix86_rtx_costs for instance has this
> 
>case PLUS:
> ...
> if (val == 2 || val == 4 || val == 8)
>   {
> *total = cost->lea;
> *total += rtx_cost (XEXP (XEXP (x, 0), 1),
> outer_code, opno, speed);
> *total += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0),
> outer_code, opno, speed);
> *total += rtx_cost (XEXP (x, 1), outer_code, opno, speed);
> return true;
>   }
> which, when using a non-zero register move cost, results in
> 
> Successfully matched this instruction:
> (set (reg:DI 198 [ D.74663 ])
>(plus:DI (plus:DI (reg/v/f:DI 172 [ use_entry ])
>(reg:DI 196 [ D.74662 ]))
>(const_int -32 [0xffe0])))
> rejecting combination of insns 179 and 180
> original costs 6 + 4 = 10
> replacement cost 15
> 
> So here the x86 backend is calculating the cost of an lea, plus the
> cost of (reg:DI 196), plus the cost of (reg/v/f:DI 172), plus the cost
> of (const_int -32).  outer_code is SET.  That means we add two
> register moves, increasing the overall cost from 7 to 15.
> 
> The second problem I've hit is that fwprop.c:should_replace_address
> has this:
> 
>  /* If the addresses have equivalent cost, prefer the new address
> if it has the highest `set_src_cost'.  That has the potential of
> eliminating the most insns without additional costs, and it
> is the same that cse.c used to do.  */
>  if (gain == 0)
>gain = (set_src_cost (new_rtx, VOIDmode, speed)
>   - set_src_cost (old_rtx, VOIDmode, speed));
> 
>  return (gain > 0);
> 
> If register moves have the same cost as adding a small constant to a
> register, then this code no longer replaces a pseudo with its value as
> an offset from a base.  I think this particular problem can be fixed
> quite simply by "return gain >= 0;", but really, this code, like the
> x86 code, is expecting the cost of a register move to be zero.
> 
> You'll notice that these example problems are not trying to cost a
> whole instruction.  In both cases they want the cost of just a piece
> of an instruction, but rtx_cost is called in a way that is
> indistinguishable from other code that calls rtx_cost on whole
> register move instructions.
> 
> The real difficulty is in separating out the whole insn cases from the
> partial insn cases.
> 
> Note that we already have insn_rtx_cost, and it returns a minimum cost
> for a SET, so register move insns get a cost of 1 insn.  However,
> despite insn_rtx_cost starting life in combine.c, even combine doesn't
> use it in all whole insn cases.  :-(

Quite often, more complex (combine) insns have to be matched manually using 
C/C++ code in order to implement the costs function.  To avoid that, maybe we 
could have target independent insn attributes that carry the costs?  That would 
be much be much easier/faster (at least) for combine to lookup and is also 
easier to maintain in the backend.

It's also possible to implement that in a target specific way.  Like in the 
costs function, constructing a temporary fake insn, recog it, lookup the 
attribute.  However, this will pointlessly invoke recog twice.  At the time 
when combine gets the insn costs it already has invoked recog.

Cheers,
Oleg

Re: Allocation of hotness of data structure with respect to the top of stack.

2015-07-07 Thread Oleg Endo

On 07 Jul 2015, at 04:49, Jeff Law  wrote:

> On 07/05/2015 05:11 AM, Ajit Kumar Agarwal wrote:
>> All:
>> 
>> I am wondering allocation of hot data structure closer to the top of
>> the stack increases the performance of the application. The data
>> structure are identified as hot and cold data structure and all the
>> data structures are sorted in decreasing order of The hotness and the
>> hot data structure will be allocated closer to the top of the stack.
>> 
>> The load and store on accessing with respect to allocation of data
>> structure on stack will be faster with allocation of hot Data
>> structure closer to the top of the stack.
>> 
>> Based on the above the code is generated with respect to load and
>> store with the correct offset of the stack allocated on the
>> decreasing order of hotness.
> You might want to look at this paper from an old gcc summit conference.  
> Basically they were trying to reorder stack slots to minimize offsets in 
> reg+d addressing for hte SH port.  It should touch on a number of common 
> issues/goals.
> 
> 
> ftp://gcc.gnu.org/pub/gcc/summit/2003/Optimal%20Stack%20Slot%20Assignment.pdf
> 
> 
> I can't recall if they ever tried to submit that work for inclusion.

Ah, inverse-AMS so to say :)
It might be interesting to combine forward and inverse AMS.  In the current AMS 
GSoC work we're hitting some cases which need mem access reordering in order to 
pick cheaper address modes.  It's not there yet, but if it knows how to reorder 
mem accesses in the insn stream it could probably be extended to try reordering 
memory layout of variables.

Cheers,
Oleg 

Re: Does GCC generate LDRD/STRD (Register) forms?

2015-07-07 Thread Oleg Endo

On 07 Jul 2015, at 13:52, Bin.Cheng  wrote:

> On Tue, Jul 7, 2015 at 10:05 AM, Anmol Paralkar (anmparal)
>  wrote:
>> Hello,
>> 
>> Does GCC generate LDRD/STRD (Register) forms [A8.8.74/A8.8.211 per ARMv7-A
>> & ARMv7-R ARM]?
>> 
>> Based on various attempts to write code to get GCC to generate a sample
>> form, and subsequently inspecting the code I see in
>> config/arm/arm.c/output_move_double () & arm.md [GCC 4.9.2], I think that
>> these register based forms of LDRD/STRD are
>> not generated, but I thought it might be a good idea to ask on the list,
>> just in case.
> Register based LDRD is harder than immediate version.  ARM doesn't
> support [base + reg + offset] addressing mode, so address computation
> of the second memory reference is scattered both in and out of memory
> reference.  To identify such opportunities, one needs to trace
> registers in address expression the memory access instruction and does
> some kind of value computation and re-association.

Basically, this is what we're trying to do with AMS.  For each mem access it 
tries to trace the reg values and figure out the effective address expression.  
For now we've limited it to the form 'base_reg + index_reg*scale + 
const_displacement'.  Then we try to see how to fit the address expressions to 
the available address modes.

It's still work in progress but already shows some improvements.
A classic SH4 example:

float fun (float* x)
{
  return x[0] + x[1] + x[2] + x[3];
}

no AMS:
mov r4,r1
add #4,r1
fmov.s  @r4,fr0
fmov.s  @r1,fr1
mov r4,r1
add #8,r1
faddfr1,fr0
fmov.s  @r1,fr1
add #12,r4
faddfr1,fr0
fmov.s  @r4,fr1
rts 
faddfr1,fr0

AMS:
fmov.s  @r4+,fr0
fmov.s  @r4+,fr1
faddfr1,fr0
fmov.s  @r4+,fr1
faddfr1,fr0
fmov.s  @r4,fr1
rts 
faddfr1,fr0

If I understand correctly, ARM's LDRD/STRD are similar to SH's FPU 2x32 pair 
loads/stores.  It needs the mem access insns of adjacent addresses to be 
adjacent in the insn stream.  We'll try to do some mem access reordering in 
AMS, mainly to improve post/pre inc/dec address mode utilization.  Afterwards, 
adjacent mem accesses can be fused together in a separate RTL pass or AMS 
sub-pass to avoid re-discovering mem access sequence information, which AMS 
already has.

Cheers,
Oleg

Deprecate SH5/SH64

2015-08-18 Thread Oleg Endo
Hi all,

Kaz and I have been discussing the SH5/SH64 status, which is part of the SH 
port, every now and then.  To our knowledge, there is no real hardware 
available as of today and we don't think there are any real users for a 
SH5/SH64 toolchain out there.  Moreover, the SH5/SH64 parts of the SH port 
haven't been touched by anybody for a long time.  The only exception is 
occasional ad-hoc fixes for bug reports from people who build GCC for every 
architecture that is listed in the Linux kernel.  However, we don't actually 
know whether code compiled for SH5/SH64 still runs at an acceptable level since 
nobody has been doing any testing for that architecture for a while now.

If there are no objections, we would like to deprecate SH5/SH64 support as of 
GCC 6.

Initially this would include an announcement on the changes page and the 
removal of any documentation related to SH5/SH64.  After GCC 6 we might start 
removing configure options and the respective code paths in the target.

Cheers,
Oleg

Re: Deprecate SH5/SH64

2015-09-20 Thread Oleg Endo
On Tue, 2015-08-18 at 12:41 -0600, Jeff Law wrote:
> On 08/18/2015 11:11 AM, David Edelsohn wrote:
> > On Tue, Aug 18, 2015 at 1:00 PM, Oleg Endo 
> > wrote:
> >> Hi all,
> >>
> >> Kaz and I have been discussing the SH5/SH64 status, which is part
> >> of the SH port, every now and then.  To our knowledge, there is no
> >> real hardware available as of today and we don't think there are
> >> any real users for a SH5/SH64 toolchain out there.  Moreover, the
> >> SH5/SH64 parts of the SH port haven't been touched by anybody for a
> >> long time.  The only exception is occasional ad-hoc fixes for bug
> >> reports from people who build GCC for every architecture that is
> >> listed in the Linux kernel.  However, we don't actually know
> >> whether code compiled for SH5/SH64 still runs at an acceptable
> >> level since nobody has been doing any testing for that architecture
> >> for a while now.
> >>
> >> If there are no objections, we would like to deprecate SH5/SH64
> >> support as of GCC 6.
> >>
> >> Initially this would include an announcement on the changes page
> >> and the removal of any documentation related to SH5/SH64.  After
> >> GCC 6 we might start removing configure options and the respective
> >> code paths in the target.
> >
> > +1
> Works for me based on what I've heard independently about sh5 hardware 
> situation.
> 
> 
> Frankly, I think we should be more aggressive about this kind of 
> port/variant pruning across the board.

I have committed the announcement for the GCC 6 page
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01516.html

Cheers,
Oleg



Multiprecision Arithmetic Builtins

2015-09-20 Thread Oleg Endo
Hi all,

I was thinking of adding some SH specific builtin functions for the
addc, subc and negc instructions.  

Are there any plans to add clang's target independent multiprecision
arithmetic builtins (http://clang.llvm.org/docs/LanguageExtensions.html)
to GCC?

A while ago clang's checked arithmetic builtins were added to GCC.  So
just in case, I wanted to check before introducing target specific
builtins.

Cheers,
Oleg



Re: Multiprecision Arithmetic Builtins

2015-09-21 Thread Oleg Endo
On Mon, 2015-09-21 at 14:42 +0200, Florian Weimer wrote:
> On 09/21/2015 08:09 AM, Oleg Endo wrote:
> > Hi all,
> > 
> > I was thinking of adding some SH specific builtin functions for the
> > addc, subc and negc instructions.  
> > 
> > Are there any plans to add clang's target independent multiprecision
> > arithmetic builtins (http://clang.llvm.org/docs/LanguageExtensions.html)
> > to GCC?
> 
> Do you mean these?
> 
> <https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html>
> 
> Is there something else that is missing?

No, I don't mean __builtin_sadd_overflow and friends, but rather
__builtin_addc and friends.

For a complete list search for "Multiprecision Arithmetic Builtins" on
the page http://clang.llvm.org/docs/LanguageExtensions.html
 

Cheers,
Oleg



Re: SH runtime switchable atomics - proposed design

2016-01-20 Thread Oleg Endo
On Tue, 2016-01-19 at 15:28 -0500, Rich Felker wrote:
> I've been working on the new version of runtime-selected SH atomics
> for musl, and I think what I've got might be appropriate for GCC's
> generated atomics too. I know Oleg was not very excited about doing
> this on the gcc side from a cost/benefit perspective

I am just not keen on making this the default atomic model for SH.
If you have a system built around this atomic model and want to add it
to GCC, please send in patches.  Just a few comments below...

> Inputs:
> - R0: Memory address to operate on
> - R1: Address of implementation function, loaded from a global
> - R2: Comparison value
> - R3: Value to set on success
> 
> Outputs:
> - R3: Old value read, ==R2 iff cas succeeded.

> Preserved: R0, R2.
> 
> Clobbered: R1, PR, T.

The T bit is obviously the result of the cas operation.  So you could
use it as an output directly instead of the implicit R3 == R2
condition.

> 
> This call (performed from __asm__ for musl, but gcc would do it as SH
> "SFUNC") is highly compact/convenient for inlining because it avoids
> clobbering any of the argument registers that are likely to already
> be
> in use by the caller, and it preserves the important values that are
> likely to be reused after the cas operation.
> 
> For J2 and future J4, the function pointer just points to:
> 
>   rts
>cas.l r2,r3,@r0
> 

> and the only costs vs an inline cas.l are loading the address of the
> function (done in the caller; involves GOT access) and clobbering R1
> and PR.
> 
> This is still a draft design and the version in musl is subject to
> change at any time since it's not a public API/ABI, but I think it
> could turn into something useful to have on the gcc side with a
> -matomic-model=libfunc option or similar. Other ABI considerations
> for
> gcc use would be where to store the function pointer and how to
> initialize it. To be reasonably efficient with FDPIC the caller needs
> to be responsible for loading the function pointer (and it needs to
> always point to code, not a function descriptor) so that the callee
> does not need a GOT pointer passed in.

Obviously the ABI has been constructed around the J-core's cas.l
instruction.  Do you have plans to add other atomic operations (like
arithmetic)?  If not, then I'd suggest to name the atomic model
"libfunc-musl-cas".

Cheers,
Oleg


Re: SH runtime switchable atomics - proposed design

2016-01-21 Thread Oleg Endo
On Wed, 2016-01-20 at 20:22 -0500, Rich Felker wrote:
> On Thu, Jan 21, 2016 at 08:08:18AM +0900, Oleg Endo wrote:
> > On Tue, 2016-01-19 at 15:28 -0500, Rich Felker wrote:
> > > I've been working on the new version of runtime-selected SH
> > > atomics
> > > for musl, and I think what I've got might be appropriate for
> > > GCC's
> > > generated atomics too. I know Oleg was not very excited about
> > > doing
> > > this on the gcc side from a cost/benefit perspective
> > 
> > I am just not keen on making this the default atomic model for SH.
> > If you have a system built around this atomic model and want to add
> > it
> > to GCC, please send in patches.  Just a few comments below...
> 
> OK, thanks for clarifying. I don't have a patch yet but I might do
> one
> later. Sato-san's work on adding direct cas.l support showed me how
> this part of the gcc code seems to work, so it shouldn't be too hard
> to hook it up, but there are ABI design considerations still if we
> decide to go this way.
> 
> > > Inputs:
> > > - R0: Memory address to operate on
> > > - R1: Address of implementation function, loaded from a global
> > > - R2: Comparison value
> > > - R3: Value to set on success
> > > 
> > > Outputs:
> > > - R3: Old value read, ==R2 iff cas succeeded.
> > 
> > > Preserved: R0, R2.
> > > 
> > > Clobbered: R1, PR, T.
> > 
> > The T bit is obviously the result of the cas operation.  So you
> > could
> > use it as an output directly instead of the implicit R3 == R2
> > condition.
> 
> I didn't want to impose a requirement that all backends leave the
> result in the T bit. At the C source level, I think most software
> uses
> old==expected as the test for success; this is the API
> __sync_val_compare_and_swap provides, and what people used to x86
> would naturally do anyway.
> 
> > > This call (performed from __asm__ for musl, but gcc would do it
> > > as SH
> > > "SFUNC") is highly compact/convenient for inlining because it
> > > avoids
> > > clobbering any of the argument registers that are likely to
> > > already
> > > be
> > > in use by the caller, and it preserves the important values that
> > > are
> > > likely to be reused after the cas operation.
> > > 
> > > For J2 and future J4, the function pointer just points to:
> > > 
> > >   rts
> > >cas.l r2,r3,@r0
> > > 
> > 
> > > and the only costs vs an inline cas.l are loading the address of
> > > the
> > > function (done in the caller; involves GOT access) and clobbering
> > > R1
> > > and PR.
> > > 
> > > This is still a draft design and the version in musl is subject
> > > to
> > > change at any time since it's not a public API/ABI, but I think
> > > it
> > > could turn into something useful to have on the gcc side with a
> > > -matomic-model=libfunc option or similar. Other ABI
> > > considerations
> > > for
> > > gcc use would be where to store the function pointer and how to
> > > initialize it. To be reasonably efficient with FDPIC the caller
> > > needs
> > > to be responsible for loading the function pointer (and it needs
> > > to
> > > always point to code, not a function descriptor) so that the
> > > callee
> > > does not need a GOT pointer passed in.
> > 
> > Obviously the ABI has been constructed around the J-core's cas.l
> > instruction.
> 
> Yes, but that was a choice I made after a first draft that was no
> more
> optimal for the other backends and less optimal for J-core. And the
> only real choices that were based on the instruction's properties
> were
> using r0 for the address input and swapping the old value into r3
> rather than producing it in a different register. Other than these
> minor details ABI was guided more by avoiding clobbers/reloads of
> potentially valuable data in the caller.
> 
> One possible change I just thought of: with one extra instruction in
> the J-core version we could have the result come out in r1 and
> preserve r3. Similar changes to the other versions are probably easy.
> 
> > Do you have plans to add other atomic operations (like
> > arithmetic)?
> 
> No, at least not in musl. From musl's perspective cas is the main one
> that's used anyway. But even in general I don't think there's a
> significant advantage to do

Re: Undefined C++ Atomic Symbol on sh-rtems

2016-04-16 Thread Oleg Endo
Hi,

On Sat, 2016-04-16 at 18:58 -0500, Joel Sherrill wrote:

> I am hoping the solution to this is obvious to someone
> more familiar with the C++ libraries. Recently the
> sh4 BSP for RTEMS began to have undefined symbols
> like this when linking a C++ test:
> 
> /data/home/joel/rtems-4.11-work/tools/4.12/bin/../lib/gcc/sh
> -rtems4.12/6.0.0/ml/m4/libstdc++.a(cxx11-shim_facets.o): In function
> `ZNKSt6locale5facet11_M_sso_shimEPKNS_2idE':
> /data/home/joel/rtems-4.11-work/rtems-source-builder/rtems/build/sh
> -rtems4.12-gcc-6-20160327-newlib-2.4.0-x86_64-linux-gnu-1/build/sh
> -rtems4.12/ml/m4/libstdc++
> -v3/include/bits/locale_facets_nonio.h:1065: undefined reference to
> `__gnu_cxx::__atomic_add(int volatile*, int)'
> 
> Is this present for sh-elf? Or is there some magic
> bit missing in the RTEMS configuration stanzas?

The reason for the above error is that _GLIBCXX_ATOMIC_BUILTINS is not
set because the atomic model is not set during configure time. 
 Normally libstdc++ would use the atomic builtin functions to do its
stuff, but if they are not enabled during configure,
_GLIBCXX_ATOMIC_BUILTINS will not be set.

On SH there are different "atomic models" to choose from, see also the 
-matomic-model= SH target option.  Unfortunately, we don't have a way
to set the default mode during GCC configure phase.  I'm planning to
add this facility to GCC 7, but it should be straight forward to port
it back if needed.

For sh4-linux and sh*-linux we currently have some hardcoded atomic
model default settings in gcc/config/sh/linux.h.  The same could be
done for rtems I guess, but I'd rather go with the configure option
above.

Cheers,
Oleg


Re: Undefined C++ Atomic Symbol on sh-rtems

2016-04-18 Thread Oleg Endo
On Sun, 2016-04-17 at 13:33 -0500, Joel Sherrill wrote:

> Thanks for the quick and thorough reply.
> 
> This doesn't happen with GCC 4.9 which we are using on our newest
> release branch. With any luck your work will be in gcc 7 before we
> make another release branch. 

That's probably because of this commit:
https://gcc.gnu.org/viewcvs?rev=220094&root=gcc&view=rev


> 
> Is there a ticket for your plan I should add myself to to track this?

No, for that particular issue there's no ticket.  I can put you in CC
when I send around/commit the patch, if that helps.

Cheers,
Oleg


Re: Undefined C++ Atomic Symbol on sh-rtems

2016-04-18 Thread Oleg Endo
On Mon, 2016-04-18 at 14:15 -0500, Joel Sherrill wrote:

> Since I stated that, we decided to use the 6.1 branch for a while.
> So I decided to look at config/sh/linux.h and see what it was doing.
> Copying if on the 6.1 branch seemed liked an option. But it only
> appears to address SH3 and SH1 for atomics. What about an implicit
> atomic for SH2 or SH4?

TARGET_SH3 means SH3, SH4 and SH4A.
TARGET_SH1 means SH1, SH2, SH2A.


> Please do. I may just leave this as a breakage and let you fix it.
> AFAIK no one is really complaining that it is is broken on our
> development master.
> 

Ah good, no hurry then :)
Anyway, this issue's been on the pile for a while now.  I'll do
something about it.

Cheers,
Oleg


Re: Bug maintenance

2016-05-08 Thread Oleg Endo
On Sun, 2016-05-08 at 15:03 -0700, David Wohlferd wrote:
> On 4/28/2016 9:41 AM, Martin Sebor wrote:
> > On 04/28/2016 01:35 AM, David Wohlferd wrote:
> > > As part of the work I've done on inline asm, I've been looking
> > > thru the
> > > bugs for it.  There appear to be a number that have been fixed or
> > > overtaken by events over the years, but the bug is still open.
> > > 
> > > Is closing some of these old bugs of any value?
> > > 
> > > If so, how do I pursue this?
> > 
> > There are nearly 10,000 still unresolved bugs in Bugzilla, almost
> > half of which are New, and a third Unconfirmed, so I'm sure any
> > effort to help reduce the number is of value and appreciated.
> 
> That's exactly what prompted me to ask.  There's such a vast number 
> of them, it's hard to believe that 9 year old bugs are still of
> interest.

Sometimes there is.  Before randomly closing any bugs because they are
too old, one should at least have a look at them and see if they're
still an issue etc.  Often things would've been fixed along the way,
but not all of them.

Cheers,
Oleg


Re: Machine constraints list

2016-05-08 Thread Oleg Endo
On Sun, 2016-05-08 at 15:27 -0700, David Wohlferd wrote:
> Looking at the v6 release criteria 
> (https://gcc.gnu.org/gcc-6/criteria.html) there are about a dozen 
> supported platforms.
> 
> Looking at the Machine Constraints docs 
> (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html), there
> are 
> 34 architectures listed.  That's a lot of entries to scroll thru.  If
> these architectures aren't supported anymore, is it time to drop some 
> of these from this page?

It's not that these architectures are not supported anymore.  They're
just neither primary nor secondary but tertiary platforms instead.  On
the release criteria page it says:

   "There are no release criteria for tertiary platforms."

BTW, the list of supported architectures is not the "machine
constrants" page, but rather this one:

  https://gcc.gnu.org/backends.html

Cheers,
Oleg


Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)

2016-07-26 Thread Oleg Endo
On Tue, 2016-07-26 at 10:37 -0400, Warren D Smith wrote:

> Also, I know on some machines to access a byte you have to get a word
> (larger than 8 bits)
> from memory, do shifts and masks.  So clearly you already do that
> inside gcc.
> It therefore is trivial for you to do uint4_t also, because it would
> be that exact same code you already have, just change some numbers.

You should try to do that yourself once.  Get the GCC source code and
just "change some numbers here and there" and see how far it goes...
Build instructions can be found here: https://gcc.gnu.org/install/

What you are suggesting looks like a generic way of "doing bitfields",
essentially making every integer a bitfield and things like int8_t,
int16_t turn into exceptions that happen to align with what the
hardware implements?

So instead of ..

struct bleh
{
  int a : 6;
  int b : 3;
};

.. we get ...

struct bleh
{
  int6_t a;
  int3_t b;
};

and then sizeof (bleh) = ???

Surely, the compiler can be taught that kind of stuff, it's just a
question of effort.

Alternatively, you can implement an N-bit drop-in integer type in C++11
yourself with container specializations of std::array and std::vector
to get tightly packed e.g. int7_t (still with some padding bits).  This
will allow you to evaluate the usefulness and effectiveness of your
proposed extensions in some real-world applications as a start.

Cheers,
Oleg


Re: Converting to LRA (calling all maintainers)

2016-09-25 Thread Oleg Endo
On Fri, 2016-09-16 at 16:25 -0500, Segher Boessenkool wrote:
> On Fri, Sep 16, 2016 at 02:53:16PM -0600, Jeff Law wrote:
> > 
> > Under traps for the unwary -- LRA requires the target to not use
> > the old 
> > cc0 condition code handling...
> I added this now, thanks Jeff.
> 
> > 
> > ANd yes, I see this as a way to deprecating those cc0 targets like
> > the 
> > m68k :-)
> Would be a shame to see m68k go.  There still is time...

Indeed.  68K is a perfect candidate for addressing mode optimization
(AMS).  It was actually one of the next targets on the list after SH.

What's with all that hurry to kill the old reload?  Unless LRA can be
brought "up to speed" for all other targets and vice versa, ripping out
old reload and all the dependent targets would be a set back for GCC,
IMHO.  It would lose a bunch of targets.

Cheers,
Oleg


Re: sh-*-* Fails to Compile on FreeBSD

2014-05-01 Thread Oleg Endo

On 01 May 2014, at 22:08, Joel Sherrill  wrote:

> Hi
> 
> gcc-4.8.2 targeting sh-*-* fails to compile on
> FreeBSD 10 which is using clang. I am hoping someone
> has some ideas about these.

Yes, I've noticed and mentioned this already a while ago:
http://gcc.gnu.org/ml/gcc/2013-12/msg00036.html


> In file included from ../../gcc-4.8.2/gcc/config/sh/sh.c:63:
> In file included from /usr/include/c++/v1/sstream:174:
> In file included from /usr/include/c++/v1/ostream:131:
> In file included from /usr/include/c++/v1/ios:216:
> In file included from /usr/include/c++/v1/__locale:15:
> In file included from /usr/include/c++/v1/string:438:
> In file included from /usr/include/c++/v1/cwchar:107:
> In file included from /usr/include/c++/v1/cwctype:54:
> /usr/include/c++/v1/cctype:51:72: error: use of undeclared identifier
> 'do_not_use_isalnum_with_safe_ctype'
> inline _LIBCPP_INLINE_VISIBILITY int __libcpp_isalnum(int __c) {return
> isalnum(__c);}
> 
> sh.c line 63 is this:
> 
> #include 
> #include 
> #include 
> 
> It is the only file in gcc/config/* to include sstream.  Has some
> update sweep for C++ transition missed this file?

Could you please try moving the std includes above any other (gcc) includes and 
see if it fixes the issue?  AFAIR it did it for me.

Cheers,
Oleg

Re: sh-*-* Fails to Compile on FreeBSD

2014-05-01 Thread Oleg Endo


On May 1, 2014, at 11:17 PM, Joel Sherrill  wrote:

> 
> On 5/1/2014 3:29 PM, Oleg Endo wrote:
>> On 01 May 2014, at 22:08, Joel Sherrill  wrote:
>> 
>>> Hi
>>> 
>>> gcc-4.8.2 targeting sh-*-* fails to compile on
>>> FreeBSD 10 which is using clang. I am hoping someone
>>> has some ideas about these.
>> Yes, I've noticed and mentioned this already a while ago:
>> http://gcc.gnu.org/ml/gcc/2013-12/msg00036.html
>> 
>> 
>>> In file included from ../../gcc-4.8.2/gcc/config/sh/sh.c:63:
>>> In file included from /usr/include/c++/v1/sstream:174:
>>> In file included from /usr/include/c++/v1/ostream:131:
>>> In file included from /usr/include/c++/v1/ios:216:
>>> In file included from /usr/include/c++/v1/__locale:15:
>>> In file included from /usr/include/c++/v1/string:438:
>>> In file included from /usr/include/c++/v1/cwchar:107:
>>> In file included from /usr/include/c++/v1/cwctype:54:
>>> /usr/include/c++/v1/cctype:51:72: error: use of undeclared identifier
>>> 'do_not_use_isalnum_with_safe_ctype'
>>> inline _LIBCPP_INLINE_VISIBILITY int __libcpp_isalnum(int __c) {return
>>> isalnum(__c);}
>>> 
>>> sh.c line 63 is this:
>>> 
>>> #include 
>>> #include 
>>> #include 
>>> 
>>> It is the only file in gcc/config/* to include sstream.  Has some
>>> update sweep for C++ transition missed this file?
>> Could you please try moving the std includes above any other (gcc) includes 
>> and see if it fixes the issue?  AFAIR it did it for me.
> This seems to fix it. I am not sure why sh.c is the only file in
> gcc/config which includes sstream though.

Because I added code to sh.c that uses stuff from sstream after the switch to 
C++.

> Is this a violation
> of some new rule?

Not that I'm aware of.

> Is there a PR for this?
> 
> If not, I probably should file one and get the patch pushed
> into 4.8 as well as 4.9 and the head if they need it.
> 

If you insist on having a PR for it, please feel free.  In any case, I'll 
commit the 'fix' to trunk and the branches by tomorrow.

Cheers,
Oleg

Re: iq2000-elf: wide-int fallout (was: we are starting the wide int merge)

2014-05-08 Thread Oleg Endo
On Fri, 2014-05-09 at 00:48 +0200, Jan-Benedict Glaw wrote:
> [...]
> 
> Just found this for iq2000:
> 
> g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
> -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
> -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
> -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
> -fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
> -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
> -I/home/jbglaw/repos/gcc/gcc/../libcpp/include 
> -I/opt/cfarm/gmp-latest/include -I/opt/cfarm/mpfr-latest/include 
> -I/opt/cfarm/mpc-latest/include  -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
> -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o wide-int.o -MT wide-int.o 
> -MMD -MP -MF ./.deps/wide-int.TPo /home/jbglaw/repos/gcc/gcc/wide-int.cc
> /home/jbglaw/repos/gcc/gcc/wide-int.cc:37:56: error: unable to emulate 'TI'
>  typedef unsigned int UTItype __attribute__ ((mode (TI)));
> ^
> make[1]: *** [wide-int.o] Error 1

I also just ran into that.  Seems to be a host issue.  This one seems to
fix it: http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00527.html

Another wide-int merge fallout I ran into:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61120

Cheers,
Oleg



Forward declaration style

2014-06-14 Thread Oleg Endo
Hi all,

I was always wondering why this the way it is.  E.g. consider
gcc/output.h:

/* Assemble the integer constant X into an object of SIZE bytes.  ALIGN is
   the alignment of the integer in bits.  Return 1 if we were able to output
   the constant, otherwise 0.  If FORCE is nonzero the constant must
   be outputable. */
extern bool assemble_integer (rtx, unsigned, unsigned, int);

Here the function abstract mentions some argument names which are absent
in the declaration.  When reading/browsing/searching the GCC code base
this is really not helpful.  One is forced to go to the implementation
of the function and read it.  On top of that the actual function
implementations are often in totally different places (e.g. there is no
such thing as output.c).
In lots of other cases, the declarations in the headers don't have any
documentation at all.  In the assemble_integer case it's duplicated in
output.h and varasm.c.

How about adding the argument names to the declarations in header files
and move function abstracts from implementation files to the header
files?  I think this would make it easier to lookup stuff in the code.
What do you think?

Cheers,
Oleg




Re: [GSoC] Question about unit tests

2014-06-27 Thread Oleg Endo
On Wed, 2014-06-25 at 20:25 +0600, Roman Gareev wrote:
> Dear gcc contributors,
> 
> could you please answer a few questions about unit tests? Is it
> possible to use them in gcc? Or maybe there is some analogue? I would
> be very grateful for your comments.

In GCC we have a DejaGnu based test suite.  It's not a framework like
CppUnit but you can write unit tests in it, too.

For more info see
https://gcc.gnu.org/install/test.html
http://www.delorie.com/gnu/docs/dejagnu/dejagnu_6.html

and the various test case examples in the source tree under
gcc/testsuite.

Cheers,
Oleg



Re: gcc.gnu.org/simtest-howto.html (was: Question for ARM person re asm_fprintf)(

2014-08-03 Thread Oleg Endo


On Aug 4, 2014, at 6:00 AM, Gerald Pfeifer  wrote:

> On Wed, 23 Jul 2014, Hans-Peter Nilsson wrote:
>> The page  is
>> unfortunately out of date (e.g. binutils+sim now lives in the
>> same git repo) but it gives you the idea.
> 
> Sooo, any volunteer to update this page?  Doesn't have to be
> perfect, even incremental improvements help.
> 
> Or is it bad enough that we should rather remove this unless/
> until someone steps up?

Since I'm basically doing all the testing in sh-sim, I could try to update that 
page.

Cheers,
Oleg

Re: ViewVC is broken on your web site

2014-08-06 Thread Oleg Endo
On Wed, 2014-08-06 at 21:34 +0200, Paolo Carlini wrote:
> Hi,
> 
> On 08/06/2014 09:19 PM, David Gero wrote:
> > Wow. What an amazingly unintuitive widget. I looked all over the page 
> > for a "Next 25 files" button. A "Go To" button that doesn't talk about 
> > next 25 files meant nothing. ViewVC used to display all the files. 
> > This is a giant leap backward in the User Interface.
> AFAIK the tool is neither part of GCC neither part of the GNU Project, 
> GCC is simply using it. Thus my guess would be that somebody installed 
> an updated version which, together with a number of improvements, has 
> also this questionable change. Personally, I don't have a strong 
> opinion, but since you guys have one, I would recommend getting in 
> contact with the authors of the tool, contribute ideas, maybe code too.

It seems this behavior can be configured 'use_pagesize = 0'
See also: http://oss.segetech.com/bugz-svn-wiki/viewvc.conf 

This page limit thing is really ... argh ... 

Cheers,
Oleg





Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Oleg Endo
On Sat, 2014-12-27 at 09:51 -0800, H.J. Lu wrote:
> On Sat, Dec 27, 2014 at 9:45 AM, Andrew Haley  wrote:
> > On 27/12/14 16:02, paul_kon...@dell.com wrote:
> >>
> >> In the case of volatile variables, the external interface in
> >> question is the one at the point where that address is implemented —
> >> a memory cell, or memory mapped I/O device on a bus.  So the
> >> required behavior is that load and store operations (read and write
> >> transactions at that interface) occur as written.
> >
> > I believe this is incorrect.  For accesses to reach memory in program
> > order on most architectures would require volatile memory references
> > to emit memory barriers, and the C committee decided not to require
> > that.
> >
> >> If a processor has add instructions that support memory references
> >> (as in x86 and vax, but not mips), such an instruction will perform
> >> a read cycle followed by a write cycle.  So as seen at the critical
> >> interface, the behavior is the same as if you were to do an explicit
> >> load, register add, store sequence.  Therefore the use of a single
> >> add-to-memory is a valid implementation.
> >
> > I agree.
> >
> 
> Can we add a target hook so that combine will allow a single
> add-to-memory instruction for volatile memory reference on
> architectures like x86?

Just don't use 'general_operand' in the predicates.  On SH I had to do
that so that combine would merge sign extending memory loads (expanded
as QI/HImode loads) and explicit sign extensions.

Cheers,
Oleg



Re: Rename C files to .c in GCC source

2015-02-08 Thread Oleg Endo
On Sat, 2015-02-07 at 23:17 +, Jonny Grant wrote:
> On 03/02/15 23:20, Andreas Schwab wrote:
> > Jonny Grant  writes:
> >
> >> How many minutes labor is this task?
> >
> > What does it fix?
> 
> Consistency. Less important if these files are only compiled after GCC 
> available, to use as a testsuite. Although I understood from other 
> replies that other files needed hacks to get them to compile.
> 
> However, if this suggestion isn't supported, there's no benefit of 
> discussing further.

This whole filename stuff has been discussed already a while ago:
https://gcc.gnu.org/ml/gcc/2012-08/msg00310.html

Cheers,
Oleg




Re: Unrolling factor heuristics for Loop Unrolling

2015-02-12 Thread Oleg Endo
On Thu, 2015-02-12 at 10:09 +, Ajit Kumar Agarwal wrote:
> Hello All:
> 
> The Loop unrolling without good unrolling factor heuristics becomes the 
> performance bottleneck. The Unrolling factor heuristics based on minimum 
> Initiation interval is quite useful with respect to better ILP.  The minimum 
> Initiation interval based on recurrence and resource calculation on Data 
> Dependency Graph  along with the register pressure can be used to add the 
> unrolling factor heuristics. To achieve better ILP with the given schedule,
> the Loops unrolling and the scheduling are inter dependent and has been 
> widely used in Software Pipelining Literature along with the more granular
> List and Trace Scheduling.
> 
> The recurrence calculation based on the Loop carried dependencies and the 
> resource allocation based on the simultaneous access of the resources 
> Using the reservation table will give good heuristics with respect to 
> calculation of unrolling factor. This has been taken care in the
> MII interval Calculation.
> 
> Along with MII, the register pressure should also be  considered in the 
> calculation of heuristics for unrolling factor.
> 
> This enable better heuristics with respect to unrolling factor. The main 
> advantage of the above heuristics for unrolling factor is that it can be 
> Implemented in the Code generation Level. Currently Loop unrolling is done 
> much before the code generation. Let's go by the current implementation
> Of doing Loop unrolling optimization at the Loop optimizer level and 
> unrolling happens. After the Current unrolling at the optimizer level the 
> above heuristics
> Can be  used to do the unrolling at the Code generation Level with the 
> accurate Register pressure calculation as done in the register allocator and 
> the
> Unrolling is done at the code generation level. This looks feasible solution 
> which I am going to propose for the above unrolling heuristics.
> 
> This enables the Loop unrolling done at the Optimizer Level  +  at the Code 
> Generation Level. This double level of Loop unrolling is quite useful.
> This will overcome the shortcomings of the Loop unrolling at the optimizer 
> level.
> 
> The SPEC benchmarks are the better candidates for the above heuristics 
> instead of Mibench and EEMBC.

Not taking register pressure into account when unrolling (and doing
other optimizations/choices) is an old problem.  See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20969

Cheers,
Oleg



Re: How to update reg_dead notes

2015-02-24 Thread Oleg Endo
On Tue, 2015-02-24 at 16:59 +0100, Georg-Johann Lay wrote:

It doesn't really answer your question, but just as a side note, the
following ...

> +  struct register_pass_info insert_before_bbro =
> +{
> +  notes_bbro_pass,  /* pass */
> +  "bbro",   /* reference_pass_name */
> +  1,/* ref_pass_instance_number */
> +  PASS_POS_INSERT_BEFORE/* position */
> +};
> +
> +  struct register_pass_info insert_before_compgotos =
> +{
> +  notes_compgotos_pass, /* pass */
> +  "compgotos",  /* reference_pass_name */
> +  1,/* ref_pass_instance_number */
> +  PASS_POS_INSERT_BEFORE/* position */
> +};
> +
> +  struct register_pass_info insert_before_shorten =
> +{
> +  notes_shorten_pass,   /* pass */
> +  "shorten",/* reference_pass_name */
> +  1,/* ref_pass_instance_number */
> +  PASS_POS_INSERT_BEFORE/* position */
> +};
> +
> +  struct register_pass_info insert_before_final =
> +{
> +  notes_final_pass, /* pass */
> +  "final",  /* reference_pass_name */
> +  1,/* ref_pass_instance_number */
> +  PASS_POS_INSERT_BEFORE/* position */
> +};
> +
> +  register_pass (&insert_before_bbro);
> +  register_pass (&insert_before_compgotos);
> +  register_pass (&insert_before_shorten);
> +  register_pass (&insert_before_final);

... can be done in 4 lines of code, without all the struct stuff to pass
function arguments.  See also "register_pass" calls in config/sh/sh.c.

Cheers,
Oleg



Re: How to update reg_dead notes

2015-02-24 Thread Oleg Endo
On Tue, 2015-02-24 at 19:23 +0100, Georg-Johann Lay wrote:

> 
> The latest pass which runs before the crash is .split5, i.e. 
> recog.c::pass_split_for_shorten_branches which executes 
> split_all_insns_noflow 
> which in turn reads:
> 
> /* Same as split_all_insns, but do not expect CFG to be available.
> Used by machine dependent reorg passes.  */
> 
> unsigned int
> split_all_insns_noflow (void)
> { ...
> 
> Does this mean CFG is (or might be) messed up after this pass and this is the 
> reason for why df crashes because df needs correct CFG?

Yes, the CFG is only valid until some point.  E.g. on SH it stops being
valid after the machine reorg pass the latest (PR 59189).  Doing
anything CFG related after split5 might not be a good idea.

Cheers,
Oleg



Re: GSoc-2015: Modular GCC (RFC on refactoring)

2015-03-18 Thread Oleg Endo
On Tue, 2015-03-17 at 22:31 -0600, Jeff Law wrote:

> I'm not a big fan of keeping the FOR_EACH_blah style iterator and would 
> prefer to use real C++ iterators.  But it ought to give you some ideas 
> about how to start breaking these things out.

BTW I've tried to propose to start doing that (using C++ 'standard'
iteration concepts) a while ago:
https://gcc.gnu.org/ml/gcc-patches/2013-12/msg01129.html

Unfortunately the discussion didn't go anywhere.  Maybe the patch could
serve as a starting point for something/somebody.

Cheers,
Oleg



Re: [gsoc] Generic addressing mode selection

2015-03-22 Thread Oleg Endo
On Sun, 2015-03-22 at 21:21 +0100, Erik Varga wrote:
> Hi all,
> 
> I'm Erik Krisztián Varga, a 2nd year Electrical Engineering student,
> and I'd be interested in contributing to gcc as part of GSoC 2015.
> I'd like to work on adding an addressing mode selection pass to the
> RTL based on the ideas described in Eckstein et. al.'s paper on the
> subject [1].
> The basic idea is to reduce the problem to a Partitioned Boolean
> Quadratic Programming (PBQP) problem and to find a close-to-optimal
> solution with a heuristic algorithm. It should be possible to model a
> lot of addressing modes with the cost matrix approach described in the
> paper, so this would be a generic way to do AMS for different
> architectures.
> 
> Quite a few things would have to be done to implement the algorithm,
> like finding the correct models for the various addressing
> instructions in the supported architectures, or implementing an
> efficient representation of the cost vectors and matrices, but I think
> it should be possible in the scope of a Summer to have at least a
> working prototype ready.
> Would someone be interested in mentoring this project for GSoC? Is
> there anything similar currently being worked on? I think Oleg Endo
> has started implementing such a pass a while ago (mentioned in
> PR56590).
> 
> Best regards,
> Erik
> 
> [1] http://sydney.edu.au/engineering/it/~scholz/publications/cgo03.pdf

Very nice!  I did start doing some research in that area and started
writing down some ideas.  Unfortunately, whenever I started writing
code, something else came along etc.  I have accumulated a pile of
use/test cases, some papers on other approaches than PBQP and a rough
plan how I think it should be done.  Although my point of view is a bit
SH biased, I believe that once it's working on SH, other platforms will
benefit from it.  The problem is quite difficult, especially the
"generic" part.  The PBQP approach is indeed very tempting, but there
are a lot more things to it than just the solver.  To get good
improvements of the generated code, the optimization also has to be able
to reorder memory accesses and perform other transformations such as
converting pre-inc into post-inc modes in loops etc.  The scope would
need to be narrowed down a bit for a GSoC project, but if you want, we
could give it a try and I would step forward as a mentor.

Cheers,
Oleg



Re: [gsoc] Generic addressing mode selection

2015-03-23 Thread Oleg Endo
On Mon, 2015-03-23 at 20:10 +0100, Erik Varga wrote:
> On Sun, Mar 22, 2015 at 10:10 PM, Oleg Endo  wrote:
> > The PBQP approach is indeed very tempting, but there
> > are a lot more things to it than just the solver.  To get good
> > improvements of the generated code, the optimization also has to be able
> > to reorder memory accesses and perform other transformations such as
> > converting pre-inc into post-inc modes in loops etc.
> 
> I confess there are some optimizations that the PBQP approach doesn't
> take into account, like reordering the instructions. Could you
> elaborate a bit on the prec-inc to post-inc conversion? I might be
> missing something, but I think the PBQP algorithm should have no
> problem transforming the addressing mode to post-inc if that makes the
> overall cost less.

Yes, the PBQP approach tries to minimize the total cost of address
sequences based on the available candidate addressing modes for a
particular access.  If a (cheaper) post-inc mode can be used instead of
e.g. a reg+disp mode, it will most likely catch it.  What I was
referring to is transforming a pre-inc mode access into a post-inc mode
access inside a loop (if the target can do post-inc only).  The
transformation itself is quite straight forward -- just place one inc
before the loop and convert the pre-inc inside the loop into a post-inc
access:

unsigned int test (const char* s0)
{
  const char* s1 = s0;
  while (*++s1);
  return s1 - s0 - 1;
}

In this particular case, the transformation is already done by something
else (yet the auto-inc-dec fails to see the post-inc opportunity).  I'd
have to dig through my notes to see if there was another use case...

> I'd like to give this a try, I'm sure a lot could be achieved in a
> summer. Could you share how you planned to approach the problem?

As far as I can see there are two different problems.  One is providing
the necessary infrastructure/framework to be able to do various
optimizations on access sequences and addressing modes inside of a GCC
RTL pass.  The other problem is trying to find a good algorithmic
solution for the optimization itself.

Before any optimization can be done access sequences need to be
extracted from the insn list.  A simple way would be to just locating
all memory accesses and use their access expressions as-is.  But that
won't go very far.  The most important information is some sort of
base-register/constant value of an access sequence, which needs to be
found.
Once the access sequences are there, there are multiple ways how to
optimize them.
For example, one option could be to first try to maximize the
utilization of post/pre-inc/dec modes (assuming/knowing that they are
generally a good idea on that target).  Then try to see how to handle
non-linear / pseudo-random accesses where displacements are out of
range.  At least on SH, there are two options.  Either adjust the base
address to make the displacements fit or load the constant displacement
into a reg and use a reg+reg mode.  Which one is better depends on the
surrounding code.
Another option could be to try to come up with a way to create good
costs and access mode candidates for the (PBQP) solver.  There you need
to answer the question "what are the costs for this access, in this
sequence, with this surrounding code, approximated register
pressure,...?".  Again, some modes might become cheaper after applying
particular transformations (access reordering to create linear access
patterns for post/pre-inc/dec).  If such special transformations are
already done to improve access sequences, running a solver afterwards
might not result in any further improvement.

>  I'd
> also be interested in some of the papers you found (I haven't yet been
> able to find much on the subject apart from the PBQP method).

I'll send you a private message with those.  If anyone else is
interested in that stuff, please let me know.

Cheers,
Oleg



Re: [gsoc] Generic addressing mode selection

2015-03-26 Thread Oleg Endo
On Thu, 2015-03-26 at 09:43 -0600, Jeff Law wrote:
> On 03/26/2015 08:32 AM, Erik Varga wrote:
> > Hi all,
> >
> > I've submitted my proposal to the GSoC website, it can be found here: [1]
> > After hearing some ideas from Oleg, I decided to go with working on
> > detecting and optimizing a few specific memory access patterns instead
> > of implementing a PBQP solver.
> > Any suggestions or comments are welcome.
> > I read that it's necessary to have a copyright assignment filed with
> > the Free Software Foundation to be able to contribute larger amounts
> > of code to GCC. When would it be best to start applying for a
> > copyright assignment (e.g. sometime during the community bonding
> > period, or the coding period, or around now)?

> If you're looking at exploiting auto-inc addressing, others and myself 
> have speculated that something built around 
> straight-line-strength-reduction at the RTL level would be ideal for 
> exploiting that capability.
> 
> That may be more suitable for a GSOC project than tackling the entire 
> space of address mode selections.

As far as I understand the proposal, the goal is not to solve all AMS
problems, but rather to lay the foundation for doing these kinds of
optimizations and deal with a few assorted ones (most likely auto-mod
will be a candidate).  Thus, I think Erik's proposal sounds feasible,
although I'd expect some of the allocations/priorities in the schedule
to change during the project. But that's not something unusual to
happen.

Cheers,
Oleg



Re: pre_modify/post_modify with scaled register

2015-05-17 Thread Oleg Endo
On Sun, 2015-05-17 at 11:09 -0600, Jeff Law wrote:
> On 05/17/2015 10:21 AM, Jon Beniston wrote:
> > Hi,
> >
> > The gccint docs for pre_modify/post_modify say that the address modifier
> > must be one of three forms:
> >
> > (plus:m x z), (minus:m x z), or (plus:m x i), where z is an index register
> > and i is a constant.
> >
> > Why isn’t (plus:m x (mult:m z i)) supported, for architectures that support
> > scaling of the index register (E.g. ARM?)
> >
> >
> >
> > Compiling:
> >
> > int *f(int *p, int x, int z)
> > {
> >p[z] = x;
> >return p + z;
> > }
> >
> > For ARM results in:
> >
> >  str r1, [r0, r2, asl #2]
> >  add r0, r0, r2, asl #2
> >
> > Rather than just:
> >
> >  str r1, [r0, r2, asl #2]!
> >
> > Should this be improved by expanding what pre/post_modify supports, as
> > above, or perhaps a peephole optimisation?
> I don't think it was really considered with we added pre/post_modify a 
> while back -- IIRC it was primarily driven by whatever port Michael 
> Hayes was working on at the time plus the capabilities of PPC and HPPA 
> at the time.
> 
> We'd certainly welcome patches to support scaling in the pre/post_modify 
> addressing modes.

One of this year's Google Summer of Code GCC Projects will try to
address some of the deficits of address mode selection/utilization in
GCC:
https://www.google-melange.com/gsoc/project/details/google/gsoc2015/erikvarga/5693417237512192

Ideas and use cases are highly appreciated.

Cheers,
Oleg



Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c

2015-05-21 Thread Oleg Endo
On Thu, 2015-05-21 at 11:59 -0700, Richard Henderson wrote:
> On 05/21/2015 11:44 AM, Segher Boessenkool wrote:
> > On Thu, May 21, 2015 at 11:34:14AM -0700, Richard Henderson wrote:
> >> Actually, I believe that the way CA is modeled at the moment is dangerous.
> >> It's not a 64-bit value, but a 1-bit value.
> > 
> > It's a fixed register and it is only ever set to 0 or 1.  There are
> > more targets that do such things, and it is safe.
> 
> Old Cygnus proverb: Lie to the compiler and it will always bite you in the 
> end.

Just for the record, the same is being done on SH with the T bit.  It's
a fixed 1 bit hardreg, but declared and treated as SImode, because all
the other integer arithmetic is done primarily in SImode, too.  No
significant problems with that.

Cheers,
Oleg



Re: Parallelize the compilation using Threads

2019-02-15 Thread Oleg Endo
On Tue, 2019-02-12 at 15:12 +0100, Richard Biener wrote:
> On Mon, Feb 11, 2019 at 10:46 PM Giuliano Belinassi
>  wrote:
> > 
> > Hi,
> > 
> > I was just wondering what API should I use to spawn threads and
> > control
> > its flow. Should I use OpenMP, pthreads, or something else?
> > 
> > My point what if we break compatibility with something. If we use
> > OpenMP, I'm afraid that we will break compatibility with compilers
> > not
> > supporting it. On the other hand, If we use pthread, we will break
> > compatibility with non-POSIX systems (Windows).
> 
> I'm not sure we have a thread abstraction for the host - we do have
> one for the target via libgcc gthr.h though.  For prototyping I'd
> resort
> to this same interface and fixup the host != target case as needed.

Or maybe, in the year 2019, we could assume that most c++ compilers
which are used to compile GCC support c++11 and come with an adequate
 implementation...  yeah, I know, sounds jacked :)

Cheers,
Oleg



Re: Missed optimization with const member

2017-07-05 Thread Oleg Endo
Hi,

On Wed, 2017-07-05 at 02:02 +0200, Geza Herman wrote:
> 
> Here's what happens: in callInitA(), an Object put onto the stack (which 
> has a const member variable, initialized to 0). Then somefunction called 
> (which is intentionally not defined). Then ~Object() is called, which 
> has an "if", which has a not-immediately-obvious, but always false 
> condition. Compiling with -03, everything gets inlined.
> 
> My question is about the inlined ~Object(). As m_initType is always 0, 
> why does not optimize the destructor GCC away? GCC inserts code that 
> checks the value of m_initType.
> 
> Is it because such construct is rare in practice? Or is it hard to do an 
> optimization like that?

It's not safe to optimize it away because the compiler does not know
what "somefunction" does.  Theoretically, it could cast the "Object&"
to some subclass and overwrite the const variable.  "const" does not
mean that the memory location is read-only in some way.

For some more explanations about const, see e.g. here 
https://stackoverflow.com/questions/4486326/does-const-just-mean-read-only-or-something-more

You can try showing "somefunction" to the compiler (i.e. put it into
the same translation unit in your example, or build the whole thing
with LTO) and see what happens.

Cheers,
Oleg


Re: Missed optimization with const member

2017-07-05 Thread Oleg Endo
On Wed, 2017-07-05 at 12:14 +0100, Jonathan Wakely wrote:
> 
> No, that would be undefined behaviour. The data member is defined as
> const, so it's not possible to write to that member without undefined
> behaviour. A variable defined with a const type is not the same as a
> variable accessed through a pointer/reference to a const type.
> 
> Furthermore, casting the Object to a derived class would also be
> undefined, because the dynamic type is Object, not some derived type.

Ugh, you're right.  Sorry for the misleading examples.

> 
> I think the reason it's not optimized away is for this case:
> 
> void somefunction(const Object& object);
> {
>   void* p = &object;
>   object.~Object();
>   new(p) Object();
> }
> 
> This means that after calling someFunction there could be a different
> object at the same location (with a possibly different value for that
> member).

This is basically what I was trying to say.  The function call acts as
an optimization barrier in this case because it could do anything with
that memory address it gets passed.  If the example is changed to:

void somefunction(const Object& object);

void callInitA(Object& x) {
Object o;
somefunction(x);
}

we can observe that everything of "Object o" gets optimized away as
expected.  And for that, the member doesn't even need to be const.

Is there actually any particular optimization mechanism in GCC that
takes advantage of "const"?  I'm not aware of any such thing.

Cheers,
Oleg


Re: Linux and Windows generate different binaries

2017-07-16 Thread Oleg Endo
On Sun, 2017-07-16 at 17:32 -0500, Segher Boessenkool wrote:
> On Sun, Jul 16, 2017 at 11:54:43PM +0300, Alexander Monakov wrote:
> > 
> > On Sun, 16 Jul 2017, Segher Boessenkool wrote:
> > > 
> > > I am well aware, and that is not what I asked.  If we would use
> > > stable sorts everywhere

> > How? There's no stable sort in libc and switching over to
> > std::stable_sort would be problematic.

> Why?

Actually GCC has been carrying around its very own implementation of
sort algorithms in libstdc++.  It's just not being used for the
compiler itself.  Because before GCC was compiled as C, using them was
impossible.  But now that it's compiled as C++, why not just use the
algos from GCC's libstdc++?

Cheers,
Oleg


Re: Volatile Memory accesses in Branch Delay Slots

2017-07-25 Thread Oleg Endo
On Tue, 2017-07-25 at 10:47 +0200, Jakob Wenzel wrote:
> 
> jr's delay slot is not filled. However, if the declaration of a is 
> changed to `extern int a`, the delay slot is filled with the sw.
> 
> The function responsible for this behavior seems to be 
> resource_conflicts_p in reorg.c. Sadly, I could not find any
> comments 
> explaining why volatile accesses cannot be put into delay slots.
> 
> What is the reason for this behavior? I am unable to think of any 
> situation where allowing volatile memory accesses in branch delay
> slots  leads to problems. Am I missing a case? Or are negative
> effects limited  to other architectures?

Maybe because the code that does the delay slot stuffing does not do
sophisticated checks whether such instruction reordering would not
violate anything?  So it's playing safe and bails out if it sees
"volatile mem".  Same thing happens also with insns that have multiple
sets.  Ideally it should do some more fine grained checks and give the
backend an option to opt-in or opt-out.

Cheers,
Oleg


Re: Overwhelmed by GCC frustration

2017-07-31 Thread Oleg Endo
On Mon, 2017-07-31 at 15:25 +0200, Georg-Johann Lay wrote:
> Around 2010, someone who used a code snipped that I published in
> a wiki, reported that the code didn't work and hang in an
> endless loop.  Soon I found out that it was due to some GCC
> problem, and I got interested in fixing the compiler so that
> it worked with my code.
> 
> 1 1/2 years later, in 2011, [...]

I could probably write a similar rant.  This is the life of a "minority
target programmer".  Most development efforts are being done with
primary targets in mind.  And as a result, most changes are being
tested only on such targets.

To improve the situation, we'd need a lot more target specific tests
which test for those regressions that you have mentioned.  Then of
course somebody has to run all those tests on all those various
targets.  I think that's the biggest problem.  But still, with a test
case at hand, it's much easier to talk to people who have silently
introduced a regression on some "other" targets.  Most of the time they
just don't know.

Cheers,
Oleg




Re: Overwhelmed by GCC frustration

2017-08-16 Thread Oleg Endo
On Wed, 2017-08-16 at 15:53 +0200, Georg-Johann Lay wrote:
> 
> This means it's actually waste of time to work on these
> backends.  The code will finally end up in the dustbin as cc0
> backends are considered undesired ballast that has to be
> "jettisoned".
> 
> "Deprecate all cc0" is just a nice formulation of "deprecate
> most of the cc0 backends".
> 
> Just the fact that the backends that get most attention and attract
> most developers don't use cc0 doesn't mean cc0 is a useless device.

The desire to get rid of old, crusty and unmaintained stuff is somehow
understandable...


> First of all, LRA cannot cope with cc0 (Yes, I know deprecating
> cc0 is just to deprecate all non-LRA BEs).  LRA asserts that
> accessing the frame doesn't change condition code. LRA doesn't
> provide replacement for LEGITIMITE_RELOAD_ADDRESS.  Hence LRA
> focusses just comfortable, orthogonal targets.

It seems LRA is being praised so much, but all those niche BEs and
corner cases get zero support.  There are several known instances of SH
code regressions with LRA, and that's why I haven't switched it to
LRA. 

I think the problem is that it's very difficult to make a register
allocator that works well for everything.  The last attempt ended in
reload.  And eventually LRA will go down the same route.  So instead of
trying to fit a round peg in a square hole, maybe we should just have
the options for round and square pegs and holes.


Cheers,
Oleg


Re: Optimizing away deletion of null pointers with g++

2017-08-16 Thread Oleg Endo
On Wed, 2017-08-16 at 13:30 +0200, Paolo Carlini wrote:
> 
> I didn't understand why we don't already handle the easy case:
> 
> constexpr int* ptr = nullptr;
> delete ptr;
> 

What about overriding the global delete operator with some user defined
implementation?  Is there something in the C++ standard that says the
invocation can be completely omitted, i.e. on which side of the call
the nullptr check is being done?

One possible use case could be overriding the global delete operator to
count the number of invocations, incl. for nullptr.  Not sure how
useful that is though.

Cheers,
Oleg


Re: Overwhelmed by GCC frustration

2017-08-17 Thread Oleg Endo
On Wed, 2017-08-16 at 19:04 -0500, Segher Boessenkool wrote:
> 
> LRA is easier to work with than old reload, and that makes it better
> maintainable.
> 
> Making LRA handle everything reload did is work, and someone needs to
> do it.
> 
> LRA probably needs a few more target hooks (a _few_) to guide its
> decisions.

Like Georg-Johann mentioned before, LRA has been targeted mainly for
mainstream ISAs.  And actually it's a pretty reasonable choice.  Again,
I don't think that "one RA to rule them all" is a scalable approach.
 But that's just my opinion.

Cheers,
Oleg


Re: Bit-field struct member sign extension pattern results in redundant

2017-08-18 Thread Oleg Endo
On Fri, 2017-08-18 at 10:29 +1200, Michael Clark wrote:
> 
> This one is quite interesting:
> 
> - https://cx.rv8.io/g/WXWMTG
> 
> It’s another target independent bug. x86 is using some LEA followed
> by SAR trick with a 3 bit shift. Surely SHL 27, SAR 27 would suffice.
> In any case RISC-V seems like a nice target to try to fix this
> codegen for, as its less risk than attempting a fix in x86 ;-)
> 
> - https://github.com/riscv/riscv-gcc/issues/89
> 
> code:
> 
>   template 
>   inline T signextend(const T x)
>   {
>   struct {T x:B;} s;
>   return s.x = x;
>   }
> 
>   int sx5(int x) {
>   return signextend(x);
>   }
> 
> riscv asm:
> 
>   sx5(int):
>     slliw a0,a0,3
>     slliw a0,a0,24
>     sraiw a0,a0,24
>     sraiw a0,a0,3
>     ret
> 
> hand coded riscv asm
> 
>   sx5(int):
>     slliw a0,a0,27
>     sraiw a0,a0,27
>     ret
> 

Maybe related ...

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67644
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521


Cheers,
Oleg


Combine Pass Behavior

2011-10-10 Thread Oleg Endo
Hi all,

I'm currently trying to find the best way to solve PR 49263 and I've ran
into some questions regarding the combine pass.

Summary of the story:

The SH machine description has a pattern that is supposed to generate
the "tst #imm, r0" instruction as a combination of an and and a
comparison, where #imm is an unsigned byte (0...255):

(define_insn ""
  [(set (reg:SI T_REG)
(eq:SI (and:SI (match_operand:SI 0 "arith_reg_operand" "z,r")
   (match_operand:SI 1 "logical_operand" "K08,r"))
   (const_int 0)))]
  "TARGET_SH1"
  "tst  %1,%0"
  [(set_attr "type" "mt_group")])

However, the combine pass does not find this pattern for certain bit
patterns, because it tries to combine the two insns..

(insn 10 7 11 2 (set (reg:SI 170)
(and:SI (reg/v:SI 164 [ x ])
(const_int 4 [0x4]))) {*andsi3_compact}
 (expr_list:REG_DEAD (reg/v:SI 164 [ x ])
(nil)))

(insn 11 10 17 2 (set (reg:SI 147 t)
(eq:SI (reg:SI 170)
(const_int 0 [0]))) {cmpeqsi_t}
 (expr_list:REG_DEAD (reg:SI 170)
(nil)))


into something like this

(set (reg:SI 147 t)
(eq:SI (zero_extract:SI (reg:SI 4 r4 [ x ])
(const_int 2 [0x2])
(const_int 0 [0]))
(const_int 0 [0])))

or .. 

(set (reg:SI 147 t)
(zero_extract:SI (xor:SI (reg:SI 4 r4 [ x ])
(const_int 4 [0x4]))
(const_int 1 [0x1])
(const_int 2 [0x2])))

or .. 

(set (reg:SI 168)
(and:SI (not:SI (reg:SI 4 r4 [ x ]))
(const_int 1 [0x1])))

.. and a couple of other variants.


The actual problem (or maybe my misunderstanding) is that it
combines the two original insns, then does some substitutions and tries
to match the combined and transformed insn against those defined in the
machine description.  If it can't find anything there it reverts
everything and proceeds with the next insn pair.  It never tries out the
straight forward option in the first place (which is not to transform
the combination). 

As a quick hack for myself I've added a second combine pass (ran after
the original combine pass) where the function 

  make_compound_operation (rtx x, enum rtx_code in_code) 

simply returns x instead of expanding it into a zero_extract.  This
fixed most of the issues at hand, but doesn't sound quite right.

Is the scenario above intended behavior of the combine pass or an
accident?  Or maybe even something else wrong in the machine description
that makes it behave like that?  At least it sounds very related to PR
30829.

Thanks,
Oleg




Re: C++11 no longer experimental

2011-10-30 Thread Oleg Endo
On Sun, 2011-10-30 at 14:14 +0100, Gerald Pfeifer wrote:

> +  C++0x was the working name of a new ISO C++ standard, which then
> +  was released in 2011 as C++11 and introduces a host of new features
> +  into the standard C++ language and library. This project seeks to
>implement new C++0x features in GCC and to make it one of the first
>compilers to bring C++0x to C++ programmers.

Since C++11 is now the official name, wouldn't it be better to use the
new name instead of the old one after the initial historical
introduction? :)
Like... 

C++0x was the working name of a new ISO C++ standard, which was then
released in 2011 as C++11 and introduces a host of new features
into the standard C++ language and library. This project seeks to
implement new C++11 features in GCC and to make it one of the first
compilers to bring C++11 to C++ programmers.

Cheers,
Oleg



Re: reverse conditionnal jump

2012-01-05 Thread Oleg Endo
On Thu, 2012-01-05 at 14:49 +0100, BELBACHIR Selim wrote:
> Hi,
> 
> I'm still developping a new private target backend (gcc4.5.2) and I
> noticed something strange in the assembler generated for conditionnal
> jump.
> 
> 

I'm not sure whether it can help you in this particular case, but you
might have a look at how this is done in the SH target (trunk version,
not 4.5.2 snapshot) and take PR 51244 into account.  Currently the SH
target suffers from a similar problem.  The patch from PR 51244 fixes
the issue.  In the SH case it helped looking at what the combine pass is
trying to do.

Hope it's useful.

Cheers,
Oleg



Re: wishlist: support for shorter pointers

2023-07-04 Thread Oleg Endo
> I think a C++ class (or rather, class template) with inline functions is 
> the way to go here.  gcc's optimiser will give good code, and the C++ 
> class will let you get nice syntax to hide the messy details.
> 
> There is no good way to do this in C.  Named address spaces would be a 
> possibility, but require quite a bit of effort and change to the 
> compiler to implement, and they don't give you anything that you would 
> not get from a C++ class.
> 
> (That's not quite true - named address spaces can, I believe, also 
> influence the section name used for allocation of data defined in these 
> spaces, which cannot be done by a C++ class.)
> 

Does the C++ template class shebang work for storing "short code pointers"
for things like compile-time/link-time generated function tables?  Haven't
tried it myself, but somehow I doubt it.

Cheers,
Oleg




Re: Building a GCC backend for the STM8

2024-01-30 Thread Oleg Endo
Hi,

On Sun, 2024-01-28 at 04:41 +0100, Sophie 'Tyalie' Friedrich via Gcc wrote:
> Hello dear people,
> 
> I want to try building a GCC compiler backend for the STM8 
> micro-controller target in order to make this wonderful architecture 
> more accessible.
> 
> But as I'm fairly new in this area of building compiler backends for 
> GCC, I would need a bit of guidance / read material to get started. Do 
> you have recommendations for anything? And is there interest in such work?
> 
> With best regards
> Tyalie

GCC might not be a bit difficult for 8-bit targets.  For example, if you
look at RL78, it had to resort to some virtual register set workaround
because GCC usual register allocation at that time couldn't deal with it. 
8-bit targets like AVR seem to be a bit easier.

Some other interesting options for 8-bit targets are SDCC (STM8 already
supported it seems) and llvm-mos project (LLVM port originally for 6502).

Cheers,
Oleg


Re: Comparing compile times and binary size for GCC compiling a few GCC releases

2013-03-16 Thread Oleg Endo
Hi,

On Sat, 2013-03-16 at 19:44 +0100, Steven Bosscher wrote:

> * cc1 for GCC 4.8.0 has a much larger .bss section than previous releases

But according to the table that followed ... 

> cc1 binary size:
> textdatabss dec hex filename
> 6460157 374800  535656  7370613 707775  4.2.4/cc1
> 7558042 666080  436744  8660866 842782  4.3.6/cc1
> 10917491775144  684544  12377179bcdc5b  4.5.4/cc1
> 11715323489512  984224  13189059c93fc3  4.6.3/cc1
> 12351879484968  1193672 14030519d616b7  4.7.2/cc1
> 13731087499080  781496  15011663e50f4f  4.8.0/cc1

... 4.7 .bss is 1193672 and 4.8 .bss is 781496  Am I missing
something?

> 
> I started looking into this because I've noticed compilations of many
> small files being slower. I wonder if the larger cc1 and .bss sections
> could be in part responsible for that...?

Could probably be.  Compiling many small files puts pressure on compiler
executable startup and larger executables result in slower startup times
(well maybe not always immediately, but eventually -- at some point in
time code/data will have to be paged in or initialized.. ), doesn't it?
I'm curious, do you happen to have some numbers for those 'small file'
cases?

Cheers,
Oleg



Re: Help for my Master thesis

2013-03-30 Thread Oleg Endo
Hello,

On Fri, 2013-03-29 at 20:35 +, Kiefmann Bernhard wrote:
> Dear Ladies and Gentlemen!
> 
> My name is Bernhard Kiefmann and I'm writing my Master's thesis with
> the topic "the suitability of the GNU C compiler used in safety-related
> areas". 

I can imagine that it could be important to differentiate whether the
compiler is just used to compile programs that are subject to safety
constraints, or whether the compiler is part of the runtime system and
is used during runtime (e.g. JIT compilation).

> The first problem with this is that I have to check if the
> compiler met the requirements of the international standard IEC
> 61508:2010. Here I would like to ask you my question as follows:
> 
>   1) What are the rules of the compiler development? 

Basically: Each patch (i.e. modification of the compiler) is tested
using compiler bootsrapping and/or testsuite and then submitted for
review. 

> Are there any diagrams of UML? Because they are a requirement of the standard.

Not in the official GCC repository. (at least not that I know of).

>   2) Are there activities for the Functional Verification?

The testsuite contains test cases with some input (program code) and
expected output.  There are test cases that check whether a piece of
code just compiles, links, runs and produces some expected output or
whether certain instructions are generated on particular targets.

>   3) What procedures and measures for
>- The design and programming guidelines

There is a common coding convention.  The SW design of the compiler is
modified in order to fulfill the needs for new features or improvements.
Some data structures and algorithms are carefully designed or picked to
meet certain performance criteria (e.g. avoiding n^2 algorithms).

>- Dynamic analysis and testing

Occasionally developers use additional tools such as GDB, Valgrind etc
to identify problematic parts in the compiler.

>- Functional testing and black box testing

Testsuite (see above).

>- Ausfall-/Versagensanalyse

I guess you mean failure analysis here.
If somebody detects a problem (e.g. compiler crashes or produces wrong
machine code and the compiled program crashes) this problem is reported
and a reduced test case is derived.  After fixing the issue in the
compiler the test case is added to the testsuite.

>- Performance tests

People run various kinds of benchmarks and post the results to the
mailing lists etc.

>- Modular approach

GCC is split into different modules internally, such as language
front-ends, SSA tree optimizations/transformations, back-ends etc.

> 
> If you have information here for me I would rather help in assessing
> whether the compiler for use in safety-relevant area is suitable. The
> second point of my work is concerned with the treatment of releases.
> Are you putting any kind of evidences in your source-code and how they
> look like? 

What do you mean by putting evidence into the source code regarding
releases?  Like associating a release and the source code that was used
to make the release?  The official GCC is released as source code only.
Versions are tracked with SVN branches.  See also the bottom of this
page: http://gcc.gnu.org/develop.html 

Hope it helps,
Oleg



Re: Calculating cosinus/sinus

2013-05-11 Thread Oleg Endo
Hi,

This question is not appropriate for this mailing list.
Please take any further discussions to the gcc-help mailing list.

On Sat, 2013-05-11 at 11:15 +0200, jacob navia wrote:
> Hi
> 
> When caculating the cos/sinus, gcc generates a call to a complicated 
> routine that takes several thousand instructions to execute.
> 
> Suppose the value is stored in some XMM register, say xmm0 and the 
> result should be in another xmm register, say xmm1.
> 
> Why it doesn't generate:
> 
>  movsd%xmm0,(%rsp)
>  fldl (%rsp)
>  fsin
>  fstpl(%rsp)
>  movsd(%rsp),%xmm1
> 
> My compiler system (lcc-win) is generating that when optimizations are 
> ON. Maybe there are some flags in gcc that I am missing?

These optimizations are usually turned on with -ffast-math.
You also have to make sure to select the appropriate CPU or architecture
type to enable the usage of certain instructions.

For more information see:
http://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html

Cheers,
Oleg



Buzilla SVN commit messages

2013-05-12 Thread Oleg Endo
Hi,

I've noticed that for some reason SVN commit messages stopped showing up
in Bugzilla PRs a while ago (before the Bugzilla 4.4 update). 
What I usually put into the commit message goes like ...

PR target/57108
* gcc.target/sh/pr57108.c: Move this test case to ...
* gcc.c-torture/compile/pr57108.c: ... here.

... which used to work just fine.  I haven't changed my commit message
format, so I guess something else has changed somewhere?  Anything
special to be aware of?

Cheers,
Oleg



Re: Buzilla SVN commit messages

2013-05-20 Thread Oleg Endo
On Sun, 2013-05-12 at 12:33 +0100, Jonathan Wakely wrote:
> On 12 May 2013 11:38, Oleg Endo wrote:
> > Hi,
> >
> > I've noticed that for some reason SVN commit messages stopped showing up
> > in Bugzilla PRs a while ago (before the Bugzilla 4.4 update).
> 
> It was the sourceware.org hardware upgrade.  The svn commit hook that
> used to email bugzilla wasn't migrated over.  I assumed it would be
> done a few days after the upgrade, but hasn't been.

Any chance that this gets fixed in the near future?
I find it quite useful to be able to see what has been done for a
particular PR.

Cheers,
Oleg



Re: Loop induction variable optimization question

2013-06-17 Thread Oleg Endo
On Mon, 2013-06-17 at 10:07 -0700, Steve Ellcey wrote:
> I have a loop induction variable question involving post increment.
> If I have this loop:
> 
> [...]

> My question is is: why (and where) did ivopts decide to move the
> post-increments above the usages in the first loop?  In my case
> (MIPS) the second loop generates better code for me then the first
> loop and I would like to avoid the '-4' offsets that are used.
> Ideally, one would think that GCC should generate the same code
> for both of these loops but it does not.
> 

Sorry for not having an answer.  I got curious, because just yesterday I
was looking at this one
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55190
and thought that this is related, although it doesn't seem to.
I've tried the two functions of yours on SH and there it produces the
same machine code with -O2.  -O3 results in a call to memcpy, while
-O3 -fno-tree-loop-distribute-patterns again results in the same code.

Cheers,
Oleg



Re: List of typos.

2013-07-07 Thread Oleg Endo
On Sun, 2013-07-07 at 19:54 +0200, Georg-Johann Lay wrote:
> Ondrej Bilka schrieb:
> 
> > http://kam.mff.cuni.cz/~ondra/gcc_misspell.patch
> 

Below are some other hunks that look suspicious...
(trying not to duplicate the things already mentioned by others)

- * 1) It means that finalizers, and all methods calle by them,
+ * 1) It means that finalizers, and all methods callee by them,

-> 'called'


-  /*  SET_ACCESS, we want to set an explicte set of permissions, do not
+  /*  SET_ACCESS, we want to set an explicate set of permissions, do not

-> 'explicit'


In Objective-C, there are two additional variants:
 
foreach-statement:
- for ( expression in expresssion ) statement
+ for ( expression in expressions ) statement
  for ( declaration in expression ) statement

Really?  I'm not so sure.


-   configury */
+   configure */

-> 'configury'


-   calll#gettlsoff(ADDR)@(gr8, gr0)
+   call#gettlsoff(ADDR)@(gr8, gr0)


-   calll   #gettlsoff(ADDR)@(gr8, gr0)
+   call   #gettlsoff(ADDR)@(gr8, gr0)

The original 'calll' is correct (see frv.md).


-   have the same priority - candidate is best if its dependees were
+   have the same priority - candidate is best if its dependencies were

-> 'dependees'


-   does not look at other present displacement addressings around it.
+   does not look at other present displacement addressing around it.

-> 'addressings' (as in addressing modes)


-  CM_SMALL,/* Makes various assumpation about sizes of code and
+  CM_SMALL,/* Makes various assumption about sizes of code and

-  CM_SMALL_PIC,/* Makes various assumpation about sizes of code and
+  CM_SMALL_PIC,/* Makes various assumption about sizes of code and

-> 'assumptions'


   /* If we had fewer function args than explicit template args,
- just use the explicits.  */
+ just use the explicit.  */

-> 'explicit ones'


-array reference if the where and elswhere destinations
+array reference if the where and Elsinore destinations

-> 'elsewhere'


-   We provide accestor to the inline_summary datastructure and
+   We provide accessor to the inline_summary datastructure and

-> probably 'ancestor'


-/* The array used to find duplications in conflict vectors of
+/* The array used to find duplication in conflict vectors of

-/* Remove duplications in conflict vector of OBJ.  */
+/* Remove duplication in conflict vector of OBJ.  */

-/* Process operator duplications in insn with ID.  We do it after the
+/* Process operator duplication in insn with ID.  We do it after the

-> 'duplicates' maybe?


-   function we iterate decompressions until no data remains.  */
+   function we iterate decompression's until no data remains.  */

-> 'decompressions'


-   TODO: Make into some kind of configury-generated table.  */
+   TODO: Make into some kind of configure-generated table.  */

-> 'configury-generated'


-point of view as prefetch withouth dependecies will have a
+point of view as prefetch withouth dependencies will have a

-> missed 'without'


-   * Unique vinsn derivates from CALL, ASM, JUMP (for a while) and other
+   * Unique vinsn derivatives from CALL, ASM, JUMP (for a while) and other

-> maybe 'deviates' ?


-/* Find the set of registers that are unavailable for storing expres
+/* Find the set of registers that are unavailable for storing express

-   that are not available for storing expres while moving ORIG_OPS up on the
+   that are not available for storing express while moving ORIG_OPS up on the

-  /* Merge c_expres found or unify live register sets from different
+  /* Merge c_express found or unify live register sets from different


-> maybe 'expression' ?


-/* { dg-final { scan-assembler "calll.*#gettlsoff\\(0\\)" } } */
+/* { dg-final { scan-assembler "call.*#gettlsoff\\(0\\)" } } */

-> see above for the 'calll' frv case.  This breaks the test case.
Do not change things inside /* { } */ comments in test cases.


-/* PR target/50749: Verify that subsequent post-increment addressings
+/* PR target/50749: Verify that subsequent post-increment addressing

-/* PR target/50749: Verify that subsequent pre-decrement addressings
+/* PR target/50749: Verify that subsequent pre-decrement addressing

-/* PR target/50749: Verify that subsequent post-increment addressings
+/* PR target/50749: Verify that subsequent post-increment addressing

-/* PR target/50749: Verify that subsequent pre-decrement addressings
+/* PR target/50749: Verify that subsequent pre-decrement addressing

-> 'addressings' (as in addressing modes)


-   /* For buitins that are likely expanded to nothing or
+   /* For builtin's that are likely expanded to nothing or

-> 'builtins'


BASE must be either a declaration or a memory reference that has correct
-   alignment ifformation embeded in it (e.g. a pre-existing one in SRA).  */
+   alignment ifformation embedded in it (e.g. 

Re: List of typos.

2013-07-08 Thread Oleg Endo
On Mon, 2013-07-08 at 16:12 +0200, Ondřej Bílka wrote:
> On Sun, Jul 07, 2013 at 09:57:05PM +0200, Oleg Endo wrote:
> > On Sun, 2013-07-07 at 19:54 +0200, Georg-Johann Lay wrote:
> > > Ondrej Bilka schrieb:
> > > 
> > > > http://kam.mff.cuni.cz/~ondra/gcc_misspell.patch
> > >
> I fixed most comments, put it here so you can diff these two files.
> http://kam.mff.cuni.cz/~ondra/gcc_misspell_fixed.patch 
> 
> > 
> 
> 
> > BASE must be either a declaration or a memory reference that has correct
> > -   alignment ifformation embeded in it (e.g. a pre-existing one in SRA).  
> > */
> > +   alignment ifformation embedded in it (e.g. a pre-existing one in SRA).  
> > */
> > 
> > -> missed 'information' I guess...
> >
> 
> This fixes only a-e. These are probably incomplete as I needed to
> exclude lot of names that are variable names etc. 
> 
> I did selectin based on following file:
> http://kam.mff.cuni.cz/~ondra/gcc_misspells 
> >
> > 
> > -   http://www.ddj.com/articles/1997/9701/9701o/9701o.htm?topic=algoritms
> > +   http://www.ddj.com/articles/1997/9701/9701o/9701o.htm?topic=algorithms
> > 
> > both links do 404 anyway ;)
> > 
> could you find what this was? I need to add another filter not touch
> html which I will do probably tomorrow.

It seems the original article is gone.  At least I can't find it easily
(the patch that added the link is from 2001...).  It's about Fibonacci
Heaps so there's plenty of material out there on the net.  The paper
mentioned in the comment can be found rather easily:

http://www.cs.princeton.edu/courses/archive/fall03/cs528/handouts/fibonacci%20heaps.pdf

However, maybe it's better to replace the DDJ link with this
http://en.wikipedia.org/wiki/Fibonacci_heap 

Cheers,
Oleg



Re: AVR-gcc shift optimization

2013-08-02 Thread Oleg Endo
Hi,

On Thu, 2013-08-01 at 21:23 -0400, Asm Twiddler wrote:
> Hello all.
> 
> The current implementation produces non-optimal code for large shifts
> that aren't a multiple of eight when operating on long integers (4
> bytes).
> All such shifts are broken down into a slow loop shift.
> For example, a logical shift right by 17 will result in a loop that
> takes around 7 cycles per iteration resulting in ~119 cycles.
> This takes at best 7 instruction words.
> 
> A more efficient implementation could be:
> mov %B0,%D1
> mov %A0,%C1
> clr %C0
> clr %D0
> lsr %C0
> ror %D0
> This gives six cycles and six instruction words, but which can both be
> reduced to five if movw exists.
> 
> There are several other locations where a more efficient
> implementation may be done.
> 
> I'm just wondering why this functionality doesn't exist already.
> It seems like this would probably be fairly easy to implement,
> although a bit time consuming.
> I would also guess lack of interest or lack of use of long integers.
> 
> Lack of this functionality wouldn't be a problem as one could simply
> split the shift.
> Sadly my attempts to split the shift result in it being recombined.
> 
> unsigned long temp = val >> 16;
> return temp >> 1;
> 
> gives the same assembly as
> 
> return val >> 17;
> 
> 
> Thanks for any info.

GCC's AVR backend does have some special shift handling.  Maybe it used
to work in some GCC version and then stopped working in another version
without anybody noticing, or something else is wrong.  You can file a
bug report in Bugzilla at http://gcc.gnu.org/bugzilla/ but you'll have
to provide more details.  See also http://gcc.gnu.org/bugs/

Cheers,
Oleg




Re: why cross out cout make result different?

2013-08-03 Thread Oleg Endo
Hello,

This mailing list is for the development of GCC, not for using it.
gcc-help might be more appropriate for this kind of question, although
it doesn't seem to be GCC related.
Please do not send any follow ups to gcc@gcc.gnu.org

On Fri, 2013-08-02 at 18:25 -0700, eric lin wrote:
> 
> I have tried to copy QuickSort c++ programs:
> ---
> #include 
> using namespace std;
> 
> 
> class Element
> {
> public: 
>   int getKey() const { return key;};
>   void setKey(int k) { key=k;};
> private:
>   int key;
>   // other fields
> 
> };
> 
> #define InterChange(list, i, j)  t=list[j]; list[i]=list[j]; list[j]=t;

This is probably wrong.  In your code it expands to

if (i < j)
  t = list[j];

list[i] = list[j];
list[j] = t;

Make InterChange a function instead of a macro and try again.

> /*-*/
> 
> 
> void QuickSort(Element list[], /* const */ int left, /*const */ int right)
> // Sort records list[left], ..., list[right] into nondescreasing order on 
> field key.
> // Key pivot = list[left].key is arbitrarily chosen as the pivot key.  
> Pointer i and j
> // are used to partition the sublist so that at any time list[m].key <= 
> pivot, m < i;
> // and list[m].key >= pivot, m>j.  It is assumed that list[left].key 
> <=list[right+1].key.
> {
> Element t;
> 
>   if (left  int i = left,
>  j=right+1,
>  pivot=list[left].getKey();
>  do {
> do i++;  while(list[i].getKey() < pivot);
> do j--;  while(list[j].getKey() > pivot);
> if (i   } while(i  InterChange(list, left, j);
>  
>  cout << "---show bankaccount1[0]= " << list[0].getKey() << " 
> bankaccount1[1]= " << list[1].getKey() <<  " bankaccount1[7]= " << 
> list[7].getKey() << " its left= " << left << endl;
>  QuickSort(list, left, j-1);
>  QuickSort(list, j+1, right);
>   }
> }
> 
> /**/
> 
> int main() {
>   Element bankaccount1[10];
>   int l1, r1;
> 
>   bankaccount1[0].setKey(26);
>   bankaccount1[1].setKey(5);
>   bankaccount1[2].setKey(37);
>   bankaccount1[3].setKey(1);
>   bankaccount1[4].setKey(61);
>   bankaccount1[5].setKey(11);
>   bankaccount1[6].setKey(59);
>   bankaccount1[7].setKey(15);
>   bankaccount1[8].setKey(48);
>   bankaccount1[9].setKey(19);
>   l1=0;
>   r1=9;
> 
>   for (int i=0; i<10; i++)
> cout << bankaccount1[i].getKey() << "  " ;
>   cout << endl;
> 
>   QuickSort(bankaccount1, l1, r1);
>   for (int i=0; i<10; i++)
> cout << bankaccount1[i].getKey() << "  " ;
>   cout << endl;
> 
> return 0;
> }
> /*-*/
> if I (or you) commnet out cout show bankaccount1 that line, it will show 
> different results
> both result s are not what I expected(accroding to books)
> I am in 4.6.1
> 
> _
> Luxmail.com




Re: Pointer arithmetic

2013-08-07 Thread Oleg Endo
On Tue, 2013-07-09 at 09:37 -0700, Hendrik Greving wrote:
> On a machine with ABI ILP32LL64:
> 
> (insn 123 122 124 (nil) (set (reg:SI 392)
> (mem:SI (plus:SI (reg/v:SI 386)
> (reg/v:SI 349)) [0 sec 0 space 0, cmsmode 0 S4 A32])) -1 (nil)
> (nil))
> 
> If we support legitimate memory addresses like [r1+r2] (e.g. indexed
> addresses), can the above RTL match such a load? 

On 32 bit address machines it should match.

> I am asking because
> of overflows, I am not sure how that part is defined, and where the
> Spec is. What do I need to check in the backend for such a definition?
> Is this POINTER_SIZE? E.g. what if the machine supports > 32 bits, who
> is responsible to make sure that there is no overflow > 32 bits in
> this case? Compiler? Assembler? Or even the user?

AFAIK overflow is undefined and thus anything can happen.  Induction
variable and loop optimizations may produce interesting things if
addresses overflow.  For example, see
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55190

Cheers,
Oleg



Re: Inefficiencies in large integers

2013-08-18 Thread Oleg Endo
On Sun, 2013-08-18 at 00:55 -0400, Asm Twiddler wrote:
> Hello all,
> 
> I'm not sure whether this has been posted before, but gcc creates
> slightly inefficient code for large integers in several cases:
> 

I'm not sure what the actual question is.
Bug reports and enhancement suggestions of that kind usually go to
bugzilla and you should also specify which compiler version you're
referring to.

Anyway, I've tried your examples on SH (4.9) which also does 64 bit
operations with stitched 32 bit ops.

> unsigned long long val;
> 
> void example1() {
> val += 0x8000ULL;
> }
> 
> On x86 this results in the following assembly:
> addl $0, val
> adcl $32768, val+4
> ret

This is probably because if a target defines a plus:DI / minus:DI
patterns (which is most likely to be the case, because of carry / borrow
bit handling peculiarities) these kind of zero bits special cases will
not be handled automatically.
Another example would be:

unsigned long long example11 (unsigned long long val, unsigned long x)
{
  val += (unsigned long long)x << 32;
  return val;
}

> The first add is unnecessary as it shouldn't modify val or set the carry.
> This isn't too bad, but compiling for a something like AVR, results in
> 8 byte loads, followed by three additions (of the high bytes),
> followed by another 8 byte saves.
> The compiler doesn't recognize that 5 of those loads and 5 of those
> saves are unnecessary.

This is probably because of the same or similar reason as mentioned
above.  I've tried the following:

void example4 (unsigned long long* x)
{
  *x |= 1;
}

and it results in:
mov.l   @(4,r4),r0
or  #1,r0   
rts
mov.l   r0,@(4,r4)

So I guess the fundamental subreg load/store handling seems to work.


> Here is another inefficiency for x86:
> 
> unsigned long long val = 0;
> unsigned long small = 0;
> 
> unsigned long long example1() {
> return val | small;
> }
> 
> unsigned long long example2() {
> return val & small;
> }
> 
> The RTL's generated for example1 and example2 are very similar until
> the fwprop1 stage.
> Since the largest word size on x86 is 4 bytes, each operation is
> actually split into two.
> The forward propagator correctly realizes that anding the upper 4
> bytes results in a zero.
> However, it doesn't seem to recognize that oring the upper 4 bytes
> should return val's high word.
> This problem also occurs in the xor operation, and also when
> subtracting (val - small).

In my case the double ior:SI and and:SI operations are eliminated in
the .cse1 pass and the resulting code is optimal.

My impression is that the stitched multiword add/sub thing could be
addressed in a target independent way so that it would work for all
affected targets automatically.
The other issues seem to be individual target problems.

Cheers,
Oleg



Re: [RFC] Detect most integer overflows.

2013-10-27 Thread Oleg Endo
On Sun, 2013-10-27 at 07:48 +0100, Ondřej Bílka wrote:
> On Sun, Oct 27, 2013 at 01:50:14AM +0200, Hannes Frederic Sowa wrote:
> > On Sat, Oct 26, 2013 at 09:29:12PM +0200, Ondřej Bílka wrote:
> > > Hi, as I brainstormed how prevent possible overflows in memory allocation 
> > > I
> > > came with heretic idea:
> > > 
> > > For gcc -D_FORTIFY_SOURCE=2 we expand all multiplication with size_t
> > > type by one that checks for integer overflow and aborts on it. This
> > > would prevent most overflow at cost of breaking some legitimate
> > > applications that use multiplication in clever way.
> > > 
> > > A less heretic way that is applicable for C++ would be write a class
> > > size_t overflow that would do arithmetic in saturating way and issue
> > > warnings when there is a size_t multiplication.
> > 
> > I am afraid of the false-positive aborts which could result in DoS against
> > applications. I like the checked arithmetic builtins LLVM introduced in
> 
> How likely is code that uses size_t for something other than size
> calculation?
> 
> I did not realized that this has opposite problem as lot of programs
> still use int for size calculations.
> 
> > 3.4 (not yet released) where one can test for overflow manually and handle
> > the overflows appropriately. They also generate better code (e.g. they
> > use the overflow flag and get inlined on x86 compared to the ftrapv insn).
> >
> As a workaround you can on x64 implement them by macros with inline assembly.
>  
> > So I would vote for fast checked arithmetic builtins first.

I think both, checked arithmetic builtins and ftrapv can be useful.
Builtins can be used to selectively do fine grained checked arithmetic,
ftrapv can be used to enable checked arithmetic on a per-function base,
for example:

int __attribute__ ((optimize ("trapv", "non-call-exceptions")))
mission_critical_stuff (int x, int y)
{
  try
  {
// do all checked arithmetic here...
return x + y;
  }
  catch (const std::overflow_error& e)
  {
// runtime has to turn the trap into a c++ exception for this
// to work.
return 0;
  }
}

Of course the function doesn't need to do the try-catch by itself.  The
exception could also be handled outside the function if fine grained
checking is not required.
Internally (although not documented as standard name patterns), GCC
already has support for the following trapping integer arithmetic insns:
addv, subv, smulv, negv, absv.  The individual target support for those
insns varies and the existing insns are for signed arithmetic only, but
it could be a starting point.

Cheers,
Oleg




C++ std headers and malloc, realloc poisoning

2013-12-04 Thread Oleg Endo
Hello,

Earlier this the following was committed:

2013-06-20  Oleg Endo  
Jason Merrill  

* system.h: Include  as well as .

... so that things like  could be included after including
system.h.
Some days ago I've tried building an SH cross-GCC on OSX 10.9 with the
latest XCode (clang) tools and its libc++ std lib.  Some of the libc++
headers use malloc, realloc etc, which are poisoned in system.h.  In
this particular case the problem is triggered by the inclusion of
 in sh.c, but there are more headers which show the same
problem (e.g. ).

Is the malloc, realloc poisoning actually still useful/helpful?  After
all it can be easily circumvented by doing
  "new char[my_size]" ...

A simple fix is to include C++ std headers before including system.h,
which works for .c/.cc files, but might become problematic if things
like  are included in headers in the future.

Anyway, just wanted to report my findings regarding this issue.

Cheers,
Oleg





Re: C++ std headers and malloc, realloc poisoning

2013-12-05 Thread Oleg Endo
On Thu, 2013-12-05 at 10:45 -0500, Jason Merrill wrote:
> On 12/04/2013 04:03 PM, Jakub Jelinek wrote:
> > I think the most important reason is that we want to handle out of mem
> > cases consistently, so instead of malloc etc. we want users to use xmalloc
> > etc. that guarantee non-NULL returned value, or fatal error and never
> > returning.  For operator new that is solvable through std::set_new_handler
> > I guess, but for malloc we really don't want people to deal with checking
> > NULL return values from those everywhere.
> 
> A simple workaround would be to disable poisoning of malloc/realloc on 
> OS X (or when the build machine uses libc++, if that's easy to detect).

Whether libc++ uses malloc/realloc/free in some implementation in a
header file or not is an implementation detail.  It could use it today
and stop doing so tomorrow ;)
Maybe a configure option to disable the poisoning would be better in
this case?

Cheers,
Oleg



Re: C++ std headers and malloc, realloc poisoning

2013-12-05 Thread Oleg Endo
On Thu, 2013-12-05 at 18:11 +0100, Jakub Jelinek wrote:
> On Thu, Dec 05, 2013 at 12:05:23PM -0500, Jason Merrill wrote:
> > On 12/05/2013 10:59 AM, Oleg Endo wrote:
> > >On Thu, 2013-12-05 at 10:45 -0500, Jason Merrill wrote:
> > >>A simple workaround would be to disable poisoning of malloc/realloc on
> > >>OS X (or when the build machine uses libc++, if that's easy to detect).
> > >
> > >Whether libc++ uses malloc/realloc/free in some implementation in a
> > >header file or not is an implementation detail.  It could use it today
> > >and stop doing so tomorrow ;)
> > 
> > Yep, which is why I described my suggestion as a workaround.  :)
> > 
> > But having the poisoning disabled when building with clang doesn't
> > seem like a significant problem even if it becomes unnecessary,
> > since any misuse will still show up when building stage 2 and on
> > other platforms.
> 
> Guess the problem is that clang pretends to be (old) version of GCC.
> Otherwise all the poisioning, which is guarded by:
> #if (GCC_VERSION >= 3000)
> wouldn't be applied.  So perhaps we want a hack there && !defined __clang__
> or similar.

The problem is not clang but the exposed internals of libc++ (at least
the version Apple currently ships).  The problem would be the same if
GCC was used as the compiler but with libc++ instead of libstdc++ (it
seems some people have been trying to do that, see
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-August/010149.html)

BTW, the #include  in sh.c also triggered the
"do_not_use_isalpha_with_safe_ctype" stuff in include/safe-ctype.h,
which is a similar problem (isalpha being used in some implementation in
libc++).

Cheers,
Oleg



Re: Oleg Endo appointed co-maintainer of SH port

2013-12-06 Thread Oleg Endo
On Fri, 2013-12-06 at 09:05 -0500, David Edelsohn wrote:
>   I am pleased to announce that the GCC Steering Committee has
> appointed Oleg Endo as co-maintainer of the SH port.
> 
>   Please join me in congratulating Oleg on his new role.
> Oleg, please update your listing in the MAINTAINERS file.

Thank you.

I've just committed the following.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 205756)
+++ MAINTAINERS (working copy)
@@ -102,6 +102,7 @@
 score port Chen Liqin  liqin@gmail.com
 sh portAlexandre Oliva aol...@redhat.com
 sh portKaz Kojima  kkoj...@gcc.gnu.org
+sh port    Oleg Endo   olege...@gcc.gnu.org
 sparc port Richard Henderson   r...@redhat.com
 sparc port David S. Miller da...@redhat.com
 sparc port Eric Botcazou   ebotca...@libertysurf.fr
@@ -364,7 +365,6 @@
 Bernd Edlinger bernd.edlin...@hotmail.de
 Phil Edwards   p...@gcc.gnu.org
 Mohan Embargnust...@thisiscool.com
-Oleg Endo  olege...@gcc.gnu.org
 Revital Eres   e...@il.ibm.com
 Marc Espie es...@cvs.openbsd.org
 Rafael �vila de Esp�ndola  espind...@google.com



Re: [RL78] Questions about code-generation

2014-03-10 Thread Oleg Endo
DJ,

On Mon, 2014-03-10 at 20:17 -0400, DJ Delorie wrote:
> > Ah, that certainly explains a lot.  How exactly would the fixing be 
> > done?  Is there an example I could look at for one of the other processors?
> 
> No, RL78 is the first that uses this scheme.

I'm curious.  Have you tried out other approaches before you decided to
go with the virtual registers?

Cheers,
Oleg



Re: [RL78] Questions about code-generation

2014-03-16 Thread Oleg Endo
On Sun, 2014-03-16 at 19:48 +0100, Richard Hulme wrote:
> On 10/03/14 22:37, DJ Delorie wrote:
> >
> > The use of "volatile" disables many of GCC's optimizations.  I
> > consider this a bug in GCC, but at the moment it needs to be "fixed"
> > in the backends on a case-by-case basis.
> >
> 
> Hi,
> 
> I've looked into the differences between the steps taken when using a 
> variable declared volatile, and when it isn't but I'm getting a bit stuck.
> 
> [...]
>
> Bearing in mind that I'm new to all this and may be missing something 
> blindingly obvious, what would cause 7->8 to fail when declared volatile 
> and not when not?  Does something need adding to rl78-virt.md to allow 
> it to match?
> 
> It doesn't seem like this is due to missing an optimization step that 
> combines insns (hmm, "combine?") but rather to not recognizing that a 
> single, existing insn is possible and so splitting the operation up into 
> multiple steps.

I haven't looked at the details of RL78, but it could be that the memory
constraints refuse to match.  Thus patterns that have mem operands won't
match if the mems are volatile.  This volatile mem check is buried in
the implementation of the 'general_operand' constraint function in
recog.c and is turned on by combine via 'init_recog_no_volatile'.

I ran into a similar issue on SH, where loads/stores from/to volatile
mems with redundant sign/zero extensions wouldn't combine away because
of that volatile mem check in 'general_operand'.  The 'solution' was not
to use 'general_operand' in the SH specific constraints implementation
(e.g. see "general_movsrc_operand" in config/sh/predicates.md).

Maybe this helps.

Cheers,
Oleg



Re: [RL78] Questions about code-generation

2014-03-16 Thread Oleg Endo
On Sun, 2014-03-16 at 17:22 -0400, DJ Delorie wrote:
> This is similar to what I had to do for msp430 - I made a new
> constraint that was what general_operand would have done if it allowed
> volatile MEMs, and used that for instructions where a volatile's
> volatileness wouldn't be broken.

Maybe we should add a target hook/macro to control this to avoid
duplicated code of 'general_operand' in various places?

Cheers,
Oleg



Re: About GCJ compiler

2012-03-17 Thread Oleg Endo
On Sat, 2012-03-17 at 13:36 +0900, Mao Ito wrote:
> To whom may it concern,
> 
> Nice to meet you. I am Mao Ito and a graduate student at the University
> of Wisconsin-Madison. I am working on a course project. My topic is
> something like "High Performance Architecture for Java Programming
> Language on mobile phones". On this project, I am planning to use GCJ
> to cross-compile Java for ARM. I am planning to add some extra
> ISAs(Instruction Set Architecture). But, if I add an extra ISAs, as you
> know, I have to generate binary codes corresponding to my new
> architecture. 

This sounds more like a target/backend job and has probably little to do
with the Java frontend.  The source for the ARM target can be found in
the source tree at gcc/config/arm.

> My question is if I can modify GCJ for my purpose. Is it
> possible to get source codes and customize it?
> 

Links to source code can be found on the main page http://gcc.gnu.org/
You can get it via SVN checkout http://gcc.gnu.org/svn.html
or by downloading a source snapshot from one of the mirrors listed on
http://gcc.gnu.org/mirrors.html
For example here http://mirrors-us.seosue.com/gcc/snapshots/

For further questions regarding building and using GCC please use the
gcc-help mailing list http://gcc.gnu.org/ml/gcc-help/


Cheers,
Oleg



Re: GSoC :Project Idea(Before final Submission) for review and feedback

2012-03-24 Thread Oleg Endo
On Sat, 2012-03-24 at 11:45 +0530, Subrata Biswas wrote:
> Dear All,
> I am a MTech student at Indain Institute of Technology, Roorkee. I
> want to do my GSoC12 project under your guidance. I am writing this
> mail for a basic review and feedback on my project idea before formal
> submission.
> 
> This project idea is mainly concentrated on improvement or addition a
> new feature in the gcc compiler.
> 
> --
> Problem Statement:
> --
> GCC compiler is the most popular c compiler. If there is a small bug
> in the program, the gcc shows the error and did not generate the
> executable (out) file. Even a small mistake done by a programmer made
> his program execution impossible which means an unfinished assignment
> !!! This case is not at all programmer friendly. This project idea is
> based on this problem faced by programmers. Here I shall try to make
> our favorite gcc compiler more programmer friendly.
> --
> Project Title:  Partial executable generation using GCC
> --
> Project Idea:
> --
> The GCC compiler can be added with an extra feature to generate an
> partial executable file after showing the current bugs in the
> programs. Here it can be done by analyzing the data flow of the c code
> and eliminating the part of the program which is dependent on the
> erroneous portion of the program. Then the compiler shows the error
> and warnings in the code and generate the partial executable version
> of the code.
> This feature may make gcc to a higher level of programmer friendly compiler.
> 
> --
> 

I might be misunderstanding the idea... 
Let's assume you've got a program that doesn't compile, and you leave
out those erroneous blocks to enforce successful compilation of the
broken program.  How are you going to figure out for which blocks it is
actually safe to be removed and for which it isn't?  Effectively, you'll
be changing the original semantics of a program, and those semantic
changes might be completely not what the programmer originally had in
mind.  In the worst case, something might end up with an (un)formatted
harddisk...

Cheers,
Oleg



Re: GSoC :Project Idea(Before final Submission) for review and feedback

2012-03-25 Thread Oleg Endo
Please reply in CC to the GCC mailing list, so others can follow the
discussion.

On Sun, 2012-03-25 at 09:21 +0530, Subrata Biswas wrote:
> On 25 March 2012 03:59, Oleg Endo  wrote:
> >
> > I might be misunderstanding the idea...
> > Let's assume you've got a program that doesn't compile, and you leave
> > out those erroneous blocks to enforce successful compilation of the
> > broken program.  How are you going to figure out for which blocks it is
> > actually safe to be removed and for which it isn't?
> 
> I can do it by tracing the code blocks which are dependent on the
> erroneous block. i.e if any block is data/control dependent(the output
> or written value of the erroneous part is read) on this erroneous
> block or line of code will be eliminated.
> 
> > Effectively, you'll
> > be changing the original semantics of a program, and those semantic
> > changes might be completely not what the programmer originally had in
> > mind.  In the worst case, something might end up with an (un)formatted
> > harddisk...*
> >
> > Cheers,
> > Oleg
> >
> Thank you sir for your great feedback. You have understood it
> correctly. Now the programmer will be informed about the change in
> code and the semantics.(Notice that this plug-in is not going to
> modify the original code!, it just copy the original code and perform
> all the operations on the temporary file!!!) Even from the partial
> execution of the code the programmer will get an overview of his
> actual progress.
> 
> suppose the program written by the programmer be:
> 
> 1 int main(void)
> 2 {
> 3int arr[]={3,4,-10,22,33,37,11};
> 4sort(arr);
> 5int a = arr[3] // Now suppose the programmer missed the semicolon
> here. Which generates a compilation error at line 5;
> 6printf("%d\n",a);
> 7for(i=0;i<7;i++)
> 8{
> 9printf("%d\n",arr[i]);
> 10}
> 11  }
> 
> 
> Now if we just analyze the data (i.e. variable), we can easily find
> that there is only data dependency exists between line 5 and line 6.
> The rest of the program is not being effected due to elimination or
> commenting line 5.
> 
> Hence the temporary source file after commenting out the erroneous
> part of the code and the code segment that is dependent on this
> erroneous  part would be:
> 
> 1 int main(void)
> 2 {
> 3int arr[]={3,4,-10,22,33,37,11};
> 4sort(arr);
> 5//int a = arr[3] // Now suppose the programmer missed the
> semicolon here. Which generates a compilation error at line 5;
> 6   // printf("%d\n",a);
> 7for(i=0;i<7;i++)
> 8{
> 9printf("%d\n",arr[i]);
> 10}
> 11  }
> 
> Now this part of the program(broken program) is error free. Now we can
> compile this part using GCC and get the partial executable.
> 
> Now the possible output after compilation using this plug in(if
> programmer use it) with GCC would be:
> 
> "You have syntax error at Line no. 5. and to generate the partial
> executable Line 5 and Line 6 have removed in the temporary executable
> execute the partial executable excute p.out"
> 
> Advantages to the Programmer:
> 1. If programmer can see the result of the partial executable he can
> actually quantify his/her progress in code.
> 2. The debug become easier as this plug-in would suggest about
> possible correction in the code etc.

I don't think it will make the actual debugging task easier.  It might
make writing code easier (that's what IDEs are doing these days while
you're typing code...).  In order to debug a program, the actual bugs
need to be _in_ the program, otherwise there is nothing to debug.
Removing arbitrary parts of the program could potentially introduce new
artificial bugs, just because of a missing semicolon.

> * I did not understand the  worst case that you have mentioned as
> (un)formatted hard disk. Can you kindly explain it?
> 

Let's say I'm writing a kind of disk utility that reads and writes
sectors...

-
source1.c:

bool
copy_sector (void* outbuf, const void* inbuf, int bytecount)
{
  if (bytecount < 4)
return false;
  
  if ((bytecount & 3) != 0)
return false;

  int* out_ptr = (int*)outbuf;
  const int* in_ptr = (const int*)inbuf;
  int count = bytecount / 4;

  do
  {
int i = *in_ptr++;
if (i & 1)
  i = do_something_special0 (i);
else if (i & (1 << 16))
  i = do_something_special1 (i);
*out_ptr++ = i;
  } while (--count);

  return true;
}

-
source0.c:

int main (void)
{
  ...
  int sector_size = get_sector_size (...);
  void* sector_read_buf = malloc (sector_size);
  void* secto

RE: Question about Tree_function_versioning

2012-03-26 Thread Oleg Endo
On Mon, 2012-03-26 at 22:51 +, Iyer, Balaji V wrote:
> I have another question along the same lines. Is it possible to tell
> gcc to never delete a certain function even if it is never called in
> the executable?
> 

"__attribute__ ((used))" maybe?

Cheers,
Oleg



Re: Switch statement case range

2012-04-08 Thread Oleg Endo
On Sun, 2012-04-08 at 09:07 -0700, Rick Hodgin wrote:
> Thank you!
> 
> I'd like to find out some day exactly how much I _don't_ know. :-)
> 

Knock yourself out ;)
http://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html

Cheers,
Oleg



Re: Updated GCC vs Clang diagnostics [Was: Switching to C++ by default in 4.8]

2012-04-13 Thread Oleg Endo
On Fri, 2012-04-13 at 10:29 -0500, Gabriel Dos Reis wrote:

> There is some repeat here.  Over 13 years ago, people were screaming
> to have line wrapping by default -- because the diagnostic
> messages related to templates were just too long and too awful.
> I implemented line wrapping for g++ and made it the default.  Then people
> screamed to take it off, and much of the effort has been essentially
> to replace it.
> Things evolve and we do expect them to evolve.  I would not be surprised
> if  in 5 years, would scream to have fonts, boxes and glides in diagnostics 
> :-)

In this case, printing '42' might be good ;)

Cheers,
Oleg



Re: Switching to C++ by default in 4.8

2012-04-16 Thread Oleg Endo
On Mon, 2012-04-16 at 04:11 +0800, Chiheng Xu wrote:
> On Sat, Apr 14, 2012 at 11:47 AM, Chiheng Xu  wrote:
> >
> > And I want to say that tree/gimple/rtl are compiler's data(or state),
> > not compiler's text(or logic), the most important thing about them is
> > how to access their fields.
> >
> 
> Given the above assumption, now I doubt the necessity of accessor
> macros or C++ getter/setter method.

According to my experience, it doesn't take more time/effort to write
"tree->code ()" instead of "tree->code" and such getter functions allow
for easier refactoring etc.  If you omit the getters/setters you can't
express things such as immutable objects (well you still could with
const ivars but...), and you'll always have to have the ivar...

> 
> Is "tree->code" more direct and efficient than "TREE_CODE(tree)" or
> "tree->get_code()" ?

What do you mean by efficient?  All of them will (most likely) end up as
the same machine code.  But still, there's a reason why there's a
TREE_CODE getter which is supposed to be used instead of writing
"tree->base.code" everywhere...

Cheers,
Oleg



Re: Switching to C++ by default in 4.8

2012-04-17 Thread Oleg Endo
On Wed, 2012-04-18 at 06:03 +0800, Chiheng Xu wrote:
> >
> Sorry,  I don't know what is the benefit of const ivars.

I didn't say there's a benefit of using const ivars in this hypothetical
case.  It's just another possible option of doing certain things.

> But if you use "tree->code" instead of "tree->code()", the compiler
> know very well whether you intend to read or write a piece of memory.
> The const-ness is clear. I doubt how the compiler optimizer can
> further optimize it.

I didn't say that either...

> By saying "efficient", I probably mean compile time is reduced( macro
> expansion + optimizing, or inlining + optimizing, are avoided).
> I also probably mean reduced .h file size( the definitions of accessor
> macros and C++ getter/setter inline methods, are avoided).
> 

...then probably I wasn't aware of the fact that this is about
optimizing for compile time.  If that's the case, maybe the topic of the
thread should be changed to avoid further confusion.

Cheers,
Oleg



Re: Is it possible to make gcc detect whether printf prints floating point numbers?

2012-06-10 Thread Oleg Endo
On Fri, 2012-06-08 at 16:54 +0800, Bin.Cheng wrote:
> Hi all,
> In micro-controller applications, code size is critical and the size
> problem is worse if library is linked.
> For example, most c programs call printf to format output data, that
> means floating point code get linked even the program only want to
> output non-floating point numbers. Currently, we rely on end-user to
> call iprintf if the program does not want floating point.
> 
> I noticed that GCC now can check format string of printf functions, so
> I am wondering if it is possible to take advantage of this utility, by
> making gcc detect whether printf prints floating point number and then
> generate assembly directive in backend to pull in floating point
> functions only if necessary.
> 
> The problem is:
> The check is done in front end, so how should I expose the check
> result to back-end. Is there any hook utility?
> 
> In the future, could this feature be supported by GCC in upstream? I
> assuming mcu backends may have interests in this.
> 

Wouldn't it be much simpler and easier to just provide a customized
printf implementation in the C runtime/std lib which does not use
floating point types or simply prints a hex number when doing a '%f' or
something like that?

Another idea could be to try using a variadic template printf, if you
can get the C code to compile as C++ :)

Cheers,
Oleg



Re: "self" keyword

2012-06-14 Thread Oleg Endo
On Thu, 2012-06-14 at 16:34 -0400, Rick C. Hodgin wrote:
> David,
> 
> Well, I probably don't have a NEED for it.  I've gotten along for 25+ 
> years without it. :-)
> 
> However, what prompted my inquiry is using it would've saved me tracking 
> down a few bugs in recent weeks.  Some prior code was re-used for a 
> similar function, but the name of the recursive calls weren't updated in 
> every case.  It didn't take long to debug, but I realized that had it 
> always been written as self() it never would've been an issue.
> 
> I can also see a use for generated code where there's a base source code 
> template in use with an embedded include file reference that changes as 
> it's generated per pass, such as:
> 
> int step1(int a, int b)
> {
>  #include "\current_task\step1.cpp"
> }
> 
> int step2(int a, int b)
> {
>  #include "\current_task\step2.cpp"
> }
> 
> Using the self() reference for recursion, one could modify stepN.cpp's 
> generator algorithms without having to know or care anything in the 
> wrapper code.  

Wouldn't this do?

#define __self__ step1
int __self__ (int a, int b)
{
  #include "something"
  __self__ (x, y);
}
#undef __self__



Cheers,
Oleg



Re: Add corollary extension

2012-06-28 Thread Oleg Endo
On Thu, 2012-06-28 at 18:08 -0400, Rick C. Hodgin wrote:
> How would you handle:
> 
> isSystemClosed = true;

By adding one line to inv_bool

struct inv_bool {
   bool& b;
   operator bool() const { return !b; }
   inv_bool& operator = (bool _b) { b = !_b; return *this; }
};


Cheers,
Oleg



Re: Double word left shift optimisation

2012-07-26 Thread Oleg Endo
On Thu, 2012-07-26 at 10:51 -0700, Ian Lance Taylor wrote:
> On Thu, Jul 26, 2012 at 8:57 AM, Jon Beniston  
> wrote:
> >
> > I'd like to try to optimise double word left shifts of sign/zero extended
> > operands if a widening multiply instruction is available. For the following
> > code:
> >
> > long long f(long a, long b)
> > {
> >   return (long long)a << b;
> > }
> >
> > ARM, MIPS etc expand to a fairly long sequence like:
> >
> > nor $3,$0,$5
> > sra $2,$4,31
> > srl $7,$4,1
> > srl $7,$7,$3
> > sll $2,$2,$5
> > andi$6,$5,0x20
> > sll $3,$4,$5
> > or  $2,$7,$2
> > movn$2,$3,$6
> > movn$3,$0,$6
> >
> > I'd like to optimise this to something like:
> >
> >  (long long) a * (1 << b)
> >
> > Which should just be 3 or so instructions. I don't think this can be
> > sensibly done in the target backend as the generated pattern is too
> > complicated to match and am not familiar with the middle end. Any
> > suggestions as to where and how this should be best implemented?
> 
> It seems to me that you could just add an ashldi3 pattern.
> 

This is interesting.  I've quickly tried it out on the SH port.  It can
be accomplished with the combine pass, although there are a few things
that should be taken care of:
- an "extendsidi2" pattern is required (so that the extension is not
  performed before expand)
- an "ashldi3" pattern that accepts "reg:DI << reg:DI"
- maybe some adjustments to the costs calculations
  (wasn't required in my case)

With those in place, combine will try to match the following pattern

(define_insn_and_split "*"
  [(set (match_operand:DI 0 "arith_reg_dest" "=r")
(ashift:DI (sign_extend:DI (match_operand:SI 1 "arith_reg_operand"
"r"))
   (sign_extend:DI (match_operand:SI 2 "arith_reg_operand" 
"r"]
  "TARGET_SH2"
  "#"
  "&& can_create_pseudo_p ()"
  [(const_int 0)]
{
  rtx tmp = gen_reg_rtx (SImode);
  emit_move_insn (tmp, const1_rtx);
  emit_insn (gen_ashlsi3 (tmp, tmp, operands[2]));
  emit_insn (gen_mulsidi3 (operands[0], tmp, operands[1]));
  DONE;
})

which eventually results in the expected output

mov #1,r1   ! 24movsi_i/3   [length = 2]
shldr5,r1   ! 25ashlsi3_d   [length = 2]
dmuls.l r4,r1   ! 27mulsidi3_i  [length = 2]
sts macl,r0 ! 28movsi_i/5   [length = 2]
rts ! 35*return_i   [length = 2]
sts mach,r1 ! 29movsi_i/5   [length = 2]

One potential pitfall might be the handling of a real "reg:DI << reg:DI"
if there are no patterns already there that handle it (as it is the case
for the SH port).  If I observed correctly, the "ashldi3" expander must
not FAIL for a "reg:DI << reg:DI" (to do a lib call), or else combine
would not arrive at the pattern above.

Hope this helps.

Cheers,
Oleg




Re: Contributing and GCC GPL

2012-08-09 Thread Oleg Endo
On Thu, 2012-08-09 at 17:54 +0100, Aaron Gray wrote:
> Hi,
> 
> I have developed several patches for GCC and am wondering as a purely
> open source non commercial developer whether there are any issues
> regarding getting patches into GCC. Do I need to sign an agreement at
> all ?
> 

Depending on the sizes of the patches you might need to get the
copyright paper work done.  Patches should be sent to the patches list
gcc-patc...@gcc.gnu.org
For more info see http://gcc.gnu.org/contribute.html

Cheers,
Oleg



Using C++ - Problem with

2012-08-25 Thread Oleg Endo
Hello,

I'm currently playing around with an RTL pass and started using C++.
When including  I get the following:

/usr/include/c++/4.6/cstdlib:76:8: error: attempt to use poisoned
"calloc"
/usr/include/c++/4.6/cstdlib:83:8: error: attempt to use poisoned
"malloc"
/usr/include/c++/4.6/cstdlib:89:8: error: attempt to use poisoned
"realloc"

It seems the story is old:
http://gcc.gnu.org/ml/gcc/2009-08/msg00553.html

Now that the switch to C++ has been made, how should this be handled?


Cheers,
Oleg



Re: Proposing switch -fsmart-pointers

2012-10-06 Thread Oleg Endo
On Sat, 2012-10-06 at 20:59 +0200, _ wrote:
> Now obviously you can't put stl everywhere.
> I don't see kernel and low level C or C++ libs using boost or stl. any
> time soon.
> Afterall. No reasonable library uses it either due to binary 
> incompatibilities.

It seems that your proposed fix to that is as binary incompatible as
using std::unique_ptr or std::shared_ptr.  What happens when you link a
library that was compiled with '*~' pointers enabled, with a library
that was compiled without those pointers, and both exchange
data/objects?

> C or C like templateless C++ code is still domain of most  os /
> drivers source code out there.
> Just go agead and try to ask Linus to wrap all pointers to stl
> templates ;D I think you will run for your life then.

Have you asked him for his opinion on your proposed solution?

> Not everybody agrees that wrapping every single variable to macro like
> templates and spreading simple code logic over zilion files is way to
> go. But the main problem holding stl back is it's binary and header
> incompatibilities.

Adding a non-standard compiler extension most likely will not be and
improvement for those problems.
But if you really want to try it out, you can do it by writing a custom
preprocessor that triggers on '*~' and replaces the preceding token with
std::unique_ptr< token > or std::shared_ptr< token >.

Cheers,
Oleg



Re: detecting integer overflows

2012-10-30 Thread Oleg Endo
On Tue, 2012-10-30 at 01:09 -0600, Michael Buro wrote:
> Recently I came across http://embed.cs.utah.edu/ioc/ which describes a
> sophisticated integer overflow checker for Clang. The reported results
> obtained by analyzing C/C++ open source projects make a convincing
> case for implementing such functionality in gcc/g++ as well. Is
> somebody looking into this?

Not sure, but we've still got a half-broken -ftrapv :)
E.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35412
Or like on SH, some of the arithmetic ops get expanded to libcalls and
some (afair SImode plus and minus) end up as normal ops.

Cheers,
Oleg





Re: Feature request: Attribute to delay evaluation of function argument

2012-11-03 Thread Oleg Endo
On Sun, 2012-11-04 at 11:02 +1100, Clinton Mead wrote:
> Hi All
> 
> This is a feature request. To explain, lets say I want to create a
> TriBool type, like so:
> 
> enum TriBool { False, True, Unknown };
> 
> I now want to implement operator&&. I want operator&& to short
> circuit, i.e. if the first argument is False the second argument
> shouldn't be evaluated.
> 
> So I'll make operator&& take a function object for it's second
> argument, like so:
> 
> TriBool operator&&
> (
>   TriBool x,
>   const std::function& y
> )
> {
>   if (x == False) { return False; }
>   else
>   {
> TriBool y_ = y();
> if (x == True) { return y_; }
> else if (y_ == False) { return False; }
> else { return Unknown; }
>   }
> }
> 
> This way if I have:
> 
> #define DELAY(x) [&]{ return x; }
> 
> TriBool f();
> TriBool g();
> 
> I can do:
> 
> f() && DELAY(g())
> 
> and hence have short circuit evaluation.
> 
> However, what I'd like to have is just "f() && g()". It would be good
> to be able to give the second argument an attribute which basically
> wraps any argument passed to it with "DELAY()". Is this possible, or
> has it already been done?
> 

I think this can be done without any additional features or extensions.
Have you tried 'class TriBool' with an 'explicit operator bool', instead
of overloading operator && for this purpose?

Cheers,
Oleg



Re: Feature request: Attribute to delay evaluation of function argument

2012-11-04 Thread Oleg Endo
On Sun, 2012-11-04 at 18:08 +1100, Clinton Mead wrote:
> Hi Oleg
> 
> Could you explain how you get around the following:
> 
> (1) Doesn't the non-overloaded operator&& return 'bool', not
> 'TriBool'? 

Yes, by default it takes bool on both sides and returns bool.

> How can it be made to return 'TriBool'?

By overloading it.

> (2) How can one prevent losing information by converting to 'bool'?

This is difficult to answer without knowing more about the use cases
that you have in mind.  For example if you want to be able to write
something like ...

tribool x = ...; tribool y = ...;
tribool z = x && y;

you'll have to overload operator &&.
Then, if you want to be able to do stuff like ...

tribool x = ...; tribool y = ...;
tribool z = x && y;
if (z) { ... }

or inherently...
if (x && y) { ... }

you'll have to provide explicit operator bool in the class tribool.

> (3) How can technique be applied more generally to functions not named
> '&&' or '||' in such a way that the suggested feature would allow?

You mean, to allow short-circuit evaluation for normal functions?
If so, how would that look in practice (use case example)?


Cheers,
Oleg



BSD licensed code OK in test suite?

2012-11-07 Thread Oleg Endo
Hi all,

In PR 24129 we've got a test case that originated from some BSD licensed
code.  Is it OK to add this test case 1:1 into the c torture tests?
Or is a disclaimer required such as found in
gcc/testsuite/gcc.c-torture/execute/pr20527-1.c ?

Thanks,
Oleg



Re: BSD licensed code OK in test suite?

2012-11-07 Thread Oleg Endo
On Wed, 2012-11-07 at 14:59 -0700, Jeff Law wrote:
> On 11/07/2012 02:47 PM, Oleg Endo wrote:
> > Hi all,
> >
> > In PR 24129 we've got a test case that originated from some BSD licensed
> > code.  Is it OK to add this test case 1:1 into the c torture tests?
> > Or is a disclaimer required such as found in
> > gcc/testsuite/gcc.c-torture/execute/pr20527-1.c ?
> Ideally the test would be reduced to its minimal form; at that point its 
> copyright status would be re-evaluated.
> 

I'm sorry, the PR in question is 48806.  24129 is the attachment ID:
http://gcc.gnu.org/bugzilla/attachment.cgi?id=24129

Do you think that this needs further reduction, or would it be
sufficient to rename the func/var names and reformat it?

Cheers,
Oleg



Re: BSD licensed code OK in test suite?

2012-11-07 Thread Oleg Endo
On Wed, 2012-11-07 at 15:12 -0700, Jeff Law wrote:
> On 11/07/2012 03:08 PM, Oleg Endo wrote:
> > On Wed, 2012-11-07 at 14:59 -0700, Jeff Law wrote:
> >> On 11/07/2012 02:47 PM, Oleg Endo wrote:
> >>> Hi all,
> >>>
> >>> In PR 24129 we've got a test case that originated from some BSD licensed
> >>> code.  Is it OK to add this test case 1:1 into the c torture tests?
> >>> Or is a disclaimer required such as found in
> >>> gcc/testsuite/gcc.c-torture/execute/pr20527-1.c ?
> >> Ideally the test would be reduced to its minimal form; at that point its
> >> copyright status would be re-evaluated.
> >>
> >
> > I'm sorry, the PR in question is 48806.  24129 is the attachment ID:
> > http://gcc.gnu.org/bugzilla/attachment.cgi?id=24129
> >
> > Do you think that this needs further reduction, or would it be
> > sufficient to rename the func/var names and reformat it?
> I think it's borderline as-is.
> 
> I suspect there's quite a bit of junk that can be zapped from that test.

OK, thanks for your feedback.  I'll have a closer look at it and post a
patch if I can successfuly reduce it further.

Cheers,
Oleg



Re: [rant?] g++ bug (missing uninitialized warning), bug reporting, bug searching

2012-11-09 Thread Oleg Endo
Hello,

On Fri, 2012-11-09 at 12:18 -0800, Bruno Nery wrote:
> Howdy,
> 
> The following piece of code:
> 
> === snip ===
> #include 
> 
> struct warnme
> {
> bool member_;
> warnme(bool member) : member_(member_) {}
> };
> 
> int main()
> {
> warnme wm(true);
> std::cout << wm.member_ << std::endl;
> return 0;
> }
> === end snip ===
> 
> when compiled with g++ 4.7, gives me no warnings - even with
> -Wuninitialized (clang++ 3.1 is fine, by the way). I then decided to
> report a bug, but:
> 
> - I need to login to report a bug, and I have to create an account. Is
> this a way to reduce the number of bugs GCC gets?

This issue has been raised just recently on the gcc-help mailing list.
See the thread:
http://gcc.gnu.org/ml/gcc-help/2012-10/threads.html#00061

> - I searched for uninitialized and got 156 bugs. How easy would it be
> for one to check if a bug is a duplicate? Shouldn't we have some kind
> of code search for bug-related snippets?

I've just searched for "uninitialized missing" and got 22 bugs, some of
them seem to look related to yours, although I haven't checked/compared
the details.  In the worst case you can just file the bug and it will be
marked as duplicate eventually (if it is one).

Cheers,
Oleg



Re: [rant?] g++ bug (missing uninitialized warning), bug reporting, bug searching

2012-11-09 Thread Oleg Endo
On Fri, 2012-11-09 at 13:22 -0800, Bruno Nery wrote:
> Twenty two might be a more manageable number, but still... why do we
> need an account to report a bug?

This issue has been raised just recently on the gcc-help mailing list.
See the thread:
http://gcc.gnu.org/ml/gcc-help/2012-10/threads.html#00061

The answer to your question is in the first reply by Ian Lance Taylor.

Cheers,
Oleg



Re: Modeling predicate registers with more than one bit

2013-03-02 Thread Oleg Endo
Hi,

On Thu, 2013-02-28 at 11:10 +, Paulo Matos wrote:
> Hello,
> 
> I am looking at how to correctly model in GCC predicate registers that
> have more than one bit and the value set into to the predicate register
> after a comparison depends on the size of the comparison.
> 
> I have looked into GCC backends but haven't really found any backend
> with a similar constraint. Have I missed a backend that has similar
> requirements? If not, is there any way to currently (as of HEAD) model
> this in GCC?

Have you had a look at the SH backend?  SH cores have a "T Bit"
register, which functions as carry bit, over/underflow, comparison
result and branch condition register.  In the SH backend it's treated as
a fixed SImode hard-reg (although BImode would suffice in this case, I
guess).
Comparison patterns set the T bit, like:

(define_insn "cmpeqsi_t"
  [(set (reg:SI T_REG)
(eq:SI (match_operand:SI 0 "arith_reg_operand" "r,z,r")
   (match_operand:SI 1 "arith_operand" "N,rI08,r")))]

Conditional branches use the T bit like:

(define_expand "branch_true"
  [(set (pc) (if_then_else (ne (reg:SI T_REG) (const_int 0))
   (label_ref (match_operand 0))
   (pc)))]

or:

(define_insn_and_split "*cbranch_t"
  [(set (pc) (if_then_else (match_operand 1 "cbranch_treg_value")
   (label_ref (match_operand 0))
   (pc)))]

where the predicate "cbranch_treg_value" looks like:

(define_predicate "cbranch_treg_value"
  (match_code "eq,ne,reg,subreg,xor,sign_extend,zero_extend")
{
  return sh_eval_treg_value (op) >= 0;
})

The predicate is for matching various forms of T bit negation patterns.

Maybe you could try the same approach for your case.
If your predicate register has multiple independent bit(fields), you
could try defining separate hard-regs for every bit(field).

Cheers,
Oleg



Re: ELF2.0: Linkable struct

2025-01-30 Thread Oleg Endo via Gcc
On Fri, 2025-01-31 at 15:49 +0900, The Cuthour via Gcc wrote:
> Suppose we have the following two classes:
> 
> === Vec.h ===
> class Vec {
>  int x, y, z;
> };
> === end Vec.h ===
> 
> === Pix.h ===
> class Pix: Vec {
>  int r, g, b;
> };
> === end Pix.h ===
> 
> If we add or remove a member variable in class Vec, it requires
> recompiling not only Vec.cc but also Pix.cc. I believe this is
> a problem. Pix.o should be relinkable.

In real world software, such small classes are unlikely to be put into their
own .cc files but tend to be header-only libraries.

struct image
{
  vec2 pos[2];
  std::vector pixels;
};

Now if any of those structs are modified, members added or removed, it would
require recompilation of the whole code that uses it.  This is because the
memory offsets to access the fields change and might fall out of the
displacement range of the processor's instructions.  So to support your idea
the compiler would need to generate worst-case code to access any arbitrary
offsets which will drag down code performance.  The linker would need added
optimizing steps to undo this ... it sounds like a can of worms.

Best regards,
Oleg Endo