RE: Backporting ARC patch to gcc7.x

2017-11-17 Thread Claudiu Zissulescu
> We've generally considered those clauses under the umbrella of the port
> maintainers.

Thank you for your guidance. I've backported the patch using the gcc guideline:

>svn commit
Sending.
Sendinggcc/ChangeLog
Sendinggcc/config.gcc
Sendinglibgcc/ChangeLog
Sendinglibgcc/config.host
Transmitting file data 
Committed revision 254864.

Thank you,
Claudiu


CFG operation leading to errors

2017-11-17 Thread Claudiu Zissulescu
Hi,

I've found a potential issue when performing CFG operations for hardware loops.

When a port is using hardware loops (like ARC) makes use of reorg_loops to find 
and analyze loops that end in loop_end instructions. The very same function can 
be set to reorder the cfg such that the loop end occurs after the loop start. 
This task is performed by reoder_loops function which at its turn calls 
cfg_layout_finalize -> fixup_reoreder_chain -> force_nonfallthru_and_redirect 
(cfgrtl.c:1476).
However, the latter is splitting a call and its corresponding CALL_ARG_LOCATION 
note, leading to an assert in dwarf2out_var_location() at dwarf2out.c:26391. 
Thus, I've made a hack which patches the force_nonfallthru_and_redirect() 
function to avoid splitting calls and their note:

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index ae46908..38a739c 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -1626,6 +1626,9 @@ force_nonfallthru_and_redirect (edge e, basic_block 
target, rtx jump_label)
   else
new_head = BB_END (e->src);
   new_head = NEXT_INSN (new_head);
+  if (new_head && NOTE_P (new_head)
+ && NOTE_KIND (new_head) == NOTE_INSN_CALL_ARG_LOCATION)
+   new_head = NEXT_INSN (new_head);
 
   jump_block = create_basic_block (new_head, NULL, e->src);
   jump_block->count = count;

I do not know if this is the way forward and I would like to have your input on 
this subject.

To reproduce the issue one can use ARC backend and the following test code with 
option -O2 -mcpu=arc700 -g :

typedef void Trans_NS_std_new_handler();
void *operator new(unsigned) {
  void *p;
  while (__builtin_expect(p == 0, false)) {
Trans_NS_std_new_handler handler;
try {
  handler();
} catch (int) {
}
  }
  return (void*) 0xdead;
}

Thank you,
Claudiu


Re: Potential bug on Cortex-M due to used registers/interrupts.

2017-11-17 Thread David Brown
On 16/11/17 17:54, Vitalijus Jefišovas wrote:
> On Cortex-M mcu’s, when interrupt happens, NVIC copies r0-r3 and couple
> other registers onto the psp stack, and then jumps to interrupt routine,
> when it finishes, NVIC restores these registers, and jumps back to user’s
> function.
> What is happening under the hood, NVIC only stacks 4 registers, r0, r1, r2,
> r3. The other ones r4-r12 is developer’s responsibility.
> I was looking at assembly code generated by GCC and there are plenty of
> instructions using r4-r12 registers.
> 
> 
> 
> How does GCC handle scenario when execution is switched to unknown
> procedure, which changes all of these registers?
> 

These other registers - r4 to r12 - are "callee saved".  That means that
a C function that uses them will first save the old values (push them
onto the stack), and will restore them at function exit.  So the C
compiler can generate a normal function with normal ARM eabi linkage and
calling conventions, and use it directly for an interrupt function (this
is one of the nice features of the Cortex-M architecture).  "caller
saved" registers, r0-r3 - the ones that the C function can use freely -
are saved by the hardware.  The others are saved by compiler-generated
instructions as needed.




Re: Potential bug on Cortex-M due to used registers/interrupts.

2017-11-17 Thread Wilco Dijkstra
Hi,

> These other registers - r4 to r12 - are "callee saved".

To be precise, R4-R11 are callee-saved, R0-R3, R12, LR are caller-saves
and LR and PSR are clobbered by calls. LR is slightly odd in that it is
a callee-save in the prolog, but not in the epilog (since LR is assumed
clobbered after a call, it doesn't need to be restored, so you can use
pop {regs,PC} to return).

Cortex-M hardware will automatically save/restore R0-R3, R12, LR, PC, PSR
on interrupts. That perfectly matches the caller-saves and clobbered
registers, so there is no potential bug.

Wilco


Re: Missed possible branch elimination

2017-11-17 Thread Stefan Ring
On Thu, Oct 26, 2017 at 8:23 PM, Stefan Ring  wrote:
> While poring over the Transport Tycoon Deluxe disassembly, commonly
> known to have been hand-written in assembler, I stumbled across this
> tidbit, which I think is kinda neat:
>
> 004057F7 83 7D B8 01  cmp dword ptr [ebp-48h],1
> 004057FB 1B C0sbb eax,eax
> 004057FD F7 D8neg eax
> 004057FF 85 05 20 A9 41 00testdword ptr ds:[41A920h],eax
> 00405805 0F 84 91 00 00 00je  0040589C
>
> which basically says:
>
> if (((DWORD*) $ebp)[-0x12] == 0 && (*(DWORD*) 0x41a920 & 1)) {  }
>
> ... leaving aside possible side effects of the memory access, so
> treating short circuit eval like an arithmetic operation might not be
> legal in this specific case. But for the function
>
> void a(void (*fnc)(void), int *x, int y) { if ((*x == 0) && (y&1)) fnc(); }
>
> & and && should be absolutely equivalent. In fact, gcc realizes this
> equivalency and produces the exact same code for both variants. It is
> just using the two-branch version instead of a one-branch strategy
> (x86_64 now):
>
>  :
>0:   8b 06   mov(%rsi),%eax
>2:   85 c0   test   %eax,%eax
>4:   75 0a   jne10 
>6:   83 e2 01and$0x1,%edx
>9:   74 05   je 10 
>b:   ff e7   jmpq   *%rdi
>d:   0f 1f 00nopl   (%rax)
>   10:   c3  retq
>
> I would rather see:
>
>  :
>0:   31 c0   xor%eax,%eax
>2:   8b 0e   mov(%rsi),%ecx
>4:   85 c9   test   %ecx,%ecx
>6:   0f 94 c0sete   %al
>9:   85 c2   test   %eax,%edx
>b:   74 03   je 10 
>d:   ff e7   jmpq   *%rdi
>f:   90  nop
>   10:   c3  retq
>
> Either that, or even:
>
>  :
>0:   8b 0e   mov(%rsi),%ecx
>2:   85 c9   test   %ecx,%ecx
>4:   0f 94 c0sete   %al
>7:   84 c2   test   %al,%dl
>9:   74 05   je 10 
>b:   ff e7   jmpq   *%rdi
>d:   0f 1f 00nopl   (%rax)
>   10:   c3  retq
>
> Although this is touching an entirely different topic now.
>
> I'm just wondering if it should not rather lean towards eliminating
> the branch. I'd
> guess that this almost always a worthy goal.

I'm really curious about that. Why is it that two branches instead of
one are preferred, even if the compiler seems to realize equivalence,
given that it replaces an arithmetic operation (the & operator) with a
branch. It would seem that this is a case of if conversion, albeit in
the wrong direction for my taste.

Is this question not appropriate here?
Did nobody know what to make of it? Or my expectations? ;)


superior language dwarf records

2017-11-17 Thread James K. Lowden
Hello, 

We want to add source-level debugging to GNU Cobol.  

The Cobol compiler generates C, and then invokes gcc to produce ELF
object code, of course. We are entertaining approaches to
replace/amplify the DWARF records produced by gcc with new ones
referencing the Cobol source.  Before diving into 30 KLOC of
dwarf-generation logic, we thought it prudent to ask for advice here.  

We are considering how best to insert DWARF records that refer to the
Cobol source, while retaining the Cobol-to-C-to-ELF design.  One way to
do that would be to supply gcc with information about "superior"
language constructs (meaning "above", not "better").  ISTM Cobol is
neither the first nor last language to use C as a kind of idealized
assembler.  

AFAIK there is no interface to supply debugging information to gcc.  

I don't know if its possible in DWARF to have one position in the
object code be defined by two source files (Cobol and C).  Ideally,
we would find a way to keep the C records and add Cobol records, and
tell the debugger which ones to use.  

Absent direct gcc support, we're considering the following approach: 

1.  Maintain a map of Cobol-to-C constructs.
2.  Invoke gcc -g to produce dwarf records.
3.  Use libdwarf or similar to iterate over the dwarf records, 
and to generate new, Cobol records for (some of) the same
locations using the map from step #1.  

If the Cobol records cannot coexist with the C records in the same
file, then between steps 2 and 3 we would extract them to a dwo file
with objcopy. 

Is gcc capable of accepting additional dwarf information, or would the
team be interested in such support?  Other suggestions on how the goal
might be met?  

Kind regards, 

--jkl






Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Qing Zhao

> On Nov 16, 2017, at 5:55 PM, Martin Sebor  wrote:
>> 
>>   A. for strncmp (s1, s2, n)
>>  if one of "s1" or "s2" is a constant string, "n" is a constant, and 
>> larger than the length of the constant string:
>>  change strncmp (s1, s2, n) to strcmp (s1, s2);
> 
> Here and I think in some (all?) the other cases below, N doesn't
> strictly have to be constant but can be in a range whose lower
> bound is greater than the string length.
Yes, I agree.  If the value range info is available, we can relax N to be an 
expression 
whose value can be determined to be larger than the length of the constant 
string.

Do you know when the value range info is available in GCC, and the interface to 
use it?
> 
> FWIW, I also recently noticed bug 82950 in a related area.

For the opportunities mentioned in PR82950, I also realized during my 
implementation of B in tree-ssa-strlen.c. the major thing missing there is
the “content” of the string is NOT recorded in the strinfo, only the “length” 
of the string is recorded. So, the information is Not completely available 
now to support this opportunity. 

However, for PR 83026, the information should be enough in tree-ssa-strlen.c, I 
can add the support this in tree-ssa-strlen.c by using the length info
of the string recorded in strinfo. 


> 
>> 
>>   B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>>  if the result is ONLY used to do a simple equality test against zero, 
>> one of "s1" or "s2" is a small constant string, n is a constant, and the 
>> other non-constant string is guaranteed to not read beyond the end of the 
>> string:
>>  change strncmp (s1, s2, n) or strcmp (s1, s2) to corresponding memcmp 
>> (s1, s2, n);
>> 
> 
> It's probably not important but I'm not sure I understand what you
> mean by
> 
>  "the other non-constant string is guaranteed to not read beyond
>  the end of the string.”

We need to be sure that we can read valid memory from the non-constant string 
after the null-character when change to corresponding memcmp. 

> 
>>  (NOTE, currently, memcmp(s1, s2, N) (!)=0 has been optimized to a 
>> simple sequence to access all bytes and accumulate the overall result in GCC 
>> by
>>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171
>>  )
>>  as a result, such str(n)cmp call would like to be replaced by simple 
>> sequences to access all types and accumulate the overall results.
> 
> Thinking about this case made me realize there's another opportunity
> here that could be exploited in tree-ssa-strlen even for non-constant
> strings by relying on its knowledge of string lengths.   I opened bug
> 83026 with the details.

Yes, I agree. and will add this in my implementation in Part B. should be very 
straightforward to implement. 

> 
>> 
>>   C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
>>  if the result is NOT used to do simple equality test against zero, one 
>> of "s1" or "s2" is a small constant string, n is a constant, and the Min 
>> value of the length of the constant string and "n" is smaller than a 
>> predefined threshold T,
>>  inline the call by a byte-to-byte comparision sequence to avoid calling 
>> overhead.
> 
> The str{,n}cmp and memcmp cases must be treated differently because
> the former compares up to N bytes while the latter exactly N bytes.
> With that, I'm wondering if the precondition "one of "s1" or "s2"
> is a small constant string" is necessary for memcmp.  I.e., why
> not inline it regardless of whether s1 or s2 are constant?

I agree. memcmp can ONLY check on “n”, if its small enough, should be inlined.  

in addition to this, for strncmp,  if both s1 and s2 are NOT constant, but if 
“n” is small enough, we can inline it too, right? 
> 
>> 
>>   A would like to be put in gimple fold phase (in routine 
>> "gimple_fold_builtin_string_compare" of gimple-fold.c)
> 
> Except for constant strings gimple-fold doesn't know about string
> lengths so it will only be able to do so much.  I'm wondering what
> the benefits are of doing the transformation there instead of just
> in tree-ssa-strlen (along with B).

As Jeff mentioned in another email, 

“Most passes call into the folder when they make noteable changes to a
statement.  So if a later pass exposes one argument as a string constant
your code for "A" should get a chance to fold the call.”

We can see whether this is enough or not, If not enough, adding  A to B is very 
easy thing to do.

thanks a lot for your comments and suggestions.

Qing




Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Qing Zhao

> On Nov 16, 2017, at 6:24 PM, Martin Sebor  wrote:
>> 
>> In my current local implementation, I used the following routine to get the 
>> range info:  (and use the MINMAXLEN[1]+1 for the length of the non-constant 
>> string)
>> 
>> /* Determine the minimum and maximum value or string length that ARG
>>   refers to and store each in the first two elements of MINMAXLEN.
>>   For expressions that point to strings of unknown lengths that are
>>   character arrays, use the upper bound of the array as the maximum
>>   length.  For example, given an expression like 'x ? array : "xyz"'
>>   and array declared as 'char array[8]', MINMAXLEN[0] will be set
>>   to 3 and MINMAXLEN[1] to 7, the longest string that could be
>>   stored in array.
>>   Return true if the range of the string lengths has been obtained
>>   from the upper bound of an array at the end of a struct.  Such
>>   an array may hold a string that's longer than its upper bound
>>   due to it being used as a poor-man's flexible array member.  */
>> 
>> bool
>> get_range_strlen (tree arg, tree minmaxlen[2])
>> {
>> }
>> 
>> However, this routine currently miss a very obvious case as the following:
>> 
>> char s[100] = {'a','b','c','d’};
>> 
>> __builtin_strcmp(s, "abc") != 0
>> 
>> So, I have to change this routine to include such common case.
> 
> There was a discussion some time ago about converting CONSTRUCTOR
> trees emitted for array initializers like the above to STRING_CST
> (see bug 71625 for some background).  I think that would still be
> the ideal solution.  Then you wouldn't have to change
> get_range_strlen.

thanks for the info, Martin.

In my case, it’s the size of “100” cannot be collected in the MINMAXLEN[1] for 
the string “s”. 

I need to make sure that the size of variable string s is larger than the size 
of constant string “abc” to guarantee the safety of the transformation.

currently, “get_range_strlen” cannot identify the simple VAR_DECL with 
array_type to determine the maximum size of the string. 

Qing
> 
> Martin



Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Qing Zhao
Hi, Jeff,

> On Nov 16, 2017, at 7:14 PM, Jeff Law  wrote:
>> 
>> In my current local implementation, I used the following routine to get
>> the range info:  (and use the MINMAXLEN[1]+1 for the length of the
>> non-constant string)
>> 
>> /* Determine the minimum and maximum value or string length that ARG
>>refers to and store each in the first two elements of MINMAXLEN.
>>For expressions that point to strings of unknown lengths that are
>>character arrays, use the upper bound of the array as the maximum
>>length.  For example, given an expression like 'x ? array : "xyz"'
>>and array declared as 'char array[8]', MINMAXLEN[0] will be set
>>to 3 and MINMAXLEN[1] to 7, the longest string that could be
>>stored in array.
>>Return true if the range of the string lengths has been obtained
>>from the upper bound of an array at the end of a struct.  Such
>>an array may hold a string that's longer than its upper bound
>>due to it being used as a poor-man's flexible array member.  */
>> 
>> bool
>> get_range_strlen (tree arg, tree minmaxlen[2])
>> {
>> }
>> 
>> However, this routine currently miss a very obvious case as the following:
>> 
>> char s[100] = {'a','b','c','d’};
>> 
>> __builtin_strcmp(s, "abc") != 0
>> 
>> So, I have to change this routine to include such common case.  
>> 
>> do you think using this routine is good? or do you have other
>> suggestions (since I am still not very familiar with the internals of
>> GCC, might not find the best available one now…)
> The range information attached to an SSA_NAME is global data.  ie, it
> must hold at all locations where the object in question might be
> referenced.  This implies that it will sometimes (often?) be less
> precise than you might like.

do you mean the “value_range” attached to SSA_NAME?

For my purpose, I’d like to get the maximum length of char array s[100] is 100, 
which is larger than the size of constant string “abc”, then
I can safely apply the transformation to memcmp. 

can “value_range” info serve this purpose?

> 
> I am currently working towards an embeddable context sensitive range
> analyzer that in theory could be used within tree-ssa-strlen pass to
> give more precise range information.  I'm hoping to wrap that work up in
> the next day or so so that folks can use it in gcc-8.

such context sensitive range info should be useful when we relax the constant 
“N” to be an expression whose Min value is larger than the length
of constant string, with it, we can catch more opportunities. 
let me know when this info is available.

thanks.

Qing

> 
> 
> 
> 
>> 
>>> 
>>> 
 
   C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
  if the result is NOT used to do simple equality test against
 zero, one of "s1" or "s2" is a small constant string, n is a
 constant, and the Min value of the length of the constant string and
 "n" is smaller than a predefined threshold T, 
  inline the call by a byte-to-byte comparision sequence to avoid
 calling overhead. 
>>> Also seems reasonable.
 
   A would like to be put in gimple fold phase (in routine
 "gimple_fold_builtin_string_compare" of gimple-fold.c
>> 
 OK.  Note that various optimizations can expose N or one of the strings
>>> to be a constant.  So having it as part of the folders makes a lot of
>>> sense .
>> 
>> I have finished this part of change and sent the patch to gcc-patch
>> alias already. 
>> https://patchwork.ozlabs.org/patch/838200/
>> 
>> Do you think it’s necessary to add the same functionality at other
>> places, such as tree-ssa-strlen.c, in order to catch more cases?
> For "A"?  No, I think having it in the gimple folder is fine.  Most
> passes call into the folder when they make noteable changes to a
> statement.  So if a later pass exposes one argument as a string constant
> your code for "A" should get a chance to fold the call.
> 
>>> 
   B would like to be put in strlen phase (add new
 "handle_builtin_str(n)cmp" routines in tree-ssa-strlen.c)
>>> Which is where you're most likely to have some kind of range information
>>> which ISTM you need to prove the non-constant string is large enough
>>> that you're not reading past its end.
>> Yes,that’s the main reason. another reason is: memcmp != 0 optimization
>> is implemented at -O2,  if we want to use this available work,
>> implement B at strlen phase is better.
> Noted.
> 
>> 
>>> 
   C would like to be put in expand phase (tree-ssa-strlen.c or
 builtins.c): run-time performance testing is needed to decide the
 predefined threshold T. 
>>> If you wanted to put it into expansion, I wouldn't object -- we could
>>> always do experiments to see if there's any value in moving it early
>>> (expansion to byte comparisons could possible expose other optimizations).
>> earlier to where? do you have any suggestion?  I can definitely do some
>> experiments. 
> The biggest question in my mind is what se

Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Qing Zhao

 
   A would like to be put in gimple fold phase (in routine
 "gimple_fold_builtin_string_compare" of gimple-fold.c
>> 
 OK.  Note that various optimizations can expose N or one of the strings
>>> to be a constant.  So having it as part of the folders makes a lot of
>>> sense .
>> 
>> I have finished this part of change and sent the patch to gcc-patch
>> alias already. 
>> https://patchwork.ozlabs.org/patch/838200/
>> 
>> Do you think it’s necessary to add the same functionality at other
>> places, such as tree-ssa-strlen.c, in order to catch more cases?
> For "A"?  No, I think having it in the gimple folder is fine.  Most
> passes call into the folder when they make noteable changes to a
> statement.  So if a later pass exposes one argument as a string constant
> your code for "A" should get a chance to fold the call.

Okay. 
> 
>>> 
   B would like to be put in strlen phase (add new
 "handle_builtin_str(n)cmp" routines in tree-ssa-strlen.c)
>>> Which is where you're most likely to have some kind of range information
>>> which ISTM you need to prove the non-constant string is large enough
>>> that you're not reading past its end.
>> Yes,that’s the main reason. another reason is: memcmp != 0 optimization
>> is implemented at -O2,  if we want to use this available work,
>> implement B at strlen phase is better.
> Noted.
> 
>> 
>>> 
   C would like to be put in expand phase (tree-ssa-strlen.c or
 builtins.c): run-time performance testing is needed to decide the
 predefined threshold T. 
>>> If you wanted to put it into expansion, I wouldn't object -- we could
>>> always do experiments to see if there's any value in moving it early
>>> (expansion to byte comparisons could possible expose other optimizations).
>> earlier to where? do you have any suggestion?  I can definitely do some
>> experiments. 
> The biggest question in my mind is what secondary opportunities arise
> when we expose the byte comparisons.  So things like if-conversion
> if-combination, propagation of equality, particularly in the single byte
> case, etc.

make sense to me.

> 
> The difficulty is tracking when exposure leads to these secondary
> opportunities.  I often end up looking for this kind of stuff by first
> identifying many source files where the transformation applies.  Then I
> generate dumps & assembly code for the transformation in each candidate
> location.  I first analyze the assembly files for differences.  If an
> assembly file shows a difference, then I look more closely to try and
> characterize the difference and if it looks interesting, then I work
> backwards to the dump files for more details.

usually, what kind of benchmarks or test cases are you recommending? 
I used to work a lot with SPEC, not sure whether that’s the same for working in 
GCC?

>>> 
>>> In general I like what you're suggesting.  And on a higher level I like
>>> that we're looking to rationalize where these kinds of things happen
>>> (compiler vs library).  It's something I've wanted to see happen for a
>>> long time.
>> Part of A and C has been implemented in glibc previously,  Wilco removed
>> them from Glibc  at
>>   https://sourceware.org/git/?p=glibc.git;a=commit;h=f7db120f67d853e0cfa2
> I know :-)  I loosely watch the glibc lists too and have expressed
> support for the glibc's team's effort to move transformations to the
> points where they make the most sense.

Yes, moving them to GCC makes good sense.

thanks.

Qing
> 
> jeff



Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Qing Zhao

> On Nov 17, 2017, at 1:50 AM, Jakub Jelinek  wrote:
> 
> On Thu, Nov 16, 2017 at 06:14:35PM -0700, Jeff Law wrote:
>>> However, this routine currently miss a very obvious case as the following:
>>> 
>>> char s[100] = {'a','b','c','d’};
>>> 
>>> __builtin_strcmp(s, "abc") != 0
>>> 
>>> So, I have to change this routine to include such common case.  
>>> 
>>> do you think using this routine is good? or do you have other
>>> suggestions (since I am still not very familiar with the internals of
>>> GCC, might not find the best available one now…)
>> The range information attached to an SSA_NAME is global data.  ie, it
>> must hold at all locations where the object in question might be
>> referenced.  This implies that it will sometimes (often?) be less
>> precise than you might like.
>> 
>> I am currently working towards an embeddable context sensitive range
>> analyzer that in theory could be used within tree-ssa-strlen pass to
>> give more precise range information.  I'm hoping to wrap that work up in
>> the next day or so so that folks can use it in gcc-8.
> 
> Well, one thing is a range of integral SSA_NAME at a certain point, the
> other is the length of the C string pointed by a certain pointer (that is
> something the strlen pass tracks), and another case is the content of that
> string, which you'd presumably need to optimize the strcmp at compile time.
> I think current strlen pass ought to find out that strlen (s) is 4 at that
> point, if it doesn't, please file a PR with a testcase.

for the safety checking purpose, when we try to convert

__builtin_strcmp(s, "abc") != 0

to 

__builtin_memcmp (s, “abc”, 4) != 0

we have to make sure that the size of variable “s” is larger than “4”. 

if  “s” is declared as

char s[100];

currently,  the “get_range_strlen” cannot determine its maximum length is 100. 
(it just return UNKNOWN).

so, I have to update “get_range_strlen” for such simple case. 

this does provide the information I want.  However, since the routine 
“get_range_strlen” is also used in other places, 
for example, in gimple-ssa-sprintf.c,  the implementation of the sprintf 
overflow warning uses the routine “get_range_strlen” 
to decide the string’s maximum size and buffer size. 

my change in “get_range_strlen” triggered some new warnings for  
-Werror=format-overflow (from gimple-ssa-sprintf.c
mentioned above) as following:

qinzhao@gcc116:~/Bugs/warning$ cat t.c
#include 

void foo(const char *macro)
{
  char buf1[256], buf2[256];
  sprintf (buf1, "%s=%s", macro, buf2);
  return;
}

with my private GCC:

qinzhao@gcc116:~/Bugs/warning$ /home/qinzhao/Install/latest/bin/gcc t.c 
-Werror=format-overflow -S
t.c: In function ‘foo’:
t.c:6:18: error: ‘sprintf’ may write a terminating nul past the end of the 
destination [-Werror=format-overflow=]
   sprintf (buf1, "%s=%s", macro, buf2);
  ^~~
t.c:6:3: note: ‘sprintf’ output 2 or more bytes (assuming 257) into a 
destination of size 256
   sprintf (buf1, "%s=%s", macro, buf2);
   ^~~~
cc1: some warnings being treated as errors

At this time, I am really not sure whether it’s good to expose such new 
warnings with my change? even though after some study, 
I think that these new warning are correct warnings, maybe we should keep? 

let me know your comments and suggestions.

thanks a lot.

Qing

>   Jakub



Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Martin Sebor

On 11/17/2017 04:08 PM, Qing Zhao wrote:



On Nov 17, 2017, at 1:50 AM, Jakub Jelinek  wrote:

On Thu, Nov 16, 2017 at 06:14:35PM -0700, Jeff Law wrote:

However, this routine currently miss a very obvious case as the following:

char s[100] = {'a','b','c','d’};

__builtin_strcmp(s, "abc") != 0

So, I have to change this routine to include such common case.

do you think using this routine is good? or do you have other
suggestions (since I am still not very familiar with the internals of
GCC, might not find the best available one now…)

The range information attached to an SSA_NAME is global data.  ie, it
must hold at all locations where the object in question might be
referenced.  This implies that it will sometimes (often?) be less
precise than you might like.

I am currently working towards an embeddable context sensitive range
analyzer that in theory could be used within tree-ssa-strlen pass to
give more precise range information.  I'm hoping to wrap that work up in
the next day or so so that folks can use it in gcc-8.


Well, one thing is a range of integral SSA_NAME at a certain point, the
other is the length of the C string pointed by a certain pointer (that is
something the strlen pass tracks), and another case is the content of that
string, which you'd presumably need to optimize the strcmp at compile time.
I think current strlen pass ought to find out that strlen (s) is 4 at that
point, if it doesn't, please file a PR with a testcase.


for the safety checking purpose, when we try to convert

__builtin_strcmp(s, "abc") != 0

to

__builtin_memcmp (s, “abc”, 4) != 0

we have to make sure that the size of variable “s” is larger than “4”.


Presumably you mean "is at least 4?"



if  “s” is declared as

char s[100];

currently,  the “get_range_strlen” cannot determine its maximum length is 100. 
(it just return UNKNOWN).

so, I have to update “get_range_strlen” for such simple case.

this does provide the information I want.  However, since the routine 
“get_range_strlen” is also used in other places,
for example, in gimple-ssa-sprintf.c,  the implementation of the sprintf 
overflow warning uses the routine “get_range_strlen”
to decide the string’s maximum size and buffer size.

my change in “get_range_strlen” triggered some new warnings for  
-Werror=format-overflow (from gimple-ssa-sprintf.c
mentioned above) as following:

qinzhao@gcc116:~/Bugs/warning$ cat t.c
#include 

void foo(const char *macro)
{
  char buf1[256], buf2[256];
  sprintf (buf1, "%s=%s", macro, buf2);
  return;
}

with my private GCC:

qinzhao@gcc116:~/Bugs/warning$ /home/qinzhao/Install/latest/bin/gcc t.c 
-Werror=format-overflow -S
t.c: In function ‘foo’:
t.c:6:18: error: ‘sprintf’ may write a terminating nul past the end of the 
destination [-Werror=format-overflow=]
   sprintf (buf1, "%s=%s", macro, buf2);
  ^~~
t.c:6:3: note: ‘sprintf’ output 2 or more bytes (assuming 257) into a 
destination of size 256
   sprintf (buf1, "%s=%s", macro, buf2);
   ^~~~
cc1: some warnings being treated as errors


When the length of one or more of the strings referenced by
the argument passed to get_range_strlen() is unknown
the -Wformat-overflow checker uses get_range_strlen() to compute
the length of the longest string that fits in an array reference
by the subexpression (i.e., sizeof array - 1) and uses it to
issue warnings.  This works with member arrays but because of
a bug/limitation it doesn't work for non-member arrays.  Bug
79538 tracks this.  So the warning above suggests your change
has fixed the problem -- good work! :)

Martin


Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Jeff Law
On 11/17/2017 03:45 PM, Qing Zhao wrote:
>>> do you think using this routine is good? or do you have other
>>> suggestions (since I am still not very familiar with the internals of
>>> GCC, might not find the best available one now…)
>> The range information attached to an SSA_NAME is global data.  ie, it
>> must hold at all locations where the object in question might be
>> referenced.  This implies that it will sometimes (often?) be less
>> precise than you might like.
> 
> do you mean the “value_range” attached to SSA_NAME?
> 
> For my purpose, I’d like to get the maximum length of char array s[100] is 
> 100, which is larger than the size of constant string “abc”, then
> I can safely apply the transformation to memcmp. 
> 
> can “value_range” info serve this purpose?
No it can't.  Sorry for leading you the wrong direction.  What you're
looking for is the object size interfaces.

See tree-object-size.[ch]

That's a pass that tries to compute the sizes of various objects
referenced by the IL.

Note that the object size is different than say the length of a string
stored in an object for which you'll probably be looking at
tree-ssa-strlen's interfaces.

Ranges are more for integer objects.  ie, i has the value [0,25] or ~[0,0].





> 
>>
>> I am currently working towards an embeddable context sensitive range
>> analyzer that in theory could be used within tree-ssa-strlen pass to
>> give more precise range information.  I'm hoping to wrap that work up in
>> the next day or so so that folks can use it in gcc-8.
> 
> such context sensitive range info should be useful when we relax the constant 
> “N” to be an expression whose Min value is larger than the length
> of constant string, with it, we can catch more opportunities. 
> let me know when this info is available.
Hoping to have the basics into the trunk within the next few days as
reviews flow in.

jeff


Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Jeff Law
On 11/17/2017 03:50 PM, Qing Zhao wrote:

>>
>> The difficulty is tracking when exposure leads to these secondary
>> opportunities.  I often end up looking for this kind of stuff by first
>> identifying many source files where the transformation applies.  Then I
>> generate dumps & assembly code for the transformation in each candidate
>> location.  I first analyze the assembly files for differences.  If an
>> assembly file shows a difference, then I look more closely to try and
>> characterize the difference and if it looks interesting, then I work
>> backwards to the dump files for more details.
> 
> usually, what kind of benchmarks or test cases are you recommending? 
> I used to work a lot with SPEC, not sure whether that’s the same for working 
> in GCC?
SPEC is always good.  Many of us also work with GCC's sources themselves
to help determine if a particluar transformation is helpful.

Jeff


Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Jeff Law
On 11/17/2017 03:20 PM, Qing Zhao wrote:
> 
>> On Nov 16, 2017, at 6:24 PM, Martin Sebor > > wrote:
>>>
>>> In my current local implementation, I used the following routine to
>>> get the range info:  (and use the MINMAXLEN[1]+1 for the length of
>>> the non-constant string)
>>>
>>> /* Determine the minimum and maximum value or string length that ARG
>>>   refers to and store each in the first two elements of MINMAXLEN.
>>>   For expressions that point to strings of unknown lengths that are
>>>   character arrays, use the upper bound of the array as the maximum
>>>   length.  For example, given an expression like 'x ? array : "xyz"'
>>>   and array declared as 'char array[8]', MINMAXLEN[0] will be set
>>>   to 3 and MINMAXLEN[1] to 7, the longest string that could be
>>>   stored in array.
>>>   Return true if the range of the string lengths has been obtained
>>>   from the upper bound of an array at the end of a struct.  Such
>>>   an array may hold a string that's longer than its upper bound
>>>   due to it being used as a poor-man's flexible array member.  */
>>>
>>> bool
>>> get_range_strlen (tree arg, tree minmaxlen[2])
>>> {
>>> }
>>>
>>> However, this routine currently miss a very obvious case as the
>>> following:
>>>
>>> char s[100] = {'a','b','c','d’};
>>>
>>> __builtin_strcmp(s, "abc") != 0
>>>
>>> So, I have to change this routine to include such common case.
>>
>> There was a discussion some time ago about converting CONSTRUCTOR
>> trees emitted for array initializers like the above to STRING_CST
>> (see bug 71625 for some background).  I think that would still be
>> the ideal solution.  Then you wouldn't have to change
>> get_range_strlen.
> 
> thanks for the info, Martin.
> 
> In my case, it’s the size of “100” cannot be collected in the
> MINMAXLEN[1] for the string “s”. 
> 
> I need to make sure that the size of variable string s is larger than
> the size of constant string “abc” to guarantee the safety of the
> transformation.
> 
> currently, “get_range_strlen” cannot identify the simple VAR_DECL with
> array_type to determine the maximum size of the string. 
It sounds more like you want the object_size interfaces.  See
tree-object-size.[ch]

Jeff


Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

2017-11-17 Thread Richard Biener
On November 17, 2017 11:20:45 PM GMT+01:00, Qing Zhao  
wrote:
>
>> On Nov 16, 2017, at 6:24 PM, Martin Sebor  wrote:
>>> 
>>> In my current local implementation, I used the following routine to
>get the range info:  (and use the MINMAXLEN[1]+1 for the length of the
>non-constant string)
>>> 
>>> /* Determine the minimum and maximum value or string length that ARG
>>>   refers to and store each in the first two elements of MINMAXLEN.
>>>   For expressions that point to strings of unknown lengths that are
>>>   character arrays, use the upper bound of the array as the maximum
>>>   length.  For example, given an expression like 'x ? array : "xyz"'
>>>   and array declared as 'char array[8]', MINMAXLEN[0] will be set
>>>   to 3 and MINMAXLEN[1] to 7, the longest string that could be
>>>   stored in array.
>>>   Return true if the range of the string lengths has been obtained
>>>   from the upper bound of an array at the end of a struct.  Such
>>>   an array may hold a string that's longer than its upper bound
>>>   due to it being used as a poor-man's flexible array member.  */
>>> 
>>> bool
>>> get_range_strlen (tree arg, tree minmaxlen[2])
>>> {
>>> }
>>> 
>>> However, this routine currently miss a very obvious case as the
>following:
>>> 
>>> char s[100] = {'a','b','c','d’};
>>> 
>>> __builtin_strcmp(s, "abc") != 0
>>> 
>>> So, I have to change this routine to include such common case.
>> 
>> There was a discussion some time ago about converting CONSTRUCTOR
>> trees emitted for array initializers like the above to STRING_CST
>> (see bug 71625 for some background).  I think that would still be
>> the ideal solution.  Then you wouldn't have to change
>> get_range_strlen.

IIRC Martin had a patch for this. 

Richard. 

>thanks for the info, Martin.
>
>In my case, it’s the size of “100” cannot be collected in the
>MINMAXLEN[1] for the string “s”. 
>
>I need to make sure that the size of variable string s is larger than
>the size of constant string “abc” to guarantee the safety of the
>transformation.
>
>currently, “get_range_strlen” cannot identify the simple VAR_DECL with
>array_type to determine the maximum size of the string. 
>
>Qing
>> 
>> Martin