On 08/25/18 01:54, Jeff Law wrote:
> On 08/24/2018 11:26 AM, Bernd Edlinger wrote:
>> On 08/24/18 18:51, Jeff Law wrote:
>>>> Well, this is broken for wide character strings.
>>>> but I hope we can get rid of STRING_CST which are
>>>> not explicitly null terminated.
>>
>> I am afraid that is not going to happen.
>> Maybe we can get STRING_CST that are never longer
>> than the TYPE_UNIT_SIZE, but c_strlen and c_getstr
>> need to take care that the string is zero-terminated.
>>
>> string_constant, should not promise the string is zero terminated.
>> But instead it can promise that:
>> 1) the STRING_CST is valid up to TREE_STRING_LENGTH
>> 2) mem_size is >= TREE_STRING_LENGTH
>> 3) memory between TREE_STRING_LENGTH and mem_size is ZERO.
>>
>> It will not guarantee anything about zero termination any more.
> Interesting because those conditions would be sufficient to deal with a
> regression I stumbled over after fixing Martin's patch to not assume
> that all STRING_CSTs are NUL terminated.
> 
> But I need to think about this a bit more.  Essentially the question
> we'd need to ask is whether or not these are sufficient in general or
> just in specific cases.
> 
> I tend to think they're not sufficient in general. If a string returned
> by string_constant that didn't have a terminating NUL, but which did
> pass the tests above were ultimately passed to the runtime's str*
> routines, then the call may run off the end of the string.  We'd like to
> be able to warn for that.
> 
> So ISTM those rules are only valid in contexts where we know the result
> isn't going to be passed to str* and friends within the C library.
> 
> I do think they're sufficient to avoid problems with the
> tree-ssa-forwprop code we've looked at.  So what may make the most sense
> is to have that routine indicate it's willing to accept unterminated
> strings, then check the conditions above before optimizing the code.
> 

There are not too many callers of string_constant.
Not all need zero termination.

But I think if the are interested in zero-termination
they should simply call c_strlen or c_getstr.


>>
>> In the end, the best approach might be to either merge my patch
>> with Martins, or step-wise, first fixing wrong code, and then
>> implementing warnings without fixing wrong code.
> Unsure at this time.  I've been working with both.  I suspect that if we
> went with yours that we'd then turn around and layer Martin's on top of
> it because of the desire to signal to callers that we have an
> unterminated string and have the callers take appropriate action.  Which
> begs the question of whether or not we just go with Martin's -- ie, is
> there really any value in using both.  I haven't seen indications there
> is value in that approach, but I'm still poking at things.
> 

Well, ya call it "layer one patch over the other"
I call it "incremental improvements".


Bernd.

Reply via email to