erichkeane added a comment.

In D124221#3494000 <https://reviews.llvm.org/D124221#3494000>, @aaron.ballman 
wrote:

> In D124221#3493792 <https://reviews.llvm.org/D124221#3493792>, @erichkeane 
> wrote:
>
>> FWIW, I'm in favor of the patch as it sits.
>>
>> As a followup: So I was thinking about the "%s" specifier for string types.  
>> Assuming char-ptr types are all strings is a LITTLE dangerous, but more so 
>> the way we're doing it.  Its a shame we don't have some way of setting a 
>> 'max' limit to the number of characters we have for 2 reasons:
>>
>> 1- For safety: If the char-ptr points to non-null-terminated memory, it'll 
>> stop us from just arbitrarily printing into space by limiting at least the 
>> NUMBER of characters we print into nonsense.
>> 2- For readability: printing a 'long' string likely makes this output look 
>> like nonsense and breaks everything up.  Limiting us to only a few 
>> characters is likely a good idea.
>> 3- <Bonus #3 from @aaron.ballman >: It might discourage SOME level of 
>> attempts at using this for reflection, or at least make it a little harder.
>>
>> What I would love would be for something like a 10 char max:
>>
>>   struct S {
>>      char *C;
>>    };
>>    S s { "The Rest of this string is cut off"};
>>    print as:
>>    struct U20A a = {
>>      .c = 0x1234 "The Rest o"
>>    };
>>
>> Sadly, I don't see something like that in printf specifiers?  Unless someone 
>> smarter than me can come up with some trickery.  PERHAPS have the max-limit 
>> compile-time configurable, but I don't feel strongly.
>
> The C Standard has this in the specification of the %s format specifier:
>
>   If no l length modifier is present, the argument shall be a pointer to 
> storage of character
>   type. Characters from the storage are written up to (but not including) the 
> terminating
>   null character. If the precision is specified, no more than that many bytes 
> are written. If
>   the precision is not specified or is greater than the size of the storage, 
> the storage shall
>   contain a null character.
>
> So you can use the precision modifier on %s to limit the length to a 
> particular number of bytes. The only downside I can think of to picking a 
> limit is, what happens when the user stores valid UTF-8 data in their string 
> and prints it via `%.10s` (will we then potentially be splitting a codepoint 
> in half and that does something bad?

Ah! TIL!  That I think would be an excellent improvement to this builtin.  I 
suspect we could just choose a fixed value to start (maybe even in this patch 
@rsmith ?), then add a compile-time option in the future if folks care to 
extend it.  I would posit that 10-15 would be a good start?  Any shorter would 
be prohibitive I think?  I could be talked into a little longer...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124221/new/

https://reviews.llvm.org/D124221

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to