erichkeane added a comment. In D124221#3494000 <https://reviews.llvm.org/D124221#3494000>, @aaron.ballman wrote:
> In D124221#3493792 <https://reviews.llvm.org/D124221#3493792>, @erichkeane > wrote: > >> FWIW, I'm in favor of the patch as it sits. >> >> As a followup: So I was thinking about the "%s" specifier for string types. >> Assuming char-ptr types are all strings is a LITTLE dangerous, but more so >> the way we're doing it. Its a shame we don't have some way of setting a >> 'max' limit to the number of characters we have for 2 reasons: >> >> 1- For safety: If the char-ptr points to non-null-terminated memory, it'll >> stop us from just arbitrarily printing into space by limiting at least the >> NUMBER of characters we print into nonsense. >> 2- For readability: printing a 'long' string likely makes this output look >> like nonsense and breaks everything up. Limiting us to only a few >> characters is likely a good idea. >> 3- <Bonus #3 from @aaron.ballman >: It might discourage SOME level of >> attempts at using this for reflection, or at least make it a little harder. >> >> What I would love would be for something like a 10 char max: >> >> struct S { >> char *C; >> }; >> S s { "The Rest of this string is cut off"}; >> print as: >> struct U20A a = { >> .c = 0x1234 "The Rest o" >> }; >> >> Sadly, I don't see something like that in printf specifiers? Unless someone >> smarter than me can come up with some trickery. PERHAPS have the max-limit >> compile-time configurable, but I don't feel strongly. > > The C Standard has this in the specification of the %s format specifier: > > If no l length modifier is present, the argument shall be a pointer to > storage of character > type. Characters from the storage are written up to (but not including) the > terminating > null character. If the precision is specified, no more than that many bytes > are written. If > the precision is not specified or is greater than the size of the storage, > the storage shall > contain a null character. > > So you can use the precision modifier on %s to limit the length to a > particular number of bytes. The only downside I can think of to picking a > limit is, what happens when the user stores valid UTF-8 data in their string > and prints it via `%.10s` (will we then potentially be splitting a codepoint > in half and that does something bad? Ah! TIL! That I think would be an excellent improvement to this builtin. I suspect we could just choose a fixed value to start (maybe even in this patch @rsmith ?), then add a compile-time option in the future if folks care to extend it. I would posit that 10-15 would be a good start? Any shorter would be prohibitive I think? I could be talked into a little longer... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124221/new/ https://reviews.llvm.org/D124221 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits