> On Jan 30, 2018, at 7:35 AM, Pavel Labath <lab...@google.com> wrote: > > Hello all, > > I am looking for feedback regarding implementation of the case folding > algorithm for .debug_names hashes. > > Unlike the apple tables, the .debug_names hashes are computed from > case-folded names (to enable case-insensitive lookups for languages > where that makes sense). The dwarf5 document specifies that the case > folding should be done according the the "Caseless matching" Section > of the Unicode standard (whose implementation is basically a long list > of special cases). While certainly possible, implementing this would > be much more complicated (and would probably make the code a bit > slower) than a simple tolower(3) call. And the benefits of this are > not really clear to me.
Assuming a UTF-8 encoding, will tolower(3) destroy any non-ASCII characters in the process? In Swift, for example, we allow a wide range of unicode characters in identifiers and I want to make sure that this doesn't cause any problems. -- adrian > > Do you know if we already make any promises or assumptions about the > encoding and/or locale of the symbol names (and here I mainly mean the > names in the debug info metadata, not llvm symbols). > > If we don't already have a policy about this, then I propose to > implement the case folding via tolower() (which is compatible with the > full case folding algorithm, as long as one sticks to basic latin > characters). > > What do you think? _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev