sammccall added a comment. Thanks for the review, and sorry for the subtlety of the code and sparse comments. It should be a little better now, please let me know which parts aren't clear enough.
================ Comment at: clangd/FuzzyMatch.cpp:69 + : NPat(std::min<int>(MaxPat, Pattern.size())), NWord(0), + ScoreScale(0.5f / NPat) { + memcpy(Pat, Pattern.data(), NPat); ---------------- klimek wrote: > Why .5? The .5 and the 2 are the same thing. Extracted a constant with a comment. ================ Comment at: clangd/FuzzyMatch.cpp:92 +// Segmentation of words and patterns +// We mark each character as the Head or Tail of a segment, or a Separator. +// e.g. XMLHttpRequest_Async ---------------- klimek wrote: > Do you mean "part of the Head or Tail"? > Also, explain that these are the CharRoles. A reader reads this first, and > will search for what CharRole means in the code later. CharRole is defined in > a different file, without comments, so figuring out how that all relates is > super hard :) Rewrote this section and added more comments. CharRole is defined here now. Each character in a segment that isn't the Head is the Tail. It's a bit of a stretch, but it's short and evocative and (now) explained with examples. ================ Comment at: clangd/FuzzyMatch.cpp:110 +// 4 packed CharTypes per entry, indexed by char. +constexpr uint8_t CharTypes[] = { + 0x00, 0x00, 0x00, 0x00, // Control characters ---------------- klimek wrote: > Finding bugs in these will be hard :) Ack :-( ================ Comment at: clangd/FuzzyMatch.cpp:120 + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, // Bytes over 127 -> Lower. + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, // This includes UTF-8. + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, ---------------- klimek wrote: > Can you expand in the comment why this works for utf-8? Done. It doesn't really "work", so much as we just give up... ================ Comment at: clangd/FuzzyMatch.cpp:212 + ScoreT S = 0; + // Bonus: pattern is part of a word prefix. + if (P == W) ---------------- klimek wrote: > Why does P == W imply that? Every pattern character must match in order, so a match with P < W is impossible, and P == W means the match is perfect so far. (Also explained in the comment) ================ Comment at: clangd/FuzzyMatch.cpp:215 + ++S; + // Bonus: case matches, or an asserted word break matches an actual. + if (Pat[P] == Word[W] || (PatRole[P] == Head && WordRole[W] == Head)) ---------------- klimek wrote: > This is the first time the term "asserted word break" shows up, perhaps > explain this when explaining the roles. Rephrased the comment here to use the familiar terminology. ================ Comment at: clangd/FuzzyMatch.h:39 + +private: + constexpr static int MaxPat = 63; ---------------- klimek wrote: > I find most of the abbreviations here non-intuitive, and thus needing > comments (Pat I get is for pattern :) > N - what does it mean? Number of characters in the pattern? I'd use Length > instead. > LPat and LWord (no idea what L could stand for). > Comments added throughout. > N - what does it mean? Number of characters in the pattern? I'd use Length > instead. `WordLength` is too long for something so common I think, these each have >20 refs. Changed `NWord` -> `WordN` which I think reads better - `Word` and `WordN` form a nice pair. (N rather than L for length because of confusion with Lower) Changed `LWord` to `LowWord` etc. ================ Comment at: clangd/FuzzyMatch.h:50 + + int NPat, NWord; + char Pat[MaxPat]; ---------------- klimek wrote: > I'd use a StringRef instead, and call the storage *Storage or something. What's the goal here? I have a couple of objections to this: - if you actually use StringRef[] to access the data, now you've got a gratuitous indirection everywhere - For `LPat`/`LWord` too? Now we have *more* members than in the first place, and two ways to write each bounds check. If the intent is to clean up the places where I construct `StringRef(Word, NWord)` explicitly, adding `StringRef word()` would certainly make sense. https://reviews.llvm.org/D40060 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits