On Fri, Nov 2, 2018 at 4:48 PM Nathan Willis <[email protected]> wrote:
> Hiyo. I'm revisiting the 'clusters' chapter in the User Manual, to make it > more consistent with the rest and hopefully easier to understand. Rereading > it has raised some questions.... > > 1) The opening sentence says "a cluster is a sequence of code points [...]" > ...which might be true for the initial buffer contents, but all the > interesting stuff happens after replacing them with glyphs. So that will > certainly need some changing. I know it's a can of worms to open, but what > if we said "characters" here? Explaining the relationship between code > points, characters, and glyphs can be tricky, but then again explaining > clusters to new readers is already difficult.... > Right. "code points" by itself doesn't mean anything. They might be Unicode code points, aka characters, or glyph code points, aka glyphs. A cluster refers to a sequence of characters and their corresponding glyphs. Or the other way around, depending on your taste. 2) "Most clients will use UTF-8, UTF-16, or UTF-32 indices, but the actual > number does not matter" ... is "indices" here referring to the buffer > contents (code points)? > Refers to position in the text that was passed to hb_buffer_add_utf8/16/32. > 3) "Moreover, it is not required for the cluster values to be > monotonically increasing. Most of HarfBuzz's tests are performed on > monotonically increasing cluster numbers but, there is no such assumption > in the code itself." > Some of that sentence is implementation detals and can go. The first sentence is enough. ... This is the big one. The examples that follow in subsequent subsections > hinge on the fact that the cluster values need to be monotonically > increasing. Keeping them monotonic & increasing is given as the reason that > clusters get merged when reordering (levels 0 and 1). So this sentence > sticks out. I'm not sure how to resolve that discrepancy; can anyone > explain how both of those pieces are supposed to fit together? > There's two things: 1. Whether or not the input clusters are monotonic, 2. Whether buffer cluster-level is set to any of the monotonic enum values. The promise is that *if* both of those are true, then the output cluster values are monotonic. If any of the above is false, there's no guarantee. > Finally, I am adding a short "why your software cares about clusters" > paragraph to the beginning. I've got cursor positioning, coloring > diacritics, and line breaking in mind; anything else worth mentioning? > Text selection. Is same as positioning but still. Thanks, > Nate > -- > nathan.p.willis > [email protected] <http://identi.ca/n8> > _______________________________________________ > HarfBuzz mailing list > [email protected] > https://lists.freedesktop.org/mailman/listinfo/harfbuzz > -- behdad http://behdad.org/
_______________________________________________ HarfBuzz mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/harfbuzz
