On Wed, Jun 6, 2018 at 2:29 PM, Richard Wordingham < [email protected]> wrote:
> On Tue, 5 Jun 2018 09:42:38 -0500 > Nathan Willis <[email protected]> wrote: > > > Your feedback and help is appreciated! > > * Malayalam Remarks * > > In Sections 2.2 and 2.3, how are multiple vowels handled, such as > U+0D4A and U+0D4B? I'm particularly interested in the handling of > multiple left matras. > Hmm. So, as I understand it, in HarfBuzz the presence of multiple matras (on any side) would be an issue dealt with by the syllable-identification regular expressions, before getting to the reordering stuff. It seems like this it what is used (the same regexps being used for all scripts in HarfBuzz's Indic shaper): matra_group = z{0,3}.M.N?.(H | forced_rakar)?; [...] halant_or_matra_group = (final_halant_group | (H.ZWJ)? matra_group{0,4}); ... and that only permits four matras (total) per syllable. I vaguely recall seeing a commit message or comment or something indicating that this limit was there to maintain compatibility with how Uniscribe matches syllables, but I searched around and couldn't find it today. It was something along the lines of the Microsoft docs saying "one matra for each type [L,R,T,B] is permitted," but that isn't clear whether it's justified by orthography at all or is just a practical concession that they made for some reason. Others with more Uniscribe knowledge may know. That having been said, I *think* that HarfBuzz doesn't rearrange two adjacent codepoints that have the same sort-ordering tags. So "Consonant,U+0D4A,U+0D4B" ought to get the matras decomposed, then the two left-side parts move together as-is to the left of the consonant, and the two right-side parts remain unchanged. You could test that with hb-view /usr/share/fonts/truetype/noto/NotoSerifMalayalam-Regular.ttf --unicodes=0d15,0d4a,0d4b I'm on a new machine right at the moment and apparently don't have all of Noto installed just yet, or I'd just try it. Will update later. In the meantime, I honestly can't speak to whether or not that's the correct behavior for the script. Behdad? Any thoughts on that? > In Section 3, how does tagging interact with substitutions? Features > can in general split and merge glyphs. > > The tagging described in stage 2 is just the reordering / syllable-position tags. So after all that is done, the sort-the-syllable-into-final-sort-order is (AIUI) the last that the tags come into play. I do know that HarfBuzz keeps track of other sorts of state that it may refer to internally as tags, but I don't think any of these docs reference those, just the reordering position tags. So applying the features in stage 3 doesn't interact with the tags — at least, not directly. If the tagging was wrong, of course, then the final sorted order might be wrong and sequences wouldn't match up to the substitution rules in GSUB. But, if I follow HarfBuzz's logic right, the reordering stuff cannot be switched off, so it always happens completely before any substitutions start, and that seems to be what other shapers did first. Should there be a wording change to address that in the document itself? Thanks, Nate > Richard. > _______________________________________________ > HarfBuzz mailing list > [email protected] > https://lists.freedesktop.org/mailman/listinfo/harfbuzz > -- nathan.p.willis [email protected] <http://identi.ca/n8>
_______________________________________________ HarfBuzz mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/harfbuzz
