Hi Naomi,
I don't have a conclusive answer for you on this yet, but let me pick up on
a few points.
First, the apostrophe is probably being handled through ignoring
punctuation in the ICUCollationKeyFilterFactory.
Alif isn't a diacritic but a letter, and its character properties would be
handled
y used for.
Charles
On Thu, May 24, 2012 at 1:41 PM, Naomi Dushay wrote:
> The alif and ayn can also be used as diacritic-like characters in Korean;
> this is a known practice. But thanks anyway.
>
> On May 24, 2012, at 9:30 AM, Charles Riley wrote:
>
> Hi Naomi,
>
>
"the encoding of the character used for alif (02BE) carries with it an
assigned property in the Unicode database of (Lm), putting it into the
category of 'Modifier_Letter'..."
Correction to what I put there: 02BC, rather. The rest of that still
holds up; the data I'm looking at regarding proper