> On Jul 2, 2023, at 11:16 PM, Jer Haan <[email protected]> wrote: > > This table is copied from Wikipedia.<uencoding.pas>Hope it’s useful for you. > If you improve the code pls let me know. >
This is perfect, thanks! Much more complicated than I thought. I'm curious now, if you were going the other direction and parsing a string of different unicode characters with different code point sequence lengths how would you know which length it was? For example I started off know which unicode scalar to use by looking at a table but if I had to find the character is stream of text? I think UTF8 can have 1-4 byte characters so you could encounter 1 byte character followed by 4 byte characters interleaved and there's no header or terminator for each character. How is this solved? Regards, Ryan Joseph _______________________________________________ fpc-pascal maillist - [email protected] https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
