The demand to read a file (in local encoding) and to decode it
incrementally seems a typical one.

With Gnulib, this can be done using the mbfile module to read in the
multibytes byte-by-byte and then using the striconveh module to decode
the multibytes in, say, UTF-8 or UTF-32.

This, however, doesn't seem to be very efficient because the
multibytes have to be investigated at least twice; once by the mbfile
iterator and once by the striconveh iterator.

Thus, I am wondering whether it makes sense to offer a stateful
decoder that takes byte by byte and signals as soon as a decoded byte
sequence is ready.

On top of that, a decoding Unicode mbfile interface can be built, say ucfile.

Thanks,

Marc

Reply via email to