On Mon, Jan 17, 2022 at 10:46:29PM +0000, Colin Watson wrote: > test_manfile (which despite the name is not a test function) calls > find_name with file!="-" and encoding=NULL; that causes find_name to > call get_page_encoding, which always returns something non-NULL > ("ISO-8859-1" for English pages), and then call add_manconv from that to > UTF-8.
I think there's a potential bug here; from attaching gdb and breaking on iconv_open, it seems there's a lot of encoding from UTF-8 to UTF-8, which should be no-ops (except that it might do some additional well-formedness checking). Is that intentional? Apart from that, I've given your patch a quick run, and it seems to cut out nearly all of the unneeded overhead. So what we're potentially left with without doing strange things like multithreading or simplifying the lexer, is: - Get rid of the unneeded conversions (~15–20% overhead, it seems). - Launch in the background, potentially. Does that make sense? In any case, what we have here is already a huge improvement. /* Steinar */ -- Homepage: https://www.sesse.net/