Hi, >>>>> "Thanh" == Thanh Han The <[EMAIL PROTECTED]> writes:
> this is a feature/limitation/bug/<call-what-you-like> of pdftex: > it doesn't remove unused Subs entries (this is very tricky and > dangerous), but replaces them by a dummy one. let me give a few additional explanations. Glyph descriptions are stored in the so-called CharString array. CharStrings can contain subroutine calls. Subroutines are stored in the Subrs array. Though a Type1 font is a PostScript program, it is also possible to write a parser which supports only the little subset of PostScript instructions allowed in Type1 fonts. And this is what most font renderers do. There is actually no problem to omit unused subroutines if such an engine is used. PostScript supports sparse arrays. However, some engines obviously do not even understand this minimal subset of PostScript code. They just create an array and put all the subroutines into it. Unfortunately they ignore the array index given in the font. That means that if Subrs[x] is removed because it is not used by any CharString of the subset, Subrs[x] will contain the value which is supposed to be in Subrs[x+1]. With other words: some engines do not support sparse arrays. What pdftex and dvips currently do is to replace unused subroutines by subroutines which contain a return statement only. This is quite safe. I don't think that it is very dangerous to remove unused subroutines, but it is a bit inconvenient and makes font inclusion a bit slower. It is probably necessary to parse the CharStrings first, create an array which maps original Subrs indices to new ones and then write out the new Subrs array and a CharString array which uses the new indices. There is another important point: If the current subset requires Subrs[0] ... Subrs[x], subroutines with indices > x will be removed. I do not know how much space the dummy subroutines require in a PDF file. Maybe I should create a font which has two identical glyphs, where one glyph uses Subrs[4] and the other calls the identical subroutine Subrs[1004]. Anyway, unless it is unclear how much is gained by changing writet1.c, I think that the best way to experiment with such things is to put the Subrs and CharString arrays into Lua tables. Maybe this had already been done in LuaTeX. > lmr10 has a very large number of subroutines (~550), while cmr10 > has only 3. I don't think that the large number of subroutines is the problem. As I said above, pdftex and dvips remove all subroutines which have a larger index than the subroutine with the largest index which is actually needed. The lm fonts are horribly inefficient in this respect. It would make sense that characters which are used frequently (the ASCII alphabet, for instance) use subroutines which are close to Subrs[0]. But in lm, the most common character in western languages, /e calls Subrs[534], /a calls Subrs[561] and so on. > Another factor is that t1 fonts are encrypted and hence they > cannot be compressed effectively, so all these dummy entries > didn't get much smaller after compression and therefore the result > is a much larger size. That's true but the only reasonable solution is to convert Type1 fonts to CFF. As you already said, that's not a weekend project. I'm quite optimistic, though. If Taco doesn't want to depend on external libraries when he provides OTF support, I assume that he has to delve into it anyway. On the other hand, in the future pdftex will support OTF. In OTF, Type1 fonts will be in CFF format. The easiest way to decrease PDF file size will be to convert Type1 fonts to OTF using an external tool. AFAIK fontforge is able to convert Type1 to OTF already. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-4592165 Marschnerstr. 25 D-30167 Hannover mailto:[EMAIL PROTECTED] ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ---------------------------------------------------------------------------- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]