In response to this issue, empty blanks are considered blanks again, so a <b/> or a <b pos="1,2,etc."/> in the output will put an empty blank if the input had an empty blank in that position.
Once again, a <b/> and a <b pos="1"/> now does the same thing. Earlier, <b pos="1,2.."/> was used to print a blank from the input and <b/> was used to print a literal space in the output. Now, a <b/> will print a blank from the input in order of the blanks read. In transfer rules written in the future there's no need to add a pos attribute to <b/>, and the ones that exist already will act the same as a <b/>. This means that there's no way to reorder blanks from transfer rules now, but that is by design. Hèctor, let me know if this solved your issue :) Thanks and Regards, *तन्मय खन्ना * *Tanmai Khanna* On Fri, Sep 4, 2020 at 12:15 PM Hèctor Alòs i Font <[email protected]> wrote: > Missatge de Tanmai Khanna <[email protected]> del dia dv., 4 de > set. 2020 a les 9:22: > >> Hèctor, >> Yes, the new improvements aren't backwards compatible but that's because >> they're better than the system we had earlier. Here's the changes: >> >> So, you are saying that the new stuff is not backwards compatible, aren't >>> you? There aren't any <b/> in the rule, but <b pos="1,2..."/>, which is not >>> the same. Until now, <b/> means explicitly putting a blank, while <b >>> pos="1,2..."/> means copying to the output whatever is in the input in a >>> given point. >>> >> >> <b pos="1,2..."/> and <b/> now do exactly the same thing. You don't need >> to replace all of the former with the latter but even if you do or don't it >> won't change anything. Until now it meant what you said but now it means >> that if you see a <b/> or a <b pos="1,2,etc."/> then print one blank from >> the blank queue in the output. >> >> Superblanks most of the time are blanks, but, as you now probably know >>> better than anyone else, they can be lots of things; they can even contain >>> no blanks at all. Even in some cases, like in Romance-language enclitics, >>> we know there shouldn't be any blank at all before them, but we had to >>> add <b pos="1,2..."/> for not loosing information on italics, bold letters, >>> etc. >>> >> >> You're right, except now we have a completely different system to deal >> with italics, bold letters, and all markup, i.e. wordbound blanks, which >> aren't considered blanks. Now that there is no information to lose, we >> didn't want to burden the people who write transfer rules to explicitly >> define positions of blanks. In cases where you don't want a space in the >> output, you just don't put a <b pos="1,2"/> in the output rule. >> >> >>> I'm not really ready to change all <b pos="1,2..."/> in the hundreds of >>> rules I've been writing in several language pairs. Specifically for >>> apertium-fra-frp, I hope it will be able to publish it before the new >>> version of the Apertium core you are preparing, so they are needed right >>> now. >>> >> >> You won't have to change all of them. Most of them will work as it is. >> The new system prints blanks in the same order as they were input, so it >> won't harm most of the rules. The *only thing *you'll have to change, is >> rules where you don't want a space in the output between LUs, you remove >> the <b pos="1"/> from those rules. This is because now, an empty blank >> isn't considered a blank anymore. This was because we want the users to >> have control about whether they want a blank or not between their output >> LUs, regardless of the input blanks. If we consider an empty blank, your >> problem will be solved, but other problems will come up, where empty blanks >> will appear in the output regardless of <b/>s in the output. >> >> So to conclude, the only thing you need to remove is the <b pos="1"/> >> from rules where you know you don't want a space in the output, like num_n, >> and maybe some enclitics. Apart from that, everything will work as it is. >> To improve the system, at some point we'll have to add a change that isn't >> strictly backwards compatible, and several people agree that after >> wordbound blanks, we should stop handling blank positions in transfer rules. >> > > The problem is that in 99% of the cases I want a blank in num_n, that is > between the numeral and the name. In most of the cases we have "two cows", > "3 dogs", etc. In Romance languages, the rule is needed mostly for gender > agreement. The problem is that sometimes, as we see, we got something else. > So the question is not whether I want a blank there or not. I want whatever > was there. So, let me try to formulate it in another way. If I want to > preserve what was written between two words, I shouldn't write <b > pos="1,2..."/>, but if I want to add a blank, I have to add <b/>. Am I > right? If this is correct, it comes to remove all <b pos="1,2..."/>. It > seems it would be easier that they wouldn't be taken into account, and thus > avoiding any change in the language pairs. Am I missing something? > > Hèctor > > >> >> If this isn't acceptable, we can discuss other possible solutions :) >> >> *तन्मय खन्ना * >> *Tanmai Khanna* >> _______________________________________________ >> Apertium-stuff mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
