Thank you! At last I can start working :-) Per PS Maybe this should be added to the Wiki?
On Wed, Feb 13, 2013, at 23:37, Jimmy O'Regan wrote: > On 13 February 2013 21:00, Per Tunedal <[email protected]> wrote: > > Well, > > I ran your script afterwords, and the Swedish characters where corrected > > - but the Danish ones where damaged: > > > > Before: > > <e><p><l>BlomkÃ¥l<s n="n"/></l><r>blomkål<s n="n"/></r></p></e> > > <e><p><l>BlÃ¥mussla<s n="n"/></l><r>blåmusling<s n="n"/></r></p></e> > > <e><p><l>Samlag<s n="n"/></l><r>bolde<s n="n"/></r></p></e> > > <e><p><l>Bomb<s n="n"/></l><r>bombe<s n="n"/></r></p></e> > > <e><p><l>Brandy_<s n="n"/></l><r>brandy<s n="n"/></r></p></e> > > <e><p><l>Hallonsläktet<s n="n"/></l><r>brombær<s n="n"/></r></p></e> > > <e><p><l>BröllopstÃ¥rta<s n="n"/></l><r>bryllupskage<s > > n="n"/></r></p></e> > > <e><p><l>Kvinnobröst<s n="n"/></l><r>bryst<s n="n"/></r></p></e> > > <e><p><l>Bröd<s n="n"/></l><r>brød<s n="n"/></r></p></e> > > <e><p><l>Bulgur<s n="n"/></l><r>bulgur<s n="n"/></r></p></e> > > <e><p><l>Bunsenbrännare<s n="n"/></l><r>bunsenbrænder<s > > n="n"/></r></p></e> > > <e><p><l>Böna<s n="n"/></l><r>bønne<s n="n"/></r></p></e> > > <e><p><l>Böna<s n="n"/></l><r>bønner<s n="n"/></r></p></e> > > > > after: > > <e><p><l>Blomkål<s n="n"/></l><r>blomk?l<s n="n"/></r></p></e> > > <e><p><l>Blåmussla<s n="n"/></l><r>bl?musling<s n="n"/></r></p></e> > > <e><p><l>Samlag<s n="n"/></l><r>bolde<s n="n"/></r></p></e> > > <e><p><l>Bomb<s n="n"/></l><r>bombe<s n="n"/></r></p></e> > > <e><p><l>Brandy_<s n="n"/></l><r>brandy<s n="n"/></r></p></e> > > <e><p><l>Hallonsläktet<s n="n"/></l><r>bromb?r<s n="n"/></r></p></e> > > <e><p><l>Bröllopstårta<s n="n"/></l><r>bryllupskage<s > > n="n"/></r></p></e> > > <e><p><l>Kvinnobröst<s n="n"/></l><r>bryst<s n="n"/></r></p></e> > > <e><p><l>Bröd<s n="n"/></l><r>br?d<s n="n"/></r></p></e> > > <e><p><l>Bulgur<s n="n"/></l><r>bulgur<s n="n"/></r></p></e> > > <e><p><l>Bunsenbrännare<s n="n"/></l><r>bunsenbr?nder<s > > n="n"/></r></p></e> > > <e><p><l>Böna<s n="n"/></l><r>b?nne<s n="n"/></r></p></e> > > <e><p><l>Böna<s n="n"/></l><r>b?nner<s n="n"/></r></p></e> > > > > That's strange, because your script corrected the file translated in the > > other direction OK. > > Yes, because it was expecting the corrupted characters to be on the > right, so to go the other way it would need to be: > perl -MEncode -ane 'chomp;if(m!(<e><p><l>)([^<]*)(<s > n="n"/></l><r>)([^<]*)(<s > n="n"/></r></p></e>)!){$rec=encode("iso-8859-1",decode("utf-8", > $2));if($2 eq lc($2)){$rec=lc($rec);}; print "$1$rec$3$4$5\n";}' > > -- > <Sefam> Are any of the mentors around? > <jimregan> yes, they're the ones trolling you > > ------------------------------------------------------------------------------ > Free Next-Gen Firewall Hardware Offer > Buy your Sophos next-gen firewall before the end March 2013 > and get the hardware for free! Learn more. > http://p.sf.net/sfu/sophos-d2d-feb > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
