On Tue, Mar 29, 2005 at 02:25:41AM +0200, Reuben Thomas wrote:
> >I meant that .emacs might have been erroneously saved as utf8 (that
> >sometimes happens, e.g., that recently happened to me with a .procmailrc
> >that tried to exclude some >128 chars combinations, and was accidentally
> >saved a utf8, so the strings were not correct and everything was messed up)
> 
> OK, that definitely DID happen with my test .emacs. I fixed it with 
> recode, and then (to avoid switching my .emacs around all the time) put 
> it in a file called emacs.ispell and ran emacs with
> 

That is the reason why is good to use the octal codes.

Anyway, I am back to my sid box with aspell-0.60 and could finally reproduce
your problem.

* When the problem appears?

  When the aspell dict is built for a 'canonical' locale, but run from emacs
  implicitely using a different one.

  I was testing everything in my es_ES (iso-8859-1) locale and then
  everything matched. For that reason I could not reproduce your problem.

* What is the reason for that problem?

  With LANG=en_GB.UTF-8 when emacs calls aspell it implicitely expects utf8
  and returns utf8. emacs however thinks the dict is iso-8859-1 and sends and
  expects iso-8859-1. For that reason ispell-check rôle sends a iso-8859-1
  string, but aspell returns an utf-8 string having what emacs thinks is a
  two word string and complains about

    Checking spelling of RÔLE...
    ispell-word: Ispell and its process have different character maps.

  or the equivalent problem you found for ispell-buffer. This happens with
  aspell-0.60, but not with previous versions that had no utf8 support.

* How to ( quick & dirty ) work around this:

  Set explicitely the communication encoding to the dict encoding, e.g.

; ------------------------------------------------------------
(debian-ispell-add-dictionary-entry
  '("british+accs"
    "[A-Z\321\324a-z\361\364]"
    "[^A-Z\321\324a-z\361\364]"
    "[']"
    nil  
    ("-B" "-d" "british-w_accents" "--encoding=iso8859-1")
    nil
    iso-8859-1)
  "aspell")

(setq ispell-dictionary-alist debian-ispell-dictionary-alist)
; -----------------------------------------------------------

 Please try if this works as seems to be working here. If so I will add this
mention to the README.emacs file.

I currently see no easy way of handling it more generally, unless there is
an option to aspell saying 'do not implicitely reencode after LANG' that
could be appended to ispell-program-name if appropriate.
But I do not see it.

Another possibility might be working out some table with the charset
namings equivalences and use it for aspell-0.60 to make sure the 'native'
encoding is used. I have to think about this.

I am cc'ing aspell maintainer to make sure he is aware of this problem.

Cheers,

-- 
Agustin

Reply via email to