On Tue, Jun 20, 2006 at 08:18:52PM +0200, Rafael Laboissiere wrote:
> * Jakson A. Aquino <[EMAIL PROTECTED]> [2006-06-20 14:34]:
> 
> > If my guess is correct the problem happens when gawk is called in a
> > locale and the files "conjugue" and "verbos" were encoded in a
> > different locale.
> 
> I found the source of the problem: I was doing the tests in a system with
> gawk 3.1.4.  When I did it in my up-to-date unstable chroot with gawk
> 3.1.5, I could replicate what you reported.
> 
> I am now about to do the following for the package brazilian-conjugate:
> 
>   * Install the original conjuge script into /usr/bin/conjugue-ISO-8859-1.
>   * Create /usr/bin/conjugue-UTF-8 with the recode command as you
>     suggested.
>   * Create the appropriate /usr/lib/brazilian-conjugate/verbos-<char-enc>
>     files and change the content fo /usr/bin/conjugue-<char-enc>
>     accordingly.
>   * Create a simple wrapper script /usr/bin/conjugue that would call the
>     appropriate /usr/bin/conjugue-<char-enc> according to the current
>     locale, something like the following:
>   
>     #!/usr/bin/perl -w
>     my $encoding = "ISO-8859-1";   # default value
>     $ENV{LANG} =~ /[a-z]{2}?(?:(?:_[A-Z]{2}?)?(?:\.(.*))?)?/;
>     $encoding = $1 if defined $1;
>     system ("/usr/bin/conjugue-$encoding", @ARGV)
>     
> Notice that the script above relies on the environment variable LANG.  Do
> you think that this would be okay?

I tested and it worked. I had only to change line 1810 of conjugue-*
to fix the name of the verbos-* files. But it doesn't work if I simply
export my locale as:

  $ export LC_ALL=pt_BR
  $ export LANG=pt_BR

I configured my system to have en_US.UTF-8 as default locale and added
pt_BR.UTF-8 and pt_BR.ISO-8859-1 as other available locales. I think
that when I don't specify the charset encoding it defaults to UTF-8,
and not to ISO-8859-1, as assumed by the wrapper script. One possible
solution would be do not assume any default charset and make the
script exit if the encoding wasn't found in the locale string. In this
case, the output wold be a help message (in Portuguese and English)
teaching how to make an unambiguous specification of the locale. This
is just a suggestion. It certainly would be better if the script could
always discover what's the correct encoding.

Best regards,

Jakson

-- 
Jakson



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to