Bug#395007: aptitude --show-deps is broken for multibyte descriptive text

Daniel Burrows Wed, 25 Oct 2006 18:40:15 -0700

On Tue, Oct 24, 2006 at 08:16:31PM +0900, Kobayashi Noritada <[EMAIL 
PROTECTED]> was heard to say:
> akira yamada (akira) reported on his book[1] about deb package and packaging
> that Japanese descriptive text from `aptitude --show-deps' is broken.
> Actually, following code at reason_string_list in 
> src/cmdline/cmdline_prompt.cc
> cannot handle multibyte characters:
> 
>   s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> 
> Here attached two patches to solve this problem: one is a version I made by
> referring to sample code[2] from Junichi Uekawa (dancer), and the other is a
> version privately presented from Kouhei Sutou.  I rebuilt with each of these
> patches and checked that both can work as a solution.  However, although I can
> understand basically what these patches are doing, I don't have a good C++
> skill, and cannot determine which is a better solution nor create a better and
> compact patch free of side effect.
> 
> Could you please choose one or create a better solution to fix this bug? :-)


  It seems to me like this does the job:

diff -rN -u old-head/src/cmdline/cmdline_prompt.cc 
new-head/src/cmdline/cmdline_prompt.cc
--- old-head/src/cmdline/cmdline_prompt.cc      2006-10-25 16:54:54.000000000 
-0700
+++ new-head/src/cmdline/cmdline_prompt.cc      2006-10-25 16:54:54.000000000 
-0700
@@ -19,6 +19,7 @@
 
 #include <vscreen/fragment.h>
 #include <vscreen/vscreen.h>
+#include <vscreen/transcode.h>
 
 #include <apt-pkg/algorithms.h>
 #include <apt-pkg/dpkgpm.h>
@@ -83,7 +84,8 @@
          first=false;
        }
 
-      s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
+      wstring dep_name = transcode(const_cast<pkgCache::DepIterator 
&>(why->dep).DepType());
+      s += transcode(dep_name.substr(0, 1));
       s+=": ";
       s+=why->pkg.Name();
     }

  The main drawback is that it doesn't do error-checking, which means
that a broken translation file will result in all the dependency types
turning into "?".  This leads me to a broader question, though: is the
first character really a suitable abbreviation for the dependency type
in all languages?  I wonder, for instance, whether just the first
Chinese character will be understood by Chinese speakers as a shortening
of the dependency type string.  Probably I should eventually add a
special set of "dependency abbreviation" translations, but right now
Christian will kill me if I do that. ;-)


> --- src/cmdline/cmdline_prompt.cc.orig        2006-10-24 03:16:42.000000000 
> +0900
> +++ src/cmdline/cmdline_prompt.cc     2006-10-24 18:01:46.000000000 +0900
> @@ -83,7 +83,13 @@
>         first=false;
>       }
>  
> -      s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> +      mbstate_t mbstate;
> +      size_t len;
> +      char *dep_type=strdup(const_cast<pkgCache::DepIterator 
> &>(why->dep).DepType());
> +      memset(&mbstate, 0, sizeof(mbstate));
> +      len=mbrlen(dep_type, strlen(dep_type), &mbstate);
> +      dep_type[len]=0;
> +      s+=dep_type;
>        s+=": ";
>        s+=why->pkg.Name();
>      }

  I assume this is pretty efficient, but it's also not consistent with
the rest of the aptitude codebase (and I doubt that efficiency matters
here).

> --- src/cmdline/cmdline_prompt.cc.orig        2006-10-24 03:16:42.000000000 
> +0900
> +++ src/cmdline/cmdline_prompt.cc     2006-10-24 18:36:45.000000000 +0900
> @@ -19,6 +19,7 @@
>  
>  #include <vscreen/fragment.h>
>  #include <vscreen/vscreen.h>
> +#include <vscreen/transcode.h>
>  
>  #include <apt-pkg/algorithms.h>
>  #include <apt-pkg/dpkgpm.h>
> @@ -83,7 +84,27 @@
>         first=false;
>       }
>  
> -      s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> +
> +      bool converting_success = false;
> +      std::string dep_type;
> +      std::wstring w_dep_type;
> +
> +      dep_type = const_cast<pkgCache::DepIterator &>(why->dep).DepType();
> +      if (transcode(dep_type, w_dep_type))
> +        {
> +          std::string dep_type_first_char;
> +          std::wstring w_dep_type_first_char;
> +          w_dep_type_first_char = w_dep_type.substr(0, 1);
> +          if (transcode(w_dep_type_first_char, dep_type_first_char))
> +            {
> +              s+=dep_type_first_char;
> +              converting_success = true;
> +            }
> +        }
> +
> +      if (!converting_success)
> +        s+=dep_type[0];
> +
>        s+=": ";
>        s+=why->pkg.Name();
>      }

  This uses aptitude's conventions for transcoding strings, but the
verbosity is a bit awkward.  Moreover, if the translation fails, falling
back to displaying the original string won't help: a failed translation
probably means that the string is in a charset that can't be displayed!

  I'd lean in favor of the simple approach as a short-term solution, and
using a proper separate translation in the long term.

  Daniel


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#395007: aptitude --show-deps is broken for multibyte descriptive text

Reply via email to