On Tue, Oct 24, 2006 at 08:16:31PM +0900, Kobayashi Noritada <[EMAIL PROTECTED]> was heard to say: > akira yamada (akira) reported on his book[1] about deb package and packaging > that Japanese descriptive text from `aptitude --show-deps' is broken. > Actually, following code at reason_string_list in > src/cmdline/cmdline_prompt.cc > cannot handle multibyte characters: > > s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0]; > > Here attached two patches to solve this problem: one is a version I made by > referring to sample code[2] from Junichi Uekawa (dancer), and the other is a > version privately presented from Kouhei Sutou. I rebuilt with each of these > patches and checked that both can work as a solution. However, although I can > understand basically what these patches are doing, I don't have a good C++ > skill, and cannot determine which is a better solution nor create a better and > compact patch free of side effect. > > Could you please choose one or create a better solution to fix this bug? :-)
It seems to me like this does the job: diff -rN -u old-head/src/cmdline/cmdline_prompt.cc new-head/src/cmdline/cmdline_prompt.cc --- old-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700 +++ new-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700 @@ -19,6 +19,7 @@ #include <vscreen/fragment.h> #include <vscreen/vscreen.h> +#include <vscreen/transcode.h> #include <apt-pkg/algorithms.h> #include <apt-pkg/dpkgpm.h> @@ -83,7 +84,8 @@ first=false; } - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0]; + wstring dep_name = transcode(const_cast<pkgCache::DepIterator &>(why->dep).DepType()); + s += transcode(dep_name.substr(0, 1)); s+=": "; s+=why->pkg.Name(); } The main drawback is that it doesn't do error-checking, which means that a broken translation file will result in all the dependency types turning into "?". This leads me to a broader question, though: is the first character really a suitable abbreviation for the dependency type in all languages? I wonder, for instance, whether just the first Chinese character will be understood by Chinese speakers as a shortening of the dependency type string. Probably I should eventually add a special set of "dependency abbreviation" translations, but right now Christian will kill me if I do that. ;-) > --- src/cmdline/cmdline_prompt.cc.orig 2006-10-24 03:16:42.000000000 > +0900 > +++ src/cmdline/cmdline_prompt.cc 2006-10-24 18:01:46.000000000 +0900 > @@ -83,7 +83,13 @@ > first=false; > } > > - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0]; > + mbstate_t mbstate; > + size_t len; > + char *dep_type=strdup(const_cast<pkgCache::DepIterator > &>(why->dep).DepType()); > + memset(&mbstate, 0, sizeof(mbstate)); > + len=mbrlen(dep_type, strlen(dep_type), &mbstate); > + dep_type[len]=0; > + s+=dep_type; > s+=": "; > s+=why->pkg.Name(); > } I assume this is pretty efficient, but it's also not consistent with the rest of the aptitude codebase (and I doubt that efficiency matters here). > --- src/cmdline/cmdline_prompt.cc.orig 2006-10-24 03:16:42.000000000 > +0900 > +++ src/cmdline/cmdline_prompt.cc 2006-10-24 18:36:45.000000000 +0900 > @@ -19,6 +19,7 @@ > > #include <vscreen/fragment.h> > #include <vscreen/vscreen.h> > +#include <vscreen/transcode.h> > > #include <apt-pkg/algorithms.h> > #include <apt-pkg/dpkgpm.h> > @@ -83,7 +84,27 @@ > first=false; > } > > - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0]; > + > + bool converting_success = false; > + std::string dep_type; > + std::wstring w_dep_type; > + > + dep_type = const_cast<pkgCache::DepIterator &>(why->dep).DepType(); > + if (transcode(dep_type, w_dep_type)) > + { > + std::string dep_type_first_char; > + std::wstring w_dep_type_first_char; > + w_dep_type_first_char = w_dep_type.substr(0, 1); > + if (transcode(w_dep_type_first_char, dep_type_first_char)) > + { > + s+=dep_type_first_char; > + converting_success = true; > + } > + } > + > + if (!converting_success) > + s+=dep_type[0]; > + > s+=": "; > s+=why->pkg.Name(); > } This uses aptitude's conventions for transcoding strings, but the verbosity is a bit awkward. Moreover, if the translation fails, falling back to displaying the original string won't help: a failed translation probably means that the string is in a charset that can't be displayed! I'd lean in favor of the simple approach as a short-term solution, and using a proper separate translation in the long term. Daniel -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]