Package: po4a
Version: 0.45-1
Severity: normal
Tags: patch

Hi,

The man module does not handle tbl's T{ ... }T text blocks, and splits
the blocks into separate lines. Fox example the follwing code from the
ps(1) man page:

  %cpu    %CPU    T{
  cpu utilization of the process in "##.#" format.  It is the CPU time
  used divided by the time the process has been running (cputime/realtime
  ratio), expressed as a percentage. It will not add up to 100% unless you
  are lucky.  (alias\ \fBpcpu\fR).
  T}

generates the following msgids:
  msgid "%cpu\t%CPU\tT{\n"
  msgid "cpu utilization of the process in \"##.#\" format.  It is the CPU 
time\n"
  msgid "used divided by the time the process has been running 
(cputime/realtime\n"
  msgid "ratio), expressed as a percentage. It will not add up to 100% unless 
you\n"
  msgid "are lucky.  (alias\\ B<pcpu>).\n"
  msgid "T}\n"
Translating such content seems to be a nightmare.


The attached patch tries to solve the issue by concatenating the lines
inside T{ ... }T - such approach works in most cases.

It does not work however if the T{...}T block contains embeeded macros, like 
for example in the follwing quote of the same ps.1 man page:

  class   CLS     T{
  scheduling class of the process.  (alias\ \fBpolicy\fR,\ \fBcls\fR).
  Field's possible values are:
  .br
  \-      not reported
  .br
  TS      SCHED_OTHER
  .br
  FF      SCHED_FIFO
  .br
  RR      SCHED_RR
  .br
  ?       unknown value
  T}

Without having no idea, how to properly handle this, I've choosen to
fall-back to previous behavior of translating the macro lines
separately - see the TODO comment in the patch. (However to be honest
in my opinion,  even with lack of proper support for embedded macros, 
the patched version is more translator-friendly than the original one,
so I would be grateful if you could apply the patch in following
versions of po4a).


Additionally you may notice that the patch fixes the typo in `split "\\t"'
line so that the tbl data is now properly split into msgids on
tabulators, what I belive was the intention of the author of the code.


Regards,
robert

-- System Information:
Debian Release: jessie/sid
  APT prefers unstable
  APT policy: (990, 'unstable'), (200, 'testing')
Architecture: i386 (i686)

Kernel: Linux 3.11-2-686-pae (SMP w/1 CPU core)
Locale: LANG=pl_PL.UTF8, LC_CTYPE=pl_PL.UTF8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages po4a depends on:
ii  gettext        0.18.3.2-1
ii  libsgmls-perl  1.03ii-33
ii  perl           5.18.2-4
ii  perl-modules   5.18.2-4
ii  sp             1.3.4-1.2.1-47.3

Versions of packages po4a recommends:
ii  liblocale-gettext-perl     1.05-8
ii  libterm-readkey-perl       2.32-1
ii  libtext-wrapi18n-perl      0.06-7
pn  libunicode-linebreak-perl  <none>

po4a suggests no packages.

-- no debconf information

-- debsums errors found:
debsums: changed file /usr/share/perl5/Locale/Po4a/Man.pm (from po4a package)
Index: lib/Locale/Po4a/Man.pm
===================================================================
--- lib/Locale/Po4a/Man.pm	(wersja 2754)
+++ lib/Locale/Po4a/Man.pm	(kopia robocza)
@@ -2191,6 +2191,7 @@
 $macro{'TS'}=sub {
     my $self=shift;
     my ($in_headers,$buffer)=(1,"");
+    my ($in_textblock,$preline,$postline)=(0,"","");
     my ($line,$ref)=$self->shiftline();
 
     # Push table start
@@ -2206,18 +2207,49 @@
                 $in_headers = 0;
             }
             $self->pushline($self->r($line));
-        } elsif ($line =~ /\\$/) {
+        } elsif ($in_textblock && $line =~ /^T}\s*/) { # end of text block
+            $in_textblock = 0;
+            $preline = $&; # save the `T}' marker to be output later
+            $line = $';    # save the remaing part of the line
+            $self->pushline($self->translate($buffer,
+                                             $ref,
+                                            'tbl table'));
+            $buffer = "";
+            next; # continue processing with the remaining part of the line
+        } elsif ($in_textblock && $line =~ /^[.']/) {
+            # TODO: properly handle macros inside text blocks, currently we mark them
+            # for translations just like the previous version did
+            $self->pushline($self->translate($buffer,
+                                             $ref,
+                                            'tbl table'));
+            $self->pushline($self->translate($line,
+                                             $ref,
+                                            'tbl table'));
+            $buffer = "";
+        } elsif ($line =~ /\\$/ || $in_textblock) {
             # Lines are continued on \ at the end of line
             $buffer .= $line;
         } else {
+            if ($line =~ s/\s*T{\s*$//) { # start of text block
+              $in_textblock = 1;
+              $postline = $&; # save the `T{' to be outputed below
+            } elsif ($buffer eq "" && $line ne ""){ # single line data
+              chomp $line; # drop eol char from the entry to be translated
+              $postline = "\n"; # and save the eol for output below
+            }
+
             $buffer .= $line;
             # Arguments to translate are separated by \t
-            $self->pushline(join("\t",
-                                 map { $self->translate($buffer,
+            $self->pushline($preline
+                            .join("\t",
+                                 map { $self->translate($_,
                                                         $ref,
                                                         'tbl table')
-                                     } split (/\\t/,$line)));
-            $buffer = "";
+                                     } split (/\t/,$buffer))
+                           .$postline);
+
+            $buffer = $preline = $postline = "";
+ 
         }
         ($line,$ref)=$self->shiftline();
     }

Reply via email to