Package: catdoc
Followup-For: Bug #815109

(I'm not a Debian Developer, I'm just a debian-l10n-english regular)

Thomas Vincent wrote:
> Subject: catdoc: Typos in the package description
> 
> Package: catdoc
> Severity: minor
> 
> Dear Maintainer,
> 
> I spotted a couple of typos while translating catdoc's description:
> 
>  * s/insinde/inside/ in the first paragraph
>  * s/it's goal/its goal/ in the third paragraph

And there are quite a few other problems you didn't spot.  Here's the
nitpicky review I'd have given this description if it had come through
the d-l-e mailing list (plus patch):

| Convert Word, Excel, and PowerPoint files to plain text

That's a capitalised verb phrase; the Developer's Reference recommends
using an uncapitalised noun phrase (such as the old version's synopsis
"MS-Word to TeX or plain text converter").  If you really want to
mention the other MS apps here, you could make it:

  plain text extractor for MS-Word/Excel/PowerPoint files

or since those words already occur in the long description, maybe the
synopsis should summarise by using a coverterm (providing an extra
keyword for searches):

  text extractor for MS-Office files

or even, this being the twentyfirst century,

  text extractor for Microsoft/Open/LibreOffice files

but my patch sticks to "MS-Office".
       
| The catdoc program reads one or more Microsoft word files and outputs text,
| contained insinde them to standard output.

Lost capitalisation: "Microsoft Word".

You've got a missing determiner or similar - "outputs the text".

Surplus comma: "the text contained inside them".

Typo: s/insinde/inside/.

It doesn't just find text and output it; it converts the contents
*into* text.  I'd phrase this as:

  The catdoc program reads one or more Microsoft Word files and outputs
  their contents to standard output as text.

I gather the current version supports things more recent than Word-97!

| It is now accompanied by xls2csv, a program which converts Excel spreadsheet
| into comma-separated value file, and catppt a utility to extract textual
| information from Powerpoint files.
       
That's not news - xls2cvs was added back in the nineties.

You've got the missing-determiner problem again twice, though I notice
the original upstream version loses the indefinite articles too.  The
easiest fix this time would be to say "converts Excel spreadsheets
into comma-separated-value files" (with fully hyphenated "c-s-v").
Except that this makes it sound as if it modifies the .xls files in
place, rather than just extracting the data to stdout; say "c-s-v
format" instead.

Then "and catppt a utility" needs a comma, and PowerPoint should be
camelcase.

Maybe:

  It is accompanied by xls2csv, a program which converts Excel spreadsheets
  into comma-separated-values format, and catppt, a utility to extract textual
  information from PowerPoint files.
       
| It doesn't try to preserve Word formatting, it's goal is to extract plain text
| and allow you to read it and, probably, reformat with TeX.

That first comma needs to be at least a semicolon.

Bogus apostrophe.

Wonky elision: you can say "to read it and reformat it with TeX",
or drop the first "it", but dropping the second is ungrammatical.

The intended interpretation for "probably" here seems to be "you'll
probably want to", but after "can" the more natural reading is "it
might not work".  Instead I'd suggest marking it as optional by
parenthesising it:

  It doesn't try to preserve Word formatting; its goal is to extract plain
  text and allow you to read it (and, probably, reformat it with TeX).

| This package suggests tk because it also includes wordview, an optional
| Tk-based GUI for catdoc. The MIME config provided in this package will use
| wordview if X is running, or catdoc directly if it is not.

No problems here.

Mind you, this description essentially predates the availability on
Debian systems of Open/LibreOffice, so really it needs some added text
to cover (for instance) the possibility that I might want to use
catdoc on my own LibreOffice .doc files!
-- 
JBR     with qualifications in linguistics, experience as a Debian
        sysadmin, and probably no clue about this particular package
--- catdoc-0.94.3~git20160113.dbc9ec6+dfsg.pristine/debian/control	2016-01-13 22:44:37.000000000 +0000
+++ catdoc-0.94.3~git20160113.dbc9ec6+dfsg/debian/control	2016-02-19 00:30:43.299037609 +0000
@@ -12,16 +12,16 @@
 Depends: ${shlibs:Depends}, ${misc:Depends}
 Suggests: tk | wish
 Homepage: http://www.wagner.pp.ru/~vitus/software/catdoc/
-Description: Convert Word, Excel, and PowerPoint files to plain text
- The catdoc program reads one or more Microsoft word files and outputs text,
- contained insinde them to standard output.
+Description: text extractor for MS-Office files
+ The catdoc program reads one or more Microsoft Word files and outputs
+ their contents to standard output as text.
  .
- It is now accompanied by xls2csv, a program which converts Excel spreadsheet
- into comma-separated value file, and catppt a utility to extract textual
- information from Powerpoint files.
+ It is accompanied by xls2csv, a program which converts Excel spreadsheets
+ into comma-separated-values format, and catppt, a utility to extract textual
+ information from PowerPoint files.
  .
- It doesn't try to preserve Word formatting, it's goal is to extract plain text
- and allow you to read it and, probably, reformat with TeX.
+ It doesn't try to preserve Word formatting; its goal is to extract plain
+ text and allow you to read it (and, probably, reformat it with TeX).
  .
  This package suggests tk because it also includes wordview, an
  optional Tk-based GUI for catdoc.  The MIME config provided in this

Reply via email to