Package: catdoc Followup-For: Bug #815109 (I'm not a Debian Developer, I'm just a debian-l10n-english regular)
Thomas Vincent wrote: > Subject: catdoc: Typos in the package description > > Package: catdoc > Severity: minor > > Dear Maintainer, > > I spotted a couple of typos while translating catdoc's description: > > * s/insinde/inside/ in the first paragraph > * s/it's goal/its goal/ in the third paragraph And there are quite a few other problems you didn't spot. Here's the nitpicky review I'd have given this description if it had come through the d-l-e mailing list (plus patch): | Convert Word, Excel, and PowerPoint files to plain text That's a capitalised verb phrase; the Developer's Reference recommends using an uncapitalised noun phrase (such as the old version's synopsis "MS-Word to TeX or plain text converter"). If you really want to mention the other MS apps here, you could make it: plain text extractor for MS-Word/Excel/PowerPoint files or since those words already occur in the long description, maybe the synopsis should summarise by using a coverterm (providing an extra keyword for searches): text extractor for MS-Office files or even, this being the twentyfirst century, text extractor for Microsoft/Open/LibreOffice files but my patch sticks to "MS-Office". | The catdoc program reads one or more Microsoft word files and outputs text, | contained insinde them to standard output. Lost capitalisation: "Microsoft Word". You've got a missing determiner or similar - "outputs the text". Surplus comma: "the text contained inside them". Typo: s/insinde/inside/. It doesn't just find text and output it; it converts the contents *into* text. I'd phrase this as: The catdoc program reads one or more Microsoft Word files and outputs their contents to standard output as text. I gather the current version supports things more recent than Word-97! | It is now accompanied by xls2csv, a program which converts Excel spreadsheet | into comma-separated value file, and catppt a utility to extract textual | information from Powerpoint files. That's not news - xls2cvs was added back in the nineties. You've got the missing-determiner problem again twice, though I notice the original upstream version loses the indefinite articles too. The easiest fix this time would be to say "converts Excel spreadsheets into comma-separated-value files" (with fully hyphenated "c-s-v"). Except that this makes it sound as if it modifies the .xls files in place, rather than just extracting the data to stdout; say "c-s-v format" instead. Then "and catppt a utility" needs a comma, and PowerPoint should be camelcase. Maybe: It is accompanied by xls2csv, a program which converts Excel spreadsheets into comma-separated-values format, and catppt, a utility to extract textual information from PowerPoint files. | It doesn't try to preserve Word formatting, it's goal is to extract plain text | and allow you to read it and, probably, reformat with TeX. That first comma needs to be at least a semicolon. Bogus apostrophe. Wonky elision: you can say "to read it and reformat it with TeX", or drop the first "it", but dropping the second is ungrammatical. The intended interpretation for "probably" here seems to be "you'll probably want to", but after "can" the more natural reading is "it might not work". Instead I'd suggest marking it as optional by parenthesising it: It doesn't try to preserve Word formatting; its goal is to extract plain text and allow you to read it (and, probably, reformat it with TeX). | This package suggests tk because it also includes wordview, an optional | Tk-based GUI for catdoc. The MIME config provided in this package will use | wordview if X is running, or catdoc directly if it is not. No problems here. Mind you, this description essentially predates the availability on Debian systems of Open/LibreOffice, so really it needs some added text to cover (for instance) the possibility that I might want to use catdoc on my own LibreOffice .doc files! -- JBR with qualifications in linguistics, experience as a Debian sysadmin, and probably no clue about this particular package
--- catdoc-0.94.3~git20160113.dbc9ec6+dfsg.pristine/debian/control 2016-01-13 22:44:37.000000000 +0000 +++ catdoc-0.94.3~git20160113.dbc9ec6+dfsg/debian/control 2016-02-19 00:30:43.299037609 +0000 @@ -12,16 +12,16 @@ Depends: ${shlibs:Depends}, ${misc:Depends} Suggests: tk | wish Homepage: http://www.wagner.pp.ru/~vitus/software/catdoc/ -Description: Convert Word, Excel, and PowerPoint files to plain text - The catdoc program reads one or more Microsoft word files and outputs text, - contained insinde them to standard output. +Description: text extractor for MS-Office files + The catdoc program reads one or more Microsoft Word files and outputs + their contents to standard output as text. . - It is now accompanied by xls2csv, a program which converts Excel spreadsheet - into comma-separated value file, and catppt a utility to extract textual - information from Powerpoint files. + It is accompanied by xls2csv, a program which converts Excel spreadsheets + into comma-separated-values format, and catppt, a utility to extract textual + information from PowerPoint files. . - It doesn't try to preserve Word formatting, it's goal is to extract plain text - and allow you to read it and, probably, reformat with TeX. + It doesn't try to preserve Word formatting; its goal is to extract plain + text and allow you to read it (and, probably, reformat it with TeX). . This package suggests tk because it also includes wordview, an optional Tk-based GUI for catdoc. The MIME config provided in this