Bug#805054: ITP: teckit -- Encoding conversion tools for plain text files

Jonas Smedegaard Sun, 15 Nov 2015 05:21:36 -0800

Quoting Daniel Glassey (2015-11-14 18:39:22)
> On Sat, Nov 14, 2015 at 02:02:47AM +0100, Jonas Smedegaard wrote:
> > Quoting Daniel Glassey (2015-11-14 01:01:19)
> > > On Sat, Nov 14, 2015 at 12:24:05AM +0100, Jonas Smedegaard wrote:
> > > > Quoting Daniel Glassey (2015-11-13 23:54:38)
> > > > > Package name    : teckit
> > > > 
> > > > > Description:
> > > > 
> > > > > TECkit provides a generic library and tools for converting data to 
> > > > > and 
> > > > > from Unicode and also from one Unicode encoding to another.
> > > > > It also includes a compiler for a description language that allows 
> > > > > for 
> > > > > birectional conversion description (the same description is used for 
> > > > > conversion to and from Unicode, for example).
> > > > 
> > > > How is this different from recode?
> > 
> > > My understanding is that teckit is a way to (relatively) easily define 
> > > a mapping for legacy custom encodings to Unicode. It is particularly 
> > > useful for converting text documents using legacy non-Unicode fonts 
> > > which use a font specific encoding to Unicode.
> > > 
> > > It is possible to add custom character sets to recode but afaict that 
> > > is a programmer level thing whereas teckit mappings are easier for the 
> > > font people.
> > 
> > Thanks for those details.
> > 
> > Seems appropriate to me to include in long description.
> 
> Thanks Jonas. How do you think this looks as a long description?:
> 
> Description: Encoding conversion tools for plain text files
>  TECkit is a toolkit for encoding conversions. It offers a simple format for
>  describing the mapping between legacy 8-bit encodings and Unicode, and a
>  set of utilities based on such descriptions for converting text between 8-bit
>  and Unicode encodings.
>  .
>  It also includes a compiler for the mapping description language that allows
>  for bidirectional conversion description (i.e. the same description is used
>  for conversion to and from Unicode).


I see nothing wrong with the description paragraphs you propose above 
(nor the ones proposed initially), but my suggestion is that in 
_addition_ to those also add a paragraph on what sets this package apart 
from other packages also in Debian, e.g. recode.

I must admit that I still do not understand what features in this tool 
is different from what recode provides (and I suspect you are using the 
term "fonts" wrong and really mean "encoding").

Here's how I would recode to the name of the island I live on 
into ISO-8859-1 (a.k.a. latin1) or ISO-8859-13 (a.k.a. latin7):

  echo Orø | recode utf8..l1 | less

  echo Orø | recode utf8..l7 | less

(in a UTF-8 capable shell above should show how the ø (i.e. &oslash; as 
html ligature) is encoded differently in those two legacy encodings.

Do you mean to say that TECkit is useful for handling fully custom 
encodings, or perhaps that you were unaware that recode was a 
commandline tool?


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature

Bug#805054: ITP: teckit -- Encoding conversion tools for plain text files

Reply via email to