Package: detox
Version: 1.2.0-5
Severity: normal
Tags: upstream patch

Dear Maintainer,

detox does not "pass through" - that is, when configuring a .tbl table
file with not default, or "default" with nothing else on the line,
followed by those changes I want to achieve with my files, detox fails
to achieve desired outcome. Instead with certain UTF-8 characters, detox
creates malformed output characters (i.e. incomplete).

Here's an example of what I cannot achieve with detox (only changing a
few problematic chars, keeping all the greek, cyrillic etc chars),
running in uxterm (or xterm -u8):

$ cat /home/justa/etc/detox/ztest.tbl
start
0x0026          _and_   # AMPERSAND

# Chars to translate to _
0x0020          _       # space
0x0021          _       # !
0x0022          _       # "
0x0024          _       # $
0x0027          _       # '
0x002a          _       # *
0x002f          _       # /
0x003a          _       # :
0x003b          _       # ;
0x003c          _       # <
0x003e          _       # >
0x003f          _       # ?
0x0040          _       # @
0x005c          _       # \
0x0060          _       # `
0x007c          _       # |

# Chars to translate to -
0x0028          -       # (
0x0029          -       # )
0x005b          -       # [
0x005d          -       # ]
0x007b          -       # {
0x007d          -       # }
end

$ cat ~/.detoxrc
sequence gnu {
   utf_8 {filename "/home/justa/etc/detox/ztest.tbl";};
};

$ env|egrep "LOC|LANG|UTF|utf|LC"
LC_ALL=zen.UTF-8
MAILCHECK=0
LANG=zen.UTF-8
XTERM_LOCALE=zen.UTF-8
 unset -v CLASSPATH_LOCAL;

$ # (my custom locale is just UTF-8 with a custom default date format)

$ touch "mÉ Æ.txt"

$ ls *txt; ls -l *txt; ls -lb *txt
mÉ Æ.txt
-rw------- 1 justa justa 0 Apr 30 22:04 mÉ Æ.txt
-rw------- 1 justa justa 0 Apr 30 22:04 mÉ\ Æ.txt


What I want is for the file "mÉ Æ.txt" to end up with the following
name:
 mÉ_Æ.txt


but instead as we can see:

$ detox -vs gnu *txt                                                            
                                                                                
                                      
Scanning: mÉ Æ.txt
mÉ Æ.txt -> m .txt

$ ls *txt; ls -l *txt; ls -lb *txt
m? ?.txt
-rw------- 1 justa justa 0 Apr 30 22:04 m? ?.txt
-rw------- 1 justa justa 0 Apr 30 22:04 m\207\ \204.txt


and we see that some malformed chars have been created,
(whatever that is, I'm not sure).


The patch, thanks to Vasily Kolobkov, is quite simple - basically just a
missing "continue;" is added in a couple of places, fixing up the
clean_safe and clean_iso8859_1 methods. There's probably a similar
change needed in clean_utf_8 method - this is not yet done.

Patch 1 is just the fix.

Patch 2 adds example table files for fine grained "cascading" in user
defined detox config sequences.

Patch 3 tidies up the "sample" filenames, which don't need to end with
".sample" and are actually required in a normal installation anyway, and
so e.g. on Debian stable, result in duplicate files (should be symlinks
at least, but shouldn't be duplicated anyway - DRY/ remove redunancy).




-- System Information:
Debian Release: 8.7
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'proposed-updates'), (500, 
'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-0.bpo.2-amd64 (SMP w/4 CPU cores)
Locale: LANG=zen.UTF-8, LC_CTYPE=zen.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set 
to zen.UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages detox depends on:
ii  libc6  2.19-18+deb8u7

detox recommends no packages.

detox suggests no packages.

-- no debconf information

Reply via email to