Package: detox
Version: 1.2.0-5
Severity: normal
Tags: upstream patch
Dear Maintainer,
detox does not "pass through" - that is, when configuring a .tbl table
file with not default, or "default" with nothing else on the line,
followed by those changes I want to achieve with my files, detox fails
to achieve desired outcome. Instead with certain UTF-8 characters, detox
creates malformed output characters (i.e. incomplete).
Here's an example of what I cannot achieve with detox (only changing a
few problematic chars, keeping all the greek, cyrillic etc chars),
running in uxterm (or xterm -u8):
$ cat /home/justa/etc/detox/ztest.tbl
start
0x0026 _and_ # AMPERSAND
# Chars to translate to _
0x0020 _ # space
0x0021 _ # !
0x0022 _ # "
0x0024 _ # $
0x0027 _ # '
0x002a _ # *
0x002f _ # /
0x003a _ # :
0x003b _ # ;
0x003c _ # <
0x003e _ # >
0x003f _ # ?
0x0040 _ # @
0x005c _ # \
0x0060 _ # `
0x007c _ # |
# Chars to translate to -
0x0028 - # (
0x0029 - # )
0x005b - # [
0x005d - # ]
0x007b - # {
0x007d - # }
end
$ cat ~/.detoxrc
sequence gnu {
utf_8 {filename "/home/justa/etc/detox/ztest.tbl";};
};
$ env|egrep "LOC|LANG|UTF|utf|LC"
LC_ALL=zen.UTF-8
MAILCHECK=0
LANG=zen.UTF-8
XTERM_LOCALE=zen.UTF-8
unset -v CLASSPATH_LOCAL;
$ # (my custom locale is just UTF-8 with a custom default date format)
$ touch "mÉ Æ.txt"
$ ls *txt; ls -l *txt; ls -lb *txt
mÉ Æ.txt
-rw------- 1 justa justa 0 Apr 30 22:04 mÉ Æ.txt
-rw------- 1 justa justa 0 Apr 30 22:04 mÉ\ Æ.txt
What I want is for the file "mÉ Æ.txt" to end up with the following
name:
mÉ_Æ.txt
but instead as we can see:
$ detox -vs gnu *txt
Scanning: mÉ Æ.txt
mÉ Æ.txt -> m .txt
$ ls *txt; ls -l *txt; ls -lb *txt
m? ?.txt
-rw------- 1 justa justa 0 Apr 30 22:04 m? ?.txt
-rw------- 1 justa justa 0 Apr 30 22:04 m\207\ \204.txt
and we see that some malformed chars have been created,
(whatever that is, I'm not sure).
The patch, thanks to Vasily Kolobkov, is quite simple - basically just a
missing "continue;" is added in a couple of places, fixing up the
clean_safe and clean_iso8859_1 methods. There's probably a similar
change needed in clean_utf_8 method - this is not yet done.
Patch 1 is just the fix.
Patch 2 adds example table files for fine grained "cascading" in user
defined detox config sequences.
Patch 3 tidies up the "sample" filenames, which don't need to end with
".sample" and are actually required in a normal installation anyway, and
so e.g. on Debian stable, result in duplicate files (should be symlinks
at least, but shouldn't be duplicated anyway - DRY/ remove redunancy).
-- System Information:
Debian Release: 8.7
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'proposed-updates'), (500,
'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 4.9.0-0.bpo.2-amd64 (SMP w/4 CPU cores)
Locale: LANG=zen.UTF-8, LC_CTYPE=zen.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set
to zen.UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages detox depends on:
ii libc6 2.19-18+deb8u7
detox recommends no packages.
detox suggests no packages.
-- no debconf information