Package: file Version: 4.26-2 Severity: normal UTF-32BE files beginning with a Byte Order Mark are not properly detected because the unicode magic doesn't match a properly encoded 32-bit big-endian Byte Order Mark. The current match is for FE FF 00 00, but it should be 00 00 FE FF. The attached diff adds a new patch to fix magic/Magdir/unicode.
This was initially reported on Ubuntu's bug tracker, but has been confirmed on current debian git. https://bugs.launchpad.net/ubuntu/+source/file/+bug/285309
From 2856f3bb0451b9cf648374199b68a9d94b1201d9 Mon Sep 17 00:00:00 2001 From: Adam Buchbinder <adam.buchbin...@gmail.com> Date: Thu, 29 Jan 2009 16:11:37 -0500 Subject: [PATCH] Fix UTF-32BE BOM magic. The UTF-32 BOM is wrong for big-endian files. According to the Unicode FAQ [http://unicode.org/faq/utf_bom.html#bom4], it should be 00 00 FE FF. --- debian/patches/00list | 1 + debian/patches/101-magic-fix-utf32be.dpatch | 22 ++++++++++++++++++++++ 2 files changed, 23 insertions(+), 0 deletions(-) create mode 100644 debian/patches/101-magic-fix-utf32be.dpatch diff --git a/debian/patches/00list b/debian/patches/00list index 82c74bf..61b1234 100644 --- a/debian/patches/00list +++ b/debian/patches/00list @@ -1,3 +1,4 @@ +101-magic-fix-utf32be.dpatch 202-magic-update-awk.dpatch 203-magic-update-reiserfs.dpatch 204-magic-update-asf.dpatch diff --git a/debian/patches/101-magic-fix-utf32be.dpatch b/debian/patches/101-magic-fix-utf32be.dpatch new file mode 100644 index 0000000..a7c7667 --- /dev/null +++ b/debian/patches/101-magic-fix-utf32be.dpatch @@ -0,0 +1,22 @@ +#! /bin/sh /usr/share/dpatch/dpatch-run +## 101-magic-fix-utf32be.dpatch by Adam Buchbinder <adam.buchbin...@gmail.com> +## +## All lines beginning with `## DP:' are a description of the patch. +## DP: UTF-32BE text is detected by the presence of the Byte Order Mark, in +## DP: UTF-32BE encoding. The stock version of the BOM is incorrect; it should +## DP: read 00 00 FE FF, according to the Unicode FAQ.[1] (LP: #285309) +## DP: +## DP: [1] http://unicode.org/faq/utf_bom.html#bom4 + +...@dpatch@ +diff -urNad file-4.24~/magic/Magdir/unicode file-4.24/magic/Magdir/unicode +--- file-4.24~/magic/Magdir/unicode 2008-02-28 13:57:35.000000000 -0500 ++++ file-4.24/magic/Magdir/unicode 2009-01-29 15:31:01.000000000 -0500 +@@ -9,6 +9,6 @@ + 0 string +/v+ Unicode text, UTF-7 + 0 string +/v/ Unicode text, UTF-7 + 0 string \335\163\146\163 Unicode text, UTF-8-EBCDIC +-0 string \376\377\000\000 Unicode text, UTF-32, big-endian ++0 string \000\000\376\377 Unicode text, UTF-32, big-endian + 0 string \377\376\000\000 Unicode text, UTF-32, little-endian + 0 string \016\376\377 Unicode text, SCSU (Standard Compression Scheme for Unicode) -- 1.5.6.3