control: tags 767666 confirmed 

Hi,

On Wed, Nov 05, 2014 at 08:33:51AM +0100, Mattia Rizzolo wrote:
> ssh://git.debian.org/git/collab-maint/sigil.git

Thanks.  I got the source.  Let's trace the situation.

$ DEBUG=f debmake -k
D: /usr/bin/debmake started
D: PYTHONPATH = 
/usr/bin:/usr/lib/python3.4:/usr/lib/python3.4/plat-x86_64-linux-gnu:/usr/lib/python3.4/lib-dynload:/usr/local/lib/python3.4/dist-packages:/usr/lib/python3/dist-packages
 
I: set parameters
Dp: @post-para para[*]:
  para[package] = ""
  para[version] = ""
  para[revision] = ""
  para[targz] = ""

I: compare debian/copyright with the source
I:  60 %, ext = c
I:  19 %, ext = media
I:   5 %, ext = ui
I:   5 %, ext = ts
I:   2 %, ext = javascript
I:   1 %, ext = python
I:   1 %, ext = dic
I:   1 %, ext = text
I:   1 %, ext = cmake
I:   1 %, ext = aff
I:   1 %, ext = md
I:   1 %, ext = qrc
I:   0 %, ext = icns
I:   0 %, ext = rc
I:   0 %, ext = epub
I:   0 %, ext = install
I:   0 %, ext = ini
I:   0 %, ext = source
I:   0 %, ext = 1
I:   0 %, ext = generic
I:   0 %, ext = plist
I:   0 %, ext = iss
I:   0 %, ext = desktop
I:   0 %, ext = dist
I: check_all_licenses
I: Df: check_all_licenses file=README.md
.Df: check_all_licenses file=CMakeLists.txt
.Df: check_all_licenses file=COPYING.txt
... snip ...
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/en_US.aff
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/en_GB.aff
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/en_GB.dic
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/hyph_fr.dic
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/de_DE.aff
.W: Non-UTF-8 char found, using latin-1: 
src/Sigil/Resource_Files/dictionaries/de_DE.aff
Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/hyph_es.dic
.W: Non-UTF-8 char found, using latin-1: 
src/Sigil/Resource_Files/dictionaries/hyph_es.dic
Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/About.txt
.Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/es.aff
.W: Non-UTF-8 char found, using latin-1: 
src/Sigil/Resource_Files/dictionaries/es.aff
Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/es.dic
.W: Non-UTF-8 char found, using latin-1: 
src/Sigil/Resource_Files/dictionaries/es.dic
Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/hyph_de_DE.dic
.W: Non-UTF-8 char found, using latin-1: 
src/Sigil/Resource_Files/dictionaries/hyph_de_DE.dic
Df: check_all_licenses file=src/Sigil/Resource_Files/dictionaries/fr.dic
.^CTraceback (most recent call last):
  File "/usr/bin/debmake", line 28, in <module>
    debmake.main()
  File "/usr/lib/python3/dist-packages/debmake/__init__.py", line 133, in main
    debmake.kludge.kludge(para['kludge'])
  File "/usr/lib/python3/dist-packages/debmake/kludge.py", line 156, in kludge
    basedata = copydiff(mode)
  File "/usr/lib/python3/dist-packages/debmake/kludge.py", line 117, in copydiff
    data_new = debmake.copyright.check_copyright(nonlink_files, mode=1)
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 802, in 
check_copyright
    adata = check_all_licenses(files, encoding=encoding, mode=mode)
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 730, in 
check_all_licenses
    (copyright_data, license_lines) = check_license(file, encoding=encoding)
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 699, in 
check_license
    (copyright_data, license_lines) = check_lines(fd.readlines())
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 677, in 
check_lines
    copyright_data = analyze_copyright(copyright_lines)
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 134, in 
analyze_copyright
    line = normalize_year_span(line).strip()
  File "/usr/lib/python3/dist-packages/debmake/copyright.py", line 53, in 
normalize_year_span
    m = re_year_1900.search(line)
KeyboardInterrupt

Yes this bug exists and shows up reading:
 src/Sigil/Resource_Files/dictionaries/fr.dic

This file is huge. ~1MB (but larger de_DE.dic is fine)
There is no copyright data in it.
This is text data with non-ascii. (looks like utf-8 although other langs
are latin-1)

I need to think a bit to get this fixed.

Osmau


-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to