[issue10551] mimetypes reading from registry in windows completely broken

2010-11-27 Thread Kovid Goyal

New submission from Kovid Goyal :

Hi,

I am the primary developer of calibre (http:/calibre-ebook.com) and yesterday I 
released an upgrade of calibre based on python 2.7. Here is a small sampling of 
all the diverse errors that my users experienced, related to reading mimetypes 
from the registry:

1. Permission denied if running from non privileged account
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 84, in run_entry_point
File "site-packages\calibre\__init__.py", line 31, in 
File "mimetypes.py", line 344, in add_type
File "mimetypes.py", line 355, in init
File "mimetypes.py", line 261, in read_windows_registry
WindowsError: [Error 5] Acceso denegado (Access not allowed)

The fix for this is to trap WindowsError and ignore it in mimetypes.py

2. Mishandling of encoding of registry entries

Traceback (most recent call last):  
  File "site.py", line 103, in main 
  File "site.py", line 84, in run_entry_point
  File "site-packages\calibre\__init__.py", line 31, in 

  File "mimetypes.py", line 344, in add_type

  File "mimetypes.py", line 355, in init

  File "mimetypes.py", line 260, in read_windows_registry   

  File "mimetypes.py", line 250, in enum_types  

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 0: invalid 
continuation byte

The fix for this is to change

except UnicodeEncodeError

to

except ValueError

3. python -c "import mimetypes; print mimetypes.guess_type('img.jpg')"
('image/pjpeg', None)

Where the output should have been

(image/jpeg', None)

The fix for this is to load the registry entries before the default entris 
defined in mimetypes.py


Of course, IMHO, the best possible fix is to simply remove the reading of 
mimetypes from the registry. But that is up to whoever maintains this module. 

Duplicate (less comprehensive) tickets ont his isuue in your traceker already 
are: 9291, 10490, 104314

If the maintainer of this module is unable to fix these issues, let me know and 
I will submit a patch, either removing _winreg or fixing the issues 
individually.

--
components: Library (Lib)
messages: 122542
nosy: kovid
priority: normal
severity: normal
status: open
title: mimetypes reading from registry in windows completely broken
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue10551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10551] mimetypes reading from registry in windows completely broken

2010-11-27 Thread Kovid Goyal

Kovid Goyal  added the comment:

And what about the third issue?

Allow me to elaborate:

mimetypes are a relatively standard set of mappings from well known file 
extensions to MIME descriptors. 

Reading mimetype mappings from the registry, a location that is writable to by 
random programs the user may have installed on his machine, let alone malware, 
is a BAD idea.

It leads to situations like asking for the mimetype of file.jpg and getting 
iage/pjpeg back. Or asking for the mimetype of file.png and getting image/x-png 
back.

If you still consider it good to read mimetypes from the registry, at the very 
least, they should be read before the standard mimetype mappings defined in 
mimetypes.py are applied. That way at least for that set of mappings, users of 
python can be assured of sane query results. 

As it stands now, mimetypes.py is useless and to workaround the problem I 
essentially had to define the mimetype mappings for all the mimetypes my 
program knows about by hand.

--
resolution: duplicate -> 
status: closed -> open

___
Python tracker 
<http://bugs.python.org/issue10551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10551] mimetypes read from the registry should not overwrite standard mime mappings

2010-11-27 Thread Kovid Goyal

Kovid Goyal  added the comment:

I apologize for the multiple issue in the ticket. To my mind they were all 
basically one issue, stemming from the decision to read mimetypes from the 
registry.

Since there are other tickets for the first two issues, I'll change the summary 
for this issue to reflect only the third.

--
title: mimetypes reading from registry in windows completely broken -> 
mimetypes read from the registry should not overwrite standard mime mappings

___
Python tracker 
<http://bugs.python.org/issue10551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10551] mimetypes read from the registry should not overwrite standard mime mappings

2010-11-30 Thread Kovid Goyal

Kovid Goyal  added the comment:

It is, of course, your decision, but IMO, since the mimetypes database in 
windows appears to be always broken, the default behavior of the mimetypes 
module in python 2.7 on windows is broken for most (all?) windows installs. For 
me personally, it doesn't matter anymore, as I have already fixed calibre, but 
it would be surprising/unexpected behavior for someone new to using 
mimetypes.py on windows. Certainly, my expectation (perhaps naively) was that 
guess_type('image.jpg') would always return 'image/jpeg'. 

Users on windows rarely (ever?) modify the registry to change mimetypes. The 
only thing that does change mimetypes is installed software, without the users' 
knowledge/consent. So treating the registry as a reliable store of mime 
information, is not a good idea. 

On unix, the knownfiles are system files. I dont know about OS X, but on linux, 
since most software is installed by package managers, the package managers 
usually have policies that prevent application installs from clobbering system 
files. And of course, running userland applications dont have the necessary 
privileges to modify the files. 

Out of curiosity, what is the upside of reading mimetypes from the registry, 
given that it's information cannot be trusted?

And you're most welcome, for calibre :)

--

___
Python tracker 
<http://bugs.python.org/issue10551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10551] mimetypes read from the registry should not overwrite standard mime mappings

2010-11-30 Thread Kovid Goyal

Kovid Goyal  added the comment:

I actually had in mind people that (like me) develop primarily on unix and 
assume that mimetypes works the same way on both windows and unix. Of course, 
the changed behavior is also a concern.

At the very least, I would encourage the addition of a warning to the 
documentation of the mimetypes module.

--
status: pending -> open

___
Python tracker 
<http://bugs.python.org/issue10551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38828] cookiejar.py broken in 3.8

2019-11-17 Thread Kovid Goyal


New submission from Kovid Goyal :

In python 3.8 cookiejar.py is full of code that compares cookie.version to 
integers, which raises as exception when cookie.version is None. For example, 
in set_ok_version() and set_ok_path(). Both the Cookie constructor and 
_cookie_from_cookie_tuple() explicitly assume version can be None and setting 
version to None worked fine in previous pythonreleases.

--
components: Library (Lib)
messages: 356797
nosy: kovid
priority: normal
severity: normal
status: open
title: cookiejar.py broken in 3.8
type: crash
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue38828>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38828] cookiejar.py broken in 3.8

2019-11-17 Thread Kovid Goyal


Kovid Goyal  added the comment:

The issue is obvious with a simple glance at the code. Either the Cookie 
constructor needs to change version = None to zero or some other integer or the 
various methods in that module need to handle a None version. I dont personally 
care about this issue any more since I have worked around it in my code, feel 
free to fix it or not, as you wish.

--

___
Python tracker 
<https://bugs.python.org/issue38828>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38828] http.cookiejar handle cookie.version to be None

2019-11-17 Thread Kovid Goyal


Kovid Goyal  added the comment:

It's trivially True that it is a regression from python 2 since in python 2 
comparison to None is fine. Whether it ever worked in any python 3 version 
before 3.8 I'm not sure about.

--

___
Python tracker 
<https://bugs.python.org/issue38828>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38828] http.cookiejar handle cookie.version to be None

2019-11-17 Thread Kovid Goyal


Kovid Goyal  added the comment:

Here's a trivial script to reproduce:

from urllib.request import Request
from http.cookiejar import Cookie, CookieJar

jar = CookieJar()
jar.set_cookie(Cookie(
None, 'test', 'test',
None, False,
'.test.com', True, False,
'/', True,
False, None, False, None, None, None
))
r = Request('http://www.test.com')
jar.add_cookie_header(r)

--

___
Python tracker 
<https://bugs.python.org/issue38828>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16512] imghdr doesn't support jpegs with an ICC profile

2014-06-12 Thread Kovid Goyal

Kovid Goyal added the comment:

FYI, the test I currently use in calibre, which has not failed so far for 
millions of users:

def test_jpeg(h, f):
if (h[6:10] in (b'JFIF', b'Exif')) or (h[:2] == b'\xff\xd8' and (b'JFIF' in 
h[:32] or b'8BIM' in h[:32])):
return 'jpeg'

--

___
Python tracker 
<http://bugs.python.org/issue16512>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16512] imghdr doesn't recognize variant jpeg formats

2014-06-21 Thread Kovid Goyal

Kovid Goyal added the comment:

You cannot assume the file like object passed to imghdr is seekable. And IMO it 
is not the job of imghdr to check file validity, especially since it does not 
do that for all formats.

--

___
Python tracker 
<http://bugs.python.org/issue16512>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16512] imghdr doesn't support jpegs with an ICC profile

2013-03-19 Thread Kovid Goyal

Kovid Goyal added the comment:

The attached patch is insufficient, for example, it fails on 
http://nationalpostnews.files.wordpress.com/2013/03/budget.jpeg?w=300&h=1571

Note that the linux file utility identifies a files as "JPEG Image data" if the 
first two bytes of the file are \xff\xd8.

A slightly stricter test that catches more jpeg files:

def test_jpeg(h, f):
if (h[6:10] in (b'JFIF', b'Exif')) or (h[:2] == b'\xff\xd8' and b'JFIF' in 
h[:32]):
return 'jpeg'

--
nosy: +kovid

___
Python tracker 
<http://bugs.python.org/issue16512>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15500] Python should support naming threads

2014-11-06 Thread Kovid Goyal

Kovid Goyal added the comment:

Just FYI, a pure python2 implementation that monkey patches Thread.start() to 
set the OS level thread name intelligently.

import ctypes, ctypes.util, threading
libpthread_path = ctypes.util.find_library("pthread")
if libpthread_path:
libpthread = ctypes.CDLL(libpthread_path)
if hasattr(libpthread, "pthread_setname_np"):
pthread_setname_np = libpthread.pthread_setname_np
pthread_setname_np.argtypes = [ctypes.c_void_p, ctypes.c_char_p]
pthread_setname_np.restype = ctypes.c_int
orig_start = threading.Thread.start
def new_start(self):
orig_start(self)
try:
name = self.name
if not name or name.startswith('Thread-'):
name = self.__class__.__name__
if name == 'Thread':
name = self.name
if name:
if isinstance(name, unicode):
name = name.encode('ascii', 'replace')
ident = getattr(self, "ident", None)
if ident is not None:
pthread_setname_np(ident, name[:15])
except Exception:
pass  # Don't care about failure to set name
threading.Thread.start = new_start

--
nosy: +kovid

___
Python tracker 
<http://bugs.python.org/issue15500>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-28 Thread Kovid Goyal

New submission from Kovid Goyal:

The Pcbuild/readme.txt file implies that it is possible to build python 
2.7.11rc1 with Visual Studio 2015 (although it is not officially supported). 
However, there are at least a couple of problems, that I have encountered so 
far:

1) timemodule.c uses timezone, tzname and daylight which are no longer defined 
in visual studio, as a quick hackish workaround, one can do
#if defined _MSC_VER && MSC_VER >= 1900
#define timezone _timezone
#define tzname _tzname
#define daylight _daylight
#endif

2) More serious, the code in posixmodule.c to check if file descriptors are 
valid no longer links, since it relies on an internal structure from microsoft 
ddls, __pioinfo that no longer exists. See
https://bugs.python.org/issue23524 for discussion about this in the python 3.x 
branch

As a quick and dirty fix one could just replace _PyVerify_fd with a stub 
implementation that does nothing for _MSC_VER >= 1900

However, a proper fix should probably be made.

--
components: Interpreter Core
messages: 20
nosy: kovidgoyal
priority: normal
severity: normal
status: open
title: Python 2.7.11rc1 not building with Visual Studio 2015
type: compile error
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-28 Thread Kovid Goyal

Kovid Goyal added the comment:

OK, I had hoped to avoid having to maintain my own fork of python 2 for a while 
longer, but, I guess not. 

Could you at least tell me if there are any other issues I should be aware of, 
to avoid me having to search through the python 3 sourcecode/commit history. 

I will be happy to make my work public so others can benefit from it as well.

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-28 Thread Kovid Goyal

Kovid Goyal added the comment:

I have it building with just two simple patches:

https://github.com/kovidgoyal/cpython/commit/fd1ceca4f21135f12ceb72f37d4ac5ea1576594d

https://github.com/kovidgoyal/cpython/commit/edb740218c04b38aa0f385188103100a972d608c

However, in developing the patches, I discovered what looks like a bug in the 
CRT close() function. If you double close a valid file descriptor it crashes, 
rather than calling the invalid parameter handler.

python -c "import os; os.close(2); os.close(2)"

crashes. This is true for python 2.7.10 built against VS 2008 as well. This 
contrasts with the behavior of double close() on other operating systems, where 
it sets errno to EBADF and does not crash.

I have not tested it with python 3.5, but I assume the bug is present there as 
well.

--
components:  -Build, Windows

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-29 Thread Kovid Goyal

Kovid Goyal added the comment:

I missed a few places in my initial patch, updated patch:
https://github.com/kovidgoyal/cpython/commit/a9ec814d466d3c0139d10b69666f88eed10e4940

Also fixed the code not clearing errno before calling CRT functions, while I 
was there. Regardless of whether you want to allow your fork to be compiled 
with VS 2015 or not, I suggest you consider merging this patch, anyway, since 
the errno clearing is the correct thing to do, regardless. You can always 
cherrypick the errno clearing bits if you like :)

Just FYI, the code in my fork of 2.7 passes all tests on 64bit builds with VS 
2015, except for 5 small ones that I have yet to track down. (test_ctypes 
test_distutils test_gzip test_mailbox test_zipfile)

I dont anticipate any difficulty in fixing the remaining test failures. Famous 
last words ;)

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-29 Thread Kovid Goyal

Kovid Goyal added the comment:

Yes, I am aware. I embed python in my application, which includes large C++ 
libraries. Those libraries are going to start requiring to be compiled with a 
modern compiler soon, which means I need python to also be compiled with a 
modern compiler. I already manually compile all python extensions in my build 
system, so that is not a problem. And before someone suggests I upgrade to 
python 3, porting half a million lines of python is simply not worth it for me. 

I'll be happy to open a separate bug report, but first I want some advice. I 
have got all the other tests passing as well, except one single test. 
test_gzip.test_many_append. 

The reason that test fails is apparently because of a buffering bug in the 
stdio C functions in VS 2015. Combining lots of seeks relative to SEEK_CUR 
causes read() to return incorrect data. I can make the test pass by modify the 
gzip module to open files with bufferring=0, or by putting in a seek(0, 0) to 
cause the stdio layer to flush its read buffer at the appropriate point. 
However, this is not an actual fix, just an inefficient workaround.

My question is, how do I properly workaround this bug? And how come this bug is 
not triggered in Python 3.5.0? Am I diagnosing this correctly? Any other 
alternative explanations?

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-29 Thread Kovid Goyal

Kovid Goyal added the comment:

To answer part of my question, the reason the fseek()+fread() bug does not 
affect python 3.5.0 appears to be because it implements its own buffering and 
does not use fseek()/fread() at all. 

Sigh, I really hope the answer does not end up being that I have to 
re-implement fseek()/ftell()/fread()/fwrite() using lseek()/read()/write() on 
windows. Or I could wait and hope Microsoft fixes the bug :)

As a first step, to confirm that the bug is in the CRT, I'll have the gzip 
module record all reads/seeks/tells and then see if I can reproduce the bug in 
a plain C program.

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-29 Thread Kovid Goyal

Kovid Goyal added the comment:

Doesn't seem like a bug in the CRT, I cannot reproduce in a plain CRT program, 
so now I get to try to figure out what is broken in fileobject.c by VS 2015. 
That's a relief :)

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-30 Thread Kovid Goyal

Kovid Goyal added the comment:

I take it back, my methodology in reproducing the function calls used by the 
gzip module was flawed. 

It does look like a bug in the CRT, but I have not been able to isolate a 
simple way of reproducing it. I have however, found a workaround for it, that 
has an acceptable performance impact. 

https://github.com/kovidgoyal/cpython/commit/72ae720ab057b1ac0402d67a7195d575d34afbbd

Now all tests pass (except for tcl/tk and distutils, neither of which I care 
about -- well I will probably need to fix up distutils at some point, but not 
now :). Running testsuite as

./PCbuild/amd64/python_d.exe Lib/test/regrtest.py -u 
network,cpu,subprocess,urlfetch

@steve: Thank you for all the work you did porting python 3.x to VS 2015, that 
certainly made by life a lot easier.

I would of course, be ecstatic if you were to consider merging my work into the 
python 2.7 branch, but if not, I understand -- no one likes to maintain a 
legacy codebase.

In any case, for interested third parties, my work is available here:

https://github.com/kovidgoyal/cpython (2.7 branch)

and instructions on building python on windows using a nice cygwin environment 
are here: 

https://github.com/kovidgoyal/calibre/blob/master/setup/installer/windows/notes2.rst

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25759] Python 2.7.11rc1 not building with Visual Studio 2015

2015-11-30 Thread Kovid Goyal

Kovid Goyal added the comment:

No worries, as I said, I understand, I would probably do the same, were I in 
your shoes. I have found that being a maintainer of a complex software project 
tends to naturally increase conservatism :)

--

___
Python tracker 
<http://bugs.python.org/issue25759>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28591] imghdr doesn't recognize some jpeg formats

2016-11-02 Thread Kovid Goyal

Kovid Goyal added the comment:

FYI, the uptodate version of imghdr I maintain is here:
https://github.com/kovidgoyal/calibre/blob/master/src/calibre/utils/imghdr.py

It uses memoryview for performance and can also also read image sizes from file 
headers for jpeg, png, gif and jpeg2000. Note that is is only tested on python 
2.7

I'm afraid I dont have the time to shepherd it through your review process, but 
feel free to take code from it if you want to. It is licensed GPLv3 but I am 
willing to re-license to another license if needed, as I am the sole 
contributor.

--

___
Python tracker 
<http://bugs.python.org/issue28591>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com