[issue46035] mimetypes.guess_type returns deprecated mimetype application/x-javascript

2021-12-10 Thread milahu


New submission from milahu :

deprecated mimetype?
per rfc4329, the technical term is "unregistered media type"

https://datatracker.ietf.org/doc/html/rfc4329#section-3

related

https://stackoverflow.com/a/9664327/10440128

https://github.com/danny0838/PyWebScrapBook/issues/53

quick fix

```py
# python/Lib/mimetypes.py

class MimeTypes:
# ...
def guess_type(self, url, strict=True):
# ...

if ext in _types_map_default:
# prefer the python-internal values over /etc/mime.types
return _types_map_default[ext], encoding

if ext in types_map:
return types_map[ext], encoding
```

why is `application/x-javascript` returned?

on linux, mimetypes.init() loads /etc/mime.types
source:
https://mirrors.kernel.org/gentoo/distfiles/mime-types-9.tar.bz2

/etc/mime.types is sorted by alphabet, so

```
cat /etc/mime.types | grep javascript
application/javascript  
js
application/x-javascript
js
```

apparently, the last entry application/x-javascript
will overwrite the previous entry application/javascript

--
components: Library (Lib)
messages: 408197
nosy: milahu
priority: normal
severity: normal
status: open
title: mimetypes.guess_type returns deprecated mimetype application/x-javascript
type: behavior
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue46035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46035] mimetypes.guess_type returns deprecated mimetype application/x-javascript

2021-12-10 Thread milahu


milahu  added the comment:

patch

https://github.com/milahu/cpython/commit/8a50633bb1b0c3e39fbe2cd467bb34a839ad068f

--

___
Python tracker 
<https://bugs.python.org/issue46035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46035] mimetypes.guess_type returns deprecated mimetype application/x-javascript

2022-01-18 Thread milahu


milahu  added the comment:

this issue is different than Issue32462
because here, both entries are valid

```
cat /etc/mime.types | grep javascript
application/javascriptjs
application/x-javascript  js
```

but the alphabetical ordering of the file
makes the last entry take precedence

python could be smarter at parsing the /etc/mime.types file
in that it could give lower precedence to the deprecated types

pseudocode:

deprecated_mimetypes = set(...) # values from rfc4329
mimetype_of_ext = dict()
# parser loop
for ...
  ext = "..."
  mimetype = "..."
  if ext in mimetype_of_ext:
old_mimetype = mimetype_of_ext[ext]
if old_mimetype in deprecated_mimetypes:
  mimetype_of_ext[ext] = mimetype # replace old with new
  # assume that mimetype is not deprecated
  mimetype_of_ext[ext] = mimetype

--

___
Python tracker 
<https://bugs.python.org/issue46035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46035] mimetypes.guess_type returns deprecated mimetype application/x-javascript

2022-01-18 Thread milahu


milahu  added the comment:

edit:

-  mimetype_of_ext[ext] = mimetype
+  else:
+# add new entry
+mimetype_of_ext[ext] = mimetype

--

___
Python tracker 
<https://bugs.python.org/issue46035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46035] mimetypes.guess_type returns deprecated mimetype application/x-javascript

2022-01-18 Thread milahu


milahu  added the comment:

python-ideas thread
https://mail.python.org/archives/list/python-id...@python.org/thread/V53XGQPIY7ZAISMTQHPHKGWZNSN5EXQG/

--

___
Python tracker 
<https://bugs.python.org/issue46035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com