[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2022-02-12 Thread Éric Araujo
Éric Araujo added the comment: See also #46337 -- nosy: +eric.araujo versions: +Python 3.11 -Python 3.5 ___ Python tracker ___ ___ P

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2020-06-10 Thread Open Close
Open Close added the comment: I found another related issue (issue37969). I also filed one myself (issue 40938). --- One thing against the 'has_netloc' etc. solution is that while it guarantees round-trips (urlunsplit(urlsplit('...')) etc.), it is conditional on 'urlunsplit' getting 'SplitRe

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-08-04 Thread Martin Panter
Martin Panter added the comment: I like this option. I suppose choosing which option to take is a compromise between compatiblity and simplicity. In the short term, the “allows_none” option requires user code to be updated. In the long term it may break compatibility. But the “has_netloc” etc

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-07-31 Thread Piotr Dobrogost
Change by Piotr Dobrogost : -- nosy: +piotr.dobrogost ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https:/

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-07-31 Thread Chris Jerdonek
Chris Jerdonek added the comment: I just learned of this issue. Rather than adding has_netloc, etc. attributes, why not use None to distinguish missing values as is preferred above, but add a new boolean keyword argument to urlparse(), etc. to get the new behavior (e.g. "allow_none" to paral

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-08-02 Thread Robert Collins
Robert Collins added the comment: See also issue 6631 -- nosy: +rbcollins ___ Python tracker ___ ___ Python-bugs-list mailing list Uns

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-05-31 Thread Berker Peksag
Changes by Berker Peksag : -- nosy: +berker.peksag ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-05-30 Thread Martin Panter
Martin Panter added the comment: Anyone want to review my new patch? This is a perennial issue; see all the duplicates I just linked. -- keywords: +needs review title: urllib.parse wrongly strips empty #fragment -> urllib.parse wrongly strips empty #fragment, ?query, //netloc

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-22 Thread Martin Panter
Martin Panter added the comment: Posting patch v2 with these changes: * Split out scheme documentation fixes to Issue 23684. * Renamed _NetlocResultMixinBase → _SplitParseBase * Explained the default values of the flags better, and what None means * Changed to Demian’s forward-looking “version c

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: > I cannot imagine some existing code (other than an exploit) that would be > broken by restoring the empty “//” component; do you have an example? You're likely right about the usage (I can't think of a plausible use case at any rate). At first read of #23505

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: > I avoided making them positional parameters, as they are not part of the > underlying tuple object. Ignore me, I was off my face and you're absolutely correct. -- ___ Python tracker

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Martin Panter
Martin Panter added the comment: Regarding unparsing of "evil.com", see Issue 23505, where the invalid behaviour is pointed out as a security issue. This was one of the bugs that motivated me to make this patch. I cannot imagine some existing code (other than an exploit) that would be brok

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: urlsplit("evil.com").netloc > '' urlsplit("evil.com").has_netloc > True urlunsplit(urlsplit("evil.com")) # Adds “//” back > 'evil.com' RFC 3986, section 3.3: If a URI contains an authority component, then the path component

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Martin Panter
Martin Panter added the comment: ## Inferring flags ## The whole reason for the has_netloc etc flags is that I don’t think we can always infer their values, so we have to explicitly remember them. Consider the following two URLs, which I think should both have empty “netloc” strings for backw

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: I've done an initial pass in Rietveld and left some comments, mostly around docs. Here are some additional questions though: Given has_* flags can be inferred during instantiation of *Result classes, is there a reason to have them writable, meaning is there a r

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-12 Thread Demian Brecht
Changes by Demian Brecht : -- stage: -> patch review ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://m

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-12 Thread Martin Panter
Martin Panter added the comment: There have been a few recent bug reports (Issue 23505, Issue 23636) that may be solved by the has_netloc proposal. So I am posting a patch implementing it. The changes were a bit more involved than I anticipated, but should still be usable. I reused some of Sti

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-06 Thread Demian Brecht
Changes by Demian Brecht : -- nosy: +demian.brecht ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail

[issue22852] urllib.parse wrongly strips empty #fragment

2015-02-08 Thread Martin Panter
Martin Panter added the comment: I also liked the idea of returning None to distinguish a missing URL component from an empty-but-present component, and it would make them more consistent with the “username” and “password” fields. But I agree it would break backwards compabitility too much. Th

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-13 Thread Stian Soiland-Reyes
Stian Soiland-Reyes added the comment: I tried to make a patch for this, but I found it quite hard as the urllib/parse.py is fairly low-level, e.g. it is constantly encoding/decoding bytes and strings within each URI component. Basically the code assumes there are tuples of strings, with suppo

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Martin Panter
Changes by Martin Panter : -- nosy: +vadmium ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pytho

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Georg Brandl
Changes by Georg Brandl : -- nosy: +orsenthil ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Stian Soiland-Reyes
New submission from Stian Soiland-Reyes: urllib.parse can't handle URIs with empty #fragments. The fragment is removed and not reconsituted. http://tools.ietf.org/html/rfc3986#section-3.5 permits empty fragment strings: URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]