[issue2986] difflib.SequenceMatcher not matching long sequences

2019-11-07 Thread Roundup Robot
Change by Roundup Robot : -- pull_requests: +16592 pull_request: https://github.com/python/cpython/pull/17082 ___ Python tracker ___

[issue2986] difflib.SequenceMatcher not matching long sequences

2011-01-08 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- stage: needs patch -> committed/rejected ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubs

[issue2986] difflib.SequenceMatcher not matching long sequences

2011-01-08 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : -- nosy: +jcea ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.o

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-25 Thread Terry J. Reedy
Terry J. Reedy added the comment: Agreed. #10534. This is really a 'follow-on' rather than 'superseder', but the forward reference should be easy for anyone to find. -- resolution: -> fixed status: open -> closed superseder: -> difflib.SequenceMatcher: expose junk sets, deprecate und

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Eli Bendersky
Eli Bendersky added the comment: Terry, I agree with Simon re closing and opening a new feature request. This issue has too much baggage in it, and you we always link to it. A new feature request should be opened strictly for 3.2 If you want I can close this issue and open a new one, but I'm

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Simon Cross
Simon Cross added the comment: My vote is that this bug be closed and a new feature request be opened. Failing that, it would be good to have a concise description of what else we would like done (and the priority should be downgraded, I guess). -- ___

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-24 Thread Terry J. Reedy
Terry J. Reedy added the comment: Since I am not sure I will be able to do any more before the 3.2b1 feature freeze, I went ahead with the minimal patch after checking the differences from the 2.7 version and redoing the Misc/News entry. (I suspect putting a new entry immediately after the app

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-21 Thread Eli Bendersky
Eli Bendersky added the comment: Simon's patch fix for 3.2 looks good to me - applies cleanly to py3k and tests pass. -- ___ Python tracker ___ _

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-20 Thread Terry J. Reedy
Terry J. Reedy added the comment: Deadline is probably next Fri. However I will apply this or slight revision thereof in a couple of days to make sure this much is in. I have to fixup some work stuff today. -- ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-20 Thread Simon Cross
Simon Cross added the comment: I made the minor changes needed to get Eli Bendersky's patch to apply against 3.2. Diff attached. -- nosy: +hodgestar Added file: http://bugs.python.org/file19675/issue2986.fix32.5.patch ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-19 Thread Eli Bendersky
Eli Bendersky added the comment: Terry, when is the deadline for producing the patch for 3.2? Perhaps we should at least submit the 2.7 patch for now so that it goes in for sure? -- ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-12 Thread Terry J. Reedy
Terry J. Reedy added the comment: r86437 - correct and replicate version-added message -- ___ Python tracker ___ ___ Python-bugs-list

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy
Terry J. Reedy added the comment: issue2986.fix27.5.patch applied, with version note added to doc, as rev86418 Only thing left is patch for 3.2, which Eli and I will produce. -- stage: commit review -> needs patch versions: -Python 2.7 ___ Python t

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- Removed message: http://bugs.python.org/msg120925 ___ Python tracker ___ ___ Python-bugs-list mailing li

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy
Terry J. Reedy added the comment: Tim told me to continue with this as he has no time. rev86401 - apply 3.1 doc fix I cannot apply 2.7 patch. I has different header lines. In particular, TortoiseSVN cannot fetch nonexistent revision "Mon Aug 30 06:37:52 2010 +0300". Please regenerate against

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Eli Bendersky
Eli Bendersky added the comment: Attaching a new patch for 2.7 freshly generated vs. current 2.7 maintenance branch from SVN. -- Added file: http://bugs.python.org/file19569/issue2986.fix27.5.patch ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy
Terry J. Reedy added the comment: Tim told me to continue with this as he has no time. rev86401 - apply 3.1 doc fix -- assignee: tim_one -> terry.reedy ___ Python tracker ___ ___

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-11 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- versions: -Python 3.1 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-11-07 Thread Eli Bendersky
Eli Bendersky added the comment: Adding a documentation patch for 3.1 which is similar to the 2.6 documentation patch that's been committed by Georg into 2.6 -- Added file: http://bugs.python.org/file19538/issue2986.docs31.1.patch ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-07 Thread Terry J. Reedy
Terry J. Reedy added the comment: The patch changes the internal function that constructs the dict mapping b items to indexes to read as follows: create b2j mapping if isjunk function, move junk items to junk set if autojunk, move popular items to popular set I helped write and test the

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-02 Thread Eli Bendersky
Eli Bendersky added the comment: Attaching a patch (developed jointly with Terry Reedy) for 2.7 that adds an 'autojunk' parameter to SequenceMatcher's constructor. The parameter is True by default which retains the current behavior in 2.6 and earlier, but can be set by the user to False to di

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-09-01 Thread Terry J. Reedy
Terry J. Reedy added the comment: While refactoring the code for 2.7, I discovered that the description of the heuristic for 2.6 and in the code comments is off by 1. "items that appear more than 1% of the time" should actually be "items whose duplicates (after the first) appear more than 1%

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-08-02 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: Georg committed this patch to the 2.6 tree, and besides, this is doesn't seem like a blocking issue, so I'm kicking 2.6 off the list and knocking the priority down. -- priority: release blocker -> high versions: -Python 2.6

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl
Changes by Georg Brandl : -- priority: deferred blocker -> release blocker ___ Python tracker ___ ___ Python-bugs-list mailing list Uns

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl
Georg Brandl added the comment: Committed 2.6 patch in r83314. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-31 Thread Georg Brandl
Georg Brandl added the comment: Deferring to after 3.2a1. -- priority: release blocker -> deferred blocker ___ Python tracker ___ ___

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-23 Thread Eli Bendersky
Eli Bendersky added the comment: Here's a patch for Doc/library/difflib.rst of the 2.6 branch, following Terry's suggested addition to the docs of the SequenceMatcher class. Tested 'make html'. -- keywords: +patch Added file: http://bugs.python.org/file18171/issue2986.docs26.1.patch

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: For 2.6 and 3.1, this is a documentation only issue. For 2.7, this is a doc + behavior issue. For 3.2, this is a doc + behavior + new feature issue. For 2.6.6 (release candidate due Aug 2, 10 days), I propose to add the following paragraph after the current 'T

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-14 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le mercredi 14 juillet 2010 à 01:45 +, Terry J. Reedy a écrit : > > 2. Add a parameter that defaults to using the heuristic but allows > turning it off. Perhaps better, but code that used the new API would > crash if run on 2.7.0 Yes, but this is an except

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-13 Thread Terry J. Reedy
Terry J. Reedy added the comment: [copied from pydev post] Summary: adding an autojunk heuristic to difflib without also adding a way to turn it off was a bug because it disabled running code. 2.6 and 3.1 each have, most likely, one final version each. Don't fix for these but add something t

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Terry J. Reedy
Terry J. Reedy added the comment: My proposal F, to expose the common frequency threshold as a fourth positional parameter with default 1, would do that: repeat current behavior. We should, and Eli and I would, add some of the anomalous cases to the test suite and verily that the default is t

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > There is no problem with extending the API in 3.2. The debate there is > over 2.7. We could extend the API as long as it stays backwards-compatible (that is, the default value for the new argument produces the same behaviour as before). -- _

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-08 Thread Terry J. Reedy
Terry J. Reedy added the comment: Anyone can post on Python-dev, but non-developers should do so judiciously and with respect for the purpose of the list. It is also polite to introduce oneself with the first post. In any case, Tim Peters has approved making some change. The remaining questio

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/op

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/op

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-07 Thread Vlastimil Brom
Vlastimil Brom added the comment: I guess, I am not supposed to post to python-dev - not being a python developer, hopefully it is appropriate to add a comment here - only based on my current usage of (a modified) difflib.SequenceMatcher. It seems, the mentions of text comparison in that threa

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky
Eli Bendersky added the comment: I apologize for the previous message. It was created by mistake - by replying to Terry's mail which came from the bugtracker. I wish I knew how to remove it from here - is this possible and I'm missing the relevant priveleges? --

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky
Changes by Eli Bendersky : Removed file: http://bugs.python.org/file17891/unnamed ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Eli Bendersky
Eli Bendersky added the comment: Thanks! Now let's see what the other devs say. The first response seems not to have understood what you meant completely :-) Eli On Wed, Jul 7, 2010 at 01:18, Terry J. Reedy wrote: > > Terry J. Reedy added the comment: > > [Also posted to pydev for additiona

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-06 Thread Terry J. Reedy
Terry J. Reedy added the comment: [Also posted to pydev for additional input, with Subject line Issue 2986: difflib.SequenceMatcher is partly broken Developed with input from Eli Bendersky, who will write patchfile(s) for whichever change option is chosen.] Summary: difflib.SeqeunceMatcher was

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-07-02 Thread Eli Bendersky
Eli Bendersky added the comment: The new "junk heuristic" has been added to difflib.py in SVN revision 26661 in 2002 (which is, incidentally, the last revision to modify difflib.py). Its commit log says: - Mostly in SequenceMatcher.{__chain_b, find_

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-06-28 Thread Terry J. Reedy
Terry J. Reedy added the comment: The discussion on #152807 references two other closed tracker issues: #1678339 Test case that currently fails #1678345 Patch to change behavior - rejected because crippled behavior is supposedly intentional and removing the change would slow things down. The p

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-06-25 Thread Terry J. Reedy
Terry J. Reedy added the comment: This appears to be one of at least three duplicate issues: #1528074, #2986, and #4622. I am closing two, leaving 2986 open, and merging the nearly disjoint nosy lists. (If no longer interested, you can delete yourself from 2986.) #1711800 appears to be slight

[issue2986] difflib.SequenceMatcher not matching long sequences

2010-04-19 Thread Vlastimil Brom
Vlastimil Brom added the comment: I just stumbled on some seemingly different unexpected behaviour of difflib.SequenceMatcher, but it turns out, it may have the same cause, i.e. the "popular" heuristics. I hopefully managed to replicate it on an illustrative sample text - in as included in the

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-10-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: The popularity heuristic could be tuned to depend on the number N of distinct elements in the sequence, and kick in if an element appears say more than 1/(N**0.5) of the time. -- nosy: +pitrou ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-10-01 Thread Geoffrey Bache
Changes by Geoffrey Bache : -- nosy: +gjb1002 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-05-27 Thread R. David Murray
Changes by R. David Murray : -- components: +Documentation, Library (Lib) -Extension Modules priority: -> normal stage: -> test needed type: -> feature request versions: +Python 3.2 -Python 2.5 ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread R. David Murray
R. David Murray added the comment: On Mon, 30 Mar 2009 at 00:40, Mike Rotondo wrote: > This seems to mean that you won't actually get an accurate diff in > certain cases, which seems odd. At the very least, this behavior should > probably be documented. Do people think it should be changed to ge

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread Mike Rotondo
Mike Rotondo added the comment: >From the source, it seems that there is undocumented behavior to SequenceMatcher which is causing this error. If b is longer than 200 characters, it will consider any element x in b that takes up more than 1% of it's contents as "popular", and thus junk. So, in

[issue2986] difflib.SequenceMatcher not matching long sequences

2009-03-29 Thread Georg Brandl
Georg Brandl added the comment: Tim, I think you've had some enlightening comments about difflib issues in the past. -- assignee: -> tim_one nosy: +georg.brandl, tim_one ___ Python tracker

[issue2986] difflib.SequenceMatcher not matching long sequences

2008-05-27 Thread Nate
New submission from Nate <[EMAIL PROTECTED]>: The following code shows no matches though the strings clearly match. from difflib import * a = '''39043203381559556628573221727792187279924711093861125152794523529732793117520068565885125032447020125028126531603069277213510312502702798781521250210