Re: [Python-Dev] Caching float(0.0)
On 3 Oct 2006, at 17:47, James Y Knight wrote: > On Oct 3, 2006, at 8:30 AM, Martin v. Löwis wrote: >> As Michael Hudson observed, this is difficult to implement, though: >> You can't distinguish between -0.0 and +0.0 easily, yet you should. > > Of course you can. It's absolutely trivial. The only part that's even > *the least bit* sketchy in this is assuming that a double is 64 bits. > Practically speaking, that is true on all architectures I know of, How about doing 1.0 / x, where x is the number you want to test? On systems with sane semantics, it should result in an infinity, the sign of which should depend on the sign of the zero. While I'm sure there are any number of places where it will break, on those platforms it seems to me that you're unlikely to care about the difference between +0.0 and -0.0 anyway, since it's hard to otherwise distinguish them. e.g. double value_to_test; ... if (value_to_test == 0.0) { double my_inf = 1.0 / value_to_test; if (my_inf < 0.0) { /* We have a -ve zero */ } else if (my_inf > 0.0) { /* We have a +ve zero */ } else { /* This platform might not support infinities (though we might get a signal or something rather than getting here in that case...) */ } } (I should add that presently I've only tried it on a PowerPC, because it's late and that's what's in front of me. It seems to work OK here.) Kind regards, Alastair -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On 4 Oct 2006, at 06:34, Martin v. Löwis wrote: > Alastair Houghton schrieb: >> On 3 Oct 2006, at 17:47, James Y Knight wrote: >> >>> On Oct 3, 2006, at 8:30 AM, Martin v. Löwis wrote: >>>> As Michael Hudson observed, this is difficult to implement, though: >>>> You can't distinguish between -0.0 and +0.0 easily, yet you should. >>> >>> Of course you can. It's absolutely trivial. The only part that's >>> even >>> *the least bit* sketchy in this is assuming that a double is 64 >>> bits. >>> Practically speaking, that is true on all architectures I know of, >> >> How about doing 1.0 / x, where x is the number you want to test? > > This is a bad idea. It may cause a trap, leading to program > termination. AFAIK few systems have floating point traps enabled by default (in fact, isn't that what IEEE 754 specifies?), because they often aren't very useful. And in the specific case of the Python interpreter, why would you ever want them turned on? Surely in order to get consistent floating point semantics, they need to be *off* and Python needs to handle any exceptional cases itself; even if they're on, by your argument Python must do that to avoid being terminated. (Not to mention the problem that floating point traps are typically delivered by a signal, the problems with which were discussed extensively in a recent thread on this list.) And it does have two advantages over the other methods proposed: 1. You don't have to write the value to memory; this test will work entirely in the machine's floating point registers. 2. It doesn't rely on the machine using IEEE floating point. (Of course, neither does the binary comparison method, but it still involves a trip to memory, and assumes that the machine doesn't have multiple representations for +0.0 or -0.0.) Even if you're saying that there's a significant chance of a trap (which I don't believe, not on common platforms anyway), the configure script could test to see if this will happen and fall back to one of the other approaches, or see if it can't turn them off using the C99 APIs. (I think I'd agree with you that handling SIGFPE is undesirable, which is perhaps what you were driving at.) Anyway, it's only an idea, and I thought I'd point it out as nobody else had yet. If 0.0 is going to be cached, then I certainly think -0.0 and +0.0 should be two separate values if they exist on a given machine. I'm less concerned about exactly how that comes about. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On 4 Oct 2006, at 02:38, Josiah Carlson wrote: > Alastair Houghton <[EMAIL PROTECTED]> wrote: > > There is, of course, the option of examining their representations in > memory (I described the general technique in another posting on this > thread). From what I understand of IEEE 764 FP doubles, -0.0 and +0.0 > have different representations, and if we look at the underlying > representation (perhaps by a "*((uint64*)(&float_input))"), we can > easily distinguish all values we want to cache... Yes, though a trip via memory isn't necessarily cheap, and you're also assuming that the machine doesn't use an FP representation with multiple +0s or -0s. Perhaps they should be different anyway though, I suppose. > And as I stated before, we can switch on those values. Alternatively, > if we can't switch on the 64 bit values directly... > > uint32* p = (uint32*)(&double_input) > if (!p[0]) { /* p[1] on big-endian platforms */ > switch p[1] { /* p[0] on big-endian platforms */ > ... > } > } That's worse, IMHO, because it assumes more about the representation. If you're going to look directly at the binary, I think all you can reasonably do is a straight binary comparison. I don't think you should poke at the bits without first knowing that the platform uses IEEE floating point. The reason I suggested 1.0/x is that it's one of the few ways (maybe the only way?) to distinguish -0.0 and +0.0 using arithmetic, which is what people that care about the difference between the two are going to care about. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On Oct 4, 2006, at 8:14 PM, Martin v. Löwis wrote: > If it breaks a few systems, that already is some systems too many. > Python should never crash; and we have no control over the floating > point exception handling in any portable manner. You're quite right, though there is already plenty of platform dependent code in Python for just that purpose (see fpectlmodule.c, for instance). Anyway, all I originally wanted was to point out that using division was one possible way to tell the difference that didn't involve relying on the representation being IEEE compliant. It's true that there are problems with FP exceptions. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Security Advisory for unicode repr() bug?
On Oct 7, 2006, at 3:36 PM, M.-A. Lemburg wrote: > Georg Brandl wrote: >> [EMAIL PROTECTED] wrote: >>> I don't know if Apple has picked up on it (or if the version they >>> currently >>> distribute is affected - 2.3.5 built Oct 5 2005). > Note that the bug refers to a UCS4 Python build. Most Linux > distros ship UCS4 builds nowadays, so they care. The Windows > builds are UCS2 (except maybe the ones for Win64 - don't know) > which doesn't seem to be affected. AFAIK the version Apple ship is a UCS2 build, therefore not affected. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 5 Dec 2006, at 09:02, Ben Wing wrote: > Fredrik Lundh wrote: >> Ka-Ping Yee wrote: >> taking everything into account, I think we should simply map >> __getitem__ >> to group, and stop there. no len(), no slicing, no sequence or >> mapping >> semantics. if people want full sequence behaviour with len and >> slicing >> and iterators and whatnot, they can do list(m) first. >> > i'm ok either way -- that is, either with the proposal i previously > published, or with this restricted idea. I prefer your previous version. It matches my expectations as a user of regular expression matching and as someone with experience of other regexp implementations. (The current groups() method *doesn't* match those expectations, incidentally. I know I've been tripped up in the past because it didn't include the full match as element 0.) Basically, I don't see the advantage in the restrictions Frederik is proposing (other than possibly being simpler to implement, though not actually all that much, I think). Yes, it's a little unusual in that you'd be able to index the match "array" with either integer indices or using names, but I don't view that as a problem, and I don't see how not supporting len() or other list features like slicing and iterators helps. What's more, I think it will be confusing for Python newbies because they'll see someone doing m[3] and assume that m is a list-like object, then complain when things like for match in m: print match or m[3:4] fail to do what they expect. Yes, you might say "it's a match object, not a list". But, it seems to me, that's really in the same vein as "don't type quit or exit, press Ctrl-D". Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 5 Dec 2006, at 15:51, Fredrik Lundh wrote: > Alastair Houghton wrote: > >> What's more, I think it will be confusing for Python newbies because >> they'll see someone doing >> >>m[3] >> >> and assume that m is a list-like object, then complain when things >> like >> >>for match in m: >> print match > > that'll work, of course, which might be confusing for people who think > they understand how for-in works but don't ;) Or (as in my case) guessed at how it works because they can't be bothered to check the code and can't remember from the last time they looked. I don't spend a great deal of time in the guts of Python. But I do use it and have a couple of extensions that I've written for it (one of which I was contemplating releasing publicly and that is impacted by this change---it provides, amongst other things, an alternate implementation of the "re" API, so I'm going to want to implement this too). >> or >> >>m[3:4] >> >> fail to do what they expect. > > the problem with slicing is that people may 1) expect a slice to > return > a new object *of the same type* What I would have expected is that it supported a similar set of sequence methods---that is, that it returned something with a similar signature. I don't see why code would care about it being the exact same type. Anyway, clearly what people will expect here (talking about the match object API) is that m[3:4] would give them a list (or some equivalent sequence object) containing groups 3 and 4. Why do you think someone would expect a match object? > 2) expect things like [::-1] to work, which opens up another can of > worms. As long as they aren't expecting it to return the same type of object, is there a can of worms here? > I prefer the "If the implementation is easy to explain, it may be a > good idea." design principle over "can of worms" design principle. As someone who is primarily a *user* of Python, I prefer the idea that sequence objects should operate consistently to the idea that there might be some that don't. By which I mean that anything that supports indexing using integer values should ideally support slicing (including things like [::-1]). Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 6 Dec 2006, at 20:29, Josiah Carlson wrote: > The problem is that either we return a list (easy), or we return > something that is basically another match object (not quite so easy). > Either way, we would be confusing one set of users or another. By not > including slicing functionality by default, we sidestep the confusion. But I don't believe that *anyone* will find it confusing that it returns a list. It's much more likely to be confusing to people that they have to write list(m)[x:y] or [m[i] for i in xrange(x,y)] when m[x] and m[y] work just fine. >> As someone who is primarily a *user* of Python, I prefer the idea >> that sequence objects should operate consistently to the idea that >> there might be some that don't. By which I mean that anything that >> supports indexing using integer values should ideally support slicing >> (including things like [::-1]). > > You are being inconsistant. You want list, tuple, etc. to be > consistant, > but you don't want match objects to be consistant. Sorry, but that is > silly. Better to not support slices than to confuse the hell out of > people by returning a tuple or list from a match slicing. That's not true *and* I object to your characterisation of the idea as "silly". What I'm saying is that the idea of slicing always returning the same exact type of object is pointless consistency, because nobody will care *provided* the thing that is returned supports a sensible set of operations given the original type. Look, I give in. There's no point trying to convince any of you further, and I don't have the time or energy to press the point. Implement it as you will. If necessary it can be an extension of my "re" replacement that slicing is supported on match objects. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 00:39, Mike Klaas wrote: > Keep in mind when implementing that m[3:4] should contain only the > element at index 3, not both 3 and 4, as you've seemed to imply twice. Yes, you're quite right. I was writing off the top of my head and I'm still a relative newbie to Python coding. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 01:01, Josiah Carlson wrote: > *We* may not be confused, but it's not about us (I'm personally > happy to > use the .group() interface); it's about relative newbies who, > generally > speaking, desire/need consistency (see [1] for a paper showing that > certain kinds of inconsistancies are bad - at least in terms of > grading > - for new computer science students). Being inconsistant because it's > *easy*, is what I consider silly. We've got the brains, we've got the > time, if we want slicing, lets produce a match object. Oh, it isn't that I don't want to produce a match object; I think you've mistaken my intention in that respect. I'd be equally happy for it to be a match object, *but*... > If we don't want > slicing, or if prodicing a slice would produce a semantically > questionable state, then lets not do it. ...if you return match objects from slicing, you have problems like m [::-1].groups(). *I* don't know what that should return. What led me to think that a tuple or list would be appropriate is the idea that slicing was a useful operation and that I felt it was unlikely that anyone would want to call the match object methods on a slice, coupled with the fact that slices clearly have problems with some of the match object methods. A match object, plus sequence functionality, minus match object methods, is basically just a sequence. If you're worried about types, you could do something like this: generic match object | +--+-+ || real match objectmatch object slice where the "generic match object" perhaps doesn't have all the methods that a "real match object" would have. (In the extreme case, generic match object might basically just be a sequence type.) Then slicing something that was a "generic match object" always gives you a "generic match object", but it might not support all the methods that the original match object supported. > Half-assing it is a waste. Sure. We're agreed there :-) >> Look, I give in. There's no point trying to convince any of you >> further, and I don't have the time or energy to press the point. >> Implement it as you will. If necessary it can be an extension of my >> "re" replacement that slicing is supported on match objects. > > I'm sorry to see you give up so easily. One thing to realize/remember > is that basically everyone who frequents python-dev has their own > "make > life easier" function/class library for those things that have been > rejected for general inclusion in Python. It's just that I'm tired and have lots of other things that need doing as well. Maybe I do have a bit more time to talk about it, we'll see. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 02:01, Josiah Carlson wrote: > Alastair Houghton <[EMAIL PROTECTED]> wrote: >> On 7 Dec 2006, at 01:01, Josiah Carlson wrote: >>> If we don't want >>> slicing, or if prodicing a slice would produce a semantically >>> questionable state, then lets not do it. >> >> ...if you return match objects from slicing, you have problems like m >> [::-1].groups(). *I* don't know what that should return. > > I would argue that any 'step' != 1 has no semantically correct result > for slicing on a match object, so we shouldn't support it. OK, but even then, if you're returning a match object, how about the following: >>> m = re.match('(A)(B)(C)(D)(E)', 'ABCDE') >>> print m[0] ABCDE >>> n = m[2:5] >>> print list(n) ['B', 'C', 'D'] >>> print n[0] B >>> print n.group(0) B The problem I have with it is that it's violating the invariant that match objects should return the whole match in group(0). It's these kinds of things that make me think that slices shouldn't have all of the methods of a match object. I think that's probably why various others have suggested not supporting slicing, but I don't think it's necessary to avoid it as long as it has clearly specified behaviour. >> If you're worried about types, you could do something like this: >> >>generic match object >> | >> +--+-+ >> || >>real match objectmatch object slice > > I believe the above is unnecessary. Slicing a match could produce > another match. It's all internal data semantics. Sure. My point, though, was that you could view (from an external perspective) all results as instances of "generic match object", which might not have as many methods. Interestingly, at present, the match object type itself is an implementation detail; e.g. for SRE, it's an _sre.SRE_Match object. It's only the API that's documented, not the type. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 07:15, Fredrik Lundh wrote: > Michael Urman wrote: > >> The idea that slicing a match object should produce a match object >> sounds like a foolish consistency to me. > > well, the idea that adding m[x] as a convenience alias for m.group(x) > automatically turns m into a list-style sequence that also has to > support full slicing sounds like an utterly foolish consistency to me. How about we remove the word "foolish" from the debate? > the OP's original idea was to make a common use case slightly > easier to > use. if anyone wants to argue for other additions to the match object > API, they should at least come up with use cases based on real > existing > code. An example where it might be useful: m = re.match('(?:([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+) (?Prect)' '|([0-9]+) ([0-9]+) ([0-9]+) (?Pcircle))', lineFromFile) if m['rect']: drawRectangle(m[1:5]) elif m['circle']: drawCircle(m[1:3], m[3]) Is that really so outlandish? I'm not saying that this is necessarily the best way, but why force people to write list(m)[1:5] or [m[i] for i in xrange(1,5)] ?? If the only reason is that some of the match object APIs, which I maintain are very unlikely to be wanted on a slice anyway, can't possibly produce consistent results, then why not just do away with the APIs and return a tuple or something instead? That way you can treat the match object as if it were just a tuple (which it could easily have been). > (and while you guys are waiting, I suggest you start a new thread > where > you discuss some other inconsistency that would be easy to solve with > more code in the interpreter, like why "-", "/", and "**" doesn't work > for strings, lists don't have a "copy" method, sets and lists have > different API:s for adding things, we have hex() and oct() but no > bin(), > str.translate and unicode.translate take different arguments, etc. > get > to work!) Oh come on! Comparing this with exponentiating strings is just not helpful. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 18:54, Martin v. Löwis wrote: > Alastair Houghton schrieb: >> How about we remove the word "foolish" from the debate? > > We should table the debate. If you really want that feature, > write a PEP. You want it, some people are opposed; a PEP is > the procedure to settle the difference. As I said a couple of e-mails back, I don't really have the time (I have lots of other things to do, most of them more important [to me, anyway]). If someone else agrees and wants to do it, great. If not, as I said before, I'm happy to let whoever do whatever. I might not agree, but that's my problem. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 7 Dec 2006, at 21:47, Josiah Carlson wrote: > Alastair Houghton <[EMAIL PROTECTED]> wrote: >> On 7 Dec 2006, at 02:01, Josiah Carlson wrote: >>> Alastair Houghton <[EMAIL PROTECTED]> wrote: >>>> On 7 Dec 2006, at 01:01, Josiah Carlson wrote: >>>>> If we don't want >>>>> slicing, or if prodicing a slice would produce a semantically >>>>> questionable state, then lets not do it. >>>> >>>> ...if you return match objects from slicing, you have problems >>>> like m >>>> [::-1].groups(). *I* don't know what that should return. >>> >>> I would argue that any 'step' != 1 has no semantically correct >>> result >>> for slicing on a match object, so we shouldn't support it. >> >> OK, but even then, if you're returning a match object, how about the >> following: >> >>>>> m = re.match('(A)(B)(C)(D)(E)', 'ABCDE') >>>>> print m[0] >>ABCDE >>>>> n = m[2:5] >>>>> print list(n) >>['B', 'C', 'D'] >>>>> print n[0] >>B >>>>> print n.group(0) >>B >> >> The problem I have with it is that it's violating the invariant that >> match objects should return the whole match in group(0). > > If we were going to go with slicing, then it would be fairly > trivial to > include the whole match range. Some portion of the underlying > structure > knows where the start of group 2 is, and knows where the end of > group 5 > is, so we can slice or otherwise use that for subsequent sliced > groups. But then you're proposing that this thing (which looks like a tuple, when you're indexing it) should slice in a funny way. i.e. m = re.match('(A)(B)(C)(D)(E)', 'ABCDE') print m[0] ABCDE print list(m) ['ABCDE', 'A', 'B', 'C', 'D', 'E'] n = m[2:5] print list(n) ['BCD', 'B', 'C', 'D'] print len(n) 4 p = list(m)[2:5] print p ['B', 'C', 'D'] print len(p) Or are you saying that m[2:5][0] != m[2:5].group(0) but m[0] == m.group(0) ?? Either way I think that's *really* counter-intuitive. Honestly, I don't think that slicing should be supported if it's going to have to result in match objects, because I can't see a way to make them make sense. I think that's Frederik's objection also, but unlike me he doesn't feel that the slice operation should return something different (e.g. a tuple). Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
On 8 Dec 2006, at 16:38, Josiah Carlson wrote: > My statement in the email you replied to above was to say that if we > wanted it to return a group, then we could include subsequent .group > (0) > with the same semantics as the original match object. And my reply was simply to point out that that's not workable. > At this point it doesn't matter, Frederik will produce what he > wants to > produce, and I'm sure most of us will be happy with the outcome. > Those > that are unhappy will need to write their own patch or deal with being > unhappy. I believe I've already conceded that twice. Kind regards, Alastair. -- http://alastairs-place.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com