from:"Andrew Robinson"

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


Ok, hopefully this is better.  I love my own e-mail editor...

I can see that the slice() function can pass in arbitrary arguments.
I'm not sure for lists, which is what the range is applied to, why an 
argument like "a" would be part of a slice.
I *really* don't see what the advantage of a slice class is over a mere 
list in the order of start, stop, step eg: [ 1,4,9 ]


In a dictionary, where "a" could be a key -- I wasn't aware that there 
was a defined order that the idea of slice could apply to.


When I look at the documentation,
http://www.python.org/doc//current/c-api/slice

The only thing that slice has which is special, is that the the length 
of the sequence can be given -- and the start and stop index are either 
trimmed or an error (exception???) is thrown.


Where is the information on the more general case of slice()? :-\

I am thinking, can one use the 'super' type of access, to override -- 
within the list object itself -- the __getitem__ method, and after 
pre-processing -- call the shadowed method with the modified 
parameters?  That would allow me to use the normal a[-4:6] notation, 
without having to write a wrapper class that must be explicitly called.


I'm thinking something like,

PyListObject.__getitem__= lambda self, slice: 

--Andrew.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 04:32 AM, Chris Angelico wrote:
I wonder if what the OP is looking for is not slicing, but something 
more akin to map. Start with a large object and an iterator that 
produces keys, and create an iterator/list of their corresponding 
values. Something like: a=[1,2,3,4,5,6,7,8,9,10] b=[a[i] for i in 
xrange(-4,3)] It's not strictly a slice operation, but it's a similar 
sort of thing, and it can do the wraparound quite happily. ChrisA 


A list comprehension ?
That does do what I am interested in, *very* much so.  Quite a gem, Chris!

:-\
I am curious as to how quickly it constructs the result compared to a 
slice operation.


Eg:
a[1:5]
vs.
[ a[i] for i in xrange[1:5] ]

But, unless it were grossly slower -- so that if/then logic and slices 
were generally faster -- I will use it.

Thanks.

--Andrew.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 04:19 AM, Steven D'Aprano wrote:

On Mon, 29 Oct 2012 00:54:29 -0700, Andrew wrote:

Slices and iterators have different purposes and therefore have not been
made interchangeable. Yes, there are certain similarities between a slice
and xrange, but there are also significant differences.

Aha, now were getting to the actual subject.


[snip]

In 'C', where Python is written,

That's a popular misapprehension. Python is written in Java, or Lisp, or
Haskell, or CLR (dot Net), or RPython, or Ocaml, or Parrot. Each of those
languages have, or had, at least one Python implementation. Oh, there's
also a version written in C, or so I have heard.

:-P
I didn't say it was only written in "C",  but in "C" where it is 
implemented.
I will be porting Python 3.xx to a super low power embedded processor 
(MSP430), both space and speed are at a premium.
Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't 
even run on this system. etc.






Thank you for the code snippet; I don't think it likely that existing
programs depend on nor use a negative index and a positive index
expecting to take a small chunk in the center...

On the contrary. That is the most straightforward and useful idea of
slicing, to grab a contiguous slice of items.
Show me an example where someone would write a slice with a negative and 
a positive index (both in the same slice);
and have that slice grab a contiguous slice in the *middle* of the list 
with orientation of lower index to greater index.
I have asked before;  It's not that I don't think it possible -- it's 
that I can't imagine a common situation.



Why would you want to grab a slice from the end of the list, and a slice
from the start of the list, and swap them around? Apart from simulating
card shuffles and cuts, who does that?
Advanced statistics programmers using lookup tables that are 
symmetrical.  Try Physicists too -- but they're notably weird.



My intended inferences about the iterator vs. slice question was perhaps
not obvious to you; Notice: an iterator is not *allowed* in
__getitem__().

Actually, you can write __getitem__ for your own classes to accept
anything you like.

Yes, I realize that.
But, why can't I just overload the existing __getitem__ for lists and 
not bother writing an entire class?
Everything in Python is supposed to be an object and one of the big 
supposed "selling" points is the ability to overload "any" object's methods.
The lists aren't special -- they're just a bunch of constant decimal 
numbers, typically given as a large tuple.




py>  class Test:
... def __getitem__(self, index):
... return index
...

Better:
>>> class Test:
... def __getitem__( self, *index ):
... return index

No extra curlies required...


You say that as if it were a bad thing.


hmmm... and you as if sarcastic? :-)
It is a bad thing to have any strictly un-necessary and non-code saving 
objects where memory is restricted.



What existing class is that? It certainly isn't xrange.

Because xrange represents a concrete sequence of numbers, all three of
start, end and stride must be concrete, known, integers:



Let's up the ante.  I'll admit xrange() won't do "later" fill in the 
blank -- BUT --

xrange() is a subclass of an *existing* class called iterator.
Iterators are very general.  They can even be made random.


The philosophy of Python is to have exactly one way to do something when
possible; so, why create a stand alone class that does nothing an
existing class could already do, and do it better ?

py>  xrange(4, None, 2)
Traceback (most recent call last):
   File "", line 1, in
TypeError: an integer is required


Hmmm..
Let's try your example exactly as shown...

"hello world"[aslice]
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'aslice' is not defined

WOW. Cool.
Where did the blanks *actually* get filled in?  Or HOW WILL they in your 
next post?


On the contrary, a simple list of three values not only could not do 
everything a slice does, but it's over twice the size! 
Yeah, There is a definite issue there.  But the case isn't decided by 
that number alone.

A slice is storing three integers -- and an integer is size is 12.
So, slices don't use integers.  If the type that *IS* used happens to be 
a real Python type, we may merely typecast integers to that type -- 
insert them in a tuple and by definition, they must be the same size.


 Looking at some of the online programming notes -- a slice apparently 
doesn't use an integer storage variable that is capable of arbitrary 
expansion. =-O -- and hence, won't work for very large sized lists.  
That actually explains some crashes I have noted in the past when 
working with 20 million element lists that I wanted a slice of.  I had 
*plenty* of ram on that system.
Besides: The program code to implement slice() is undoubtedly larger 
than 12 bytes of savings!

How many slices() are typically found in memory simultaneously?

Yo

Re: I need help installing pypng in Python 3.3

2012-10-29 Thread Andrew Robinson


On 10/29/2012 05:23 AM, icgwh wrote:

Hello all,

I am very new to python. I am currently porting a little project of mine from 
java to python and I need to be able to construct and write png images. I 
naturally turned myself toward pypng to accomplish this.

I don't know if this will help, but:

There is a package called image magic; which can convert any image to 
any other type of image.
If that would not be a burden to install (most OS's have pre-compiled 
working binaries) -- you could easily write a portable bitmap file to 
disk using python (even without a library) -- and then convert it to png.


I have a little script I wrote in python to create a canvas, and allow 
you to draw on it using very simple line drawing primitives, and then 
save it to PBM.  It's simple enough (only draws lines with different 
pens) that you could examine it and write your own script to do the same 
or better.


If this interests you, I will see if I can post the py script; or email 
it to you.


--Andrew.


--
http://mail.python.org/mailman/listinfo/python-list

Re: I need help installing pypng in Python 3.3

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:39 AM, [email protected] wrote:

That's very kind of you but I don't think it would be particularly fitted to my needs. 
The program I'm trying to code creates an image as an 2D array of "pixels" 
which is defined by RGBA value. My program needs to access and modifies every component 
of every pixels in the image following a set of rules, kind of like the game of life, 
only more complex.

In fact I only need a library to "push" this array of pixels in a displayable 
format for the GUI and in PNG format to write the image to disk. I don't need to do any 
fancy stuff with the image, just being able to display and write it.



Then, actually, what I am suggesting was *almost* perfect.
To do transparency, you need to write the portable any map (PAM) formation.

Simply print a text header to a file which says:

P7
WIDTH 10
HEIGHT 10
DEPTH 4
MAXVAL 255
TUPLTYPE RGB_ALPHA
ENDHDR

And then dump your 2D array to that same file.
A very quick example in 17 lines of code:

io = open( "anyname.pam","w")
x,y = 10,10
gray=(128,128,128,255) # R,G,B,A value
picture = [ [ gray ] * x ] * y # Make a blank gray canvas 2D array

# Do whatever you want to the 2D picture array here!

io.write( "P7\nWIDTH %d\nHEIGHT %d\nDEPTH 4\nMAXVAL 255\nTUPLTYPE 
RGB_ALPHA\nENDHDR\n" % (x,y) )


for yi in xrange( y ):
for xi in xrange( x ):
pixel = picture[yi][xi]
io.write( chr(pixel[0]) ) # R value
io.write( chr(pixel[1]) ) # G value
io.write( chr(pixel[2]) ) # B value
io.write( chr(pixel[3]) ) # A value
io.flush()

io.close()

And that's it.  You may of course make this more efficient -- I'm just 
showing it this way for clarity.
Many programs can read PAM directly; but for those that can't you can 
use nettools, or imagemagick, to convert it to PNG.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:52 AM, Roy Smith wrote:

Show me an example where someone would write a slice with a negative and
a positive index (both in the same slice);
and have that slice grab a contiguous slice in the *middle* of the list
with orientation of lower index to greater index.
It's possible in bioinformatics. ...
eq[100:-100].
I decided to go to bed... I was starting to write very badly worded 
responses. :)


Thanks, Roy,  what you have just shown is another example that agrees 
with what I am trying to do.
FYI: I was asking for a reason why Python's present implementation is 
desirable...


I wonder, for example:

Given an arbitrary list:
a=[1,2,3,4,5,6,7,8,9,10,11,12]

Why would someone *want* to do:
a[-7,10]
Instead of saying
a[5:10] or a[-7:-2] ?

eg:
What algorithm would naturally *desire* the default behavior of slicing 
when using *mixed* negative and positive indexes?
In the case of a bacterial circular DNA/RNA ring, asking for codons[ 
-10: 10 ]  would logically desire codon[-10:] + codon[:10] not an empty 
list, right?


I think your example is a very reasonable thing the scientific community 
would want to do with Python.

:)

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 10:09 AM, Ian Kelly wrote:

On Oct 29, 2012 7:10 AM, "Andrew Robinson"  wrote:

I will be porting Python 3.xx to a super low power embedded processor (MSP430), 
both space and speed are at a premium.
Running Python on top of Java would be a *SERIOUS* mistake.  .NET won't even 
run on this system. etc.

If that's the case, then running Python at all is probably a mistake.
You know the interpreter alone has an overhead of nearly 6 MB?
There's already a version of the python interpreter which fits in under 
100K:

http://code.google.com/p/python-on-a-chip/
It's not the 3.x series, though; and I don't want to redo this once 2.7 
really does become obsolete.

Yes, I realize that.
But, why can't I just overload the existing __getitem__ for lists and not 
bother writing an entire class?

You can just overload that one method in a subclass of list.  Being
able to monkey-patch __getitem__ for the list class itself would not
be advisable, as it would affect all list slicing anywhere in your
program and possibly lead to some unexpected behaviors.

That's what I am curious about.
What unexpected behaviors would a "monkey patch" typically cause?
If no one really uses negative and positive indexes in the same slice 
operation, because there is no reason to do so...

It will only break the occasional esoteric application.



20 million is nothing.  On a 32-bit system, sys.maxsize == 2 ** 31 -
1.  If the error you were seeing was MemoryError, then more likely you
were running into dynamic allocation issues due to fragmentation of
virtual memory.




No, there was no error at all.  Pthon just crashed & exited; not even an 
exception that I can recall.   It was if it exited normally!


The list was generated in a single pass by many .append() 's, and then 
copied once -- the original was left in place; and then I attempted to 
slice it.


I am able to routinely to 5 million length lists, copy, slice, cut, 
append, and delete from them without this ever happening.
If fragmentation were the issue, I'd think the shorter lists would cause 
the problem after many manipulations...


It may not be a bug in python itself, though, of course.  There are 
libraries it uses which might have a bug.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:53 AM, Chris Angelico wrote:

Can you provide links to these notes? I'm looking at
cpython/Include/sliceobject.h that has this comment:

/*

A slice object containing start, stop, and step data members (the
names are from range).  After much talk with Guido, it was decided to
let these be any arbitrary python type.  Py_None stands for omitted values.
*/

Also, the code for slice objects in CPython works with Py_ssize_t (a
signed quantity of the same length as size_t), which will allow at
least 2**31 for an index. I would guess that your crashes were nothing
to do with 20 million elements and slices.

ChrisA
Let's look at the source code rather than the web notes -- the source 
must be the true answer anyhow.


I downloaded the source code for python 3.3.0, as the tbz;
In the directory "Python-3.3.0/Python", look at Python-ast.c, line 2089 
& ff.


Clearly a slice is malloced for a slice_ty type.
It has four elements: kind, lower, upper, and step.

So, tracing it back to the struct definition...

"Include/Python-ast.h"  has "typedef struct _slice *slice_ty;"

And, here's the answer!:

enum _slice_kind {Slice_kind=1, ExtSlice_kind=2, Index_kind=3};
struct _slice {
enum _slice_kind kind;
union {
struct {
expr_ty lower;
expr_ty upper;
expr_ty step;
} Slice;

struct {
asdl_seq *dims;
} ExtSlice;

struct {
expr_ty value;
} Index;

} v;
};


So, slice() does indeed have arbitrary python types included in it; 
contrary to what I read elsewhere.
expr_ty is a pointer to an arbitrary expression, so the actual structure 
is 4 pointers, at 32 bits each = 16 bytes.
The size of the structure itself, given in an earlier post, is 20 bytes 
-- which means one more pointer is involved, perhaps the one pointing to 
the slice structure itself.


Hmm...!

An empty tuple gives sys.getsizeof( () ) = 24.

But, I would expect a tuple to be merely a list of object pointers; 
hence I would expect 4 bytes for len(), and then a head pointer 4 bytes, 
and then a pointer for each object.

3 objects gives 12 bytes, + 8 = 16 bytes.

Then we need one more pointer so Python knows where the struct is...
So a Tuple of 3 objects ought to fit nicely into 20 bytes; the same size 
as slice() --


but it's 24, even when empty...
And 36 when initialized...
What are the extra 16 bytes for?

All I see is:
typedef struct { object** whatever } PyTupleObject;







--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 05:02 PM, Steven D'Aprano wrote:

On Mon, 29 Oct 2012 08:42:39 -0700, Andrew Robinson wrote:


But, why can't I just overload the existing __getitem__ for lists and
not bother writing an entire class?

You say that as if writing "an entire class" was a big complicated
effort. It isn't. It is trivially simple, a single line:

class MyList(list):
 ...
No, I don't think it big and complicated.  I do think it has timing 
implications which are undesirable because of how *much* slices are used.
In an embedded target -- I have to optimize; and I will have to reject 
certain parts of Python to make it fit and run fast enough to be useful.



You can just overload that one method in a subclass of list.  Being
able to monkey-patch __getitem__ for the list class itself would not be
advisable, as it would affect all list slicing anywhere in your program
and possibly lead to some unexpected behaviors.

That's what I am curious about.
What unexpected behaviors would a "monkey patch" typically cause?

What part of "unexpected" is unclear?

Ahh -- The I don't know approach!  It's only unexpected if one is a bad 
programmer...!

Let me see if I can illustrate a flavour of the sort of things that can
happen if monkey-patching built-ins were allowed.

You create a list and print it:

# simulated output
py>  x = [5, 2, 4, 1]
py>  print(x)
[1, 2, 4, 5]



Finally you search deep into the libraries used in your code, and *five
days later* discover that your code uses library A which uses library B
which uses library C which uses library D which installs a harmless
monkey-patch to print, but only if library E is installed, and you just
happen to have E installed even though your code never uses it, AND that
monkey-patch clashes with a harmless monkey-patch to list.__getitem__
installed by library F. And even though each monkey-patch alone is
harmless, the combination breaks your code's output.

Right, which means that people developing the libraries made 
contradictory assumptions.



Python allows, but does not encourage, monkey-patching of code written in
pure Python, because it sometimes can be useful. It flat out prohibits
monkey-patching of builtins, because it is just too dangerous.

Ruby allows monkey-patching of everything. And the result was predictable:

http://devblog.avdi.org/2008/02/23/why-monkeypatching-is-destroying-ruby/



I read that post carefully; and the author purposely notes that he is 
exaggerating.

BUT Your point is still well taken.

What you are talking about is namespace preservation; and I am thinking 
about it. I can preserve it -- but only if I disallow true Python 
primitives in my own interpreter; I can't provide two sets in the memory 
footprint I am using.


From my perspective, the version of Python that I compile will not be 
supported by the normal python help; The predecessor which first forged 
this path, Pymite, has the same problems -- however, the benefits 
ought-weigh the disadvantages; and the experiment yielded useful 
information on what is redundant in Python (eg: range is not supported) 
and when that redundancy is important for some reason.


If someone had a clear explanation of the disadvantages of allowing an 
iterator, or a tuple -- in place of a slice() -- I would have no qualms 
dropping the subject.  However, I am not finding that yet.  I am finding 
very small optimization issues...


The size of an object is at least 8 bytes.  Hence, three numbers is 
going to be at least 24 bytes; and that's 24 bytes in *excess* of the 
size of slice() or tuple () which are merely containers.  So -- There 
*ARE* savings in memory when using slice(), but it isn't really 2x 
memory -- its more like 20% -- once the actual objects are considered.


The actual *need* for a slice() object still hasn't been demonsrated.  I 
am thinking that the implementation of __getitem__() is very poor 
probably because of legacy issues.


A tuple can also hold None, so ( 1, None, 2 ) is still a valid Tuple.
Alternately:  An iterator, like xrange(), could be made which takes None 
as a parameter, or a special value like 'inf'.
Since these two values would never be passed to xrange by already 
developed code, allowing them would not break working code.


I am only aware of one possible reason that slice() was once thought to 
be necessary; and that is because accessing the element of a tuple would 
recursively call __getitem__ on the tuple.  But, even that is easily 
dismissed once the fixed integer indexes are considered.


Your thoughts?  Do you have any show stopper insights?








--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 06:49 PM, Chris Kaynor wrote:
Every Python object requires two pieces of data, both of which are 
pointer-sized (one is a pointer, one is an int the size of a pointer). 
These are: a pointer to the object's type, and the object's reference 
count. A tuple actually does not need a head pointer: the head pointer 
is merely an offset from the tuple's pointer. It merely has a ref 
count, type, an item count, and pointers to its contents. A slice has 
the same type pointer and reference count, then three pointers to the 
start, stop, and step objects. This means a slice object should be the 
same size as a two-item tuple: the tuple needs a count, while that is 
fixed at 3 for a slice (though some items may be unset). NOTE: The 
above is taken from reading the source code for Python 2.6. For some 
odd reason, I am getting that an empty tuple consists of 6 
pointer-sized objects (48 bytes on x64), rather than the expected 3 
pointer-sized (24 bytes on x64). Slices are showing up as the expected 
5 pointer-sized (40 bytes on x64), and tuples grow at the expected 1 
pointer (8 bytes on x64) per item. I imagine I am missing something, 
but cannot figure out what that would be.

All I see is:
typedef struct { object** whatever } PyTupleObject;

It's fairly straight forward in 3.2.0.  I debugged the code with GDB and 
watched.

Perhaps it is the same in 2.6 ?

In addition to those items you mention, of which the reference count is 
not even *inside* the struct -- there is additional debugging 
information not mentioned.  Built in objects contain a "line number", a 
"column number", and a "context" pointer.  These each require a full 
word of storage.


Also, built in types appear to have a "kind" field which indicates the 
object "type" but is not a pointer.  That suggests two "object" type 
indicators, a generic pointer (probably pointing to "builtin"? somewhere 
outside the struct) and a specific one (an enum) inside the "C" struct.


Inside the tuple struct, I count 4 undocumented words of information.
Over all, there is a length, the list of pointers, a "kind", "line", 
"col" and "context"; making 6 pieces in total.


Although your comment says the head pointer is not required; I found in 
3.3.0 that it is a true head pointer; The Tuple() function on line 2069 
of Python-ast.c, (3.3 version) -- is passed in a pointer called *elts.  
That pointer is copied into the Tuple struct.


How ironic,  slices don't have debugging info, that's the main reason 
they are smaller.

When I do slice(3,0,2), suprisingly "Slice()" is NOT called.
But when I do a[1:2:3] it *IS* called.




--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson

Hi Ian,

There are several interesting/thoughtful things you have written.
I like the way you consider a problem before knee jerk answering.

The copying you mention (or realloc) doesn't re-copy the objects on the 
list.
It merely re-copies the pointer list to those objects. So lets see what 
it would do...

I have seen doubling as the supposed re-alloc method, but I'll assume 
1.25 --

so, 1.25**x = 20million, is 76 copies (max).

The final memory copy would leave about a 30MB hole.
And my version of Python operates initially with a 7MB virtual footprint.

Sooo If the garbage collection didn't operate at all, the copying 
would waste around:

>>> z,w = 30e6,0
>>> while (z>1): w,z = w+z, z/1.25
...
>>> print(w)
14995.8589521

eg: 150MB cummulative.
The doubles would amount to 320Megs max.

Not enough to fill virtual memory up; nor even cause a swap on a 2GB 
memory machine.

It can hold everything in memory at once.

So, I don't think Python's memory management is the heart of the problem,
although memory wise-- it does require copying around 50% of the data.

As an implementation issue, though, the large linear array may cause 
wasteful caching/swapping loops, esp, on smaller machines.

On 10/29/2012 10:27 AM, Ian Kelly wrote:

Yes, I misconstrued your question.  I thought you wanted to change the
behavior of slicing to wrap around the end when start>  stop instead
of returning an empty sequence. ...  Chris has
already given ...  You
could also use map for this:

new_seq = list(map(old_seq.__getitem__, iterable))

MMM... interesting.

I am not against changing the behavior, but I do want solutions like you 
are offering.
As I am going to implement a python interpreter, in C,  being able to do 
things differently could significantly reduce the interpreter's size.

However, I want to break existing scripts very seldom...

I'm aware of what is possible in C with pointer arithmetic. This is 
Python, though, and Python by design has neither pointers nor pointer 
arithmetic. In any case, initializing the pointer to the end of the 
array would still not do what you want, since the positive indices 
would then extend past the end of the array. 

Yes, *and* if you have done assembly language programming -- you know 
that testing for sign is a trivial operation.  It doesn't even require a 
subtraction.  Hence, at the most basic machine level -- changing the 
base pointer *once* during a slice operation is going to be far more 
efficient than performing multiple subtractions from the end of an 
array, as the Python API defines.
I'll leave out further gory details... but it is a Python interpreter 
built in "C" issue.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-29 Thread Andrew Robinson


On 10/29/2012 04:01 PM, Ian Kelly wrote:

On Mon, Oct 29, 2012 at 9:20 AM, Andrew Robinson
  wrote:

FYI: I was asking for a reason why Python's present implementation is
desirable...

I wonder, for example:

Given an arbitrary list:
a=[1,2,3,4,5,6,7,8,9,10,11,12]

Why would someone *want* to do:
a[-7,10]
Instead of saying
a[5:10] or a[-7:-2] ?

A quick search of local code turns up examples like this:

if name.startswith('{') and name.endswith('}'):
 name = name[1:-1]

Which is done to avoid explicitly calling the len() operator.

If slices worked like ranges, then the result of that would be empty,
which is obviously not desirable.
Yes, and that's an excellent point -- but note what I am showing in the 
example.  It is that example, which I am specifying.  There are only two 
cases where I think the default behavior of Python gives undesirable 
results:


The step is positive, and the pair of indexes goes from negative to 
positive.
Likewise, If the pair went from positive to negative, and the step was 
negative.


In all other combinations, the default behavior of python ought to 
remain intact.
I apologize for not making this crystal clear -- I thought you would 
focus on the specific example I gave.



I don't know of a reason why one might need to use a negative start
with a positive stop, though.
I've already given several examples; and another poster did too -- eg: 
Gene sequences for bacteria.  It's not uncommon to need this.  If I do 
some digging, I can also show some common graphics operations that 
benefit greatly from this ability -- NOTE: in another thread I just 
showed someone how to operate on RGBA values...  Slicing becomes THE 
major operation done when converting, or blitting, graphics data. etc.


Another example -- Jpeg, for example, uses discrete cosines -- which are 
a naturally cyclic data type.  They repeat with a fixed period.  I know 
there are "C" libraries already made for Jpeg -- but that doesn't mean 
many other applications with no "C" library aren't plagued by this problem.


I don't know how to make this point more clear.  There really *ARE* 
applications that uses cyclic lists of data; or which can avoid extra 
logic to fix problems encountered from linear arrays which *end* at a 
particular point.


sometimes it is desirable for a truncation to occur, sometimes it's 
NOT.  The sign convention test I outlined, I believe, clearly detects 
when a cyclic data set is desired. If there are normal examples where my 
tests fail -- that's what's important to me.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/29/2012 10:53 PM, Michael Torrie wrote:

On 10/29/2012 01:34 PM, Andrew Robinson wrote:

No, I don't think it big and complicated.  I do think it has timing
implications which are undesirable because of how *much* slices are used.
In an embedded target -- I have to optimize; and I will have to reject
certain parts of Python to make it fit and run fast enough to be useful.

Since you can't port the full Python system to your embedded machine
anyway, why not just port a subset of python and modify it to suit your
needs right there in the C code.  It would be a fork, yes,

You're exactly right;  That's what I *know* I am faced with.


   Without a libc, an MMU on the CPU, and a kernel, it's not going to just 
compile and run.
I have libc.  The MMU is a problem; but the compiler implements the 
standard "C" math library; floats, though, instead of doubles.  That's 
the only problem -- there.

  What you want with slicing behavior changes has no
place in the normal cPython implementation, for a lot of reasons.  The
main one is that it is already possible to implement what you are
talking about in your own python class, which is a fine solution for a
normal computer with memory and CPU power available.
If the tests I outlined in the previous post inaccurately describe a 
major performance improvement and at least a modest code size reduction; 
or will *often* introduce bugs -- I *AGREE* with you.


Otherwise, I don't.  I don't think wasting extra CPU power is a good 
thing -- Extra CPU power can always be used by something else


I won't belabor the point further.  I'd love to see a counter example to 
the specific criteria I just provided to IAN -- it would end my quest; 
and be a good reference to point others to.



--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/29/2012 11:51 PM, Ian Kelly wrote:

On Mon, Oct 29, 2012 at 4:39 PM, Andrew Robinson

As above, you're looking at the compiler code, which is why you're
finding things like "line" and "column".  The tuple struct is defined
in tupleobject.h and stores tuple elements in a tail array.



If you re-check my post to chris, I listed the struct you mention.
The C code is what is actually run (by GDB breakpoint test) when a tuple 
is instantiated.
If the tuple were stripped of the extra data -- then it ought to be as 
small as slice().
But it's not as small -- so either the sys.getsizeof() is lying -- or 
the struct you mention is not complete.


Which?

--Andrew.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 11:02 AM, Ian Kelly wrote:

On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman  wrote:

File a bug report?

Looks like it's already been wontfixed back in 2006:

http://bugs.python.org/issue1501180
Thanks, IAN, you've answered the first of my questions and have been a 
great help.
(And yes, I was debugging interactive mode... I took a nap after writing 
that post, as I realized I had reached my 1 really bad post for the day... )


I at least I finally know why Python chooses to implement slice() as a 
separate object from tuple; even if I don't like the implications.


I think there are three main consequences of the present implementation 
of slice():


1) The interpreter code size is made larger with no substantial 
improvement in functionality, which increases debugging effort.
2) No protection against perverted and surprising (are you surprised?! I 
am) memory operation exists.
3) There is memory savings associated with not having garbage collection 
overhead.


D'Apriano mentioned the named values, start, stop, step in a slice() 
which are an API and legacy issue;  These three names must also be 
stored in the interpreter someplace.  Since slice is defined at the "C" 
level as a struct, have you already found these names in the source code 
(hard-coded), or are they part of a .py file associated with the 
interface to the "C" code?


--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 01:17 AM, Steven D'Aprano wrote:

By the way Andrew, the timestamps on your emails appear to be off, or
possibly the time zone. Your posts are allegedly arriving before the
posts you reply to, at least according to my news client.
:D -- yes, I know about that problem.  Every time I reboot it shows up 
again...
It's a distribution issue, my hardware clock is in local time -- but 
when the clock is read by different scripts in my distribution, some 
refuse to accept that the system clock is not UTC.
I'll be upgrading in a few weeks -- so I'm just limping along until 
then. My apology.



Then I look forward to seeing your profiling results that show that the
overhead of subclassing list is the bottleneck in your application.

Until then, you are making the classic blunder of the premature optimizer:

"More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason — including
blind stupidity." — W.A. Wulf


I'm sure that's true.  Optimization, though, is a very general word.

On a highway in my neighborhood -- the government keeps trying to put 
more safety restrictions on it, because it statistically registers as 
the "highest accident rate road" in the *entire* region.


Naturally, the government assumes that people in my neighborhood are 
worse drivers than usual and need to be policed more -- but the truth 
is, that highway is the *ONLY* access road in the region for dozens of 
miles in any direction for a densely populated area, so if there is 
going to be an accident it will happen there; the extra safety 
precautions are not necessary when the accident rate is looked at from a 
per-capita perspective of those driving the highway.


I haven't made *the* blunder of the premature optimizer because I 
haven't implemented anything yet.  Premature optimizers don't bother to 
hold public conversation and take correction.
OTOH:  people who don't ever optimize out of fear, pay an increasing 
bloat price with time.



I am not impressed by performance arguments when you have (apparently)
neither identified the bottlenecks in your code, nor even measured the
performance.
Someone else already did a benchmark between a discrete loop and a slice 
operation.

The difference in speed was an order of magnitude different.
I bench-marked a map operation, which was *much* better -- but also 
still very slow in comparison.


Let's not confound an issue here -- I am going to implement the python 
interpreter; and am not bound by optimization considerations of the 
present python interpreter -- There are things I can do which as a 
python programmer -- you can't.  I have no choice but to re-implement 
and optimize the interpreter -- the question is merely how to go about it.



  You are essentially *guessing* where the bottlenecks are,
and *hoping* that some suggested change will be an optimization rather
than a pessimization.

Of course I may be wrong, and you have profiled your code and determined
that the overhead of inheritance is a problem. If so, that's a different
ball game. But your posts so far suggest to me that you're trying to
predict performance optimizations rather than measure them.
Not really; Inheritance itself and it's timing aren't my main concern.  
Even if the time was *0* that wouldn't change my mind.


There are man hours in debugging time caused by not being able to wrap 
around in a slice. (I am not ignoring the contrary man hours of an API 
change's bugs).


Human psychology is important; and it's a double edged sword.

I would refer you to a book written by Steve Maguire, Writing Solid 
Code; Chapter 5; Candy machine interfaces.


He uses the "C" function "realloc()" as an excellent example of a bad 
API; but still comments on one need that it *does* fulfill -- "I've 
found it better to have one function that both shrinks and expands 
blocks so that I don't have to write *ifs* constructs every time I need 
to resize memory.  True, I give up some extra argument checking, but 
this is offset by the *ifs* that I no longer need to write (*and 
possibly mess up*).


* Extra steps that a programmer must take to achieve a task are places 
where bugs get introduced.


* API's which must be debugged to see what particular operation it is 
performing rather than knowing what that operation is from looking at 
the un-compiled code are places where bugs get introduced.


These two points are not friendly with each other -- they are in fact, 
generally in conflict.

Right, which means that people developing the libraries made
contradictory assumptions.

Not necessarily. Not only can monkey-patches conflict, but they can
combine in bad ways. It isn't just that Fred assumes X and Barney assumes
not-X, but also that Fred assumes X and Barney assumes Y and *nobody*
imagined that there was some interaction between X and Y.
They *STILL* made contradictory assumptions; each of them assumed the 
interaction mechanism would not be applied i

Re: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


On 10/30/2012 04:48 PM, Mark Lawrence wrote:

On 30/10/2012 15:47, Andrew Robinson wrote:


I would refer you to a book written by Steve Maguire, Writing Solid
Code; Chapter 5; Candy machine interfaces.



The book that took a right hammering here 
http://accu.org/index.php?module=bookreviews&func=search&rid=467 ?




Yes, although Chapter 5 is the one where the realloc() issue is discussed.
If you have a library, see if you can check the book out -- rather than 
spend $$$ on it.
But, in good humor --  Consider the only criticism the poster mentioned 
about chapter 5's contents.


   Occasionally, he presents a code fragment with a subtle bug, such as:

   p = realloc(p,n);

   _I have to admit that I didn't spot the bug_, but then I never use
   realloc, knowing it to have pitfalls. What is the bug? If realloc
   cannot allocate the memory, it returns NULL and the assignment means
   you lose your original pointer.

   What are the pitfalls? Realloc may or may not copy the data to a
   new, larger, area of memory and return the address of that: many
   programmers forget this and end up with pointers into the old,
   deallocated, area. Even those programmers who remember will likely
   fall into the trap that Maguire shows.

   _Back to 'clever' code though, he prefers_:


His critique is a bit like the scene in Monty Python's the Life of Br..an...
Where the aliens come, and crash, and leave -- and is totally irrelevant 
to what the plot-line is in the movie.  What does this comment have to 
do with a critique???  McGuire didn't fail to notice the bug!


But the critic doesn't even notice the main *pointS* the author was 
trying to make in that chapter.
There are, to be sure, recommendations that I don't agree with in the 
book;  He doesn't seem to do much Unit testing, postmortems, etc.  are 
all topics that I studied in a formal class on Software Engineering.  It 
was a wider perspective than McGuire brings to his book;


But that doesn't mean McGuire has nothing valuable to say!

A short Python Homage for readers of the linked Critique! :

I've had to follow GPL project style rules where the rule for a weird 
situation would be:
while (*condition) /* nothing */ ;   // and yes, this will sometimes 
generate a warning...


But, I have enough brains to take McGuire's *suggestion* to an improved 
Python conclusion.


#define PASS(x) {(void)NULL;}

while (*condition) PASS( Gas );  // There will be no warning

-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Negative array indicies and slice()

2012-10-30 Thread Andrew Robinson


Ian,

>  Looks like it's already been wontfixed back in 2006:

>  http://bugs.python.org/issue1501180

Absolutely bloody typical, turned down because of an idiot.  Who the hell is
Tim Peters anyway?
>  I don't really disagree with him, anyway.  It is a rather obscure bug
>  -- is it worth increasing the memory footprint of slice objects by 80%
>  in order to fix it?

:D

In either event, a *bug* does exist (at *least* 20% of the time.)  Tim 
Peters could have opened the *appropriate* bug complaint if he rejected 
the inappropriate one.


The API ought to have either 1) included the garbage collection, or 2) 
raised an exception anytime dangerous/leaky data was supplied to slice().


If it is worth getting rid of the 4 words of extra memory required for 
the GC -- on account of slice() refusing to support data with 
sub-objects; then I'd also point out that a very large percentage of the 
time, tuples also contain data (typically integers or floats,) which do 
not further sub-reference objects.  Hence, it would be worth it there too.


OTOH, if the GC is considered acceptable in non-sub-referenced tuples, 
GC ought to be acceptable in slice() as well.


Inconsistency is the mother of surprises; and code bloat through 
exceptions



Note that the slice API also includes the slice.indices method.

They also implement rich comparisons, but this appears to be done by
copying the data to tuples and comparing the tuples, which is actually
a bit ironic considering this discussion.

Yes, indeed!
I didn't mention the slice.indicies method -- as it's purpose is 
traditionally to *directly* feed the parameters of xrange or range.  ( I 
thought that might start a WAR! ). :D


http://docs.python.org/release/2.3.5/whatsnew/section-slices.html

class FakeSeq:
...
 return FakeSeq([self.calc_item(i) for i in_range(*indices)_])
else:
return self.calc_item(i)


And here I'm wondering why we can't just pass range into it directly... :(


I came across some unexpected behavior in Python 3.2 when experimenting 
with ranges and replacement


Consider, xrange is missing, BUT:
>>> a=range(1,5,2)
>>> a[1]
3
>>> a[2]
5
>>> a[1:2]
range(3, 5, 2)

Now, I wondered if it would still print the array or not; eg: if this 
was a __str__ issue vs. __repr__.


>>> print( a[1:2] ) # Boy, I have to get used to the print's parenthesis
range(3, 5, 2)

So, the answer is *NOPE*.
I guess I need to read the doc's all over again... it's ... well, quite 
different.

--Andrew.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson


On 10/30/2012 10:29 PM, Michael Torrie wrote:
As this is the case, why this long discussion? If you are arguing for 
a change in Python to make it compatible with what this fork you are 
going to create will do, this has already been fairly thoroughly 
addressed earl on, and reasons why the semantics will not change 
anytime soon have been given. 


I'm not arguing for a change in the present release of Python; and I 
have never done so.
Historically, if a fork happens to produce something surprisingly 
_useful_; the main code bank eventually accepts it on their own.  If a 
fork is a mistake, it dies on its own.


That really is the way things ought to be done.

   include this
   The Zen of Python, by _Tim Peters_
   
   Special cases aren't special enough to break the rules.
   Although _practicality beats purity_.
   

Now, I have seen several coded projects where the idea of cyclic lists 
is PRACTICAL;
and the idea of iterating slices may be practical if they could be made 
*FASTER*.


These warrant looking into -- and carefully;  and that means making an 
experimental fork; preferably before I attempt to micro-port the python.


Regarding the continuing discussion:
The more I learn, the more informed decisions I can make regarding 
implementation.

I am almost fully understanding the questions I originally asked, now.

What remains are mostly questions about compatibility wrappers, and how 
to allow them to be used -- or selectively deleted when not necessary; 
and perhaps a demonstration or two about how slices and named tuples can 
(or can't) perform nearly the same function in slice processing.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson


On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; I move that we mandate (yes, in the current
python implementation, even as a *fix*) that its named members may
not be assigned any objects other than None or numbers

eg: Lists would be forbidden

Since functions, and subclasses, can be test evaluated by int(
the_thing_to_try ) and *[] can too,
generality need not be lost for generating nothing or numbers.


PEP 357 requires that anything implementing the __index__ special 
method be allowed for slicing sequences (and also that __index__ be 
used for the conversion).  For the most part, that includes ints and 
numpy integer types, but other code could be doing esoteric things 
with it.


I missed something... (but then that's why we're still talking about it...)

Reading the PEP, it notes that *only* integers (or longs) are permitted 
in slice syntax.

(Overlooking None, of course... which is strange...)

The PEP gives the only exceptions as objects with method "__index__".

Automatically, then, an empty list is forbidden (in slice syntax).
However,  What you did, was circumvent the PEP by passing an empty list 
directly to slice(), and avoiding running it through slice syntax 
processing.


So...
Is there documentation suggesting that a slice object is meant to be 
used to hold anything other than what comes from processing a valid 
slice syntax [::]??. (we know it can be done, but that's a different Q.)



The change would be backward-incompatible in any case, since there is 
certainly code out there that uses non-numeric slices -- one example 
has already been given in this thread.

Hmmm.

Now, I'm thinking -- The purpose of index(), specifically, is to notify 
when something which is not an integer may be used as an index;  You've 
helpfully noted that index() also *converts* those objects into numbers.


Ethan Fullman mentioned that he used the names of fields, "instead of 
having to remember the _offsets_"; Which means that his values _do 
convert_ to offset numbers


His example was actually given in slice syntax notation [::].
Hence, his objects must have an index() method, correct?.

Therefore, I still see no reason why it is permissible to assign 
non-numerical (non None) items
as an element of slice().  Or, let me re-word that more clearly -- I see 
no reason that slice named members when used as originally intended 
would ever need to be assigned a value which is not *already* converted 
to a number by index().  By definition, if it can't be coerced, it isn't 
a number.


A side note:
At 80% less overhead, and three slots -- slice is rather attractive to 
store RGB values in for a picture!  But, I don't think anyone would have 
a problem saying "No, we won't support that, even if you do do it!


So, what's the psychology behind allowing slice() to hold objects which 
are not converted to ints/longs in the first place?


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-01 Thread Andrew Robinson


On 11/01/2012 07:12 AM, Ethan Furman wrote:

Andrew Robinson wrote:

  On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson  wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; 


A little review...
The premise of my statement here, is that Tim Peter's closed the Bug report;

http://bugs.python.org/issue1501180
With the *reason* being that using GC was *goofy* on account of what slice() was intended 
to hold, None and a number.  So, My first attempt at bug fix was simply to take Tim 
Peter's at his word... since we all assume he *isn't* a "Bloody Idiot".  Hey 
isn't that a swear-word somewhere in the world?  Its not where I live, but I seem to 
recall... oh, well... whatever.

I missed something... (but then that's why we're still talking about 
it...)


Reading the PEP, it notes that *only* integers (or longs) are 
permitted in slice syntax.


Keep in mind that PEPs represent Python /at that time/ -- as Python
moves forward, PEPs are not updated (this has gotten me a couple times).
And, since I am reading them in the order written (but in 3.0) trying to 
get the whole of Python into my mind on the journey to prep for porting 
it into a tiny chip -- I'm frustrated by not being finished yet...



Furman, actually.  :)

:-!



And my values do *not* convert to indices (at least, not automatically).
Ahhh (Rhetorical & sarcastic) I was wondering how you added index() 
method to strings, not access it, and still be following the special PEP 
we are talking about,when you gave that example using unwrapped strings.


--

H was that PEP the active state of Python, when Tim rejected the 
bug report?  eg: have we "moved on" into a place where the bug report 
ought to be re-issued since that PEP is now *effectively* passe, and Tim 
could thus be vindicated from being a "b... Idiot?"  (Or has he been 
given the 1st place, Python Twit award -- and his *man* the bug list 
been stripped?)



In other words, the slice contains the strings, and my code calculates
the offsets -- Python doesn't do it for me.

~Ethan~


I see, so the problem is that PEP wants you to implement the index(), 
but that is going to cause you to subclass string, and add a wrapper 
interface every time you need to index something.
eg: doing something llke ---   mydbclass[ MyString( 'fromColumn' ) : 
MyString( 'toColum' ) ] and the code becomes a candy machine interface 
issue (Chapter 5, Writing Solid Code).


My favorite line there uses no swearing  "If they had just taken an 
extra *30* seconds thinking about their design, they could have saved 
me, and I'm sure countless others, from getting something they didn't 
want."   I laugh, if they didn't get it already -- an extra *30* seconds 
is WY to optimistic.  Try minutes at least, will a policeman glaring 
over their shoulder.


But anyhow --- The problem lies in *when* the conversion to an integer 
is to take place, not so much if it is going to happen.  Your indexes, 
no matter how disguised, eventually will become numbers; and you have a 
way that minimizes coding cruft (The very reason I started the thread, 
actually... subclassing trivially to fix candy machine interfaces leads 
to perpetual code increases -- In cPython source-code, "realloc" 
wrappers and "malloc" wrappers are found  I've seen these wrappers 
*re*-invented in nearly every C program I've every looked at! Talk about 
MAN-hours, wasted space, and cruft.)


So; is this a reasonable summary of salient features (status quo) ?

 * Enforcing strict numerical indexes (in the slice [::] operator)
   causes much psychological angst when attempting to write clear code
   without lots of wrapper cruft.
 * Pep 357 merely added cruft with index(), but really solved nothing. 
   Everything index() does could be implemented in __getitem__ and

   usually is.
 * slice().xxxs are merely a container for *whatever* was passed to [::]
 * slice() is
 * slice is also a full blown object, which implements a trivial method
   to dump the contents of itself to a tuple.
 * presently slice() allows memory leaks through GC loops.
 * Slice(), even though an object with a constructor, does no error
   checking to deny construction of memory leaks.

If people would take 30 seconds to think about this the more details 
added -- the more comprehensive can be my understanding -- and perhaps a 
consensus reached about the problem.

These are a list of relevant options, without respect to feasability.

 * Don't bother to fix the bug; allow Python to crash with a subtle bug
   that often ta

Re: Negative array indicies and slice()

2012-11-01 Thread Andrew Robinson

On 11/01/2012 12:07 PM, Ian Kelly wrote:

On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson
  wrote:

H was that PEP the active state of Python, when Tim rejected the bug 
report?

Yes. The PEP was accepted and committed in March 2006 for release in
Python 2.5.  The bug report is from June 2006 has a version
classification of Python 2.5, although 2.5 was not actually released
until September 2006.

That explain's Peter's remark.  Thank you.  He looks *much* smarter now.

Pep 357 merely added cruft with index(), but really solved nothing.  Everything 
index() does could be implemented in __getitem__ and usually is.

No.  There is a significant difference between implementing this on
the container versus implementing it on the indexes.  Ethan
implemented his string-based slicing on the container, because the
behavior he wanted was specific to the container type, not the index
type.  Custom index types like numpy integers on the other hand
implement __index__ on the index type, because they apply to all
sequences, not specific containers.

Hmmm...
D'Aprano didn't like the monkey patch;and sub-classing was his fix-all.

Part of my summary is based on that conversation with him,and you 
touched on one of the unfinished  points; I responded to him that I 
thought __getitem__ was under-developed.   The object slice() has no 
knowledge of the size of the sequence; nor can it get that size on it's 
own, but must passively wait for it to be given to it.

The bottom line is:  __getitem__ must always *PASS* len( seq ) to 
slice() each *time* the slice() object is-used.  Since this is the case, 
it would have been better to have list, itself, have a default member 
which takes the raw slice indicies and does the conversion itself.  The 
size would not need to be duplicated or passed -- memory savings, & 
speed savings...

I'm just clay pidgeoning an idea out here
Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed 
*list*; so let's ignore pure python as a reason to do anything on the 
behalf on Numpy:

Then, lets' consider all thrid party classes;  These are where 
subclassing becomes a pain -- BUT: I think those could all have been 
injected.

>>> class ThirdParty( list ):  # Pretend this is someone else's...
... def __init__(self): return
... def __getitem__(self,aSlice): return aSlice
...

We know it will default work like this:
>>> a=ThirdParty()
>>> a[1:2]
slice(1, 2, None)

# So, here's an injection...
>>> ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
>>> ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3, 
self.superOnlyOfNumpy__getitem__(aSlice ).step )

>>> a[5:6]
(1, 3, None)

Numpy could have exported a (workable) function that would modify other 
list functions to affect ONLY numpy data types (eg: a filter).  This 
allows user's creating their own classes to inject them with Numpy's 
filter only when they desire;

Recall Tim Peter's "explicit is better than implicit" Zen?

Most importantly normal programs not using Numpy wouldn't have had to 
carry around an extra API check for index() *every* single time the 
heavily used [::] happened.  Memory & speed both.

It's also a monkey patch, in that index() allows *conflicting* 
assumptions in violation of the unexpected monkey patch interaction worry.

eg: Numpy *CAN* release an index() function on their floats -- at which 
point a basic no touch class (list itself) will now accept float as an 
index in direct contradiction of PEP 357's comment on floats... see?

My point isn't that this particular implementation I have shown is the 
best (or even really safe, I'd have to think about that for a while).  
Go ahead and shoot it down...

My point is that, the methods found in slice(), and index() now have 
moved all the code regarding a sequence *out* of the object which has 
information on that sequence.  It smacks of legacy.

The Python parser takes values from many other syntactical constructions 
and passes them directly to their respective objects -- but in the case 
of list(), we have a complicated relationship; and not for any reason 
that can't be handled in a simpler way.

Don't consider the present API legacy for a moment, I'm asking 
hypothetical design questions:

How many users actually keep slice() around from every instance of [::] 
they use?
If it is rare, why create the slice() object in the first place and 
constantly be allocating and de-allocating memory, twice over? (once for 
the original, and once for the repetitive method which computes dynamic 
values?)  Would a single mutable have less overhead, since it is 
destroyed anyway?

--
http://mail.python.org/mailman/listinfo/python-list

Re: Negative array indicies and slice()

2012-11-02 Thread Andrew Robinson


Hi Ian,

I apologize for trying your patience with the badly written code 
example.  All objects were meant to be ThirdParty(), the demo was only 
to show how a slice() filter could have been applied for the reasons 
PEP357 made index() to exist.
eg: because numpy items passed to __getitems__ via slice syntax [::] 
were illegal values.
PEP 357 is the one who specifically mentioned Numpy types -- which is 
the only reason I used the name in the example;  I could have just as 
well used a string.


I am fully aware of what numpy does -- I have used it; modified the 
fortran interfaces underneath, etc.


The index() method, however, affects *all* list objects in Python, not 
just Numpy's -- correct?


I'll write a working piece of code tomorrow to demonstrate the filter 
very clearly rather than a skeleton, and test it before posting.


--
http://mail.python.org/mailman/listinfo/python-list

Memory profiling: Python 3.2

2012-11-02 Thread Andrew Robinson

When Python3.2 is running, is there an easy way within Python to capture 
the *total* amount of heap space the program is actually using  (eg:real 
memory)?  And how much of that heap space is allocated to variables ( 
including re-capturable data not yet GC'd ) ?




--
http://mail.python.org/mailman/listinfo/python-list

Fwd: Re: Negative array indicies and slice()

2012-11-03 Thread Andrew Robinson

Forwarded to python list:

 Original Message 
Subject:Re: Negative array indicies and slice()
Date:   Sat, 03 Nov 2012 15:32:04 -0700
From:   Andrew Robinson
Reply-To:   [email protected]
To: Ian Kelly <>

On 11/01/2012 05:32 PM, Ian Kelly wrote:

 On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson

 The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
 each *time* the slice() object is-used.  Since this is the case, it would
 have been better to have list, itself, have a default member which takes the
 raw slice indicies and does the conversion itself.  The size would not need
 to be duplicated or passed -- memory savings,&   speed savings...

 And then tuple would need to duplicate the same code.  As would deque.
   And str.  And numpy.array, and anything else that can be sliced,
 including custom sequence classes.

I don't think that's true.  A generic function can be shared among
different objects without being embedded in an external index data
structure to boot!

If *self* were passed to an index conversion function (as would
naturally happen anyway if it were a method), then the method could take
len( self ) without knowing what the object is;
Should the object be sliceable -- the len() will definitely return the
required piece of information.

 Numpy arrays are very different internally from lists.

Of course!  (Although, lists do allow nested lists.)

 I'm not understanding what this is meant to demonstrate.  Is "MyClass"
 a find-replace error of "ThirdParty"?  Why do you have __getitem__
 returning slice objects instead of items or subsequences?  What does
 this example have to do with numpy?

Here's a very cleaned up example file, cut and pastable:
#!/bin/env python
# File: sliceIt.py  --- a pre PEP357 hypothesis test skeleton

class Float16():
"""
Numpy creates a float type, with very limited precision -- float16
Rather than force you to install np for this test, I'm just making a
faux object.  normally we'd just "import np"
"""

def __init__(self,value): self.value = value
def AltPEP357Solution(self):
""" This is doing exactly what __index__ would be doing. """
return None if self.value is None else int( self.value )

class ThirdParty( list ):
"""
A simple class to implement a list wrapper, having all the
properties of
a normal list -- but explicitly showing portions of the interface.
"""
def __init__(self, aList): self.aList = aList

def __getitem__(self, aSlice):
print( "__getitems__", aSlice )
temp=[]
edges = aSlice.indices( len( self.aList ) ) # *unavoidable* call
for i in range( *edges ): temp.append( self.aList[ i ] )
return temp

def Inject_FloatSliceFilter( theClass ):
"""
This is a courtesy function to allow injecting (duck punching)
a float index filter into a user object.
"""
def Filter_FloatSlice( self, aSlice ):

# Single index retrieval filter
try: start=aSlice.AltPEP357Solution()
except AttributeError: pass
else: return self.aList[ start ]

# slice retrieval filter
try: start=aSlice.start.AltPEP357Solution()
except AttributeError: start=aSlice.start
try: stop=aSlice.stop.AltPEP357Solution()
except AttributeError: stop=aSlice.stop
try: step=aSlice.step.AltPEP357Solution()
except AttributeError: step=aSlice.step
print( "Filter To",start,stop,step )
return self.super_FloatSlice__getitem__( slice(start,stop,step) )

theClass.super_FloatSlice__getitem__ = theClass.__getitem__
theClass.__getitem__ = Filter_FloatSlice

# EOF: sliceIt.py

Example run:

 from sliceIt import *
 test = ThirdParty( [1,2,3,4,5,6,7,8,9] )
 test[0:6:3]

('__getitems__', slice(0, 6, 3))
[1, 4]

 f16=Float16(8.3)
 test[0:f16:2]

('__getitems__', slice(0,, 2))
Traceback (most recent call last):
  File "", line 1, in
  File "sliceIt.py", line 26, in __getitem__
edges = aSlice.indices( len( self.aList ) )  # This is an
*unavoidable* call
TypeError: object cannot be interpreted as an index

 Inject_FloatSliceFilter( ThirdParty )
 test[0:f16:2]

('Filter To', 0, 8, 2)
('__getitems__', slice(0, 8, 2))
[1, 3, 5, 7]

 test[f16]

9

 We could also require the user to explicitly declare when they're
 performing arithmetic on variables that might not be floats. Then we
 can turn off run-time type checking unless the user explicitly
 requests it, all in the name of micro-optimization and explicitness.

:) None of those would help micro-optimization that I can see.

 Seriously, whether x is usab

Re: Multi-dimensional list initialization

2012-11-04 Thread Andrew Robinson


On 11/04/2012 10:27 PM, Demian Brecht wrote:

So, here I was thinking "oh, this is a nice, easy way to initialize a 4D 
matrix" (running 2.7.3, non-core libs not allowed):

m = [[None] * 4] * 4

The way to get what I was after was:

m = [[None] * 4, [None] * 4, [None] * 4, [None * 4]]

FYI:  The behavior is the same in python 3.2
m=[[None]*4]*4
produces a nested list with all references being to the first instance 
of the inner list construction.


I agree, the result is very counter-intuitive; hmmm... but I think you 
meant:


m = [[None] * 4, [None] * 4, [None] * 4, [None] *4 ]
rather than:
m = [[None] * 4, [None] * 4, [None] * 4, [None * 4]]

? :) ?

I asked a why question on another thread, and watched several dodges to 
the main question; I'll be watching to see if you get anything other 
than "That's the way it's defined in the API".  IMHO -- that's not a 
real answer.


My guess is that the original implementation never considered anything 
beyond a 1d list.

:)

A more precise related question might be: is there a way to force the 
replication operator to use copying rather than referencing?

:/

--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-04 Thread Andrew Robinson


On 11/04/2012 11:27 PM, Chris Angelico wrote:

On Mon, Nov 5, 2012 at 6:07 PM, Chris Rebert  wrote:

x = None
x.a = 42

Traceback (most recent call last):
   File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'a'

Python needs a YouGottaBeKiddingMeError for times when you do
something utterly insane like this. Attributes of None??!? :)

ChrisA

Hmmm? Everything in Python is an object.
Therefore! SURE. None *does* have attributes! ( even if not useful ones... )

eg: " None.__getattribute__( "__doc__" ) " doesn't produce an error.

In C, in Linux, at the end of the file "errno.h", where all error codes 
are listed eg:( EIO, EAGAIN, EBUSY, E) They had a final error like 
the one you dreamed up, it was called "EIEIO"; and the comment read 
something like, "All the way around Elmer's barn".


:)

The poster just hit that strange wall -- *all* built in types are 
injection proof; and that property is both good and bad...


--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-05 Thread Andrew Robinson


On 11/05/2012 06:30 PM, Oscar Benjamin wrote:

On 6 November 2012 02:01, Chris Angelico  wrote:

On Tue, Nov 6, 2012 at 12:32 PM, Oscar Benjamin
  wrote:

I was just thinking to myself that it would be a hard thing to change
because the list would need to know how to instantiate copies of all
the different types of the elements in the list. Then I realised it
doesn't. It is simply a case of how the list multiplication operator
is implemented and whether it chooses to use a reference to the same
list or make a copy of that list. Since all of this is implemented
within the same list type it is a relatively easy change to make
(ignoring backward compatibility concerns).

I don't see this non-copying list multiplication behaviour as
contradictory but has anyone ever actually found a use for it?

Stupid example of why it can't copy:

bad = [open("test_file")] * 4

How do you clone something that isn't Plain Old Data? Ultimately,
that's where the problem comes from. It's easy enough to clone
something that's all scalars (strings, integers, None, etc) and
non-recursive lists/dicts of scalars, but anything more complicated
than that is rather harder.

That's not what I meant. But now you've made me realise that I was
wrong about what I did mean. In the case of

stuff = [[obj] * n] * m

I thought that the multiplication of the inner list ([obj] * n) by m
could create a new list of lists using copies. On closer inspection I
see that the list being multiplied is in fact [[obj] * n] and that
this list can only know that it is a list of lists by inspecting its
element(s) which makes things more complicated.

I retract my claim that this change would be easy to implement.


Oscar

Hi Oscar,

In general, people don't use element multiplication (that I have *ever* 
seen) to make lists where all elements of the outer most list point to 
the same sub-*list* by reference.  The most common use of the 
multiplication is to fill an array with a constant, or short list of 
constants;  Hence, almost everyone has  to work around the issue as the 
initial poster did by using a much longer construction.


The most compact notation in programming really ought to reflect the 
most *commonly* desired operation.  Otherwise, we're really just making 
people do extra typing for no reason.


Further, list comprehensions take quite a bit longer to run than low 
level copies; by a factor of roughly 10. SO, it really would be worth 
implementing the underlying logic -- even if it wasn't super easy.


I really don't think doing a shallow copy of lists would break anyone's 
program.
The non-list elements, whatever they are, can be left as reference 
copies -- but any element which is a list ought to be shallow copied.  
The behavior observed in the opening post where modifying one element of 
a sub-list, modifies all elements of all sub-lists is never desired as 
far as I have ever witnessed.


The underlying implementation of Python can check an object type 
trivially, and the only routine needed is a shallow list copy.  So, no 
it really isn't a complicated operation to do shallow copies of lists.


:)

--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-06 Thread Andrew Robinson


On 11/05/2012 10:07 PM, Chris Angelico wrote:

On Tue, Nov 6, 2012 at 4:51 PM, Andrew Robinson
  wrote:

I really don't think doing a shallow copy of lists would break anyone's
program.

Well, it's a change, a semantic change. It's almost certainly going to
break _something_. But for the sake of argument, we can suppose that
the change could be made. Would it be the right thing to do?

Shallow copying by default would result in extremely weird behaviour.
All the same confusion would result, only instead of comparing
[None]*4 with [[None]]*4, there'd be confusion over the difference
between [[None]]*4 and [[[None]]]*4.

I don't think it would help anything, and it'd result in a lot more
work for no benefit.

ChrisA

I don't follow.
a=[ None ]*4 would give a=[ None, None, None, None ] as usual.
All four None's would be the same object, but there are automatically 4 
different pointers to it.

Hence,
a[0]=1 would give a=[ 1, None, None, None ] as usual.

a=[ [None] ]*4 would give a=[ [None], [None], [None], [None] ] as usual
BUT:
a[0][0] = 1 would no longer give a=[ [1],[1],[1],[1] ] *Rather* it would 
give

a=[ [1].[None].[None],[None] ]

The None objects are all still the same one, BUT the lists themselves 
are different.


Again, a=[ ["alpha","beta"] * 4 ] would give:
a=[ ["alpha","beta"], ["alpha","beta"], ["alpha","beta"], ["alpha","beta"] ]

All four strings, "alpha", are the same object -- but there are 5 
different lists;  The pointers inside the initial list are copied four 
times -- not the string objects;

But the *lists* themselves are created new for each replication.

If you nest it another time;
[[[None]]]*4, the same would happen; all lists would be independent -- 
but the objects which aren't lists would be refrenced-- not copied.


a=[[["alpha","beta"]]]*4 would yield:
a=[[['alpha', 'beta']], [['alpha', 'beta']], [['alpha', 'beta']], 
[['alpha', 'beta']]]
and a[0][0]=1 would give [[1],[['alpha', 'beta']], [['alpha', 'beta']], 
[['alpha', 'beta'

rather than a=[[1], [1], [1], [1]]

Or at another level down: a[0][0][0]=1 would give: a=[[[1, 'beta']], 
[['alpha', 'beta']], [['alpha', 'beta']], [['alpha', 'beta']] ]

rather than a=[[[1, 'beta']], [[1, 'beta']], [[1, 'beta']], [[1, 'beta']]]

The point is, there would be no difference at all noticed in what data 
is found where in the array;
the *only* thing that would change is that replacing an item by 
assignment would only affect the *location* assigned to -- all other 
locations would not be affected.


That really is what people *generally* want.
If the entire list is meant to be read only -- the change would affect 
*nothing* at all.


See if you can find *any* python program where people desired the 
multiplication to have the die effect that changing an object in one of 
the sub lists -- changes all the objects in the other sub lists.


I'm sure you're not going to find it -- and even if you do, it's going 
to be 1 program in 1000's.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-06 Thread Andrew Robinson

On 11/06/2012 06:35 AM, Oscar Benjamin wrote:

> In general, people don't use element multiplication (that I have 
*ever* seen) to make lists where all elements of the outer most list 
point to the same sub-*list* by reference.  The most common use of the 
multiplication is to fill an array with a constant, or short list of 
constants;  Hence, almost everyone has  to work around the issue as 
the initial poster did by using a much longer construction.

That's what I have seen as well. I've never seen an example where 
someone wanted this behaviour.

>
> The most compact notation in programming really ought to reflect the 
most *commonly* desired operation.  Otherwise, we're really just 
making people do extra typing for no reason.

It's not so much the typing as the fact that this a common gotcha. 
Apparently many people expect different behaviour here. I seem to 
remember finding this surprising at first.

:)  That's true as well.

>
> Further, list comprehensions take quite a bit longer to run than low 
level copies; by a factor of roughly 10. SO, it really would be worth 
implementing the underlying logic -- even if it wasn't super easy.

>
> I really don't think doing a shallow copy of lists would break 
anyone's program.
> The non-list elements, whatever they are, can be left as reference 
copies -- but any element which is a list ought to be shallow copied. 
 The behavior observed in the opening post where modifying one element 
of a sub-list, modifies all elements of all sub-lists is never desired 
as far as I have ever witnessed.

It is a semantic change that would, I imagine, break many things in 
subtle ways.

?? Do you have any guesses, how ?

>
> The underlying implementation of Python can check an object type 
trivially, and the only routine needed is a shallow list copy.  So, no 
it really isn't a complicated operation to do shallow copies of lists.

Yes but if you're inspecting the object to find out whether to copy it 
what do you test for? If you check for a list type what about 
subclasses? What if someone else has a custom list type that is not a 
subclass? Should there be a dunder method for this?

No dunder methods.  :)
Custom non-subclass list types aren't a common usage for list 
multiplication in any event.
At present one has to do list comprehensions for that, and that would 
simply remain so.

Subclasses, however, are something I hadn't considered...

I don't think it's such a simple problem.

Oscar

You made a good point, Oscar; I'll have to think about the subclassing a 
bit.

:)

--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-06 Thread Andrew Robinson


On 11/06/2012 09:32 AM, Prasad, Ramit wrote:

Ian Kelly wrote:

On Tue, Nov 6, 2012 at 1:21 AM, Andrew Robinson


[snip]

See if you can find *any* python program where people desired the
multiplication to have the die effect that changing an object in one of the
sub lists -- changes all the objects in the other sub lists.

I'm sure you're not going to find it -- and even if you do, it's going to be
1 program in 1000's.

Per the last thread where we discussed extremely rare scenarios,
shouldn't you be rounding "1 in 1000s" up to 20%? ;-)

:D -- Ian -- also consider that I *am* willing to use extra memory.
Not everything can be shrunk to nothing and still remain functional.  :)
So, it isn't *all* about *micro* optimization -- it's also about 
psychology and flexibility.

Actually, I would be surprised if it was even 1 in 1000.
Of course, consistency makes it easier to learn and *remember*.
I value that far more than a minor quirk that is unlikely to
bother me now that I know of it. Well, at least not as long as
I do not forget my morning coffee/tea :)
But, having it copy lists -- when the only purpose of multiplication is 
for lists;

is only a minor quirk as well.




~Ramit


--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-06 Thread Andrew Robinson


On 11/06/2012 01:19 AM, Ian Kelly wrote:

On Tue, Nov 6, 2012 at 1:21 AM, Andrew Robinson


If you nest it another time;
[[[None]]]*4, the same would happen; all lists would be independent -- but
the objects which aren't lists would be refrenced-- not copied.

a=[[["alpha","beta"]]]*4 would yield:
a=[[['alpha', 'beta']], [['alpha', 'beta']], [['alpha', 'beta']], [['alpha',
'beta']]]
and a[0][0]=1 would give [[1],[['alpha', 'beta']], [['alpha', 'beta']],
[['alpha', 'beta'
rather than a=[[1], [1], [1], [1]]

Or at another level down: a[0][0][0]=1 would give: a=[[[1, 'beta']],
[['alpha', 'beta']], [['alpha', 'beta']], [['alpha', 'beta']] ]
rather than a=[[[1, 'beta']], [[1, 'beta']], [[1, 'beta']], [[1, 'beta']]]

You wrote "shallow copy".  When the outer-level list is multiplied,
the mid-level lists would be copied.  Because the copies are shallow,
although the mid-level lists are copied, their contents are not.  Thus
the inner-level lists would still be all referencing the same list.
To demonstrate:

I meant all lists are shallow copied from the innermost level out.
Equivalently, it's a deep copy of list objects -- but a shallow copy of 
any list contents except other lists.





from copy import copy
class ShallowCopyList(list):

... def __mul__(self, number):
... new_list = ShallowCopyList()
... for _ in range(number):
... new_list.extend(map(copy, self))
... return new_list
...
That type of copy is not equivalent to what I meant; It's a shallow copy 
only of non-list objects.
This shows that assignments at the middle level are independent with a 
shallow copy on multiplication, but assignments at the inner level are 
not. In order to achieve the behavior you describe, a deep copy would 
be needed. 
Yes, it can be considered a deep copy of *all* list objects -- but not 
of non list contents.

It's a terminology issue -- and you're right -- I need to be more precise.

That really is what people *generally* want.
If the entire list is meant to be read only -- the change would affect
*nothing* at all.

The time and memory cost of the multiplication operation would become
quadratic instead of linear.


Perhaps, but the copy would still not be _nearly_ as slow as a list 
comprehension !!!


Being super fast when no one uses the output -- is , "going nowhere fast."
I think It's  better to get at the right place at a medium speed than 
nowhere fast;


List comprehensions *do* get to the right place, but *quite* slowly.  
They are both quadratic, *and* multiple tokenized steps.


:)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-06 Thread Andrew Robinson

On 11/06/2012 01:04 AM, Steven D'Aprano wrote:

On Mon, 05 Nov 2012 21:51:24 -0800, Andrew Robinson wrote:

The most compact notation in programming really ought to reflect the
most *commonly* desired operation.  Otherwise, we're really just making
people do extra typing for no reason.

There are many reasons not to put minimizing of typing ahead of all other
values:
I didn't.  I put it ahead of *some* values for the sake of practicality 
and human psychology.

" Practicality beats purity. "

* Typically, code is written once and read many times. Minimizing
   typing might save you a second or two once, and then cost you many
   seconds every time you read the code. That's why we tell people to
   choose meaningful variable names, instead of naming everything "a"
   and "b".
Yes.  But this isn't going to cost any more time than figuring out 
whether or not the list multiplication is going to cause quirks, 
itself.  Human psychology *tends* (it's a FAQ!) to automatically assume 
the purpose of the list multiplication is to pre-allocate memory for the 
equivalent (using lists) of a multi-dimensional array.  Note the OP even 
said "4d array".

The OP's original construction was simple, elegant, easy to read and 
very commonly done by newbies learning the language because it's 
*intuitive*.  His second try was still intuitive, but less easy to read, 
and not as elegant.

* Consistency of semantics is better than a plethora of special
   cases. Python has a very simple and useful rule: objects should
   not be copied unless explicitly requested to be copied. This is
   much better than having to remember whether this operation or
   that operation makes a copy. The answer is consistent:
Bull.  Even in the last thread I noted the range() object produces 
special cases.

>>> range(0,5)[1]
1
>>> range(0,5)[1:3]
range(1, 3)
>>>

The principle involved is that it gives you what you *usually* want;  I 
read some of the documentation on why Python 3 chose to implement it 
this way.

   (pardon me for belabouring the point here)

 Q: Does [0]*10 make ten copies of the integer object?
 A: No, list multiplication doesn't make copies of elements.

Neither would my idea for the vast majority of things on your first list.

Q: What about [[]]*10?
A: No, the elements are never copied.

YES! For the obvious reason that such a construction is making mutable 
lists that the user wants to populate later.  If they *didn't* want to 
populate them later, they ought to have used tuples -- which take less 
overhead.  Who even does this thing you are suggesting?!

>>> a=[[]]*10
>>> a
[[], [], [], [], [], [], [], [], [], []]
>>> a[0].append(1)
>>> a
[[1], [1], [1], [1], [1], [1], [1], [1], [1], [1]]

Oops! Damn, not what anyone normal wants

 Q: How about if the elements are subclasses of list?
 A: No, the elements are never copied.
Another poster brought that point up -- it's something I would have to 
study before answering.

It's a valid objection.

 Q: What about other mutable objects like sets or dicts?
 A: No, the elements are never copied.
They aren't list multiplication compatible in any event! It's a total 
nonsense objection.

If these are inconsistent in my idea -- OBVIOUSLY -- they are 
inconsistent in Python's present implementation.  You can't even 
reference duplicate them NOW.

>>> { 1:'a', 2:'b', 3:'c' } * 2
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for *: 'dict' and 'int'

 Q: How about on Tuesdays? I bet they're copied on Tuesdays.
 A: No, the elements are never copied.

That's really a stupid objection, and everyone knows it.
" Although that way may not be obvious at first unless you're Dutch. "

Your proposal throws away consistency for a trivial benefit on a rare use-
case, and replaces it with a bunch of special cases:

RARE You are NUTS

 Q: What about [[]]*10?
 A: Oh yeah, I forgot about lists, they're copied.

Yup.

 Q: How about if the elements are subclasses of list?
 A: Hmmm, that's a good one, I'm not actually sure.

 Q: How about if I use delegation to proxy a list?
 A: Oh no, they definitely won't be copied.
Give an example usage of why someone would want to do this.  Then we can 
discuss it.

 Q: What about other mutable objects like sets or dicts?
 A: No, definitely not. Unless people complain enough.
now you're just repeating yourself to make your contrived list longer -- 
but there's no new objections...

Losing consistency in favour of saving a few characters for something as
uncommon as list multiplication is a poor tradeoff. That&

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


Hi IAN!

On 11/06/2012 03:52 PM, Ian Kelly wrote:

On Tue, Nov 6, 2012 at 3:41 PM, Andrew Robinson

The objection is not nonsense; you've merely misconstrued it.  If
[[1,2,3]] * 4 is expected to create a mutable matrix of 1s, 2s, and
3s, then one would expect [[{}]] * 4 to create a mutable matrix of
dicts.  If the dicts are not copied, then this fails for the same
reason
:) The idea does create a multable list of dicts; just not a mutable 
list of different dicts.



  Q: How about if I use delegation to proxy a list?
  A: Oh no, they definitely won't be copied.

Give an example usage of why someone would want to do this.  Then we can
discuss it.

Seriously?  Read a book on design patterns.  You might start at SO:

http://stackoverflow.com/questions/832536/when-to-use-delegation-instead-of-inheritance
:)  I wasn't discarding the argument, I was asking for a use case to 
examine.


I know what a delegation *is*;  but I'm not spending lots of times 
thinking about this issue.

(Besides this thread just went more or less viral, and I can't keep up)

I have a book on design patterns -- in fact, the one called "Design 
Patterns" by Gamma, Helm, Johnson, Vlissides.  (Is it out of date 
already or something?)
Please link to the objection being proposed to the developers, and 
their reasoning for rejecting it. I think you are exaggerating. 

> From Google:

http://bugs.python.org/issue1408
http://bugs.python.org/issue12597
http://bugs.python.org/issue9108
http://bugs.python.org/issue7823

Note that in two out of these four cases, the reporter was trying to
multiply lists of dicts, not just lists of lists.

That's helpful. Thanks.  I'll look into these.


Besides, 2D arrays are *not* rare and people *have* to copy internals of
them very often.
The copy speed will be the same or *faster*, and the typing less -- and the
psychological mistakes *less*, the elegance more.

List multiplication is not potentially useful for copying 2D lists,
only for initializing them.  For copying an existing nested list,
you're still stuck with either copy.deepcopy() or a list
comprehension.


Yes, I totally agree.
But, as far as I know -- the primary use of list multiplication is 
initialization.
That was my point about the most compact notation ought to be for the 
most common case.

Initialization is a very common use case.

List comprehensions are appropriate for the other's.
Even D'Aprano thought the * operator was not a common operation; and I 
suppose that when compared to other operations done in a program 
(relative counting) he's correct; most programs are not primarily matrix 
or initialization oriented.





It's hardly going to confuse anyone to say that lists are copied with list
multiplication, but the elements are not.

Every time someone passes a list to a function, they *know* that the list is
passed by value -- and the elements are passed by reference.  People in
Python are USED to lists being "the" way to weird behavior that other
languages don't do.

Incorrect.  Python uses what is commonly known as call-by-object, not
call-by-value or call-by-reference.  Passing the list by value would
imply that the list is copied, and that appends or removes to the list
inside the function would not affect the original list.
Interesting, you avoided the main point "lists are copied with list 
multiplication".


But, in any event:
_Pass_ by value (not call by value) is a term stretching back 30 years; 
eg: when I learned the meaning of the words.  Rewording it as "Call by 
value" is something that happened later, and the nuance is lost on those 
without a very wide programming knowledge *and* age.


In any event:
All objects in Python are based on pointers; all parameters passed to 
functions, etc, are *copies* of those pointers; (by pointer value).


I made the distinction between contents of the list and the list object 
itself for that reason; I gave an explicit correction to the _pass_ by 
"value" generalization by saying: ("the elements are passed by reference").


The concept I gave, although archaically stated -- still correctly 
represents what actually happens in Python and can be seen from it's 
source code(s).


The point I am making is not generally true of everyone learning 
Python;  For some people obviously learn it from scratch.  But, for 
people who learn the language after a transition, this is a common FAQ; 
how do I modify the variables by reference and not by value; -- the 
answer is, you can't -- you must embed the return value in another 
object; parameters are always passed the *same* way.


Every function written, then, has to decide when objects are passed to 
it -- whether to modify or copy the object (internals) when modifying 
it.   That's all I meant.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


On 11/06/2012 05:55 PM, Steven D'Aprano wrote:

On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:


Yes.  But this isn't going to cost any more time than figuring out
whether or not the list multiplication is going to cause quirks, itself.
  Human psychology *tends* (it's a FAQ!) to automatically assume the
purpose of the list multiplication is to pre-allocate memory for the
equivalent (using lists) of a multi-dimensional array.  Note the OP even
said "4d array".

I'm not entirely sure what your point is here. The OP screwed up -- he
didn't generate a 4-dimensional array. He generated a 2-dimensional
array. If his intuition about the number of dimensions is so poor, why
should his intuition about list multiplication be treated as sacrosanct?

Yes he did screw up.
There is a great deal of value in studying how people screw up, and 
designing interfaces which tend to discourage it.  "Candy machine 
interfaces".



As they say, the only truly intuitive interface is the nipple.

No it's not -- that interface really sucks.  :)
Have you ever seen a cat trying to suck a human nipple -- ?
Or, have you ever asked a young child who was weaned early and doesn't 
remember nursing -- what a breast is for ?  Once the oral stage is left, 
remaining behavior must be re-learned.

  There are
many places where people's intuition about programming fail. And many
places where Fred's intuition is the opposite of Barney's intuition.
OK.  But that doesn't mean that *all* places have opposite intuition; 
Nor does it mean that one intuition which is statistically *always* 
wrong shouldn't be discouraged, or re-routed into useful behavior.


Take the candy machine,  if the items being sold are listed by number -- 
and the prices are also numbers; it's very easy to type in the price 
instead of the object number because one *forgets* that the numbers have 
different meaning and the machine can't always tell from the price, 
which object a person wanted (duplicate prices...); Hence a common 
mistake... people get the wrong item, by typing in the price.


By merely avoiding a numeric keypad -- the user is re-routed into 
choosing the correct item by not being able to make the mistake.


For this reason, Python tends to *like* things such as named parameters 
and occasionally enforces their use.  etc.



Even more exciting, there are places where people's intuition is
*inconsistent*, where they expect a line of code to behave differently
depending on their intention, rather than on the code. And intuition is
often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
give 43? (Unless it is intuitively obvious that it should give 421.)
I agree, and in places where an *exception* can be raised; it's 
appropriate to do so.

Ambiguity, like the candy machine, is *bad*.


So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.
"where possible"; OK, fine -- I agree.  I'm not "happy" to give it up; 
but I am willing.
I don't like the man hours wasted on ambiguous behavior; and I don't 
ever think that should make someone "happy".



The OP's original construction was simple, elegant, easy to read and
very commonly done by newbies learning the language because it's
*intuitive*.  His second try was still intuitive, but less easy to read,
and not as elegant.

Yes. And list multiplication is one of those areas where intuition is
suboptimal -- it produces a worse outcome overall, even if one minor use-
case gets a better outcome.

I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
disputing that this matters. Python would be worse off if list
multiplication behaved intuitively.

How would it be worse off?

I can agree, for example, that in "C" -- realloc -- is too general.
One can't look at the line where realloc is being used, and decide if it is:
1) mallocing
2) deleting
3) resizing

Number (3) is the only non-redundant behavior the function provides.
There is, perhaps, a very clear reason that I haven't discovered why the 
extra functionality in list multiplication would be bad;  That reason is 
*not* because list multiplication is unable to solve all the copying 
problems in the word;  (realloc is bad, precisely because of that);  But 
a function ought to do at least *one* thing well.


Draw up some use cases for the multiplication operator (I'm calling on 
your experience, let's not trust mine, right?);  What are all the 
Typical ways people *Do* to use it now?


If those use cases do not *primarily* center around *wanting* an effect 
explicitly caused by reference duplication -- then it may be better to 
abolish list multiplication all together; and rather, improve the list 
comprehensions to overcome the memory, clarity,

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


On 11/06/2012 10:56 PM, Demian Brecht wrote:
My question was *not* based on what I perceive to be intuitive 
(although most of this thread has now seemed to devolve into that and 
become more of a philosophical debate), but was based on what I 
thought may have been inconsistent behaviour (which was quickly 
cleared up with None being immutable and causing it to *seem* that the 
behaviour was inconsistent to the forgetful mind).
I originally brought up "intuitive"; and I don't consider the word to 
mean an "exclusive" BEST way -- I meant it to mean easily guessed or 
understood.  An intelligent person can see when there may be more than 
one reasonable explanation -- ergo: I just called your OP intelligent, 
even if you were wrong; and D'Aprano ripped you for being wrong.


The debate is degenerating because people are _subjectively_ judging 
other people's intelligence.
The less intelligent a person is, the more black and white their 
judgements _tend_ to be.


As you touch on here, "intuition" is entirely subjective. If you're 
coming from a C/C++ background, I'd think that your intuition would be 
that everything's passed by value unless explicitly stated.

Yup -- that's my achillies heel and bias, I'm afraid.
I learned basic, then assembly, and then pascal, and then fortran77 with 
C (historically in that order)


In my view, pass by value vs. reference always exists at the 
hardware/CPU level regarless of the language; and regardless of whether 
the language hides the implementation details or not;


I'm an EE; I took software engineering to understand the clients who use 
my hardware, and to make my hardware drivers understandable to them by 
good programming practices.  An EE's perspective often lead to doing 
efficient things which are hard to understand;  That's why I look for a 
consensus (not a compromise) before implementing speed/memory 
improvements and ways to clarify what is being done.


Someone coming from another background (Lua perhaps?) would likely 
have entirely different intuition. 
Yes, they might be ignorant of what LUA is doing at the hardware level; 
even though it *is* doing it.



So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.

I fail to see where there has been any giving up on intuitiveness in the 
context of this particular topic. In my mind, intuitiveness is generally born 
of repetitiveness and consistency.
YES  I think a good synonym would be habit;  and when a habit is 
good -- it's called strength, or "virtue"; When it's bad it's called 
"vice" or "sin" or "bad programming habit."  :)

Virtues don't waste people's time in debugging.

  As everything in Python is a reference, it would seem to me to be 
inconsistent to treat expressions such as [[obj]*4]*4 un-semantically 
(Pythonically speaking) and making it *less* intuitive. I agree that Python 
would definitely be worse off.
That's a fair opinion. I was pleasantly surprised when the third poster 
actually answered the "WHY" question with the idea that Python always 
copies by reference unless forced to do deep copy.  That's intuitive, 
and as a habit (not a requirement) Python implements things that way.


I've already raised the question about why one would want a multiplier 
at all, if it were found that the main desired use case never *wants* 
all objects to change together.


I laid out a potential modification of list comprensions; which, BTW, 
copy by re-instantiating rather than reference; so the paradigm of 
Python is wrong in that case  But, I think the modifications in that 
context can't be argued against as easily as list multiplication (For 
the same reason that comprehensions already break the copy by reference 
mold  )




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson

On 11/07/2012 05:39 AM, Joshua Landau wrote:

On 7 November 2012 11:11, Oscar Benjamin wrote:

On Nov 7, 2012 5:41 AM, "Gregory Ewing"  wrote:
>
> If anything is to be done in this area, it would be better
> as an extension of list comprehensions, e.g.
>
>   [[None times 5] times 10]
>
> which would be equivalent to
>
>   [[None for _i in xrange(5)] for _j in xrange(10)]

Oscar, I'm really in agreement with you;  I think that it's better to 
group all *special* array/list constructions into a single logical unit 
which will show up in the same part of the Python documentation.

A multidimensional list comprehension would be useful even for
people who are using numpy as it's common to use a list
comprehension to initialise a numpy array.

I hadn't paid that much attention; but I think that's true of people 
using the newer releases of Numpy.

A Very interesting point... Thank you for mentioning it.

A more modest addition for the limited case described in this
thread could be to use exponentiation:

>>> [0] ** (2, 3)
[[0, 0, 0], [0, 0, 0]]

I'm against over using the math operators, for the reason that matrix 
and vector algebra have meanings mathematicians desire (rightly) to 
maintain.  Numpy users might find matricies overloaded to do these 
things in the future -- and then it becomes unclear whether an 
initialization is happening or a mathematical operation. I think it best 
just not to set up an accident waiting to happen in the first place.

Hold on: why not just use multiplication?
>>> [0] * (2, 3)

Would you consider that better than [0].nest(2).nest(3) ? or [0].nest(2,3) ?
(I'm against multiplication, but I'm still interested in what you find 
attractive about it.)
We do have to think of the potential problems, though. There are 
definitely some. For one, code that relies on lst * x throwing an 
error would break. It may confuse others - although I don't see how.

Excellent observation:
People relying on an exception, would be in the try: operation.
So, since lst * int  does not cause an exception; they would need a 
reason to be concerned that someone passed in a list instead of an 
integer.  Semantically, the same KIND of result happens, lst is in some 
way duplicated; so if the result is accepted, it likely would work in 
place of an integer.
So, the concern would be where someone wanted to detect the difference 
between an integer and a list, so as to run some alternate algorithm.

Eg, say a vector multiply, or similar operation.  The design would want 
to shadow * and call a method to do the multiply; You'd have a fragment 
possibly like the following:

...
try: ret = map( lambda x: x*rightSide, leftSide )
except TypeError:  for i in rightSide: self.__mul__( rightSide, i ) 
# recursive call to __mul__

...

That's a common technique for type checking dating from earlier releases 
of Python, where the "type" attribute wasn't available.  It also works 
based on functionality, not specific type -- so objects which "work" 
alike (subclasses, alternate reinventions of the wheel) also can be handled.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


On 11/07/2012 01:01 PM, Ian Kelly wrote:

On Wed, Nov 7, 2012 at 12:51 PM, Andrew Robinson
  wrote:

Interesting, you avoided the main point "lists are copied with list
multiplication".

It seems that each post is longer than the last.  If we each responded
to every point made, this thread would fill a book.

It already is :)


Anyway, your point was to suggest that people would not be confused by
having list multiplication copy lists but not other objects, because
passing lists into functions as parameters works in basically the same
way.
Not quite; Although I wasn't clear;  The variable passed in is by 
*value* in contradistinction to the list which is by reference.  Python 
does NOT always default copy by reference *when it could*; that's the point.


Hence the programmer has to remember in  foo( x,y ), the names x and y 
when assigned to -- *DONT* affect the variables from which they came.  
But any object internals do affect the objects everywhere.


A single exception exists; My thesis is for a single exception as well 
-- I think Python allows that kind of thinking.

So actually I did address
this point with the "call-by-object" tangent; I just did not
explicitly link it back to your thesis.
My apology for not proof reading my statements for clarity.  It was 
definitely time for a nap back then.


Potayto, potahto. The distinction that you're describing is between 
"strict" versus "non-strict" evaluation strategies. Hinging the 
distinction on the non-descriptive words "call" and "pass" is lazy 
terminology that should never have been introduced in the first place.
I would do it again.  Other's have already begun to discuss terminology 
with you -- I won't double team you.




--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


On 11/07/2012 03:39 PM, Ian Kelly wrote:
Why? Just to get rid of an FAQ? 

:-)


Here's one of the more interesting uses from my own code:

OK, and is this a main use case?  (I'm not saying it isn't I'm asking.)

Replacing the list multiplication in that function with a list 
comprehension would be awkward, as the obvious replacement of 
[iter(iterable) for _ in range(n)] would produce different results.


Yes. I have a thought on that.
How exactly do you propose to indicate to the compiler which parts of 
the expressions are meant to be cached, and which are not?


Exactly?
OK; Here's what I would consider a safe implementation -- but it could 
be improved later.
There is a special keyword which signals the new type of comprehension;  
A normal comprehension would say eg: '[ foo for i in xrange ]'; but when 
the 'for i in' is reduced to a specific keyword such as 'ini' (instead 
of problematic 'in') the caching form of list comprehension would start.


So, then, just like a comprehension -- the interpreter will begin to 
evaluate the code from the opening bracket '['; But anything other than 
a function/method will raise a type error (people might want to change 
that, but it's safe).


The interpreter then caches all functions/initialiser methods it comes 
into contact with.
Since every function/method has a parameter list (even if empty);  The 
interpreter would evaluate the parameter list on the first pass through 
the comprehension, and cache each parameter list with it's respective 
function.


When the 'ini' keyword is parsed a second time, Python would then 
evaluate each cached function on its cached parameter list; and the 
result would be stored in the created list.

This cached execution would be repeated as many times as is needed.

Now, for your example:

values = zip(samples, times * num_groups)
 if len(values)<  len(times) * num_groups:
 # raise an error

Might be done with:

values = zip(   samples, [ lambda:times, ini xrange(num_groups) ]   )
if len(values) < len(times) * num_groups

The comma after the lambda is questionable, and this construction would 
be slower since lambda automatically invokes the interpreter; but it's 
correct.


If you could provide a built in which returns a reference to the 
parameter passed to it; that would run at max system speed; by default, 
all built-in object initializers are maximally fast.


The key difference is that the ini syntax evaluates the parameter lists 
only once; and the ini's purpose is for repeating an initialization of 
the same kind of object in multiple different places.


As an aside, how would you do the lambda inside a list comprehension?
[lambda:6 for i in xrange(10) ] # Nope.
Generic lists allow a spurrious comma, so that [  3,3,3, ] = [3,3,3] 
dropped;

[lambda:6, for i in xrange(10) ] # but this is no good.

I have to do:
def ref(): return 6
[ref(x) for i in xrange(10) ]

Of course you got an integer. You took an index of the range object, 
not a slice. The rule is that taking an index of a sequence returns an 
element; taking a slice of a sequence returns a sub-sequence. You 
still have not shown any inconsistency here.


Because it's an arbitrary rule which operates differently than the 
traditional idea shown in python docs?


slice.indices()  is *for* (QUOTE)"representing the _set of indices_ 
specified by _range_(start, stop, step)"

http://docs.python.org/2/library/functions.html#slice

There are examples of python doing this; use Google...  They use slice 
indices() to convert negative indexes into positive ones _compatible 
with range()_.


some_getitem_method_in_a_subclass_foo( self, range ):
ret=[]
for i in xrange( range.indices( len(self) ) ):
ret.append( self.thing[i] )
return ret

The return is equivalent to a range object in the sense that it is an 
iterator object, but it's not the same iterator object.  It will still 
work with legacy code since different iterators can be interchanged 
so long as they return the same values.


No, he wasn't. He was talking about multiplying lists of dicts, and 
whether the dicts are then copied or not, just like every other Q&A 
item in that dialogue was concerning whether item X in a list should 
expect to be copied when the containing list is multiplied.


I already told him several times before that what the answer was;
It doesn't copy anything except the list itself.

Then he asks, "does it multiply dicts" and no mention of it being inside 
a list.

He's browbeating a dead horse.

Perhaps you're not aware that on the Internet, TYPING IN ALL CAPS is 
commonly construed as SHOUTING. 

Sure, and people say:

THIS IS YELLING, AND I AM DOING IT HERE AS AN EXAMPLE.
This is STRESS.
This is SHOCK!

I don't recall typing any _full sentence_ in all caps, if I did, I'm 
awfully sorry.  I didn't mean it.
Yes, he is beginning to get condescendingly exasperating.  Everyone else 
seems to understand 85+% of what I say, correctly.  He doesn't; and now

Re: Multi-dimensional list initialization

2012-11-07 Thread Andrew Robinson


On 11/07/2012 04:00 PM, Steven D'Aprano wrote:

Andrew, it appears that your posts are being eaten or rejected by my
ISP's news server, because they aren't showing up for me. Possibly a side-
effect of your dates being in the distant past?
Date has been corrected since two days ago.  It will remain until a 
reboot

Ignorance, though, might be bliss...

Every now and again I come across somebody who tries to distinguish 
between "call by foo" and "pass by foo", but nobody has been able to 
explain the difference (if any) to me.
I think the "Call by foo" came into vogue around the time of C++; Eg: 
It's in books like C++ for C programmers;  I never saw it used before 
then so I *really* don't know for sure...


I know "Pass by value" existed all the way back to the 1960's.  I see 
"pass by" in my professional books from those times and even most newer 
ones; but I only find "Call by value" in popular programming books of 
more recent times.  (Just my experience)  So -- I "guess" the reason is 
that when invoking a subroutine, early hardware often had an assembler 
mnemonic by the name "call".


See for example: Intelx86 hardware books from the 1970's;

Most early processors (like the MC6809E, and 8080) allow both direct and 
indirect *references* to a function (C would call them function 
pointers); So, occasionally early assembly programs comment things like: 
"; dynamic VESA libraries are called by value in register D.";  And they 
meant that register D is storing a function call address from two or 
more vesa cards.  It had little to do with the function's parameters, 
(which might be globals anyway)  (It procedural dynamic binding!)


Today, I don't know for sure -- so I just don't use it.
"pass" indicates a parameter of the present call; but not the present 
call itself.


--
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

2012-11-08 Thread Andrew Robinson


On 11/07/2012 11:09 PM, Ian Kelly wrote:

On Wed, Nov 7, 2012 at 8:13 PM, Andrew Robinson
  wrote:

OK, and is this a main use case?  (I'm not saying it isn't I'm asking.)

I have no idea what is a "main" use case.
Well, then we can't evaluate if it's worth keeping a list multiplier 
around at all.

You don't even know how it is routinely used.

FYI, the Python devs are not very fond of adding new keywords.  Any
time a new keyword is added, existing code that uses that word as a
name is broken.  'ini' is particularly bad, because 1) it's not a
word, and 2) it's the name of a common type of configuration file and
is probably frequently used as a variable name in relation to such
files.

Fine; Its a keyword TBD then; I should have said 'foo'.
in is worse than ini, ini is worse than something else -- at the end of 
the rainbow, maybe there is something

values = zip(   samples, [ lambda:times, ini xrange(num_groups) ]   )

 if len(values)<  len(times) * num_groups


How is this any better than the ordinary list comprehension I already
suggested as a replacement?  For that matter, how is this any better
than list multiplication?
You _asked it to implement_ a list multiplication of the traditional 
kind;  By doing copies by *REFERENCE*; so of course it's not better.

My intentions were for copying issues, not-non copying ones.

   Your basic complaint about list
multiplication as I understand it is that the non-copying semantics
are unintuitive.

No.
1) My basic complaint is that people (I think from watching) primarily 
use it to make initializer lists, and independent mutable place holders; 
List multiplication doesn't do that well.
2) If it is indeed very rare (as D'Aprano commented) then -- it has a 
second defect in looking to casual inspection to be the same as vector 
multiplication; which opacifies which operation is being done when 
matrix packages are potentially being used.



   Well, the above is even less intuitive.  It is
excessively complicated and almost completely opaque.  If I were to
come across it outside the context of this thread, I would have no
idea what it is meant to be doing.
Nor would I *know* what this list multiplier look alike does 
[1,2,3]*aValue without checking to see if someone imported a vector 
library and the variable aValue has a special multiplication operator.



As an aside, how would you do the lambda inside a list comprehension?

As a general rule, I wouldn't.  I would use map instead.

OK: Then copy by reference using map:

values = zip(   map( lambda:times, xrange(num_groups) )   )
if len(values) < len(times) * num_groups ...

Done.  It's clearer than a list comprehension and you still really don't 
need a list multiply.
I''m not going to bother explaining what the construction I offered 
would be really good at.

It's pointless to explain to the disinterested.


Thak constructs a list of 10 functions and never calls them.  If you
want to actually call the lambda, then:

Yep, I was very tired.
slice.indices() has nothing to do with it. Indexing a sequence and 
calling the .indices() method on a slice are entirely different 
operations. 
Yes, but you're very blind to history and code examples implementing the 
slice operation.

slice usually depends on index; index does not depend on slice.
Slice is suggested to be implemented by multiple calls to single indexes 
in traditional usage and documentation.


The xrange(,,)[:] implementation breaks the tradition, because it 
doesn't call index multiple times; nor does it return a result 
equivalent identical to doing that.


It's different. period.  You're not convincing in the slightest by 
splitting hairs.




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Preventing tread collisions

2012-12-13 Thread Andrew Robinson


On 12/12/2012 12:29 PM, Dave Angel wrote:

On 12/12/2012 03:11 PM, Wanderer wrote:

I have a program that has a main GUI and a camera. In the main GUI,
you can manipulate the images taken by the camera. You can also use
the menu to check the camera's settings. Images are taken by the
camera in a separate thread, so the long exposures don't block the
GUI. I block conflicts between the camera snapshot thread and the
main thread by setting a flag called self.cameraActive. I check to
see if the cameraActive flag is false and set the cameraActive to
True just before starting the thread. I generate an event on exiting
the thread which sets the cameraActive flag to False. I also check
and set and reset the flag in all the menu commands that access the
camera. Like this.

  def onProperties(self, event):
  """ Display a message window with the camera properties
  event -- The camera properties menu event
  """
  # Update the temperature
  if not self.cameraActive:
  self.cameraActive = True
  self.camera.getTemperature()
  camDict = self.camera.getPropertyDict()
  self.cameraActive = False
  else:
  camDict = {'Error': 'Camera Busy'}
  dictMessage(camDict, 'Camera Properties')

This works

I don't think so.  in between the if and the assignment, another thread
could get in there and also set the flag.  Then when either one of them
finishes, it'll clear the flag and the other code is unprotected.

For semaphores between multiple threads, you either have to define only
a single thread at any given moment being permitted to modify it, or you
have to use lower-level primitives, sometimes called test+set operation.

i don't know the "right" way to do this in Python, but this isn't it.

but my question is, is there a better way using semaphores, locks or
something else to prevent collisions between threads?

Thanks


if you already have the cameraActive variable reset by an event at
thread termination; it's not necessary to set it false in the menu
command.  It's better NOT to do that. Your GUI menu functions need only
test to see if self.cameraActive is false, and then set it to true just
before the launch of the second thread.  The second thread, itself,
ought never change the cameraActive variable.

I'm also not sure why you are able to obtain the information from the
camera sequentially (camDict?) when you say you are not blocking the GUI.
I  assume self.camera.getTemperature() launches the second thread ?  Is it,
somehow, explicitly allowing the continued processing of GUI events that
accessing the camera straight in the GUI would not allow?

If you are talking about the Python semaphore library, I don't think you
need it.

Semaphores are really for use when multiple threads wish to access a
resource where more than one thread can use the resource at a time;
That would mean multiple threads using the camera at once... not a good
idea.

The Lock() object essentially does the same thing, but assumes only 1
thread may use it at a time; hence that would be sufficient (if it
were needed at all!).

A lock is in the "thread" library (Python 2.xx) or the "threading"
library (Python 3.xx).  Semaphores aren't part of the thread library in
Python 2.xx... (another reason not to bother with them...)

However, Locking will cause the GUI thread to block when the camera is
in use, which isn't what you want -- correct?

There is a way to test the lock but not block, which is equivalent to
your variable (to be honest!);  I'm pretty sure that Python doesn't use
true Posix threads but only the GNU Pth library.  That means that the
only time threads truly switch is determined by the Python interpreter.
In that case, all python variable assignments are going to be effectively
atomic anyhow... and a variable, like you are using, is identical to a lock.
(Atomic merely means the write can't be interrupted by a thread switch partway
through).

If you have a multi-processing environment, there is a multiprocessor
library -- where the lock or semaphore mechanism would be important.
But I really don't think you need it.

 


--
http://mail.python.org/mailman/listinfo/python-list

Re: Running a python script under Linux

2012-12-13 Thread Andrew Robinson


On 12/13/2012 06:45 PM, Steven D'Aprano wrote:

I understand this is not exactly a Python question, but it may be of
interest to other Python programmers, so I'm asking it here instead of a
more generic Linux group.

I have a Centos system which uses Python 2.4 as the system Python, so I
set an alias for my personal use:

[steve@ando ~]$ which python
alias python='python2.7'
 /usr/local/bin/python2.7


When I call "python some_script.py" from the command line, it runs under
Python 2.7 as I expected. So I give the script a hash-bang line:

#!/usr/bin/env python

and run the script directly, but instead of getting Python 2.7, it runs
under Python 2.4 and gives me system errors.

When I run env directly, it ignores my alias:

steve@ando ~]$ /usr/bin/env python -V
Python 2.4.3


What am I doing wrong?



After seeing the lecture on Bad Ideas ... this might backfire on me :)

But ... if you really want to make the alias show up in all bash shells, 
you can put it in your ~/.bashrc file which is executed every time a 
shell is created.

alias python='python2.7'

However, alias expansions do not work in non-interactive shells -- so I 
don't think it will launch with the #!/bin/env technique.


OTOH -- Shell functions DO operate in non-interactive mode, so you could 
add something like:


function python() {
python3 # whichever version of python you want as default
}
# eof of function example to add to ~/.bashrc

OR
In a bash shell, you can also do:

function python {
python3 # or 2.7 ...
}
export -f python

And that would give the same effect as an alias, but funtions can be 
exported to child processes.


It's your system, and we're adults here -- screw it up however you want to.
Cheers!








--
--Jesus Christ is Lord.

--
http://mail.python.org/mailman/listinfo/python-list

Re: Running a python script under Linux

2012-12-13 Thread Andrew Robinson


On 12/13/2012 06:45 PM, Steven D'Aprano wrote:

What am I doing wrong?


By the way,  I didn't include command line parameters as part of the 
function definition, so you might want to add them to insure it acts 
like a generic alias.


Also, (alternately), you could define a generic python shell 
script/import with the duty of checking for a compatible python version; 
and if the wrong one is executing --  it could then import the shell 
command execution function, and *fork* the correct version of python on 
the script; then it could exit.


... or whatever ;)


--
http://mail.python.org/mailman/listinfo/python-list

Re: where to view open() function's C implementation source code ?

2012-12-18 Thread Andrew Robinson


On 12/18/2012 07:03 AM, Chris Angelico wrote:

On Wed, Dec 19, 2012 at 1:28 AM, Roy Smith  wrote:

In article,
  iMath  wrote:

Download the source for the version you're interested in.

but which python module is  open() in ?

I met you half-way, I showed you where the source code is.  Now you
need to come the other half and look at the code.  Maybe start by
grepping the entire source tree for "open"?

Ouch, that mightn't be very effective! With some function names, you
could do that. Not so much "open". Still, it'd be a start...

ChrisA
In Python3.3.0 -- the built in open() appears in 
Python-3.3.0/Modules/_io/_iomodule.c;
There is another module defined in an object in 
Python-3.3.0/Modules/_io/fileio.c; but I don't think that the one called 
when a lone x=open(...) is done.


Cheers.
--Andrew.
--
http://mail.python.org/mailman/listinfo/python-list

Vote tallying...

2013-01-18 Thread Andrew Robinson


Hi,

I have a problem which may fit in a mysql database, but which I only 
have python as an alternate tool to solve... so I'd like to hear some 
opinions...


I'm building a experimental content management program on a standard 
Linux Web server.
And I'm needing to keep track of archived votes and their voters -- for 
years.


Periodically, a python program could be given a batch of new votes 
removed from the database, and some associated comments, which are no 
longer real-time necessary;  and then a python script needs to take that 
batch of votes, and apply them to an appropriate archive file.  It's 
important to note that it won't just be appending new votes, it will be 
sorting through a list of 10's of thousands of votes, and changing a 
*few* of them, and appending the rest.


XML may not be the ideal solution, but I am easily able to see how it 
might work.  I imagine a file like the following might be inefficient, 
but capable of solving the problem:





   
   12345A3
   FF734B5D
   7FBED
   The woodstock games



I think you're on 
drugs, man.!
It would have been 
better if they didn't wake up in the morning.




10
1
3



The questions I have are, is using XML for vote recording going to be 
slow compared to other stock solutions that Python may have to offer?  
The voter ID's are unique, 32 bits long, and the votes are only from 1 
to 10. (4 bits.).  I'm free to use any import that comes with python 
2.5. so if there's something better than XML, I'm interested.


And secondly, how likely is this to still work once the vote count 
reaches 10 million?
Is an XML file with millions of entries something someone has already 
tried succesfully?



--
http://mail.python.org/mailman/listinfo/python-list

Re: Vote tallying...

2013-01-18 Thread Andrew Robinson


On 01/18/2013 08:47 AM, Stefan Behnel wrote:

Andrew Robinson, 18.01.2013 00:59:

I have a problem which may fit in a mysql database

Everything fits in a MySQL database - not a reason to use it, though. Py2.5
and later ship with sqlite3 and if you go for an external database, why use
MySQL if you can have PostgreSQL for the same price?
MySQL is provided by the present server host.  It's pretty standard at 
web hosting sites.
It works through "import MySQLdb" -- and it means an IP call for every 
action...

Postgre isn't available :( otherwise, I'd use it

I'm mildly concerned about scaling issues but don't have a lot of 
time (just a few days) to come to a decision.  I don't need high 
performance, just no grotesque degradation when the system is scaled up, 
and no maintenance nightmare.  The votes table is going to get 
monsterous if all votes are held in one table


Your comment about sqlite is interesting; I've never used it before.
At a glance, it uses individual files as databases, which is good... But it 
wants to lock the entire database against reads as well as writes when any 
access of the database happens.  Which is bad...

http://www.sqlite.org/different.html
http://www.sqlite.org/whentouse.html


... XML files are a rather static thing and meant to be
processed from start to end on each run. That adds up if the changes are
small and local while the file is ever growing. You seem to propose one
file per article, which might work. That's unlikely to become too huge to
process, and Python's cElementTree is a very fast XML processor.

Yes, that's exactly what I was thinking one file/article.

It's attractive, I think, because many Python programs are allowed to 
read the XML file concurrently, but only one periodically updates it as 
a batch/chron/or triggered process; eg: the number/frequency of update 
is actually controllable.


eg: MySQL accumulates a list of new votes and vote changes and python 
occasionally flushes the database into the archive file. That way, MySQL 
only maintains a small database of real-time changes, and the 
speed/accuracy of the vote tally can be tailored to the user's need.


However, your problem sounds a lot like you could map it to one of the dbm
databases that Python ships. They work like dicts, just on disk.
Doing a Google search, I see some of these that you are mentioning -- 
yes, they may have some potential.


IIUC, you want to keep track of comments and their associated votes, maybe
also keep a top-N list of the highest voted comments. So, keep each comment
and its votes in a dbm record, referenced by the comment's ID (which, I
assume, you keep a list of in the article that it comments on).
The comments themselves are just ancillary information; the votes only 
apply to the article itself at this time.  The two pieces of feedback 
information are independent, occasionally having a user that gives both 
kinds.  Statistically, there are many votes -- and few comments.


Each archive file has the same filename as the article that is being 
commented or voted on; but with a different extension (eg: xml, or 
.db,or...) so there's no need to store article information  on each vote 
or comment; (unlike the MySQL database, which has to store all that 
information for every vote ugh!)



  You can use
pickle (see the shelve module) or JSON or whatever you like for storing
that record. Then, on each votes update, look up the comment, change its
votes and store it back. If you keep a top-N list for an article, update it
at the same time. Consider storing it either as part of the article or in
another record referenced by the article, depending of how you normally
access it. You can also store the votes independent of the comment (i.e. in
a separate record for each comment), in case you don't normally care about
the votes but read the comments frequently. It's just a matter of adding an
indirection for things that you use less frequently and/or that you use in
more than one place (not in your case, where comments and votes are unique
to an article).

You see, lots of options, even just using the stdlib...

Stefan


Yes, lots of options
Let's see... you've noticed just about everything important, and have 
lots of helpful thoughts; thank you.


There are implementation details I'm not aware of regarding how the 
file-system dictionaries (dbm) work; and I wouldn't know how to compare 
it to XML access speed either but I do know some general information 
about how the data might be handled algorithmically; and which might 
suggest a better Python import to use?


If I were to sort all votes by voter ID (a 32 bit number), and append 
the vote value (A 4 to 8bit number);  Then a vote becomes a chunk of 40 
bits, fixed length; and I can stack one right after another in a compact 
format.


Blocks of compacted votes are ideal for bin

XML/XHTML/HTML differences, bugs... and howto

2013-01-23 Thread Andrew Robinson


Good day :),

I've been exploring XML parsers in python; particularly: 
xml.etree.cElementTree; and I'm trying to figure out how to do it 
incrementally, for very large XML files -- although I don't think the 
problems are restricted to incremental parsing.


First problem:
I've come across an issue where etree silently drops text without 
telling me; and separate.


I am under the impression that XHTML is a subset of XML (eg:defined 
tags), and that once an HTML file is converted to XHTML, the body of the 
document can be handled entirely as XML.


If I convert a (partial/contrived) html file like:



 This is example bold text.



to XHTML, I might do --right or wrong-- (1):



 This is example bold text.



or, alternate difference: (2): " This is example bold text. "

But, when I parse with etree,  in example (1) both "This is an example" 
and "text." are dropped;
The missing text is part of the start, or end event tags, in the 
incrementally parsed method.


Likewise: In example (2), only "text" gets dropped.

So, etree is silently dropping all text following a close tag, but 
before another open tag happens.


Q:
Isn't XML supposed to error out when invalid xml is parsed?
Is there a way in etree to recover/access the dropped text?
If not -- is the a python library issue, or the underlying expat.so, 
etc. library.


Secondly;
I have an XML file which will grow larger than memory on a target 
machine, so here's what I want to do:


Given a source XML file, and a destination file:
1) iteratively scan part of the source tree.
2) Optionally Modify some of scanned tree.
3) Write partial scan/tree out to the destination file.
4) Free memory of no-longer needed (partial) source XML.
5) continue scanning a new section of the source file... eg: goto step 1 
until source file is exhausted.


But, I don't see a way to write portions of an XML tree, or iteratively 
write a tree to disk.

How can this be done?

:)  Thanks!



--
http://mail.python.org/mailman/listinfo/python-list

Re: XML/XHTML/HTML differences, bugs... and howto

2013-01-23 Thread Andrew Robinson


On 01/24/2013 06:42 AM, Stefan Behnel wrote:

Andrew Robinson, 23.01.2013 16:22:

Good day :),

Nope, you should read the manual on this. Here's a tutorial:

http://lxml.de/tutorial.html#elements-contain-text
I see, so it should be under the "tail" attribute, not the "text" 
attribute.  That's why I missed it.




But, I don't see a way to write portions of an XML tree, or iteratively
write a tree to disk.
How can this be done?

There are several ways to do it. Python has a couple of external libraries
available that are made specifically for generating markup incrementally.

lxml also gained that feature recently. It's not documented yet, but here
are usage examples:

https://github.com/lxml/lxml/blob/master/src/lxml/tests/test_incremental_xmlfile.py

Stefan

Thanks Stefan !  I'll look that over. :)
--
http://mail.python.org/mailman/listinfo/python-list

XML validation / exception.

2013-01-24 Thread Andrew Robinson


A quick question:

On xml.etree,
When I scan in a handwritten XML file, and there are mismatched tags -- 
it will throw an exception.
and the exception will contain a line number of the closing tag which 
does not have a mate of the same kind.


Is there a way to get the line number of the earlier tag which caused 
the XML parser to know the closing tag was mismatched, so I can narrow 
down the location of the mismatches for a manual repair? (I don't want 
auto-repair like beautiful soup. but google is worthless for finding a 
solution...)


And secondly, for times where I want to throw a software/content 
specific error on valid XML files;
I don't see which attribute of an element, or method, allows me to find 
out the line number and column number that an element I am examining is 
found at.


? How do I get it ?

Cheers, --Andrew.
--
http://mail.python.org/mailman/listinfo/python-list

Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson


Hi,
I'm building a custom numeric class that works with values that have 
uncertainty and am wanting to make it as compatible with floating point 
objects as possible -- so as to be usable in legacy code with minimal 
rewites; but but I am having trouble understanding how to handle 
magnitude comparison return values for __lt__ __gt__ etc.


The problem with magnitude comparisons in my class is that the 
comparison may fail to be True for various reasons, and in various ways, 
but it's important for programming to retain the general reason a 
comparison wasn't strictly 'True'.


For example, to 1 significant figure, I can write two values 
a=mynum(0.1) and b=mynum(0.01) ; as written both of them have an 
uncertainty of 1 least significant figure, so it is possible both values 
are really a 0.   But, there is a bias/mean/average which suggests that 
more often than not -- 0.1 will be bigger than 0.01. So, what is the 
proper response of a>b ?,  the answer is that a>b depends on the context 
of how the values were obtained and what is being done with them, 
although strictly speaking 'a' is not greater than 'b' in this case, 
although 'a' has a much better chance of being greater than 'b' on 
average which may be important for the user to know.


Where I'm getting stuck is how to encode the return values of the 
comparison operators to be compatible with python floating point object 
return values (and python's sort algorithms for lists of floats) but 
still give extra functionality that is needed for uncertainty ... and as 
I'm writing in python 2.xx but with an eye toward python 3 in the 
future, I've become concerned that __cmp__ has been discontinued so that 
I need to write this using only __gt__() and friends.  I don't know how 
to do it.


What I would like is for operators like '>' to return a boolean True or 
False equivalent, which also encodes the reason for failure in one of 
five ways:


True:  Sufficient information and confidence exists that the 
comparison is thoroughly True.
PartTrue:The comparison is uncertain for any single sample, but True 
is at least minimally more probable.

Unbiased:  The comparison is uncertain for any single sample.
PartFalse:   The comparison is uncertain for any single sample, but 
False is at least minimally more probable.
False: Sufficient information and confidence exists that the 
comparison is thoroughly False.


By default, only True would evaluate in conditional statement as a 
logical True.  All other values would be equivalent to False, but 
hopefully, a programmer could find out which 'False' was returned by 
some kind of object inspection or operator, like:


if (a>b) is PartTrue: print "I don't really know if 'a' really is 
greater than 'b'. but it might be"
if (a>b) > Unbiased:   print "a sorts after b because it's at least more 
probable that a>b than not."
if (a>b):  print "this message will not print if the value 'b' might 
occasionally be less than or equal to 'a'."


For sorting, it would be ideal if the sorting algorithms min(), max(), 
etc. could automatically recognize that:

False < PartFalse < Unknown < PartTrue < True.

But, even if that can't be done -- if there were a way, or method I 
could add to my classes, which would intercept sort functions and 
replace an absolute certain compare function with a merely Unbiased 
detecting one; sort functions would all operate properly.


However, I'm not sure how to create these return values nor do I know 
how to get the sort() functions to use them.


I've tried subclassing float() to see if I could actually make a 
subclass that inherited all that float has, and be able to add extra 
methods for my own use -- but Python doesn't seem to allow subclassing 
float.  I either am not doing it right, or it can't be done.


So, I'm not sure I can subclass boolean either because that too is a 
built in class ...  but I'm not sure how else to make an object that 
acts as boolean False, but can be differentiated from false by the 'is' 
operator.  It's frustrating -- what good is subclassing, if one cant 
subclass all the base classes the language has?


What other approaches can I take?

'False' is a singleton, so I was pretty sure this wouldn't work -- but I 
tried it...

PartTrue = False
if (1>2) is PartTrue: print "This is an obvious failure...  False is not 
the object PartTrue."


And I am seriously stumped
















--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson




So, I'm not sure I can subclass boolean either because that too is a
built in class ...  but I'm not sure how else to make an object that
acts as boolean False, but can be differentiated from false by the 'is'
operator.  It's frustrating -- what good is subclassing, if one cant
subclass all the base classes the language has?


As I said above, make sure you have a constructor.  If you still get 
an error, post a message that shows exactly what you did, and what 
exception you saw.




OK.
I tried to subclass bool, using __new__ just to see if it would even 
accept the definition... eg: python 2.7.5


>>> class UBool( bool ):
... def __new__( self, default ): return bool.__new__( self, default )
...
Traceback (most recent call last):
  File "", line 1, in 
TypeError: Error when calling the metaclass bases
type 'bool' is not an acceptable base type

I also tried using return int.__new__( self, bool(default) ) but that 
too failed the exact same way.



I came across this in my searches, perhaps it has something to do with 
why I can't do this?

https://mail.python.org/pipermail/python-dev/2002-March/020822.html

I thought about this last night, and realized that you shouldn't be
allowed to subclass bool at all!  A subclass would only be useful when
it has instances, but the mere existance of an instance of a subclass
of bool would break the invariant that True and False are the only
instances of bool!  (An instance of a subclass of C is also an
instance of C.)  I think it's important not to provide a backdoor to
create additional bool instances, so I think bool should not be
subclassable.

...

--Guido van Rossum

So, I think Guido may have done something so that there are only two 
instances of bool, ever.
eg: False and True, which aren't truly singletons -- but follow the 
singleton restrictive idea of making a single instance of an object do 
the work for everyone; eg: of False being the only instance of bool 
returning False, and True being the only instance of bool returning True.


Why this is so important to Guido, I don't know ... but it's making it 
VERY difficult to add named aliases of False which will still be 
detected as False and type-checkable as a bool.  If my objects don't 
type check right -- they will likely break some people's legacy code...  
and I really don't even care to create a new instance of the bool object 
in memory which is what Guido seems worried about, rather I'm really 
only after the ability to detect the subclass wrapper name as distinct 
from bool False or bool True with the 'is' operator.  If there were a 
way to get the typecheck to match, I wouldn't mind making a totally 
separate class which returned the False instance; eg: something like an 
example I modified from searching on the web:


class UBool():
def __nonzero__(self): return self.default
def __init__( self, default=False ): self.default = bool(default)
def default( self, default=False ): self.defualt = bool(default)

but, saying:
>>> error=UBool(False)
>>> if error is False: print "type and value match"
...
>>>

Failed to have type and value match, and suggests that 'is' tests the 
type before expanding the value.
It's rather non intuitive, and will break code -- for clearly error 
expands to 'False' when evaluated without comparison functions like ==.


>>> if not error: print "yes it is false"
...
yes it is false
>>> print error.__nonzero__()
False
>>> if error==False: print "It compares to False properly"
...
>>>

So, both 'is' and == seems to compare the type before attempting to 
expand the value.

As a simple cross check, I tried to make a one valued tuple.

>>> a=(False,None)
>>> print a
(False, None)
>>> a=(False,)
>>> if a is False: print "yes!!!"
...
>>>
>>> if not a: print "a is False"
...
>>> if a == False: print "a is False"

but that obviously failed, too; and if == fails to say False==False ... 
well, it's just to sensitive for wrapper classes to be involved unless 
they are a subclass of bool...


Any ideas ?













-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson



On 01/06/2015 06:02 AM, Dave Angel wrote:

On 01/06/2015 08:30 AM, Andrew Robinson wrote:



So, I'm not sure I can subclass boolean either because that too is a
built in class ...  but I'm not sure how else to make an object that
acts as boolean False, but can be differentiated from false by the 
'is'

operator.  It's frustrating -- what good is subclassing, if one cant
subclass all the base classes the language has?


I said earlier that I don't think it's possible to do what you're 
doing without your users code being somewhat aware of your changes.

Aye. You did.  And I didn't disagree. :)
The goal is merely to trip up those who don't know what I'm doing as 
little as possible and only break their code where the very notion of 
uncertainty is incompatible with what they are doing, or where they did 
something very stupid anyway... eg: to break it where there is a good 
reason for it to be broken.

I may not achieve my goal, but I at least hope to come close...



But as long as the user doesn't check for the subclass-ness of your 
bool-like function, you should manage.  In Python, duck-typing is 
encouraged, unlike java or C++, where the only substitutable classes 
are subclasses.


but if you can't subclass a built in type -- you can't duck type it -- 
for I seem to recall that Python forbids duck typing any built in class 
nut not subclasses.  So your two solutions are mutually damaged by 
Guido's decision;  And there seem to be a lot of classes that python 
simply won't allow anyone to subclass.  ( I still need to retry 
subclassing float, that might still be possible. )


Removing both options in one blow is like hamstringing the object 
oriented re-useability principle completely.  You must always re-invent 
the wheel from near scratch in Python


--Guido van Rossum

So, I think Guido may have done something so that there are only two
instances of bool, ever.
eg: False and True, which aren't truly singletons -- but follow the
singleton restrictive idea of making a single instance of an object do
the work for everyone; eg: of False being the only instance of bool
returning False, and True being the only instance of bool returning 
True.


Why this is so important to Guido, I don't know ... but it's making it
VERY difficult to add named aliases of False which will still be
detected as False and type-checkable as a bool.  If my objects don't
type check right -- they will likely break some people's legacy code...
and I really don't even care to create a new instance of the bool object
in memory which is what Guido seems worried about, rather I'm really
only after the ability to detect the subclass wrapper name as distinct
from bool False or bool True with the 'is' operator.  If there were a


There's already a contradiction in what you want.  You say you don't 
want to create a new bool object (distinct from True and False), but 
you have to create an instance of your class.  If it WERE a subclass 
of bool, it'd be a bool, and break singleton.
Yes there seems to be a contradiction but I'm not sure there is ... and 
it stems in part from too little sleep and familiarity with other 
languages...


Guido mentioned subclassing in 'C' as part of his justification for not 
allowing subclassing bool in python.

That's what caused me to digress a bit...  consider:

In 'C++' I can define a subclass without ever instantiating it; and I 
can define static member functions of the subclass that operate even 
when there exists not a single instance of the class; and I can typecast 
an instance of the base class as being an instance of the subclass.  So 
-- (against what Guido seems to have considered) I can define a function 
anywhere which returns my new subclass object as it's return value 
without ever instantiating the subclass -- because my new function can 
simply return a typecasting of a base class instance;  The user of my 
function would never need to know that the subclass itself was never 
instantiated... for they would only be allowed to call static member 
functions on the subclass anyway, but all the usual methods found in the 
superclass(es) would still be available to them.  All the benefits of 
subclassing still exist, without ever needing to violate the singleton 
character of the base class instance.


So part of Guido's apparent reason for enforcing singleton ( dual 
singleton / dualton? ) nature of 'False' and 'True' isn't really 
justified by what 'C++' would allow because C++ could still be made to 
enforce singleton instances while allowing subclassing *both* at the 
same time.


There seems to be some philosophical reason for what Guido wants that he 
hasn't fully articulated...?
If I understood him better-- I wouldn't be making wild ass guesses and 
testing e

Re: Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson



On 01/06/2015 05:35 AM, Chris Angelico wrote:

On Wed, Jan 7, 2015 at 12:30 AM, Andrew Robinson
 wrote:

Why this is so important to Guido, I don't know ... but it's making it VERY
difficult to add named aliases of False which will still be detected as
False and type-checkable as a bool.  If my objects don't type check right --
they will likely break some people's legacy code...  and I really don't even
care to create a new instance of the bool object in memory which is what
Guido seems worried about, rather I'm really only after the ability to
detect the subclass wrapper name as distinct from bool False or bool True
with the 'is' operator.  If there were a way to get the typecheck to match,
I wouldn't mind making a totally separate class which returned the False
instance; eg: something like an example I modified from searching on the
web:

Okay, so why not just go with your own class, and deal with the
question of the type check? Simple solution: Instead of fiddling with
__gt__/__lt__, create your own method, and use your own comparison
function to sort these things.

ChrisA
Because defining a bunch of special methods defeats the very purpose of 
making my class compatible with float variables.

eg: No legacy code would work...

I know (belatedly) that am going to have to define my own class.
That's pretty much a given, but I want to do it in a way which requires 
my users to make very few changes to their traditional floating point 
algorithms and code.


The type check issue is mostly about compatability in the first place ; 
eg: users typecheck either unintentionally -- (novices syndrome) -- or 
because they need all the capabilities of a given type, and the only 
simple way to find out if they are all there are there is to typecheck.  
eg: That's the whole point of subclassing bool ... to let the user know 
they have at their disposal (in a portable, simple way) all the features 
of the base type.


Well, if I make a new object -- type checking is pointless.   No user 
thinking they were coding for floating point in the past would know that 
my new return type is totally compatible with bool.  They would have to 
have written individual tests for the existence of every method in bool, 
and why would they be crazy enough to do that? It's a lot of work for 
nothing...








--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson

On 01/06/2015 06:34 PM, Terry Reedy wrote:

On 1/6/2015 9:01 PM, Andrew Robinson wrote:

[snip]

There are very few (about 4) builtin classes that cannot be 
subclassed.  bool is one of those few, float is not.  Go ahead and 
subclass it.

>>> class F(float): pass

>>> F

>>> F(2.3) + F(3.3)
5.6

Thanks terry! That's a relief.  Ive just managed to find a few classes 
that won't subtype by trial and error in the last two months and was 
getting pessimistic.  ( eg: doing web pages I wanted to customize the 
error output traceback stack from a python script based on where the 
exception occurred. GH!  I worked around the no sub-typing 
issue, but it took a lot of guessing to trick python into accepting a 
fake class to the print traceback functions the webserver used... )

--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-06 Thread Andrew Robinson



On 01/06/2015 06:31 PM, Chris Angelico wrote:



The type check issue is mostly about compatability in the first place ; eg:
users typecheck either unintentionally -- (novices syndrome) -- or because
they need all the capabilities of a given type, and the only simple way to
find out if they are all there are there is to typecheck.  eg: That's the
whole point of subclassing bool ... to let the user know they have at their
disposal (in a portable, simple way) all the features of the base type.

Thing is, you're not fulfilling bool's contract, so it's better to not
subclass, and just make your new type always falsy. If your users are
type-checking bools, you might just have to let it break, and tell
them not to do that.

ChrisA


Explain; How does mere subclassing of bool break the contract that bool has?
eg: What method or data would the superclass have that my subclass would 
not?


Are you speaking about the quasi singleton nature of bool ?
If so, I spent a little time browsing my design patterns book by Gamma, 
Helm, Johnson, and Vlissides; and I'm looking at the singleton pattern 
on p.127.


The author writes, "Use the singleton pattern when:
-- There must be exactly one instance of a class, and it must be 
accessible to clients from a well-known access point.
-- When the _sole instance_ should be extensible by subclassing, and 
clients should be able to use an extended instance *without modifying 
their code*.

"

So, it's clear that in typical programming scenarios -- objects which 
are even more restrictive than bool by having only a single allowed 
instance rather than TWO -- are *Still* intentionally allowed to be 
subclassed for compatibility reasons.


And later in design patterns, the authors continue on:

"2. Subclassing the singleton class.
  The _main issue is not so much defining the subclass_ but installing 
its unique instance so that clients will be able to use it."


So, the general programming community is aware of the issue Rossum 
brings up about a singleton's subclass having an instance; it's just 
apparent that there are ways to work around the issue and preserve a 
singleton's character while still allowing a subclass.


So:  I'm really curious -- If subclassing is generally permitted for 
singletons as an industrial practice, why is it wrong to allow it in python?


I mean, If this is because Python doesn't support sub-classes for 
singletons, then it seems that Python is lacking something that should 
be added.


This isn't limited to bool, for as a library writer I might want to 
create a singleton class for my own purposes that has nothing to do with 
any of python's built in types.  And so, it would be appropriate to have 
a mechanism for subclassing user created singletons as well


I already KNOW that 'C++' does have a workaround mechanism, as I've 
mentioned in a different e-mail, so that there's no reason to 
instantiate an instance of the subclass of a singleton if you don't want 
to.  That objection is really spurrious... so I really don't understand 
why Rossum cut off subclassability itself ... wasn't there any other way 
he could have prevented instantiation of subclasses without preventing 
the definition of a subclass itself?


I mean, even in python I can execute some methods of a class without 
actually INSTANTIATING that class.

eg:

import decimal
decimal.getcontext()

So, I don't understand your objection.
How does merely defining a subclass of bool violate the contract that 
bool puts out?


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-12 Thread Andrew Robinson



On 01/07/2015 04:04 PM, Ethan Furman wrote:

On 01/06/2015 07:37 PM, Andrew Robinson wrote:


Explain; How does mere subclassing of bool break the contract that bool has?
eg: What method or data would the superclass have that my subclass would not?

bool's contract is that there are only two values (True and False) and only one 
instance each of those two values (True
and False).  If bool were subclassable, new values could be added with either 
completely different values (PartTrue) or
with more of the same value (True, ReallyTrue, AbsolutelyTrue) -- hence, broken 
contract.

--
~Ethan~



Huh? I'm not adding any values when I merely subclass bool ; and even if 
the subclass could be instantiated -- that's doesn't mean a new value or 
instance of the base class (bool) must exist.  For I could happily work 
with a new subclass that contains no new data, but only an already 
_existing instance_ of 'True' or 'False_'_ as its value source.   That 
means there is no new value...  but at most (and even that could be 
worked around) a new instance of a subclass containing an existing 
instance of it's base class.


Note:  Guido only mentioned that he didn't want multiple instances of 
the base class bool -- But that's not technically the same as objecting 
to a subclass having an instance which CONTAINS an original bool 
instance and not a new one.


There are other ways Guido could have modified the Python language to 
prevent creation of new values without preventing the creation of a 
subclass --  if that's what he was really afterSo -- no -- I 
disagree with you.


Subclassing is allowed in many other OOP languages (not just C++) when 
working with singletons, and dualtons, n-tons... and the very PURPOSE of 
those objects is to prevent multiple instances, or spreading of 
control.  BUT -- Most object oriented languages I know of -- allow as 
standard practice, subclassing of n-tons -- while (often) simultaneously 
controlling or even eliminating the number of instances a subclass may 
have / and or the values it may take.  eg: depending on how flexible the 
type/class definition of a language is -- languages handle sub-classing 
of singletons differently.


Besides, a contract for a class can only be assumed to be valid for code 
designed for that specific class -- not code made for a subclass;  The 
contract for bool simply says nothing about programs designed for any 
other classes that bool is found inside of -- either as a subclass or a 
subelement;  eg:  I can still put bool inside another object with 
different methods: eg: I just write: (False,) as proof -- so Guido 
couldn't possibly have been trying to limit the methods which can 
operate on bool or the number of links to bool.


So I don't understand why Guido cared to restrict subclassing of bool -- 
and what the contract you mention has to do with it --  eg: what was his 
actual goal ?  Was it memory conservation, or compatability of built-in 
return types -- or what ?  There is no written 'contract' saying exactly 
what Guido's design objectives were in detail and more importantly 
'WHY';  but Guido only said that subclassing instances allowed a 'back 
door' (not that subclassiing itself was bad, just that it allowed some 
side effect...) to whatever Guido really didn't want to happen.


Python generally allows subclassing of singleton instances; so that 
makes 'bool' is an arbitrary exception to the general rule which Guido 
decided  to make... and he did so without any very clear explanation as 
to why.  eg: he cited what C/C++ *must *do as part of his reasoning -- 
but C/C++ can apparently do something that Guido  thought it couldn't 
(Guido was flat wrong) and for that reason, I really wonder if 
Guido's decision was some kind of spur of the moment erroneous epiphany 
-- bolstered by the fact that he said "I realized last night..." like 
something he had never thought of before, or thought through carefully


https://mail.python.org/pipermail/python-dev/2002-March/020822.html


And worse, Guido's problem even apparently extends to making 'duck 
types' which other writers in this thread have been proposing I do. eg: 
They, too, are going against canonical Guido's epiphany night...


Of course, you can define your own subclass of int similar to the bool
class I show in the PEP, and you can give it any semantics you want --
but that_would also defeat the purpose of having a standard bool_.

Not to mention, that Charles Bool's two initial values True and False 
are NOT the only ones used in computer engineering and science; for 
boolean logic only became useful at a time when engineers and 
mathematicians realized that at least a third type, AKA: 'Don't care' -- 
was necessary for doing correctness testing of logic in a tractable 
way.   So -- Gu

Re: Comparisons and sorting of a numeric class....

2015-01-12 Thread Andrew Robinson

On 01/12/2015 02:35 PM, Chris Angelico wrote:

On Tue, Jan 13, 2015 at 9:27 AM, Andrew Robinson
 wrote:

Huh? I'm not adding any values when I merely subclass bool ; and even if the
subclass could be instantiated -- that's doesn't mean a new value or
instance of the base class (bool) must exist.  For I could happily work with
a new subclass that contains no new data, but only an already existing
instance of 'True' or 'False' as its value source.   That means there is no
new value...  but at most (and even that could be worked around) a new
instance of a subclass containing an existing instance of it's base class.

Hmmm
That may be true in python, as it is now, but that doesn't mean that 
Guido had to leave it that way when he decided to change the language to 
single out bool, and make it's subclassing rules abnormal in the first 
place.

He was changing the language when he made the decision after all !!

What I am wanting to know is WHY did Guido think it so important to do 
that ?   Why was he so focused on a strict inability to have any 
instances of a bool subclass at all -- that he made a very arbitrary 
exception to the general rule that base types in Python can be subclassed ?

There's no reason in object oriented programming principles in general 
that requires a new subclass instance to be a COMPLETELY DISTINCT 
instance from an already existing superclass instance nor, have I 
ever seen Guido say that Python is designed intentionally to force this 
to always be the case... so I'm not sure that's its anything more than a 
non guaranteed implementation detail that Python acts the way you say it 
does

I don't see, within Python, an intrinsic reason (other than lack of 
support/foresight in the historical evolution of Python to date), as to 
why a subclass couldn't be instanted with the data coming from an 
*already* existing instance of it's superclass.

There is no need to copy data from an initialized superclass instance 
into a subclass instance that has no new data, but only rebind -- or add 
a binding/proxy object -- to bind the superclass instance to the 
subclass methods.

eg: what is now standard practice to create a new copy of the superclass:

class myFalse( bool ): __new__( self, data ): return super( myFalse, 
self ).__new__(self,data)

Could be replaced by a general purpose proxy meant to handle singleton 
subclassing:

class myFalse( bool ):  __new__( self ): return 
bind_superinstance_to_subclass( False, myFalse )

he Python bool type has the following
invariant, for any object x:

assert not isinstance(x, bool) or x is True or x is False

Really !???
Where, in the language definition, did Guido explicitly guarantee this 
invariant ?

(You can fiddle with this in Py2 by rebinding the names True and
False, but you could replace those names with (1==1) and (1==0) if you
want to be completely safe. Likewise, the name "bool" could be
replaced with (1==1).__class__ to avoid any stupidities there. But
conceptually, that's the invariant.)
Interesting ... but rebinding True and False, won't extend the new 
capabilities to modules which are imported.  They will still, I think, 
be bound to the old True and False values.   I know, for example -- I 
can redefine the class bool altogether; although the type string becomes 
'main.bool' -- none the less, it does not exist in default scope when I 
switch namespaces; eg: in a module being imported 'bool' still means the 
old version of class bool.

I have to do something more drastic, like __builtins__.bool = class 
bool(int): ...  And then, even modules will recognize the changed class 
definition.

Hmmm.
>>> __builtins__.True='yes'
>>> True
'yes'

However, such actions -- I think -- are rather drastic; because they 
produce situations where another third party library in competition with 
mine might also have need of subclassing 'bool' and then we are in a 
fight for a static binding name with winner takes all ... rather than 
sharing dynamically compatible definitions based on subclasses.

Subclassing bool breaks this invariant, unless you never instantiate
the subclass, in which case it's completely useless.

Well we can't subclass bool now -- and the language would have to change 
in order for us to be able to subclass it; So -- I don't think your 
assert statement guarantees anything if case the language changes... On 
the other hand, I also don't see that your assert statement would ever 
return False even as it is written --- even if the object (x) was 
sub-classed or a totally different object than True or False  and so 
I *definitely* don't see why your assert statement would Fail if the 
language changed in one of several subtle ways I think it could.

I mean, even right now -- with the language as-is -- l

Re: Comparisons and sorting of a numeric class....

2015-01-12 Thread Andrew Robinson

Hmm

LOL ... no exception was raised... and we know if the assertion Failed, an
exception ought to be raised:

The assertion did not fail. There are three parts, and as long as one
of them is true, the assertion will pass:

1) x isn't an instance of bool
2) x is the object known as True
3) x is the object known as False

You just gave an example of the first part of the invariant. That's an
instance of tuple, which is not a subclass of bool, ergo isinstance(x,
bool) returns False, negating that makes True, and the assertion
passes. [1]

Uh ... yeah... and so an assertion meant to test if something is or is 
not a bool let a non-bool pass the assertion.  That seems rather ... 
silly, and useless, ... and so I really doubt the assertion -- based on 
it's behavior --can distinguish an actual bool from a subclassed one or 
a totally unrelated object ...

I mean, let's start by testing if x as an actual boolean will cause the 
assertion to act differently from a fake-non bool object which we 
already tried.

x=True
>>> x=True
>>> assert not isinstance(x, bool) or x is True or x is False
>>>
Wow. No difference in behavior.

So (as a test) it can't distinguish between an actual boolean and a 
faked one.
They both pass the assertion.  So -- What good is this assertion? It 
tells us nothing useful when executed.

Also what if we put in a subclass

Instead of pretending what if -- let's actually REPLACE python's built 
in bool class with an emulation that ALLOWS subclassing and THEN let's 
TEST my hypothesis that the assert statement you gave me can't tell the 
difference between bools any anthing else by it's actions...  ( h 
... another back-door that Guido forgot about... or perhaps purposely 
left open...)

class bool( int ):
def __new__(self,data): return super( bool, self).__new__(self, 
[0,1][data!=0] )

def __repr__(self): return [ 'True', 'False' ][self>0]

__builtins__.bool=bool
__builtins__.True=bool( 1==1 )
__builtins__.False=bool( 1==0 )

And, running it in the python interpreter...

>>> class bool( int ):
... def __new__(self,data): return super( bool, self).__new__(self, 
[0,1][data!=0] )

... def __repr__(self): return [ 'True', 'False' ][self>0]
...
>>> __builtins__.bool=bool
>>> __builtins__.True=bool( 1==1 )
>>> __builtins__.False=bool( 1==0 )
>>>
>>> type(True)

I now have proof that the replacement succeeded.
So let's subclass bool !!!

>>> class subBool(bool): pass
...
>>>

and now, let's see if your assertion believes a subclass of bool is a 
bool...

>>> x=subBool(0)
>>> assert not isinstance(x, bool) or x is True or x is False
>>>

Wow. No change.
So -- it doesn't fail when the object ISN'T a bool, it doesn't fail when 
the object IS a bool, and it doesn't fail when the object is a subclass 
of bool; and #1 matches that of a True bool! or if #1 doesn't match that 
of a true bool.

>>> isinstance( x, bool ) ,
True
>>> isinstance( True, bool )
True

Therefore, your explantion ( so far ) of how to interpret the invariant, 
is consistent with it explicitly and *precisely* 'passing'  all possible 
instances of subclasses of bool, all instances of non-bools, and all 
instances of bools.  Yes, the assertion you chose passes precisely 
ANYTHING! (facetous use of precise.).

It's a worthless assertion in the sense that it has no explicit logic to 
DISTINGUISH what Guido/Python does want a bool to be from what you have 
said and implied Guido doesn't want (even though I've never seen Guido 
agree with you on this assertion thing...) .

So -- your assertion, at least as shown, is pretty useless in helping
determine why subclassing is not allowed, or instances of subclasses that
are not distinct from their superclasses existing instance.

It exactly defines the nature of Python's bool type: there are
precisely two instances of it.

ChrisA

[1] Which puts me in mind of https://www.youtube.com/watch?v=D0yYwBzKAyY

Uh, no, -- your assertion excludes nothing you've been telling me is not 
a bool by contract -- so it doesn't 'define' anything precisely because 
it's by definition inaccurate.  it's simply an invariant form of spam 
with three terms you can interpret any way you like.

Where did you get that assertion from anyway and how is it related to 
Guido and formal definitions of the python language ???  Are you really 
trying to imply that Guido wrote that assertion ?

On Tue, Jan 13, 2015 at 12:59 PM, Andrew Robinson
  wrote:

There is no need to copy data from an initialized superclass instance into a
subclass instance that has no new data, but only rebind -- or add a
binding/proxy object -- to bind the superclass ins

Re: Comparisons and sorting of a numeric class....

2015-01-13 Thread Andrew Robinson



On 01/12/2015 09:32 PM, Steven D'Aprano wrote:

On Mon, 12 Jan 2015 17:59:42 -0800, Andrew Robinson wrote:

[...]

What I am wanting to know is WHY did Guido think it so important to do
that ?   Why was he so focused on a strict inability to have any
instances of a bool subclass at all -- that he made a very arbitrary
exception to the general rule that base types in Python can be
subclassed ?

It's not arbitrary. All the singleton (doubleton in the case of bool)
classes cannot be subclassed. E.g. NoneType:

py> class X(type(None)):
... pass
...
Traceback (most recent call last):
   File "", line 1, in 
TypeError: Error when calling the metaclass bases
 type 'NoneType' is not an acceptable base type


Likewise for the NotImplemented and Ellipsis types.

The reason is the same: if a type promises that there is one and only one
instance (two in the case of bool), then allowing subtypes will break
that promise in the 99.99% of cases where the subtype is instantiated.
Ok. That's something I did not know.  So much for the just four classes 
can't be subtyped remark someone else made...


At least Guido is consistent.
But that doesn't give me any idea of why he thought it important.


I suppose in principle Python could allow you to subclass singleton
classes to your hearts content, and only raise an error if you try to
instantiate them, but that would probably be harder and more error-prone
to implement, and would *definitely* be harder to explain.


There may be others too:

py> from types import FunctionType
py> class F(FunctionType):
... pass
...
Traceback (most recent call last):
   File "", line 1, in 
TypeError: Error when calling the metaclass bases
 type 'function' is not an acceptable base type


My guess here is that functions are so tightly coupled to the Python
interpreter that allowing you to subclass them, and hence break required
invariants, could crash the interpreter. Crashing the interpreter from
pure Python code is *absolutely not allowed*, so anything which would
allow that is forbidden.



There's no reason in object oriented programming principles in general
that requires a new subclass instance to be a COMPLETELY DISTINCT
instance from an already existing superclass instance

True. But what is the point of such a subclass? I don't think you have
really thought this through in detail.


I have. Such a subclass allows refining of the meaning/precision of a 
previous type while improving compatibility with existing applications 
-- and without being so easy to do that everyone will abuse it.  That's 
the standard kind of things which go into deciding that a singleton is 
appropriate...


Subclasses of singletons is a tried and true way of improving 
productivity regardless of how it is implemented.



Suppose we allowed bool subclasses, and we implement one which *only*
returns True and False, without adding a third instance:

class MyBool(bool):
 def __new__(cls, arg):
 if cls.condition(arg):
 return True
 else:
 return False
 @classmethod
 def condition(cls, obj):
 # decide whether obj is true-ish or false-ish.
 pass
 def spam(self):
 return self.eggs()
 def eggs(self):
 return 23


And then you do this:

flag = MyBool(something)
flag.spam()


What do you expect to happen?

Flag is not an instance of MyBool, so it's going to generate an exception.


Since flag can *only* be a regular bool, True or False, it won't have
spam or eggs methods.

Correct.  It would generate an exception.

You might think of writing the code using unbound methods:

MyBool.spam(flag)

(assuming that methods don't enforce the type restriction that "self"
must be an instance of their class), but that fails when the spam method
calls "self.eggs". So you have to write your methods like this:

 def spam(self):
 return MyBool.eggs(self)

hard-coding the class name! You can't use type(self), because that's
regular bool, not MyBool.

This is a horrible, error-prone, confusing mess of a system. If you're
going to write code like this, you are better off making MyBool a module
with functions instead of a class with methods.


Every design method has its trade offs... but how you organize you code 
will affect whether it is messy or clean.


In terms of information encoding -- both an instance of a type,  or a 
class definition held in a type variable -- eg: a class name -- are 
pretty much interchangeable when it comes to being able to tell two 
items are not the same one, or are the same one.


So -- even a cursory thought shows that the information could be encoded 
in a very few lines even without an instance of a subclass:


class CAllFalse():
@classmethod
def __nonzero__(Kls): return False

class CPartFalse():
@classmethod
def __nonzero__(

Re: Comparisons and sorting of a numeric class....

2015-01-14 Thread Andrew Robinson



And most of this thread has been nothing more than me asking "why" 
did Guido

say to do that -- and people avoiding answering the question.

Wait, are you actually asking why bool is a doubleton? If nobody has
answered that, I think probably nobody understood you were asking it,
because it shouldn't need to be explained.

In some ways, yes I am asking that -- but in another I am not;
to be more precise -- Why did Guido refuse to allow refinement of 
meaning when in industry -- logic equations generally are not limited to 
Charles Bool's original work, but include practical enhancements of his 
theories which make them far more useful.


A subclass is generally backward compatible in any event -- as it is 
built upon a class, so that one can almost always revert to the base 
class's meaning when desired -- but subclassing allows extended meanings 
to be carried.  eg: A subclass of bool is a bool -- but it can be MORE 
than a bool in many ways.


One example: It can also be a union. So when Guido chose to cut off 
subclassing -- his decision had a wider impact than just the one he 
mentioned; eg: extra *instances* of True and False as if he were 
trying to save memory or something.


The reason Guido's action puzzles me is twofold -- first it has been 
standard industry practice to subclass singleton  (or n-ton) objects to 
expand their meaning in new contexts, and this practice has been 
documented for many years.  So -- why did Guido go against the general 
OOP practice unless he didn't know about it?


The mere existence of a subclass does not threaten the integrity of 
bools or security in Python in any way I can see -- as one can do a type 
check, or a base type check, on an instance to decide how to handle a 
subclass vs. base class instance. So I'm guessing he was concerned about 
something else, but I don't know what.


In general -- it's not the goal of subclassing to create more instances 
of the base types -- but rather to refine meaning in a way that can be 
automatically reverted to the base class value (when appropriate) and to 
signal to users that the type can be passed to functions that require a 
bool because of backward compatibility.



Boolean algebra has two values: true and false, or 1 and 0, or humpty
and dumpty, or whatever you like to call them.
You're speaking to an Electrical engineer.  I know there are 10 kinds of 
people, those who know binary and those who don't.  But you're way off 
base if you think most industrial quality code for manipulating boolean 
logic uses (or even can be restricted to use) only two values and still 
accomplish their tasks correctly and within a finite amount of time.



   The bool class
represents the values of boolean algebra. Therefore, there are two of
them. If I have an object that I know is an instance of bool, the
implication is that it is one of those two values, not something
potentially completely different.

Can you name any other language that *does* allow subclassing of
booleans or creation of new boolean values?
Yes. Several off the top of my head -- and I have mentioned these 
before.  They generally come with the extra subclasses pre-created and 
the user doesn't get to create the classes, but only use them; none the 
less -- they have more than two values with which to do logic equations 
with.


VHDL, Verilog, HDL, Silos III, and there are IEEE variants also.
C/C++ historically allowed you to do it with instances included, 
although I am not sure it still does.


The third value is usually called "TRI-state" or "don't care". (Though 
its sometimes a misnomer which means -- don't know, but do care.)


Most of these high definition languages are used to do things like 
design micorprocessors... eg: the very intel or arm processor you 
typically run python on --- because trying to do it with boolean logic 
and theorems of the past in a pencil and paper compatible strict 
re-incarnation of what Charles Bool's did in his own time (even if done 
by computer) -- rather than including De-morgan and all the many other 
people who contributed afterward -- is about as red-neck backward as one 
can get -- and often doomed to failure (though for small applications 
you might get away with it.)


Often, only one extra ( tri state ) value is needed to do logic 
verification and testing; but in some cases, notably, where the 
exclusive 'or' function is involved, the relationship between don't care 
inputs can become important and more values are required; eg: to detect 
when in deeply nested logic, various sources of don't care inputs 
interfere with each and themselves in constructive or destructive ways 
to produce constant logic Trues or Falses.


We've discovered that we live in a quantum-mechanical universe -- yet 
people still don't grasp the pragmatic issue that basic logic can be 
indeterminate at least some of the time ?!


The name 'boolean logic' has never been re-named in honor of the many 
people who developed the advancements i

Re: Comparisons and sorting of a numeric class....

2015-01-15 Thread Andrew Robinson



On 01/15/2015 12:41 AM, Steven D'Aprano wrote:

On Wed, 14 Jan 2015 23:23:54 -0800, Andrew Robinson wrote:

[...]

A subclass is generally backward compatible in any event -- as it is
built upon a class, so that one can almost always revert to the base
class's meaning when desired -- but subclassing allows extended meanings
to be carried.  eg: A subclass of bool is a bool -- but it can be MORE
than a bool in many ways.

You don't have to explain the benefits of subclassing here.

I'm still trying to understand why you think you *need* to use a bool
subclass. I can think of multiple alternatives:

- don't use True and False at all, create your own multi-valued
   truth values ReallyTrue, MaybeTrue, SittingOnTheFence, ProbablyFalse,
   CertainlyFalse (or whatever names you choose to give them);

- use delegation to proxy True and False;

- write a class to handle the PossiblyTrue and PossiblyFalse cases,
   and use True and False for the True and False cases;

There may be other alternatives, but what problem are you solving that
you think

 class MyBool(bool): ...

is the only solution?

That's a unfair question that has multiple overlapping answers.
Especially since I never said subclassing bool is the 'only' solution; I 
have indicated it's a far better solution than many.


So -- I'll just walk you through my thought processes and you will see 
what I consider problems:


Start with the concept that as an engineer, I have spent well over 
twenty years on and off dealing with boolean values that are very often 
mixed indistinguishably with 'don't care' or 'tri-state' or 'metastable 
states'.   A metastable state *is* going to be True or False once the 
metastability resolves by some condition of measurement/timing/etc.; but 
that value can not be known in advance.   eg: similar to the idea that 
there is early and late binding in programming Sometimes there is a 
very good reason to delay making a final decision until the last 
possible moment; and it is good to have a default value defined if no 
decision is made at all.


So -- From my perspective, Guido making Python go from an open ended and 
permissive use of anything goes as a return value that can handle 
metastable states -- into to a historical version of 'logic' being 
having *only* two values in a very puritanical sense, is rather -- well 
-- disappointing.  It makes me wonder -- what hit the fan?!  Is it 
lemmings syndrome ? a fight ? no idea  and is there any hope of 
recovery or a work around ?


eg: To me -- (as an engineer) undefined *IS* equivalent in useage to an 
acutal logic value, just as infinity is a floating point value that is 
returned as a 'float'.  You COULD/CAN separate the two values from each 
other -- but always with penalties.  They generally share an OOP 'is' 
relationship with respect to how and when they are used. (inf) 'IS' a 
float value and -- uncertain -- 'IS' a logic value.


That is why I automatically thought before I ever started writing on 
this list (and you are challenging me to change...) -- that 'uncertain' 
should share the same type (or at least subtype) as Bool.  
Mathematicians can argue all they want that 'infinity' is not a float 
value, and uncertain is not a True or False.  And they are/will be 
technically right -- But as a practical matter -- I think programmers 
have demonstrated over the years that good code can handle 'infinity' 
most efficiently by considering it a value rather than an exception.  
And I think the same kind of considerations very very likely apply to 
Truth values returned from comparisons found in statistics, quantum 
mechanics, computer logic design, and several other fields that I am 
less familiar with.


So -- let's look at the examples you gave:


- don't use True and False at all, create your own multi-valued
   truth values ReallyTrue, MaybeTrue, SittingOnTheFence, ProbablyFalse,
   CertainlyFalse (or whatever names you choose to give them);


OK.  So -- what do I think about when I see your suggestion:

First I need to note where my booleans come from -- although I've never 
called it multi-valued logic... so jargon drift is an issue... though 
you're not wrong, please note the idea of muti-value is mildly misleading.


The return values I'm concerned about come from a decimal value after a 
comparison with another decimal value.

eg:

a = magicFloat( '2.15623423423(1)' )
b = magicFloat('3()')

myTruthObject = a>b

Then I look at python development historically and look at the built in 
class's return values for compares; and I notice; they have over time 
become more and more tied to the 'type' bool.  I expect sometime in the 
future that python may implement an actual type check on all comparison 
operators so they can not b

Re: Comparisons and sorting of a numeric class....

2015-01-23 Thread Andrew Robinson



On 01/15/2015 09:05 AM, Ian Kelly wrote:

On Thu, Jan 15, 2015 at 12:23 AM, Andrew Robinson
 wrote:

Can you name any other language that *does* allow subclassing of
booleans or creation of new boolean values?

Yes. Several off the top of my head -- and I have mentioned these before.
They generally come with the extra subclasses pre-created and the user
doesn't get to create the classes, but only use them; none the less -- they
have more than two values with which to do logic equations with.

VHDL, Verilog, HDL, Silos III, and there are IEEE variants also.
C/C++ historically allowed you to do it with instances included, although I
am not sure it still does.

Sorry, let me rehprase my question. Of course there will be
special-purpose languages that allow you to do interesting things with
the logic values and operators. Can you name any other
*general-purpose* language that allows subclassing of booleans or
creation of new boolean values? If not, it seems rather unfair to
single out Python and marvel that this isn't allowed when it's
actually quite normal to disallow it. Unless you care to provide an
example, I am fairly sure your claim of C/C++ is wrong. The bool type
in C++ is a primitive type, none of which can be inherited from. C
doesn't even have a bool type; at most you have macros for true and
false to 1 and 0, so the "booleans" there are just ordinary integers.

Ian,
I agree with you mostly; there is good reason to pick on other 
languages, too, with respect to what a bool is.


Although, I have to laugh -- Verilog can syntheze a CPU -- implement 
memory -- and then load a program and run python on the virtual 
machine.   When the pentium was first developed, I watched as Intel 
actually booted up MS-DOS under using Xilinx chips to run the verilog 
program's output they could physically run anything a pentium processor 
could run.  That's *IS* what I consider "general purpose".


But you're sort of confounding my need for type information in my new 
class as a way to advertise compatability with bool, with subclassing -- 
which is only one way that I was exploring to get the 'type' name 
attached to the new object;  That's a mistake that D'Aprano seems to be 
repetitively making as well.
But please note: Type checking for 'bool' is only needed for 
legacy/compatability reasons -- but new code can use any type; so 
subtyping is not strictly necessary if there is another way to get the 
'bool' type attached to my return object for advertising purposes; for 
what I am interested in is an object who presents as bool for legacy 
code, but new code can convert it to anything at all..


C++ *DOES* allow the necessary kind of type checking and subclassing for 
what I need to do in spite of not having a subclass mechanism built into 
the language for base types; eg: C++ allows a semantic subclass to be 
constructed which can be typecast to a bool for compatibility, but 
otherwise presents extra data and the type the enhanced object reports 
is irrelevant.  As I've mentioned before, people can do object oriented 
programming in C,  So, to satisfy your curiosity -- I'll show you a 
mixed C/C++ example, where I make a semantic subclass that has five 
values AllFalse, PartFalse, Uncertain, PartTrue, True ; and these five 
values will have a typeid() of bool and be 100% compatible with legacy 
C++ bool; but upon request, they these 'bool' types will re-cast into a 
semantic subtype that provides additional certainty data.


See the program at end of e-mail.  It compiles with gcc 4.8.2 with no 
warnings;  g++ filename.cc ; ./a.out


But let me explain a bit more why I'm picking on Python:  For even if we 
set the electronic engineering concerns aside that I've raised (and they 
are valid, as OOP is supposed to model reality, not reality be bent to 
match OOP) -- People's facile explanations about why Python's version of 
bool is the way it is -- still bothers me here in the python mail list 
-- because people seem to have a very wrong idea about bool's nature as 
a dualton being somehow justified solely by the fact that there are only 
two values in Boolean logic; For, singletons style programming is not 
justified by the number of values an object has in reality -- And I know 
Charles bool didn't use singletons in his algebra,  -- just read his 
work and you'll see he never mentions them or describes them, but he 
does actually use dozens of *instances* of the True and False objects he 
was talking about -- for the obvious reason that he would have needed 
special mirrors, dichroic or partially silvered, to be even able to 
attempt to make one instance of True written on paper show up in 
multiple places; And that's silly to do when there's no compelling 
reason to do it.


Yet -- people here seem to want to insist that the bool type with only 
two *in

Fwd: Re: Comparisons and sorting of a numeric class....

2015-01-26 Thread Andrew Robinson

 Original Message 
Subject:Re: Comparisons and sorting of a numeric class
Date:   Mon, 26 Jan 2015 05:38:22 -0800
From:   Andrew Robinson 
To: Steven D'Aprano 

On 01/24/2015 12:27 AM, Steven D'Aprano wrote:

Andrew Robinson wrote:

But let me explain a bit more why I'm picking on Python:  For even if we
set the electronic engineering concerns aside that I've raised (and they
are valid, as OOP is supposed to model reality, not reality be bent to
match OOP) -- People's facile explanations about why Python's version of
bool is the way it is -- still bothers me here in the python mail list
-- because people seem to have a very wrong idea about bool's nature as
a dualton being somehow justified solely by the fact that there are only
two values in Boolean logic;

Nobody has suggested that except you.

Yes, they did 'suggest' it.

Earlier I even stated that I didn't
know GvR's motivation in making True and False singletons, but suggested
that it might have been a matter of efficiency.

True... but you are not the only person on this list.

Although I seriously doubt that either efficiency, or memory
conservation plays a part in this in any meausrable manner.  eg: You
accused me of 'pre-optimizing' earlier, and if anything looks like a
pre-optimization, it's the bool itself because computers have tons of
memory now, and I seriously doubt modern Python versions can even run in
a small micro-controler; eg: without an almost complete rewrite of the
languageand as far as efficiency, I know speed won't change
significantly whether or not a singleton, or multiple instances are
allowed except under very unusual circumstances.  So, even now -- I have
no idea why Guido chose to make bool so restrictive other than because
he thought C++ was absolutely restrictive, when in fact C++'s type
system is more flexible than he seems to have noticed.

-- And I know
Charles bool didn't use singletons in his algebra,  -- just read his
work and you'll see he never mentions them or describes them, but he
does actually use dozens of instances of the True and False objects he
was talking about -- for the obvious reason that he would have needed
special mirrors, dichroic or partially silvered, to be even able to
attempt to make one instance of True written on paper show up in
multiple places; And that's silly to do when there's no compelling
reason to do it.

In the words of physicist Wolfgang Pauli, that is not even wrong.

http://en.wikipedia.org/wiki/Not_even_wrong

I'm actually gobsmacked that you could seriously argue that because Boole
wrote down true and false (using whatever notation he choose) more than
once, that proves that they aren't singletons. That's as sensible as
claiming that if I write your name down twice, you must be two people.

Clearly: Charles bool did not use singletons, and you are wasting your
breath splitting hairs that are moot.

Yet -- people here seem to want to insist that the bool type with only
two instances is some kind of pure re-creation of what Charles Bool
did -- when clearly it isn't.

Nobody has argued that Boole (note the spelling of his name) considered True
and False to be singletons. Being a mathematician, he probably considered
that there is a single unique True value and a single unique False value,
in the same way that there is a single unique value pi (3.1415...) and a
single unique value 0.

The spelling caveat is great -- and in Python the object named in bool's
honor is spelled bool (lowercase too). ;)  another point about the
inconsistency of the object with the historical author, I just love it,
which is part of why I'm going to keep on spelling it like that  For
even the spelling, suggests Python is really acting like a lemming and
just doing bool because Guido though other lanugages do bool...   so
I'll just continue because it's fitting that anyone who mocks my use of
the mis-spelling bool also mock's Python's.

But "the number of instances" and "singleton" are concepts from object
oriented programming, which didn't exist when Boole was alive.

Yep -- I made that point myself in an earlier e-mail.  Do you feel
brilliant, or something, copying my remarks ?

  It is not
even wrong to ask the question whether Boole thought of true and false to
be singleton objects. He no more had an opinion on that than Julius Caesar
had an opinion on whether the Falkland Islands belong to the UK or
Argentina.

Oh!  So you want people to think you commune with the dead, and know
he never thought about it alive and/or dead?  D'Aprano speaks
posthumously for Dr. bool ?

You're being very condescending and arrogant and arguing in pointless
circles!

I said, and I quote "He didn't use singletons in his algebra" -- if you
can show where he did, I'

Re: Fwd: Re: Comparisons and sorting of a numeric class....

2015-01-26 Thread Andrew Robinson


Hi Mark !
I hope you are well, and haven't been injured falling out of your chair 
laughing.
There are probably 12 to 14 emails that I haven't been able to read in 
my inbox from the python list on the subject of this 'drivel', because I 
have a real life besides the semi-comedy act that goes on here.  I 
seriously doubt I will ever read them all...


On 01/26/2015 08:31 AM, Mark Lawrence wrote:

On 26/01/2015 13:38, Andrew Robinson wrote:




*plonk*



Ah well, now that I have actually bothered to read your three replies, I 
suppose the most surprising part of your emails is this:


-My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

You do realize that such a statement actually encourages more drivel ? :D
Cheers.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

2015-01-26 Thread Andrew Robinson


Hi, Rob.
Sorry I'm so slow in getting back to you there's too much to read 
and I can't catch up with the backlog.  But I wanted to reply to you, at 
least as I think you made some very good points that make more sense to 
me than other people's replies.


On 01/15/2015 09:54 AM, Rob Gaddi wrote:

On Wed, 14 Jan 2015 23:23:54 -0800
Andrew Robinson  wrote:


Boolean algebra has two values: true and false, or 1 and 0, or humpty
and dumpty, or whatever you like to call them.

You're speaking to an Electrical engineer.  I know there are 10 kinds of
people, those who know binary and those who don't.  But you're way off
base if you think most industrial quality code for manipulating boolean
logic uses (or even can be restricted to use) only two values and still
accomplish their tasks correctly and within a finite amount of time.


[snip]

Absolutely it does and can.  You store anything that's non-boolean in a
non-boolean value, and keep it as fuzzy as you like.  But at the end of
the day, an if statement has no "kinda".  You do, or you don't.  1 or
0.  And so you must ultimately resolve to a Boolean decision.


Yes -- at the *point* of decision; a logic expression always collapses 
to a True or False value.
metastability must always resolve to a final value, or else the system 
will malfunction.
(Although cache branch prediction can choose to take both paths, one of 
them eventually gets flushed).
That's the same in digital circuitry as well,  even with uncertainty 
meta information appended to it.




You have to ask

if x = '1'
or
if (x = '1') or (x = 'H')

Because you are comparing one value of an enumerated type against
others, the result of that '=' operation being, in fact, a boolean,
defined again on the range (true, false).
That's fine, too.   An enumerated type is still a semantic subtype, if 
not a formal one recognized by type().
So -- I don't see that you are arguing the two types must be 
semantically distinct until the if statement is actually executed, at 
which point a true/false decision must be made.  I totally agree, 
digital systems must make a final decision at some point.


Your example, here, BTW: is almost exactly what I was talking about in 
the original few posts of the thread;  Eg: a way to comparing the 
uncertainty value returned by a float's subtype's compare -- against an 
enumerated bool meta-type to resolve that value to a final True or False 
for an if statement.





[snip]

We've discovered that we live in a quantum-mechanical universe -- yet
people still don't grasp the pragmatic issue that basic logic can be
indeterminate at least some of the time ?!


But actions can't be. You're not asking the software about it's
feelings, you're telling it to follow a defined sequence of
instructions.  Do this, or don't do this.


Right!
And, in quantum mechanics -- a wave packet 'collapses' to one and only 
one final decision.

So, I agree with you; and let me get your opinion:

I admit, there can be good reasons to prevent direct subtyping; I know, 
for example, that in C++ no base class is allowed to be subtyped -- but 
not because of OOP concerns or ideology about bool; I'm fairly sure the 
reason had something to do with difficulties in implementing base class 
overloading in the compiler due to compile time binding issues.


But that's useless reasoning with an interpreter like Python which 
*already* allows subtyping of at least some base classes and does 
runtime type() tests instead of compile time tests.


So: The major question I have been asking about is 'when' must the 
decision be made, and 'why' (besides oversight / copying other 
languages) is the a bool variable/object in Python designed in such a 
way as to force the decision to be made early, rather than late -- and 
prevent bool from carrying extended information about the bool itself; 
eg: meta information -- so that the final decision can be delayed until 
an 'if' statement is actually used, and the context is known:


eg:

x = a > b  # x is an uncertain compare that generates meta data along 
with a boolean True or False.

# this value 'x' can be used in two ways:

if x > bool_meta_threshold:  # if statement's branch chosen by meta data.
if x:  # If statement's branch chosen by default/base bool value 
contained in x, meta data is ignored.





I don't know what you mean about composition vs. sub-classing.
Would you care to show an example of how it would solve the problem and
still allow hierarchical sorting ?

I don't see how you can get pre-existing python functions (like sort,
max, and min) to accept a complex bool value and have that value
automatically revert to a True or False in an intelligent manner without
overloading the operators define

Re: Fwd: Re: Comparisons and sorting of a numeric class....

2015-01-27 Thread Andrew Robinson


On 01/26/2015 02:22 PM, Ian Kelly wrote:



On Jan 26, 2015 6:42 AM, "Andrew Robinson" <mailto:[email protected]>> wrote:

> ...

If you're going to descend into insults and name-calling, then I'm not 
going to waste any more of my time on this thread.




I don't believe I've actually, intentionally, insulted you.  That's 
generally not in my nature, although annoyance sometimes comes out.


If my response to D'Aprano's abrasive remarks upset you -- well, sorry.  
It honestly wasn't aimed at you and it wasn't derogatory name calling.  
eg:  If you mean 'bool', instead of 'Boole' that's just because Python 
does it.


The restriction on inheriting from bool isn't likely to change. There 
have been several suggestions as to how you can do what you want. I 
recommend you pick one and get on with it.




FYI:  I already have implemented the code twice, so there's no need to 
continue the conversation for my sake.
I got 'on' with it a long time ago; but thanks for the advice for what 
it's worth.


I hope you enjoyed seeing a C++ type cast of a non bool type into a bool 
that can be used as a bool at leisure, although if you've already stated 
your pleasure or displeasure -- it may take me a while to find the email 
in the backlog so don't feel pressure to repeat yourself.


--Cheers.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fwd: Re: Comparisons and sorting of a numeric class....

2015-01-29 Thread Andrew Robinson



On 01/27/2015 02:04 AM, Gregory Ewing wrote:

Andrew Robinson wrote:

The spelling caveat is great -- and in Python the object named in 
bool's honor is spelled bool (lowercase too). ;)


That doesn't change the fact that the man was called
George Boole (not Charles!). If you're going to refer
to him by name, it's only courteous to make some effort
to get it right.

I stand corrected.  Thank you.


http://en.wikipedia.org/wiki/George_Boole



--
https://mail.python.org/mailman/listinfo/python-list

Cross compiling C Python2.7.10 for arm on x86_64 linux box.

2015-06-29 Thread Andrew Robinson


Hi,

I'm Needing to get python 2.7.10 to cross compile correctly for an ARM 
embedded device.
I'm very close, as it does build with warnings, but the result is 
defective and I'm not sure how to fix it.
For some odd reason, the interpreter does run -- but I either get random 
segfaults -- or if I configure --without-pymalloc, I get messages when I 
import libraries saying that:


---
Python 2.7.10 (default, Jun 29 2015, 23:00:31)
[GCC 4.8.1 20130401 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
Traceback (most recent call last):
  File "", line 1, in 
ImportError: /mnt/user/lib/python2.7/lib-dynload/math.so: undefined 
symbol: Py_InitModule4

[40857 refs]
>>>
---

That message suggests this might be an issue with 32/64 bit machine 
architectures confilicts according to information I googled on it.


I am also seeing many warnings during build like the following which 
confirms some kind of build size mismatch:
*** WARNING: renaming "pyexpat" since importing it failed: 
build/lib.linux-x86_64-2.7/pyexpat.so: wrong ELF class: ELFCLASS32 
building '_elementtree' extension


I think Py_InitModule4 is normally found in libpython2.7.so,  as the 
symbol at least exists in the 32 and 64 bit complations of python that I 
have ; but when I check the cross compiled version, it's not there.
So I think the python ./configure script might be getting confused about 
the size of the target it's trying to build as I am building it on a 64 
bit x86 machine, and the target is a 32 bit arm processor.



The following is what I am doing from bash shell (linux) to try and 
cross compile python.

And idea of what I am doing wrong ?

---
# A par# A parser generator and build system version of python are 
supposed to be

# needed to run parts of the cross compilation;  I do see python used in the
# Makefile, but no references to a buid version of PGEN are defined,
# so I don't know if PGEN gets used or not -- but I build it anyway...
# As this is what receipies on the web say to do...

make distclean
./configure
make Parser/pgen python
mv python python_for_build
mv Parser/pgen Parser/pgen_for_build
make distclean

# fix setup.py to handle installing to the target system's fake install
# directory found on the build system at $DEVICEROOT.

if grep -q os.environ['DEVICEROOT'] ; then echo "Already patched" ; else
sed -i setup.py 's%^[[:space:]]*math_libs = [].*$%if 
'DEVICEROOT' in os.environ:\nlib_dirs += 
os.environ['DEVICEROOT']+'/mnt/user/lib\nlib_dirs += 
os.environ['DEVICEROOT']+'/mnt/user/include\n'%

fi

# We want utf-8, unicode terminal handling -- so make sure python compiles
# with ncursesw substituted for curses.

CURSESFLAGS=`pkg-config --cflags ncursesw`

# Configure python to be built
CFLAGS="${CFLAGS} ${CURSESFLAGS} -g3 -ggdb -gdwarf-4" ./configure 
--host=${CROSSTARGET} --build=i686-linux --enable-unicode 
--enable-shared --with-pydebug --prefix=/mnt/user --disable-ipv6 
--without-pymalloc ac_cv_file__dev_ptmx=yes ac_cv_file__dev_ptc=no 
ac_cv_have_long_long_format=yes PYTHON_FOR_BUILD=${PWD}/python_for_build


# Fix a bug in the Makefile
# The build version of python, ought not try to actually use the ARM 
libraries.
sed -i -e 's%\([[:space:]]\)\(PYTHONPATH=$(DESTDIR)$(LIBDEST)\)%\1-\2%' 
Makefile

echo "Fix the makefile if you can"
sleep 10

make PYTHON_FOR_BUILD=${PWD}/python_for_build CROSS_COMPILE_TARGET=yes

echo " Waiting to allow you to see error messages before installing "
sleep 10

# Optionally, binary file stripping could be carried out on the python 
binary

# Don't strip if you are doing debugging of python
# strip --strip-unneeded python

make install DESTDIR=${DEVICEROOT} 
PYTHON_FOR_BUILD=${PWD}/python_for_build 
PGEN_FOR_BUILD=${PWD}/pgen_for_build

ser generator and build system version of python are supposed to be
# needed to run parts of the cross compilation;  I do see python used in the
# Makefile, but no references to a buid version of PGEN are defined,
# so I don't know if PGEN gets used or not -- but I build it anyway...
# As this is what receipies on the web say to do...

make distclean
./configure
make Parser/pgen python
mv python python_for_build
mv Parser/pgen Parser/pgen_for_build
make distclean

# fix setup.py to handle installing to the target system's fake install
# directory found on the build system at $DEVICEROOT.

if grep -q os.environ['DEVICEROOT'] ; then echo "Already patched" ; else
sed -i setup.py 's%^[[:space:]]*math_libs = [].*$%if 
'DEVICEROOT' in os.environ:\nlib_dirs += 
os.environ['DEVICEROOT']+'/mnt/user/lib\nlib_dirs += 
os.environ['DEVICEROOT']+'/mnt/user/include\n'%

fi

# We want utf-8, unicode terminal handling -- so make sure python compiles
# with nc

multi-result set MySQLdb queries.

2013-02-07 Thread Andrew Robinson

Hi, I'm being forced to use "import MySQLdb" to access a serverand 
am not getting all my data back.


I'm trying to send multiple queries all at once (for time reasons) and 
then extract the rows in bulk.

The queries have different number of columns;  For a contrived example;

script.db.query( '(SELECT Id, password, sessionFlags FROM account WHERE 
Id=(SELECT Id FROM ident where email="%s")); (SELECT * FROM identify);' 
% (prefix) )


resultStore = (script.db.store_result())
result  = resultStore.fetch_row( maxrows=10 )
result = str(result) + ":::"+str( resultStore.fetch_row(maxrows=10) )

This ought to return two result sets; and under php -- it does.
The first query, returns 1 row; the second returns 4 rows.

However, when I try to get the results using PYTHON's MySQLdb; I get
((2L, 'abcdefg', 0L),):::()

which is wrong...

I tried doing a "result store" twice, but executing it twice causes an 
exception;

(besides breaking the bulk transfer paradigm that I want ...)

Is this a bug in python/mySQLdb -- or is there another way this is 
supposed to be done?


---  Appendix (full code, with sensitive 
information foobar'd) ---


def m_database():
global script
import MySQLdb

script.db = MySQLdb.connect(
host = "foo.ipagemysql.com",
  user = "bar",
  passwd = "fobar",
db = "bar"
  )

prefix = "[email protected]"
script.db.query( '(SELECT Id, password, sessionFlags FROM account 
WHERE Id=(SELECT Id FROM identify where email="%s")); (SELECT * FROM 
identify);' % (prefix) )

# Get the ID number for the account,
# and then go get the password for verification purposes.
resultStore = (script.db.store_result())
result  = resultStore.fetch_row( maxrows=10 )

# attempt to retrieve second query set
# resultStore = (script.db.store_result()) # This will cause an 
exception... so commented out.

result = str(result) + ":::"+str( resultStore.fetch_row(maxrows=10) )

script.db.close()
return result


--
http://mail.python.org/mailman/listinfo/python-list

Re: call from pthon to shell

2013-02-12 Thread Andrew Robinson

On 02/12/2013 05:38 AM, Bqsj Sjbq wrote:

>>> import os
>>> os.system("i=3")
0
>>> os.system("echo $i")

0

why i can not get the value of i?

First:
os.system is only defined to give the return value (exit code) of the 
sub-process.

However, one way to get the output of shell commands is to use subprocess.

import subprocess
x = subprocess.check_output( [ "echo", "3,5,7" ] )

However, bash built-ins are not executables; nor is shell expansion 
performed; so you will actually need to do something like:

x=subprocess.check_output( [ "bash", "-c", "i=3; echo $i" ] )
>>> x
>>> '3\n'

To get the result you're interested in.
There may be better ways to get the result you want but hopefully 
you understand the problem better.

:)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: FYI: AI-programmer

2013-02-22 Thread Andrew Robinson


On 02/22/2013 07:21 PM, Ian Kelly wrote:

On Fri, Feb 22, 2013 at 4:41 AM, Chris Angelico  wrote:

That's not artificial intelligence, though. It's artificial program
generation based on a known target output. The "Fitness" calculation
is based on a specific target string. This is fine for devising a
program that will produce the entire works of Shakespeare, since there
is a target string for that (actually, several targets, plus you have
to work out whether you want the works of Shakespeare or the works of
some guy named Bacon... mmm bacon), but I suggest that a more
sophisticated and useful goal be implemented.

Indeed, it seems to me that this is basically Richard Dawkins' weasel
program, with the addition of a transformation step in the fitness
function that amounts to running the string through a Brainfuck
interpreter.  There is a rather large gap between this and getting
computers to generate programs that do anything interesting.

I am curious about how he deals with infinite loops in the generated
programs.  Probably he just kills the threads after they pass some
time threshold?
I'm under the impression that Python doesn't really allow you to kill a 
thread after a time period.

It's not portable to do so

http://eli.thegreenplace.net/2011/08/22/how-not-to-set-a-timeout-on-a-computation-in-python/

--
http://mail.python.org/mailman/listinfo/python-list

Re: FYI: AI-programmer

2013-02-22 Thread Andrew Robinson


On 02/22/2013 08:23 PM, Ian Kelly wrote:

On Fri, Feb 22, 2013 at 5:09 AM, Andrew Robinson
 wrote:

On 02/22/2013 07:21 PM, Ian Kelly wrote:

I am curious about how he deals with infinite loops in the generated
programs.  Probably he just kills the threads after they pass some
time threshold?

I'm under the impression that Python doesn't really allow you to kill a
thread after a time period.
It's not portable to do so

He's using C#, not Python.  And anyway, you can cooperatively request
threads to shut down; just have the Brainfuck interpreter thread check
for a shutdown request every N cycles, or even just place the timeout
in the interpreter thread itself.

-
1st)
It's still surprising that even C# would allow a killing of threads.

Resources can be allocated by a thread and tied up was one of the 
comments made on the site I linked; so those resources could be 
permanently tied up until process exit if the thread is killed.


eg: killing a thread is different from killing the process it is running in.

I'm not familiar enough with C# to know if it does garbage collection or 
not, or how de-allocation is handled; so perhaps I missed something.


--
2nd)

How would you get an interpreter thread to check for a shutdown request 
every N cycles?
I've read about how to set a timeout based on time, but not on any kind 
of cycle (eg: instruction cycle?) count.


Do you have a python example?
Thanks!


--
http://mail.python.org/mailman/listinfo/python-list

Re: Suggested feature: slice syntax within tuples (or even more generally)?

2013-02-25 Thread Andrew Robinson


On 02/14/2013 05:23 AM, Terry Reedy wrote:

On 2/13/2013 2:00 PM, [email protected] wrote:

Hello,

Would it be feasible to modify the Python grammar to allow ':' to 
generate slice objects everywhere rather than just indexers and 
top-level tuples of indexers?


Right now in Py2.7, Py3.3:
 "obj[:,2]" yields "obj[slice(None),2]"
but
 "obj[(:,1),2]" is an error, instead of "obj[(slice(None), 1), 2]"

Also, more generally, you could imagine this working in (almost?) any 
expression without ambiguity:

 "a = (1:2)" could yield "a = slice(1,2)"


I've read through the whole of the subject, and the answer is no, 
although I think allowing it in (::) is a *very* good idea, including as 
a replacement for range or xrange.


s=1:2:3
for i in s:
for i in (1:2:3) :
and I really don't even mind, for i in s[a]:
or even a[1,2,5,11] where the indicies are equivalent to *sequence* 
other than xrange.
Python evaluates right to left; this is semantically an iterator giving 
a[1],a[2],a[5],a[11]


This is not a new idea: eg: 2002. (which is still status OPEN).
http://osdir.com/ml/python.patches/2002-06/msg00319.html

The python code in c-python is quite bloated; consolidating some of it, 
making it more consistent, and raising other parts to a high level 
language, I think are the way of the future.
I'm a fan of this to the point of implementing Python without a parser 
in the core, but as a script implicitly loaded *on demand*; much simpler 
and easier to modify at will and reuse mixed legacy code...


On Travis Oliphant:  I agree...
The numpy communities desire for readable slice functionality (and 
matrix compatible/intuitive code) is only going to get stronger with 
time.  Python is attractive to the scientific community, but legacy 
biased against clean matrix math...


http://technicaldiscovery.blogspot.com/2011/06/python-proposal-enhancements-i-wish-i.html
PEP 225's... desire for readability is important to me too ... even if a 
fork happens.
( An aside: I hate line noise, and fights, so UTF8 in the python 
interpreter, please...!  a ×  b · c )


I doubt even people without looking around confusedly for a moment or 
three and searching for a definition buried in an import somewhere would 
know what s(x) does... Maybe D'Aprano likes it harder?


I mean --  D'Aprano -- a comment on a real world case?
Olifant says: """The biggest wart this would remove is the (ab)use of 
getitem to return new ranges and grids in NumPy (go use *mgrid* and *r_* 
in NumPy to see what I mean)."""


#=
Stephenwlin ! (biggrin)
""" But if there's no difference, then why have ':' work specially for 
'[]' operations at all instead of requiring the user to build slice 
objects manually all the time? """


YES! YES! YES! Oh yeah!

#=
 Duncan: (???)
""" Would this be a dict with a slice as key or value, or a set with a 
slice with a step?: {1:2:3} """


I think it would be a syntax error, just like it is now. It's a syntax 
error anywhere a slice WOULD precede a colon. The syntax is simple 
parser LALR logic, and is already in place.


But I doubt Stephen meant using it everywhere anyway, he did say """(almost?)"""
Stephen, I'm sure, knew ahead of time that:*  eg:
**not 1+::1 is 2::
*
_Besides_, Stephen's already mentioned parenthesis at least 4 times...









 
































A programmer can always add () where an ambiguity exists, and the parser 
can generate syntax errors in all places where an ambiguity could arise.


if x:  # is never a slice,
if 1: 2:

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Suggested feature: slice syntax within tuples (or even more generally)?

2013-02-25 Thread Andrew Robinson


Errata, I made a tyepopeo in the middle of the night:

eg:"""Python evaluates right to left; this is semantically an iterator 
giving a[1],a[2],a[5],a[11]"""


Sigh:  Python Iterates from left to right;

--Andrew.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Suggested feature: slice syntax within tuples (or even more generally)?

2013-02-25 Thread Andrew Robinson


On 02/25/2013 10:28 AM, Ian Kelly wrote:

On Sun, Feb 24, 2013 at 6:10 PM, Andrew Robinson
 wrote:
I've read through the whole of the subject, and the answer is no, 
although I
think allowing it in (::) is a *very* good idea, including as a 
replacement

for range or xrange.

s=1:2:3
for i in s:
for i in (1:2:3) :

Eww, no.  I can appreciate the appeal of this syntax, but the problem
is that ranges and slices are only superficially similar.  For one,
ranges require a stop value; slices do not.  What should Python do
with this:

for i in (:):

The same thing it would do with slices.

A slice is converted to an iterator at the time __getitem__ is called; 
it in fact has methods to compute the actual start and stop, based on 
the parameters given and the size of the object it is applied to.  
Slices are, therefore, *not* in fact infinite;





Intuitively, it should result in an infinite loop starting at 0. But
ranges require a stop value for a very good reason -- it should not be
this easy to accidentally create an infinite for loop.
It wouldn't, but even if it did an *effective* infinite loop is already 
easy to create with xrange:

a = 10
...
a = 1.1e12
...
for i in xrange( int(a) ):

and, besides, the same is true with other constructions of loops

while a:  # Damn easy, if a is accidentally true!

I can go on  but it's rather pointless.  Build a better protective 
device, and the world will find a luckier idiot for you. There isn't 
enough concrete to stop terrorists -- and not enough typing to stop bad 
programmers and pass the good ones.



   So I would
advocate that this should raise an error instead.  If the user really
wants an unlimited counting loop, let them continue to be explicit
about it by using itertools.count.  On the other hand, this would mean
that the semantics of (:) would be different depending on whether the
slice is used as a slice or a range.
No, it would be different depending on whether or not it was applied to 
an iterable; which is already true.




The next problem you run into is that the semantics of negative
numbers are completely different between slices and ranges. Consider
this code:

s = (-5:6)
for i in s:
 print(i)
for i in range(6)[s]:
 print(i)

I don't find this difference to be necessary, nor objectionable.

It is less inconsistent, in my view, to allow that
([ 1,2,3,4,5 ])[-1:2]  produce [5,1,2] than an empty list;
and ([ 1,2,3,4,5])[2:-1] does produce an empty list.

I have been looking for actual programs that this would break for over 
two months now, and I haven't been finding any.  I am willing to run any 
mainstream application you can find on test-patched python!




Intuitively, both loops should print the same thing.  After all, one
is using the slice s as a range, and the other is using the very same
slice s as a slice of a sequence where the indices and values are the
same.

YES! I like the way you think about consistency and intuition.


   This expectation fails, however.  The first loop prints the
integers from -5 to 5 inclusive, and the second loop only prints the
integers from 1 to 5 inclusive.

For these reasons, I disagree that allowing slices to be implicitly
converted to ranges or vice versa is a good idea.

I respect your opinion and agree to simply disagree.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Suggested feature: slice syntax within tuples (or even more generally)?

2013-02-25 Thread Andrew Robinson


On 02/25/2013 04:54 PM, Ian Kelly wrote:

On Mon, Feb 25, 2013 at 12:41 AM, Andrew Robinson
 wrote:

Intuitively, it should result in an infinite loop starting at 0.  But
ranges require a stop value for a very good reason -- it should not be
this easy to accidentally create an infinite for loop.

...
and, besides, the same is true with other constructions of loops

while a:  # Damn easy, if a is accidentally true!

Notice I specifically said an "infinite *for* loop".

OK, so tit for tat.

Notice I already showed an effective *accidental* "infinite" for loop 
because I did notice you spoke about a *for* loop.


And, obviously, in the case of the while loop I showed -- it was not 
meant to be True forever.

It's a variable, which is subject to change.

I really do respect your opinion; but it's one of about 5 people that 
dominate this list, albeit the same spend a lot of time helping others;  
Stephen is someone new to me, and I want to encourage his probing of the 
issue more than I want to advance my view.


P.S.
I apologize about the e-mail clock, it seems I am sending my local time 
again -- and it's different from your timezone; I *wish* the python list 
computer would politely adjust it when *accidents* happen, or my OS's 
distribution would fix their bug -- but cest la vie.  I limp along with 
the status quo for now.





--
http://mail.python.org/mailman/listinfo/python-list

77 matches

Mail list logo