Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson

On 10/30/2012 10:29 PM, Michael Torrie wrote:
As this is the case, why this long discussion? If you are arguing for 
a change in Python to make it compatible with what this fork you are 
going to create will do, this has already been fairly thoroughly 
addressed earl on, and reasons why the semantics will not change 
anytime soon have been given. 


I'm not arguing for a change in the present release of Python; and I 
have never done so.
Historically, if a fork happens to produce something surprisingly 
_useful_; the main code bank eventually accepts it on their own.  If a 
fork is a mistake, it dies on its own.


That really is the way things ought to be done.

   include this
   The Zen of Python, by _Tim Peters_
   
   Special cases aren't special enough to break the rules.
   Although _practicality beats purity_.
   

Now, I have seen several coded projects where the idea of cyclic lists 
is PRACTICAL;
and the idea of iterating slices may be practical if they could be made 
*FASTER*.


These warrant looking into -- and carefully;  and that means making an 
experimental fork; preferably before I attempt to micro-port the python.


Regarding the continuing discussion:
The more I learn, the more informed decisions I can make regarding 
implementation.

I am almost fully understanding the questions I originally asked, now.

What remains are mostly questions about compatibility wrappers, and how 
to allow them to be used -- or selectively deleted when not necessary; 
and perhaps a demonstration or two about how slices and named tuples can 
(or can't) perform nearly the same function in slice processing.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Negative array indicies and slice()

2012-10-31 Thread Ian Kelly
On Tue, Oct 30, 2012 at 4:25 PM, Andrew Robinson
 wrote:
> Ian,
>
>> Looks like it's already been wontfixed back in 2006:
>
>> http://bugs.python.org/issue1501180
>
> Absolutely bloody typical, turned down because of an idiot.  Who the hell is
> Tim Peters anyway?
>
>> I don't really disagree with him, anyway.  It is a rather obscure bug
>> -- is it worth increasing the memory footprint of slice objects by 80%
>> in order to fix it?
>
> :D
>
> In either event, a *bug* does exist (at *least* 20% of the time.)  Tim
> Peters could have opened the *appropriate* bug complaint if he rejected the
> inappropriate one.

Where are you getting that 20% figure from?  Reference cycles
involving slice objects would be extremely rare, certainly far less
than 20%.

> The API ought to have either 1) included the garbage collection, or 2)
> raised an exception anytime dangerous/leaky data was supplied to slice().

How would you propose detecting the latter?  At the time data is
supplied to slice() it cannot refer to the slice, as the slice does
not exist yet.  The cycle has to be created after.

> If it is worth getting rid of the 4 words of extra memory required for the
> GC -- on account of slice() refusing to support data with sub-objects; then
> I'd also point out that a very large percentage of the time, tuples also
> contain data (typically integers or floats,) which do not further
> sub-reference objects.  Hence, it would be worth it there too.

I disagree.  The proportion of the time that a tuple contains other
collection objects is *much* greater.  This happens regularly.  OTOH,
if I had to hazard a guess at the frequency with which non-atomic
objects are used in slices, it would be a fraction of a fraction of a
fraction of a percent.

> I came across some unexpected behavior in Python 3.2 when experimenting with
> ranges and replacement
>
> Consider, xrange is missing, BUT:

More accurately, range is gone, and xrange has been renamed range.

 a=range(1,5,2)
 a[1]
> 3
 a[2]
> 5
 a[1:2]
> range(3, 5, 2)
>
> Now, I wondered if it would still print the array or not; eg: if this was a
> __str__ issue vs. __repr__.
>
 print( a[1:2] ) # Boy, I have to get used to the print's parenthesis
> range(3, 5, 2)
>
> So, the answer is *NOPE*.

I'm not sure why you would expect it to print a list here, without an
explicit conversion.  The result of calling range in Python 3 is a
range object, not a list.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: calling one staticmethod from another

2012-10-31 Thread Ulrich Eckhardt

Am 30.10.2012 18:23, schrieb Jean-Michel Pichavant:

- Original Message -
[snip]

I haven't figured out the justification for staticmethod,


http://en.wikipedia.org/wiki/Namespace
+
"Namespaces are one honking great idea -- let's do more of those!"

Someone may successfully use only modules as namespaces, but classes
can be used as well. It's up to you.


Indeed, see e.g. Steven D'Aprano's approach at formalizing that:

http://code.activestate.com/recipes/578279/


Greetings!

Uli

--
http://mail.python.org/mailman/listinfo/python-list


Re: Negative array indicies and slice()

2012-10-31 Thread Steven D'Aprano
On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote:

> On 30/10/2012 18:02, Ian Kelly wrote:
>> On Tue, Oct 30, 2012 at 10:14 AM, Ethan Furman 
>> wrote:
>>> File a bug report?
>>
>> Looks like it's already been wontfixed back in 2006:
>>
>> http://bugs.python.org/issue1501180
>>
>>
> Absolutely bloody typical, turned down because of an idiot.  Who the
> hell is Tim Peters anyway? :)

I see your smiley, but for the benefit of those who actually don't know 
who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python 
history. He invented Python's astonishingly excellent sort routine, 
Timsort, and popularised the famous adverbial phrase signoffs you will 
see in a lot of older posts.

Basically, he is in the pantheon of early Python demigods.


stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


working with yml files in python and opencv

2012-10-31 Thread inshu chauhan
How to load a yml file in python and work with it ??

I used : import cv
data = cv.Load("Z:/data/xyz_0_300.yml")

But when I print data.. it just gives the detail of the image like number
of rows and columns etc
I want read what is there in the pixel of the image.. can somebody help..
thanx in advance !!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread Grant Edwards
On 2012-09-16,  ??  wrote:

> Iam positng via google groups using chrome, thats all i know.

Learn something else.  Google Groups is seriously and permanently
broken, and all posts from Google Groups are filtered out and ignored
by many people (including myself -- I only saw this because somebody
else replied to it).

IMO, the best option is to point a newsreader an an NNTP server that
carries comp.lang.python (gname.org also provides an NNTP server that
gateways to the mailing list).  Pointing a newsreader at gmane's NNTP
server is also an excellent option.

If all you can do is run a browwer, then I suggest using gmane.org:

   http://news.gmane.org/gmane.comp.python.general

> Whats a mailing list?

  http://en.wikipedia.org/wiki/Electronic_mailing_list
  http://www.python.org/community/lists/

The python mailing list is gatewayed to the Usenet newsgroup
comp.lang.python (which is where I read/post from):

  http://en.wikipedia.org/wiki/Usenet

> Can i get responses to my mail instead of constantly check the google
> groups site?

Yes.  You can subscribe directly to the list (which means you'll
receive a _lot_ of e-mail every day).

-- 
Grant Edwards   grant.b.edwardsYow! If I felt any more
  at   SOPHISTICATED I would DIE
  gmail.comof EMBARRASSMENT!
-- 
http://mail.python.org/mailman/listinfo/python-list


sort order for strings of digits

2012-10-31 Thread djc
I learn lots of useful things from the list, some not always welcome. No 
sooner had I found a solution to a minor inconvenience in my code, than 
a recent thread here drew my attention to the fact that it will not work 
for python 3. So suggestions please:


TODO 2012-10-22: sort order numbers first then alphanumeric
>>> n
('1', '10', '101', '3', '40', '31', '13', '2', '2000')
>>> s
('a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4')

>>> sorted(n)
['1', '10', '101', '13', '2', '2000', '3', '31', '40']
>>> sorted(s)
['1a', '222 bb', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> sorted(n+s)
['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40', 
'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']




Possibly there is a better way but for Python 2.7 this gives the 
required result


Python 2.7.3 (default, Sep 26 2012, 21:51:14)

>>> sorted(int(x) if x.isdigit() else x for x in n+s)
[1, 2, 3, 10, 13, 31, 40, 101, 2000, '1a', '222 bb', 'a', 'a1', 'ab', 
'acd', 'b a 4', 'bcd']



[str(x) for x in sorted(int(x) if x.isdigit() else x for x in n+s)]
['1', '2', '3', '10', '13', '31', '40', '101', '2000', '1a', '222 bb', 
'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']



But not for Python 3
Python 3.2.3 (default, Oct 19 2012, 19:53:16)

>>> sorted(n+s)
['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40', 
'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']


>>> sorted(int(x) if x.isdigit() else x for x in n+s)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unorderable types: str() < int()
>>>

The best I can think of is to split the input sequence into two lists, 
sort each and then join them.



--
djc

--
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Hans Mulder
On 31/10/12 16:17:14, djc wrote:
> Python 3.2.3 (default, Oct 19 2012, 19:53:16)
> 
 sorted(n+s)
> ['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40',
> 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
> 
 sorted(int(x) if x.isdigit() else x for x in n+s)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: unorderable types: str() < int()


>>> sorted(n+s, key=lambda x:(x.__class__.__name__, x))
['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40',
'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>>

> The best I can think of is to split the input sequence into two lists,
> sort each and then join them.

That might well be the most readable solution.


Hope this helps,

-- HansM
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Ian Kelly
On Wed, Oct 31, 2012 at 9:17 AM, djc  wrote:
> The best I can think of is to split the input sequence into two lists, sort
> each and then join them.

In the example you have given they already seem to be split, so you
could just do:

sorted(n, key=int) + sorted(s)

If that's not really the case, then you could construct (str, int)
tuples as sort keys:

sorted(n+s, key=lambda x: ('', int(x)) if x.isdigit() else (x, -1))

Note that the empty string sorts before all numbers here, which may or
may not be desirable.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Float to String

2012-10-31 Thread Mark Dickinson
  3ds.com> writes:

> When formatting a float using the exponential format, the rounding is
> different in Python-2.6 and Python-2.7. See example below.  Is this
> intentional?

Yes, in a sense.  Python <= 2.6 uses the OS-provided functionality (e.g., the C
library's strtod, dtoa and sprintf functions) to do float-to-string and
string-to-float conversions, and hence behaves differently from platform to
platform.  In particular, it's common for near halfway cases (like the one
you're looking at here) and tiny numbers to give different results on different
platforms.  Python >= 2.7 has its own built-in code for performing
float-to-string and string-to-float conversions, so those conversions are
platform- independent and always correctly rounded.  (Nitpick: it's still
theoretically possible for Python 2.7 to use the OS code if it can't determine
the floating-point format, or if it can't find a way to ensure the proper FPU
settings, but I don't know of any current platforms where that's the case.)

> Is there any way of forcing the Python-2.6 behavior (for compatibility
> reasons when testing)?

Not easily, no.

--
Mark


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Negative array indicies and slice()

2012-10-31 Thread Mark Lawrence

On 31/10/2012 10:07, Steven D'Aprano wrote:

On Tue, 30 Oct 2012 21:33:32 +, Mark Lawrence wrote:


Absolutely bloody typical, turned down because of an idiot.  Who the
hell is Tim Peters anyway? :)


I see your smiley, but for the benefit of those who actually don't know
who Tim Peters, a.k.a. the Timbot, is, he is one of the gurus of Python
history. He invented Python's astonishingly excellent sort routine,
Timsort, and popularised the famous adverbial phrase signoffs you will
see in a lot of older posts.

Basically, he is in the pantheon of early Python demigods.

stop-me-before-i-start-gushing-over-the-timbot-ly y'rs,



4 / 10, must try harder, the omission of the Zen of Python is considered 
a very serious matter :)


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Need 4 Python Developer // Times Sq - New York // 15+ Months Contract.

2012-10-31 Thread raj
Hi Friends,
 
Hope you are doing great.
 
This is Rajesh from NYTP.
I wanted to let you know about New Job opening in Times Sq - New York. It is a 
15+ months Contract.

Role :  Python Developer
Location   :  Times Sq - New York
Duration   :  15+ Months Contract

Positions :  4
 
Project currently in the Development Phase

Qualifications:

   4-6 years of development experience with object-oriented languages
   3+ years of Python development experience
   Knowledge of HTML5 and Javascript is a plus
   Experience developing applications for AWS
   Agile development experience a plus
   Demonstrated ability to work in a team environment
   Good unit testing practices
   Good communication and documentation skills
   Willingness to interact and work with different teams across 
organizations in different time zones
   Willingness to work overtime and weekends if required
   Bachelor’s degree in Computer Science
   Required Skills:
   Strong in object-oriented concepts and Python language
   Experience developing web applications with Tornado
   Working knowledge of software design patterns
   Familiar with N-Tier caching strategies
   Familiar with REST and JSON
   Knowledgeable about MongoDB and REDIS 

If you are available and interested in this positions. Please send me an 
updated resume. Please feel free to contact me for any further information.
___
 
New York Technology Partners – Rochester

Rajesh Kaluri
332 Jefferson Rd.
Rochester, NY 14623 
Phone: (201) 680 - 0200 x7023   

Fax: (201) 474 - 8533
[email protected]
www.nytp.com
Profile : http://in.linkedin.com/pub/k-rajeshwar/8/51a/13

 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Obnoxious postings from Google Groups

2012-10-31 Thread rurpy
On 10/30/2012 11:07 PM, Robert Miles wrote:> On 9/16/2012 8:18 AM, Ben Finney 
wrote:
>> Νικόλαος Κούρας  writes:
>>
>>> Iam sorry i didnt do that on purpose and i dont know how this is done.
>>>
>>> Iam positng via google groups using chrome, thats all i know.
>>
>> It is becoming quite clear that some change has happened recently to
>> Google Groups that makes posts coming from there rather more obnoxious
>> than before. And there doesn't seem to be much its users can do except
>> use something else.

You (BF) are wrong that there "doesn't seem to be much its users
can do..." and I explained why previously.  However, since you have
advocated killfiling anyone using GG (which I do) you probably didn't
see my post.  If you choose intentional ignorance that is your choice
but you do a disservice to the community by advocating that others
do the same.

(Officer, I don't deserve this ticket because I couldn't see the 
traffic signal was red; I had my eyes closed. :-)

> You're probably referring to their change in the way they handle
> end-of-lines, which is now incompatible with most newsreaders,
> especially with multiple levels of quoting.

It's a minor pain to fix this when posting, but

1. It is fixable (and previous post of mine gave a couple ways)
2. The double spacing is obvious in Google's compose window
 so if one posts anyway, it is a matter of laziness.

> The incompatibility tends to insert a blank line after every line.
> With multiple levels of quoting, this gives blank line groups that
> often roughly double in size for every level of quoting.
> 
> Robert Miles
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread rurpy
On 10/31/2012 09:11 AM, Grant Edwards wrote:> On 2012-09-16,  ?? 
 wrote:
> 
>> Iam positng via google groups using chrome, thats all i know.
> 
> Learn something else.  Google Groups is seriously and permanently
> broken, and all posts from Google Groups are filtered out and ignored
> by many people (including myself -- I only saw this because somebody
> else replied to it).

"Broken"?  Yes.  But so is every piece of software in one way 
or another.  Thunderbird is one of the most perpetually buggy
pierces of software I have ever used on a continuing basis.

"Seriously"?  That's pretty subjective.  I manage to use it
without major problems so it couldn't be that bad.  I posted
previously on how to use it without the double posts or the
double spacing.

"Permenantly"?  Your ability to foretell the future leaves me
in awe.  :-)

Feel free to filter whatever you want but be aware than in 
doing so you risk missing information that could help you
avoid disseminating erroneous info.  Of course, carrying out
some kind of private war against Google Groups may be more
important to you than that... 
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: date and time comparison how to

2012-10-31 Thread Prasad, Ramit
Gary Herron wrote:
> On 10/29/2012 04:13 PM, noydb wrote:
> > All,
> >
> > I need help with a date and time comparison.
> >
> > Say a user enters a date-n-time and a file on disk.  I want to compare the 
> > date and time of the file to the
> entered date-n-time; if the file is newer than the entered date-n-time, add 
> the file to a list to process.
> >
> > How best to do?  I have looked at the datetime module, tried a few things, 
> > no luck.
> >
> > Is os.stat a part of it?  Tried, not sure of the output, the 
> > st_mtime/st_ctime doesnt jive with the file's
> correct date and time.  ??
> >
> > Any help would be appreciated!
> 
> Use the datetime module (distributed with Python) to compare date/times.
> 
> You can turn a filesystem time into a datetime with something like the
> following:
>  import datetime, os, stat
>  mtime = os.lstat(filename)[stat.ST_MTIME]   // the
> files modification time
>  dt = datetime.datetime.fromtimestamp(mtime)
> 

You could also write that as:

datetime.datetime.fromtimestamp( os.path.getmtime( path ) )


Ramit P


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Negative array indicies and slice()

2012-10-31 Thread Ian Kelly
On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson
wrote:

> Then; I'd note:  The non-goofy purpose of slice is to hold three data
> values;  They are either numbers or None.  These *normally* encountered
> values can't create a memory loop.
> So, FOR AS LONG, as the object representing slice does not contain an
> explicit GC pair; I move that we mandate (yes, in the current python
> implementation, even as a *fix*) that its named members may not be assigned
> any objects other than None or numbers
>
> eg: Lists would be forbidden
>
> Since functions, and subclasses, can be test evaluated by int(
> the_thing_to_try ) and *[] can too,
> generality need not be lost for generating nothing or numbers.
>

PEP 357 requires that anything implementing the __index__ special method be
allowed for slicing sequences (and also that __index__ be used for the
conversion).  For the most part, that includes ints and numpy integer
types, but other code could be doing esoteric things with it.

The change would be backward-incompatible in any case, since there is
certainly code out there that uses non-numeric slices -- one example has
already been given in this thread.
And more wonderful yet, when I do extended slice replacement -- it gives me
results beyond my wildest imaginings!


> >>> a=[0,1,2,3,4,5]
> >>> a[4:5]=range( 0, 3 ) # Size origin=1, Size dest =3
> >>> a
> [0, 1, 2, 3, 0, 1, 2, 5]  # Insert on top of replacement
> >>>
> But !!!NOT!!! if I do it this way:
> >>> a[4]=range( 0, 3 )
> >>> a
> [0, 1, 2, 3, range(0, 3), 1, 2, 5]
> >>>
>

That's nothing to do with range or Python 3.  It's part of the difference
between slice assignment and index assignment.  The former unpacks an
iterable, and the latter assigns a single object.  You'd get the same
behavior with lists:

>>> a = list(range(6))
>>> a[4:5] = list(range(3))
>>> a
[0, 1, 2, 3, 0, 1, 2, 5]
>>> a = list(range(6))
>>> a[4] = list(range(3))
>>> a
[0, 1, 2, 3, [0, 1, 2], 5]

Slice assignment unpacks the list; index assignment assigns the list itself
at the index.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Mark Lawrence

On 31/10/2012 18:17, Dennis Lee Bieber wrote:


Why -- I doubt Python 3.x .sort() and sorted() have removed the
optional key and cmp keywords.



Nope.  I'm busy porting my own code from 2.7 to 3.3 and cmp seems to be 
very dead.


This doesn't help either.

c:\Users\Mark\Cash\Python>2to3.py
Traceback (most recent call last):
  File "C:\Python33\Tools\Scripts\2to3.py", line 3, in 
from lib2to3.main import main
ImportError: No module named main

--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread Mark Lawrence

On 31/10/2012 19:35, [email protected] wrote:

On 10/31/2012 09:11 AM, Grant Edwards wrote:> On 2012-09-16,  ?? 
 wrote:.

"Broken"?  Yes.  But so is every piece of software in one way
or another.  Thunderbird is one of the most perpetually buggy
pierces of software I have ever used on a continuing basis



Please provide evidence that Thunderbird is buggy.  I use it quite 
happily, don't have problems, and have never seen anybody complaining 
about it.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Ian Kelly
On Wed, Oct 31, 2012 at 3:33 PM, Mark Lawrence wrote:

> Nope.  I'm busy porting my own code from 2.7 to 3.3 and cmp seems to be
> very dead.
>
> This doesn't help either.
>
> c:\Users\Mark\Cash\Python>**2to3.py
>
> Traceback (most recent call last):
>   File "C:\Python33\Tools\Scripts\**2to3.py", line 3, in 
> from lib2to3.main import main
> ImportError: No module named main
>

Perhaps you have a sys.path conflict?

Use functools.cmp_to_key for porting cmp functions.  "sort(x, my_cmp)"
becomes "sort(x, key=cmp_to_key(my_cmp))"

The cmp builtin is also gone.  If you need it, the suggested replacement
for "cmp(a, b)" is "(b < a) - (a < b)".

Cheers,
Ian
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Obnoxious postings from Google Groups

2012-10-31 Thread Steven D'Aprano
On Wed, 31 Oct 2012 12:32:57 -0700, rurpy wrote:

[...]
>> You're probably referring to their change in the way they handle
>> end-of-lines, which is now incompatible with most newsreaders,
>> especially with multiple levels of quoting.
> 
> It's a minor pain to fix this when posting, but
> 
> 1. It is fixable (and previous post of mine gave a couple ways) 2. The
> double spacing is obvious in Google's compose window
>  so if one posts anyway, it is a matter of laziness.

I don't killfile merely for posting from Gmail or Google Groups, but 
regarding your second point, it has seemed to me for some years now that 
Gmail is the new Hotmail, which was the new AOL. Whenever there is an 
inane, lazy, mind-numbingly stupid question or post, chances are 
extremely high that the sender has a Gmail address.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Steven D'Aprano
On Wed, 31 Oct 2012 15:17:14 +, djc wrote:

> The best I can think of is to split the input sequence into two lists,
> sort each and then join them.

According to your example code, you don't have to split the input because 
you already have two lists, one filled with numbers and one filled with 
strings.

But I think that what you actually have is a single list of strings, and 
you are supposed to sort the strings such that they come in numeric order 
first, then alphanumerical. E.g.:

['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
=> ['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']

At least that is what I would expect as the useful thing to do when 
sorting.

The trick is to take each string and split it into a leading number and a 
trailing alphanumeric string. Either part may be "empty". Here's a pure 
Python solution:

from sys import maxsize  # use maxint in Python 2
def split(s):
for i, c in enumerate(s):
if not c.isdigit():
break
else:  # aligned with the FOR, not the IF
return (int(s), '')
return (int(s[:i] or maxsize), s[i:])

Now sort using this as a key function:

py> L = ['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
py> sorted(L, key=split)
['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']


The above solution is not quite general:

* it doesn't handle negative numbers or numbers with a decimal point;

* it doesn't handle the empty string in any meaningful way;

* in practice, you may or may not want to ignore leading whitespace,
  or trailing whitespace after the number part;

* there's a subtle bug if a string contains a very large numeric prefix,
  finding and fixing that is left as an exercise.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Steven D'Aprano
On Wed, 31 Oct 2012 19:05:17 -0400, Dennis Lee Bieber wrote:

>> The cmp builtin is also gone.  If you need it, the suggested
>> replacement for "cmp(a, b)" is "(b < a) - (a < b)".
>>
>   OUCH... Just another reason for my to hang onto the 2.x series as
> long as possible 

On the contrary. If you are using cmp with sort, your sorts are slow, and 
you should upgrade to using a key function as soon as possible.

For small lists, you may not notice, but for large lists using a 
comparison function is a BAD IDEA.

Here's an example: sorting a list of numbers by absolute value.

py> L = [5, -6, 1, -2, 9, -8, 4, 3, -7, 2, -3]
py> sorted(L, key=abs)
[1, -2, 2, 3, -3, 4, 5, -6, -7, -8, 9]
py> sorted(L, lambda a, b: cmp(abs(a), abs(b)))
[1, -2, 2, 3, -3, 4, 5, -6, -7, -8, 9]

But the amount of work done is radically different. Let's temporarily 
shadow the built-ins with patched versions:

py> _abs = abs
py> _abs, _cmp = abs, cmp
py> c1 = c2 = 0
py> def abs(x):
... global c1
... c1 += 1
... return _abs(x)
...
py> def cmp(a, b):
... global c2
... c2 += 1
... return _cmp(a, b)
...

Now we can see just how much work is done under the hood using a key 
function vs a comparison function:

py> sorted(L, key=abs)
[1, -2, 2, 3, -3, 4, 5, -6, -7, -8, 9]
py> c1
11

So the key function is called once for each item in the list. But:


py> c1 = 0  # reset the count
py> sorted(L, lambda a, b: cmp(abs(a), abs(b)))
[1, -2, 2, 3, -3, 4, 5, -6, -7, -8, 9]
py> c1, c2
(54, 27)

The comparison function is called 27 times for a list of nine items (a 
average of 2.5 calls to cmp per item), and abs is called twice for each 
call to cmp. (Well, duh.)

If the list is bigger, it gets worse:

py> c2 = 0
py> x = sorted(L*10, lambda a, b: cmp(abs(a), abs(b)))
py> c2
592

That's an average of 5.4 calls to cmp per item. And it gets even worse as 
the list gets bigger.

As your lists get bigger, the amount of work done calling the comparison 
function gets ever bigger still. Sorting large lists with a comparison 
function is SLOOOW.

py> del abs, cmp  # remove the monkey-patched versions
py> L = L*100
py> with Timer():
... x = sorted(L, key=abs)
...
time taken: 9.165448 seconds
py> with Timer():
... x = sorted(L, lambda a, b: cmp(abs(a), abs(b)))
...
time taken: 63.579679 seconds


The Timer() context manager used can be found here:

http://code.activestate.com/recipes/577896



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread DJC

On 31/10/12 23:09, Steven D'Aprano wrote:

On Wed, 31 Oct 2012 15:17:14 +, djc wrote:


The best I can think of is to split the input sequence into two lists,
sort each and then join them.


According to your example code, you don't have to split the input because
you already have two lists, one filled with numbers and one filled with
strings.


Sorry for the confusion, the pair of strings was just a way of testing 
variations on the input. So a sequence with any combination of strings 
that can be read as numbers and strings of chars that don't look like 
numbers (even if that string includes digits) is the expected input




But I think that what you actually have is a single list of strings, and
you are supposed to sort the strings such that they come in numeric order
first, then alphanumerical. E.g.:

['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
=> ['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']


Not quite, what I want is to ensure that if the strings look like 
numbers they are placed in numerical order. ie 1 2 3 10 100 not 1 10 100 
2 3. Cases where a string has some leading digits can be treated as 
strings like any other.



At least that is what I would expect as the useful thing to do when
sorting.


Well it depends on the use case. In my case the strings are column and 
row labels for a report. I want them to be presented in a convenient to 
read sequence. Which the lexical sorting of the strings that look like 
numbers is not. I want a reasonable do-what-i-mean default sort order 
that can handle whatever strings are used.





The trick is to take each string and split it into a leading number and a
trailing alphanumeric string. Either part may be "empty". Here's a pure
Python solution:

from sys import maxsize  # use maxint in Python 2
def split(s):
 for i, c in enumerate(s):
 if not c.isdigit():
 break
 else:  # aligned with the FOR, not the IF
 return (int(s), '')
 return (int(s[:i] or maxsize), s[i:])

Now sort using this as a key function:

py> L = ['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
py> sorted(L, key=split)
['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']


The above solution is not quite general:

* it doesn't handle negative numbers or numbers with a decimal point;

* it doesn't handle the empty string in any meaningful way;

* in practice, you may or may not want to ignore leading whitespace,
   or trailing whitespace after the number part;

* there's a subtle bug if a string contains a very large numeric prefix,
   finding and fixing that is left as an exercise.


That looks more than  general enough for my purposes! I will experiment 
along those lines, thank you.



--
http://mail.python.org/mailman/listinfo/python-list


Re: Obnoxious postings from Google Groups

2012-10-31 Thread Arnaud Delobelle
On 31 October 2012 22:33, Steven D'Aprano
 wrote:
[...]
> I don't killfile merely for posting from Gmail

And we are humbly grateful.

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread Robert Miles

On 10/31/2012 2:35 PM, [email protected] wrote:

On 10/31/2012 09:11 AM, Grant Edwards wrote:> On 2012-09-16,  ?? 
 wrote:



Iam positng via google groups using chrome, thats all i know.


Learn something else.  Google Groups is seriously and permanently
broken, and all posts from Google Groups are filtered out and ignored
by many people (including myself -- I only saw this because somebody
else replied to it).


"Seriously"?  That's pretty subjective.  I manage to use it
without major problems so it couldn't be that bad.  I posted
previously on how to use it without the double posts or the
double spacing.


If you're using it for reasonable purposes, you won't encounter
its worst flaw.  It's much too easy for spammers to use for
posting spam.  I'd estimate that about 99% of the world's
newsgroups spam in English is posted through Google Groups.

--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread Robert Miles

On 10/31/2012 4:38 PM, Mark Lawrence wrote:

On 31/10/2012 19:35, [email protected] wrote:

On 10/31/2012 09:11 AM, Grant Edwards wrote:> On 2012-09-16, 
??  wrote:.

"Broken"?  Yes.  But so is every piece of software in one way
or another.  Thunderbird is one of the most perpetually buggy
pierces of software I have ever used on a continuing basis



Please provide evidence that Thunderbird is buggy.  I use it quite
happily, don't have problems, and have never seen anybody complaining
about it.


Why should they complain about it in this newsgroup?

Most of the people who complain about it know that complaining
in newsgroup mozilla.support.thunderbird is much more likely
to get any problems fixed.  Rather few newsgroups servers are
allowed to carry that newsgroup; news.mozilla.org is one of them.

The newsgroups section of Thunderbird seems to have more bugs
than the email section, partly because there are more volunteers
interested in working on the email section.

--
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Mark Lawrence

On 31/10/2012 22:24, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 3:33 PM, Mark Lawrence wrote:


Nope.  I'm busy porting my own code from 2.7 to 3.3 and cmp seems to be
very dead.

This doesn't help either.

c:\Users\Mark\Cash\Python>**2to3.py

Traceback (most recent call last):
   File "C:\Python33\Tools\Scripts\**2to3.py", line 3, in 
 from lib2to3.main import main
ImportError: No module named main



Perhaps you have a sys.path conflict?


Correct, now fixed, thanks.



Use functools.cmp_to_key for porting cmp functions.  "sort(x, my_cmp)"
becomes "sort(x, key=cmp_to_key(my_cmp))"

The cmp builtin is also gone.  If you need it, the suggested replacement
for "cmp(a, b)" is "(b < a) - (a < b)".


As it's my own small code base I've blown away all references to cmp, 
it's rich comparisons all the way.




Cheers,
Ian



--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime issue

2012-10-31 Thread Mark Lawrence

On 01/11/2012 00:23, Robert Miles wrote:

On 10/31/2012 4:38 PM, Mark Lawrence wrote:

On 31/10/2012 19:35, [email protected] wrote:

On 10/31/2012 09:11 AM, Grant Edwards wrote:> On 2012-09-16, 
??  wrote:.

"Broken"?  Yes.  But so is every piece of software in one way
or another.  Thunderbird is one of the most perpetually buggy
pierces of software I have ever used on a continuing basis



Please provide evidence that Thunderbird is buggy.  I use it quite
happily, don't have problems, and have never seen anybody complaining
about it.


Why should they complain about it in this newsgroup?


I'm reading all the Python *MAILING LISTS* that I'm interested in with 
Thunderbird.  Here.  Now.  So do a lot of other people.  If they weren't 
happy with it, they'd be stating so here when this type of discussion 
came up.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python garbage collector/memory manager behaving strangely

2012-10-31 Thread Robert Miles

On 9/16/2012 9:12 PM, Dave Angel wrote:

On 09/16/2012 09:07 PM, Jadhav, Alok wrote:

Hi Everyone,



I have a simple program which reads a large file containing few million
rows, parses each row (`numpy array`) and converts into an array of
doubles (`python array`) and later writes into an `hdf5 file`. I repeat
this loop for multiple days. After reading each file, i delete all the
objects and call garbage collector.  When I run the program, First day
is parsed without any error but on the second day i get `MemoryError`. I
monitored the memory usage of my program, during first day of parsing,
memory usage is around **1.5 GB**. When the first day parsing is
finished, memory usage goes down to **50 MB**. Now when 2nd day starts
and i try to read the lines from the file I get `MemoryError`. Following
is the output of the program.


Is it a 32-bit program?  If so, expect the maximum amount of memory it
can use to hold the program, its current dataspace, and images of all
the files it has open to be about 3.5 GB, even if it is running on a
64-bit computer with over 4 GB of memory.  It seems that 32-bit
addresses can only refer to 4 GB of memory, and part of that 4 GB
must be used for whatever the operating system needs for running
32-bit programs.  With some of the older compilers, only 2 GB can be
used for the program; the other 2 GB is reserved for the operating system.

How practical would it be to have that program run twice a day?
The first time, it should ignore all the data for the second half
of the day; the second time, it should ignore all the data for the
first half of the day.

--
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Chris Angelico
On Thu, Nov 1, 2012 at 10:44 AM, Steven D'Aprano
 wrote:
> On the contrary. If you are using cmp with sort, your sorts are slow, and
> you should upgrade to using a key function as soon as possible.
>

But cmp_to_key doesn't actually improve anything. So I'm not sure how
Py3 has achieved anything; Py2 supported key-based sorting already.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sort order for strings of digits

2012-10-31 Thread Arnaud Delobelle
On 31 October 2012 23:09, Steven D'Aprano
 wrote:
> The trick is to take each string and split it into a leading number and a
> trailing alphanumeric string. Either part may be "empty". Here's a pure
> Python solution:
>
> from sys import maxsize  # use maxint in Python 2
> def split(s):
> for i, c in enumerate(s):
> if not c.isdigit():
> break
> else:  # aligned with the FOR, not the IF
> return (int(s), '')
> return (int(s[:i] or maxsize), s[i:])
>
> Now sort using this as a key function:
>
> py> L = ['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
> py> sorted(L, key=split)
> ['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']

You don't actually need to split the string, it's enough to return a
pair consisting of the number of leading digits followed by the string
as the key. Here's an implementation using takewhile:

>>> from itertools import takewhile
>>> def prefix(s):
... return sum(1 for c in takewhile(str.isdigit, s)) or 1000, s
...
>>> L = ['9', '1000', 'abc2', '55', '1', 'abc', '55a', '1a']
>>> sorted(L, key=prefix)
['1', '1a', '9', '55', '55a', '1000', 'abc', 'abc2']

Here's why it works:

>>> map(prefix, L)
[(1, '9'), (4, '1000'), (1000, 'abc2'), (2, '55'), (1, '1'), (1000,
'abc'), (2, '55a'), (1, '1a')]

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Negative array indicies and slice()

2012-10-31 Thread Andrew Robinson

On 10/31/2012 02:20 PM, Ian Kelly wrote:

On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson wrote:

Then; I'd note:  The non-goofy purpose of slice is to hold three
data values;  They are either numbers or None.  These *normally*
encountered values can't create a memory loop.
So, FOR AS LONG, as the object representing slice does not contain
an explicit GC pair; I move that we mandate (yes, in the current
python implementation, even as a *fix*) that its named members may
not be assigned any objects other than None or numbers

eg: Lists would be forbidden

Since functions, and subclasses, can be test evaluated by int(
the_thing_to_try ) and *[] can too,
generality need not be lost for generating nothing or numbers.


PEP 357 requires that anything implementing the __index__ special 
method be allowed for slicing sequences (and also that __index__ be 
used for the conversion).  For the most part, that includes ints and 
numpy integer types, but other code could be doing esoteric things 
with it.


I missed something... (but then that's why we're still talking about it...)

Reading the PEP, it notes that *only* integers (or longs) are permitted 
in slice syntax.

(Overlooking None, of course... which is strange...)

The PEP gives the only exceptions as objects with method "__index__".

Automatically, then, an empty list is forbidden (in slice syntax).
However,  What you did, was circumvent the PEP by passing an empty list 
directly to slice(), and avoiding running it through slice syntax 
processing.


So...
Is there documentation suggesting that a slice object is meant to be 
used to hold anything other than what comes from processing a valid 
slice syntax [::]??. (we know it can be done, but that's a different Q.)



The change would be backward-incompatible in any case, since there is 
certainly code out there that uses non-numeric slices -- one example 
has already been given in this thread.

Hmmm.

Now, I'm thinking -- The purpose of index(), specifically, is to notify 
when something which is not an integer may be used as an index;  You've 
helpfully noted that index() also *converts* those objects into numbers.


Ethan Fullman mentioned that he used the names of fields, "instead of 
having to remember the _offsets_"; Which means that his values _do 
convert_ to offset numbers


His example was actually given in slice syntax notation [::].
Hence, his objects must have an index() method, correct?.

Therefore, I still see no reason why it is permissible to assign 
non-numerical (non None) items
as an element of slice().  Or, let me re-word that more clearly -- I see 
no reason that slice named members when used as originally intended 
would ever need to be assigned a value which is not *already* converted 
to a number by index().  By definition, if it can't be coerced, it isn't 
a number.


A side note:
At 80% less overhead, and three slots -- slice is rather attractive to 
store RGB values in for a picture!  But, I don't think anyone would have 
a problem saying "No, we won't support that, even if you do do it!


So, what's the psychology behind allowing slice() to hold objects which 
are not converted to ints/longs in the first place?


-- 
http://mail.python.org/mailman/listinfo/python-list


how to perform word sense disambiguation?

2012-10-31 Thread nachiket
an initial part of my project involves assigning sense to each word in 
sentence. I came across this tool called wordnet. do share your views
-- 
http://mail.python.org/mailman/listinfo/python-list