Re: "

2012-05-04 Thread Stefan Behnel
Ian Kelly, 04.05.2012 01:02:
> BeautifulSoup is supposed to parse like a browser would

Not at all, that would be html5lib.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Paul Rubin
Steve Howell  writes:
> compressor = zlib.compressobj()
> s = compressor.compress("foobar")
> s += compressor.flush(zlib.Z_SYNC_FLUSH)
>
> s_start = s
> compressor2 = compressor.copy()

I think you also want to make a decompressor here, and initialize it
with s and then clone it.  Then you don't have to reinitialize every
time you want to decompress something.

I also seem to remember that the first few bytes of compressed output
are always some fixed string or checksum, that you can strip out after
compression and put back before decompression, giving further savings in
output size when you have millions of records.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Dan Stromberg
On Thu, May 3, 2012 at 11:03 PM, Paul Rubin  wrote:

> > Sort of as you suggest, you could build a Huffman encoding for a
> > representative run of data, save that tree off somewhere, and then use
> > it for all your future encoding/decoding.
>
> Zlib is better than Huffman in my experience, and Python's zlib module
> already has the right entry points.
>
> Isn't zlib kind of dated?  Granted, it's newer than Huffman, but there's
been bzip2 and xz since then, among numerous others.

Here's something for xz:
http://stromberg.dnsalias.org/svn/xz_mod/trunk/
An xz module is in the CPython 3.3 alphas - the above module wraps it if
available, otherwise it uses ctypes or a pipe to an xz binary..

And I believe bzip2 is in the standard library for most versions of CPython.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Steve Howell
On May 3, 11:59 pm, Paul Rubin  wrote:
> Steve Howell  writes:
> >     compressor = zlib.compressobj()
> >     s = compressor.compress("foobar")
> >     s += compressor.flush(zlib.Z_SYNC_FLUSH)
>
> >     s_start = s
> >     compressor2 = compressor.copy()
>
> I think you also want to make a decompressor here, and initialize it
> with s and then clone it.  Then you don't have to reinitialize every
> time you want to decompress something.

Makes sense.  I believe I got that part correct:

  https://github.com/showell/KeyValue/blob/master/salted_compressor.py

> I also seem to remember that the first few bytes of compressed output
> are always some fixed string or checksum, that you can strip out after
> compression and put back before decompression, giving further savings in
> output size when you have millions of records.

I'm pretty sure this happens for free as long as the salt is large
enough, but maybe I'm misunderstanding.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Paul Rubin
Steve Howell  writes:
> Makes sense.  I believe I got that part correct:
>
>   https://github.com/showell/KeyValue/blob/master/salted_compressor.py

The API looks nice, but your compress method makes no sense.  Why do you
include s.prefix in s and then strip it off?  Why do you save the prefix
and salt in the instance, and have self.salt2 and s[len(self.salt):]
in the decompress?  You should be able to just get the incremental bit.

> I'm pretty sure this happens for free as long as the salt is large
> enough, but maybe I'm misunderstanding.

No I mean there is some fixed overhead (a few bytes) in the compressor
output, to identify it as such.  That's fine when the input and output
are both large, but when there's a huge number of small compressed
strings, it adds up.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Steve Howell
On May 4, 1:01 am, Paul Rubin  wrote:
> Steve Howell  writes:
> > Makes sense.  I believe I got that part correct:
>
> >  https://github.com/showell/KeyValue/blob/master/salted_compressor.py
>
> The API looks nice, but your compress method makes no sense.  Why do you
> include s.prefix in s and then strip it off?  Why do you save the prefix
> and salt in the instance, and have self.salt2 and s[len(self.salt):]
> in the decompress?  You should be able to just get the incremental bit.

This is fixed now.

https://github.com/showell/KeyValue/commit/1eb316d6e9e44a37ab4f3ca73fcaf4ec0e7f22b4#salted_compressor.py


> > I'm pretty sure this happens for free as long as the salt is large
> > enough, but maybe I'm misunderstanding.
>
> No I mean there is some fixed overhead (a few bytes) in the compressor
> output, to identify it as such.  That's fine when the input and output
> are both large, but when there's a huge number of small compressed
> strings, it adds up.

It it's in the header, wouldn't it be part of the output that comes
before Z_SYNC_FLUSH?



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Create directories and modify files with Python

2012-05-04 Thread Hans Mulder
On 1/05/12 17:34:57, [email protected] wrote:
> from __future__ import print_function   #1
>
> 
> 
> #1: Not sure whether you're using Python 2 or 3.  I ran
>  this on Python 2.7 and think it will run on Python 3 if
>  you remove this line.

You don't have to remove that line: Python3 will accept it.
It doesn't do anything in python3, since 'print' is a function
whether or not you include that line, but for backward
compatibility, you're still allowed to say it.

Incidentally, the same is true for all __future__ features.
For example, Python3 still accepts:

from __future__ import nested_scopes

, even though it's only really needed if you're using python
2.1, since from 2.2 onwards scopes have nested with or without
that command.

HTH,

-- HansM
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyjamas / pyjs

2012-05-04 Thread Terry Reedy

On 5/4/2012 12:52 AM, John O'Hagan wrote:


Just read the thread on pyjamas-dev. Even without knowing anything about the
lead-up to the coup, its leader's linguistic contortions trying to justify it


And what is the name of the miscreant, so we know who to have nothing to 
with?


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Paul Rubin
Steve Howell  writes:
>> You should be able to just get the incremental bit.
> This is fixed now.

Nice.

> It it's in the header, wouldn't it be part of the output that comes
> before Z_SYNC_FLUSH?

Hmm, maybe you are right.  My version was several years ago and I don't
remember it well, but I half-remember spending some time diddling around
with this issue.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyjamas / pyjs

2012-05-04 Thread james hedley
On Thursday, 3 May 2012 12:52:36 UTC+1, alex23  wrote:
> Anyone else following the apparent hijack of the pyjs project from its
> lead developer?

Yes, me. The guy now in control got the owner of the domain name to turn it 
over to him, which is probably ok legally, but he had no public mandate or 
support. As far as I can see from the mailing list, only 3 or 4 out of the 650 
subscribers actively support his actions. He's a long time contributor and 
genuinely seems quite talented. However there's no getting away from the fact 
that he's done this undemocratically, when he could have forked the project. To 
my mind he hasn't made a good enough reasoned justification of his arguments 
and he's coming across as being very defensive at the moment.

The former leader, Luke Leighton, seemed to have vanished from the face of the 
earth but I mailed him yesterday and he's on holiday so trying not to pay too 
much attention to it at the moment.

There's also an allegation, which I am not making myself at this point - only 
describing its nature, that a person may have lifted data from the original 
mail server without authorisation and used it to recreate the mailing list on a 
different machine. *If* that were to be true, then the law has been broken in 
at least one country.

I'm arguing that there should be a public consultation over who gets to run 
this project and I'm also thinking of making a suggestion to the python 
software foundation or maybe other bodies such as the FSF (I'm not a FOSS 
expert but they were suggested by others) that they host a fork of this project 
so that we can have a legitimate and stable route forward.

The problem for me with all this is that I use pyjamas in a commercial capacity 
and (sorry if this sounds vague but I have to be a bit careful) there are 
probably going to be issues with our clients - corporate people distrust FOSS 
at the best of times and this kind of thing will make them run for the bloody 
hills.

In fact, there appear to be a lot of "sleeper" users who make a living out of 
this stuff and the actions of the new de-facto leader has jeopardised this, 
pretty needlessly in our opinion.

James
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Thu, May 3, 2012 at 11:16 PM, Terry Reedy  wrote:
> On 5/3/2012 8:36 PM, Peng Yu wrote:
>>
>> Hi,
>>
>> list(a_set)
>>
>> When convert two sets with the same elements to two lists, are the
>> lists always going to be the same (i.e., the elements in each list are
>> ordered the same)? Is it documented anywhere?
>
>
> "A set object is an unordered collection of distinct hashable objects".
> If you create a set from unequal objects with equal hashes, the iteration
> order may (should, will) depend on the insertion order as the first object
> added with a colliding hash will be at its 'natural position in the hash
> table while succeeding objects will be elsewhere.
>
> Python 3.3.0a3 (default, May  1 2012, 16:46:00)
 hash('a')
> -292766495615408879
 hash(-292766495615408879)
> -292766495615408879
 a = {'a', -292766495615408879}
 b = {-292766495615408879, 'a'}
 list(a)
> [-292766495615408879, 'a']
 list(b)
> ['a', -292766495615408879]

Thanks. This is what I'm looking for. I think that this should be
added to the python document as a manifestation (but nonnormalized) of
what "A set object is an unordered collection of distinct hashable
objects" means.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyjamas / pyjs

2012-05-04 Thread james hedley
By the way, there's a lot more to say on this, which I'll cover another time. 
There are arguments for and against what's happened; at this stage I'm just 
trying to flag up that there is *not* unanimity and we are not just carrying on 
as normal.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: syntax for code blocks

2012-05-04 Thread Kiuhnm

On 5/4/2012 4:44, alex23 wrote:

On May 4, 2:17 am, Kiuhnm  wrote:

On 5/3/2012 2:20, alex23 wrote:

locals() is a dict. It's not injecting anything into func's scope
other than a dict so there's not going to be any name clashes. If you
don't want any of its content in your function's scope, just don't use
that content.


The clashing is *inside* the dictionary itself. It contains *all* local
functions and variables.


This is nonsense.

locals() produces a dict of the local scope. I'm passing it into a
function. Nothing in the local scope clashes, so the locals() dict has
no "internal clashing". Nothing is injecting it into the function's
local scope, so _there is no "internal clashing"_.

To revise, your original "pythonic" example was, effectively:

 def a(): pass
 def b(): pass

 func_packet = {'a': a, 'b': b}
 func(arg, func_packet)

My version was:

 def a(): pass
 def b(): pass

 func_packet = locals()
 func(arg, func_packet)

Now, please explain how that produces name-clashes that your version
does not.


It doesn't always produce name-clashes but it may do so.
Suppose that func takes some functions named fn1, fn2 and fn3. If you 
only define fn2 but you forget that you already defined somewhere before 
fn1, you inadvertently pass to func both fn1 and fn2.

Even worse, if you write
  def a(): pass
  def b(): pass
  func(arg, locals())
and then you want to call func again with c() alone, you must write this:
  def c(): pass
  a = b = None
  func(arg, locals())
Moreover, think what happens if you add a function whose name is equal 
to that of a function accepted by func.

That's what I call name-clashing.
My solution avoids all these problems, promote encapsulation and let you 
program in a more functional way which is more concise that the OOP way, 
sometimes.



That's not the same thing. If a function accepts some optional
callbacks, and you call that function more than once, you will have
problems. You'll need to redefine some callbacks and remove others.
That's total lack of encapsulation.


Hand-wavy, no real example, doesn't make sense.


Really? Then I don't know what would make sense to you.


You haven't presented *any* good code or use cases.


Says who? You and some others? Not enough.


So far, pretty much everyone who has tried to engage you on this
subject on the list. I'm sorry we're not all ZOMGRUBYBLOCKS111
like the commenters on your project page.


It's impossible to have a constructive discussion while you and others 
feel that way. You're so biased that you don't even see how biased you are.



The meaning is clear from the context.


Which is why pretty much every post in this thread mentioned finding
it confusing?


I would've come up with something even better if only Python wasn't so rigid.


The inability for people to add 6 billion mini-DSLs to solve any
stupid problem _is a good thing_. It makes Python consistent and
predictable, and means I don't need to parse _the same syntax_ utterly
different ways depending on the context.


If I and my group of programmers devised a good and concise syntax and 
semantics to describe some applicative domain, then we would want to 
translate that into the language we use.

Unfortunately, Python doesn't let you do that.
I also think that uniformity is the death of creativity. What's worse, 
uniformity in language is also uniformity in thinking.
As I said in some other posts, I think that Python is a good language, 
but as soon as you need to do something a little different or just 
differently, it's a pain to work with.



Because that would reveal part of the implementation.
Suppose you have a complex visitor. The OOP way is to subclass, while
the FP way is to accept callbacks. Why the FP way? Because it's more
concise.
In any case, you don't want to reveal how the visitor walks the data
structure or, better, the user doesn't need to know about it.


Again, nothing concrete, just vague intimations of your way being
better.


Sigh.


So define&use a different scope! Thankfully module level isn't the
only one to play with.


We can do OOP even in ASM, you know?


???


You can do whatever you want by hand: you can certainly define your 
functions inside another function or a class, but that's just more noise 
added to the mix.



I'm sorry but it is still clear-as-mud what you're trying to show
here. Can you show _one_ practical, real-world, non-toy example that
solves a real problem in a way that Python cannot?


I just did. It's just that you can't see it.


"I don't understand this example, can you provide one." "I just did,
you didn't understand it."


Your rephrasing is quite wrong. You asked for a practical example and I 
said that I already showed you one. It's just that you can't see it (as 
practical).



Okay, done with this now.  Your tautologies and arrogance are not
clarifying your position at all, and I really don't give a damn, so
*plonk*


I don't care if you don't read this post. 

Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Chris Angelico
On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
> Thanks. This is what I'm looking for. I think that this should be
> added to the python document as a manifestation (but nonnormalized) of
> what "A set object is an unordered collection of distinct hashable
> objects" means.

There are other things that can prove it to be unordered, too; the
exact pattern and order of additions and deletions can affect the
iteration order. The only thing you can be sure of is that you can't
be sure of it.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: syntax for code blocks

2012-05-04 Thread Chris Angelico
On Fri, May 4, 2012 at 9:12 PM, Kiuhnm
 wrote:
> If I and my group of programmers devised a good and concise syntax and
> semantics to describe some applicative domain, then we would want to
> translate that into the language we use.
> Unfortunately, Python doesn't let you do that.

No, this is not unfortunate. Python does certain things and does them
competently. If Python doesn't let you write what you want the way you
want, then you do not want Python.

This is not an insult to Python, nor is it a cop-out whereby the
Python Cabal tells you to shut up and go away, you aren't doing things
the Proper Way, you need to change your thinking to be more in line
with Correct Syntax. It is simply a reflection of the nature of
languages.

If I want to write a massively-parallel program that can be divided
across any number of computers around the world, Python isn't the best
thing to use.

If I want to write a MUD with efficient reloading of code on command,
Python isn't the best thing to use.

If I want to write a device driver, Python isn't the best thing to use.

If I want to write a simple script that does exactly what it should
and didn't take me long to write, then Python quite likely IS the best
thing to use.

But whatever you do, play to the strengths of the language you use,
don't play to its weaknesses. Don't complain when C leaks the memory
that you forgot to free(), don't bemoan LISP's extreme parenthesizing,
don't fight the Python object model. You'll only hurt yourself.

In any case, you know where to find Ruby any time you want it.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
>> Thanks. This is what I'm looking for. I think that this should be
>> added to the python document as a manifestation (but nonnormalized) of
>> what "A set object is an unordered collection of distinct hashable
>> objects" means.
>
> There are other things that can prove it to be unordered, too; the
> exact pattern and order of additions and deletions can affect the
> iteration order. The only thing you can be sure of is that you can't
> be sure of it.

I agree. My point was just to suggest adding more explanations on the
details in the manual.

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


set PYTHONPATH for a directory?

2012-05-04 Thread Neal Becker
I'm testing some software I'm building against an alternative version of a 
library.  So I have an alternative library in directory L.  Then I have in an 
unrelated directory, the test software, which I need to use the library version 
from directory L.

One approach is to set PYTHONPATH whenever I run this test software.  Any 
suggestion on a more foolproof approach?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: numpy (matrix solver) - python vs. matlab

2012-05-04 Thread someone

On 05/04/2012 05:52 AM, Steven D'Aprano wrote:

On Thu, 03 May 2012 19:30:35 +0200, someone wrote:



So how do you explain that the natural frequencies from FEM (with
condition number ~1e6) generally correlates really good with real
measurements (within approx. 5%), at least for the first 3-4 natural
frequencies?


I would counter your hand-waving ("correlates really good", "within
approx 5%" of *what*?) with hand-waving of my own:


Within 5% of experiments of course.
There is not much else to compare with.


"Sure, that's exactly what I would expect!"

*wink*

By the way, if I didn't say so earlier, I'll say so now: the
interpretation of "how bad the condition number is" will depend on the
underlying physics and/or mathematics of the situation. The
interpretation of loss of digits of precision is a general rule of thumb
that holds in many diverse situations, not a rule of physics that cannot
be broken in this universe.

If you have found a scenario where another interpretation of condition
number applies, good for you. That doesn't change the fact that, under
normal circumstances when trying to solve systems of linear equations, a
condition number of 1e6 is likely to blow away *all* the accuracy in your
measured data. (Very few physical measurements are accurate to more than
six digits.)


Not true, IMHO.

Eigenfrequencies (I think that is a very typical physical measurement 
and I cannot think of something that is more typical) don't need to be 
accurate with 6 digits. I'm happy with below 5% error. So if an 
eigenfrequency is measured to 100 Hz, I'm happy if the numerical model 
gives a result in the 5%-range of 95-105 Hz. This I got with a condition 
number of approx. 1e6 and it's good enough for me. I don't think anyone 
expects 6-digit accuracy with eigenfrequncies.




--
http://mail.python.org/mailman/listinfo/python-list


Re: numpy (matrix solver) - python vs. matlab

2012-05-04 Thread someone

On 05/04/2012 06:15 AM, Russ P. wrote:

On May 3, 4:59 pm, someone  wrote:

On 05/04/2012 12:58 AM, Russ P. wrote:
Ok, but I just don't understand what's in the "empirical" category, sorry...


I didn't look it up, but as far as I know, empirical just means based
on experiment, which means based on measured data. Unless I am


FEM based on measurement data? Still, I don't understand it, sorry.


mistaken , a finite element analysis is not based on measured data.


I'm probably a bit narrow-thinking because I just worked with this small 
FEM-program (in Matlab), but can you please give an example of a 
matrix-problem that is based on measurement data?



Yes, the results can be *compared* with measured data and perhaps
calibrated with measured data, but those are not the same thing.


Exactly. That's why I don't understand what solving a matrix system 
using measurement/empirical data, could typically be an example of...?



I agree with Steven D's comment above, and I will reiterate that a
condition number of 1e6 would not inspire confidence in me. If I had a
condition number like that, I would look for a better model. But
that's just a gut reaction, not a hard scientific rule.


I don't have any better model and don't know anything better. I still 
think that 5% accuracy is good enough and that nobody needs 6-digits 
precision for practical/engineering/empirical work... Maybe quantum 
physicists needs more than 6 digits of accuracy, but most 
practical/engineering problems are ok with an accuracy of 5%, I think, 
IMHO... Please tell me if I'm wrong.



--
http://mail.python.org/mailman/listinfo/python-list


Re: pyjamas / pyjs

2012-05-04 Thread Duncan Booth
james hedley  wrote:

> There's also an allegation, which I am not making myself at this point
> - only describing its nature, that a person may have lifted data from
> the original mail server without authorisation and used it to recreate
> the mailing list on a different machine. *If* that were to be true,
> then the law has been broken in at least one country. 
> 
I don't know whether they moved it to another machine or not, but what they 
definitely did do was start sending emails to all the people on the list 
who had sending of emails disabled (including myself) which resulted in a 
flood of emails and from the sound of it a lot of annoyed people. If he 
wanted to community support for the takeover that probably wasn't a good 
start.

In case it isn't obvious why I might be subscribed but emails turned off, I 
read mailing lists like that through gmane in which case I still need to 
sign up to the list to post but definitely don't want to receive emails.

-- 
Duncan Booth http://kupuguy.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: set PYTHONPATH for a directory?

2012-05-04 Thread Dave Angel
On 05/04/2012 08:21 AM, Neal Becker wrote:
> I'm testing some software I'm building against an alternative version of a 
> library.  So I have an alternative library in directory L.  Then I have in an 
> unrelated directory, the test software, which I need to use the library 
> version 
> from directory L.
>
> One approach is to set PYTHONPATH whenever I run this test software.  Any 
> suggestion on a more foolproof approach?
>
Simply modify  sys.path  at the beginning of your test software.  That's
where import searches.



-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: set PYTHONPATH for a directory?

2012-05-04 Thread Pedro Larroy
Isn't virtualenv for this kind of scenario?


Pedro.

On Fri, May 4, 2012 at 3:46 PM, Dave Angel  wrote:
> On 05/04/2012 08:21 AM, Neal Becker wrote:
>> I'm testing some software I'm building against an alternative version of a
>> library.  So I have an alternative library in directory L.  Then I have in an
>> unrelated directory, the test software, which I need to use the library 
>> version
>> from directory L.
>>
>> One approach is to set PYTHONPATH whenever I run this test software.  Any
>> suggestion on a more foolproof approach?
>>
> Simply modify  sys.path  at the beginning of your test software.  That's
> where import searches.
>
>
>
> --
>
> DaveA
>
> --
> http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "

2012-05-04 Thread Ian Kelly
On Fri, May 4, 2012 at 12:57 AM, Stefan Behnel  wrote:
> Ian Kelly, 04.05.2012 01:02:
>> BeautifulSoup is supposed to parse like a browser would
>
> Not at all, that would be html5lib.

Well, I guess that depends on whether we're talking about
BeautifulSoup 3 (a regex-based screen scraper with methods for
navigating and searching the resulting tree) or 4 (purely a parse tree
navigation library that relies on external libraries to do the actual
parsing).

According to the BS3 documentation, "The BeautifulSoup class is full
of web-browser-like heuristics for divining the intent of HTML
authors."

If we're talking about BS4, though, then the problem in this instance
would have nothing to do with BS4 and instead would be an issue of
whatever underlying parser the OP is using.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: syntax for code blocks

2012-05-04 Thread Michael Torrie
On 05/04/2012 05:12 AM, Kiuhnm wrote:
>> Hand-wavy, no real example, doesn't make sense.
> 
> Really? Then I don't know what would make sense to you.

Speaking as as an observer here, I've read your blog post, and looked at
your examples.  They don't make sense to me either.  They aren't real
examples.  They are abstract examples.  They do not answer the
questions, "what actual, real world python problems does this solve?"
and "how is this better than a plain python solution?"  For example,
I've seen ruby code where blocks are used in a real-world way.  Could
you not put in something similar in your examples?  Since you've written
this code you must use it in everyday python coding.  Show us what
you've been doing with it.

Also while some of your blog snippets are snippets, other code examples
you provide purport to be complete examples, when in fact they are not.
 For example, about 45% of the way down your blog page you have a block
of code that looks to be self-contained.  It has "import logging" and
"import random" at the top of it.  Yet it cannot run as it's missing an
import of your module.


 You haven't presented *any* good code or use cases.
>>>
>>> Says who? You and some others? Not enough.

How many people do you need to tell you this before it's good enough?
Doesn't matter how genius your code is if no one knows when or how to
use it.

> It's impossible to have a constructive discussion while you and others 
> feel that way. You're so biased that you don't even see how biased you are.

Having followed the conversation somewhat, I can say that you have been
given a fair hearing.  People aren't just dissing on it because it's
ruby.  You are failing to listen to them just as much as you claim they
are failing to listen to them.

>>> The meaning is clear from the context.

Not really.  For one we're not Ruby programmers here, and like has been
said, where is a real example of real code that's not just some abstract
"hello this is block1, this is block 2" sort of thing? Providing
non-block code to compare is important too.


> Unfortunately, communication is a two-people thing. It's been clear from 
> the first post that your intention wasn't to understand what I'm proposing.
> There are some things, like what I say about name-clashing, that you 
> should understand no matter how biased you are.
> If you don't, you're just pretending or maybe you weren't listening at all.

well there's my attempt.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Steve Howell
On May 3, 6:10 pm, Miki Tebeka  wrote:
> > I'm looking for a fairly lightweight key/value store that works for
> > this type of problem:
>
> I'd start with a benchmark and try some of the things that are already in the 
> standard library:
> - bsddb
> - sqlite3 (table of key, value, index key)
> - shelve (though I doubt this one)
>

Thanks.  I think I'm ruling out bsddb, since it's recently deprecated:

http://www.gossamer-threads.com/lists/python/python/106494

I'll give sqlite3 a spin.  Has anybody out there wrapped sqlite3
behind a hash interface already?  I know it's simple to do
conceptually, but there are some minor details to work out for large
amounts of data (like creating the index after all the inserts), so if
somebody's already tackled this, it would be useful to see their
code.

> You might find that for a little effort you get enough out of one of these.
>
> Another module which is not in the standard library is hdf5/PyTables and in 
> my experience very fast.

Thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: syntax for code blocks

2012-05-04 Thread Temia Eszteri
You know what I find rich about all of this?

>>>[ ... ]> I'd like to change the syntax of my module 'codeblocks' to make it 
>>>more 
>>>[ ... ]> pythonic.

Kiuhnm posted a thread to the group asking us to help him make it more
Pythonic, but he has steadfastly refused every single piece of help he
was offered because he feels his code is good enough after all.

So why are we perpetuating it?

~Temia
--
When on earth, do as the earthlings do.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Tim Chase
On 05/04/12 10:27, Steve Howell wrote:
> On May 3, 6:10 pm, Miki Tebeka  wrote:
>>> I'm looking for a fairly lightweight key/value store that works for
>>> this type of problem:
>>
>> I'd start with a benchmark and try some of the things that are already in 
>> the standard library:
>> - bsddb
>> - sqlite3 (table of key, value, index key)
> 
> Thanks.  I think I'm ruling out bsddb, since it's recently deprecated:

Have you tested the standard library's anydbm module (certainly not
deprecated)?  In a test I threw together, after populating one gig
worth of data, lookups were pretty snappy (compared to the lengthy
time it took to populate the 1gb of junk data).

-tkc


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Terry Reedy

On 5/4/2012 8:00 AM, Peng Yu wrote:

On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:

On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:

Thanks. This is what I'm looking for. I think that this should be
added to the python document as a manifestation (but nonnormalized) of
what "A set object is an unordered collection of distinct hashable
objects" means.


There are other things that can prove it to be unordered, too; the
exact pattern and order of additions and deletions can affect the
iteration order. The only thing you can be sure of is that you can't
be sure of it.


I agree. My point was just to suggest adding more explanations on the
details in the manual.


I am not sure how much clearer we can be in the language manual. The 
word 'unordered' means just that. If one imposes an arbitrary linear 
order on an unordered collection, it is arbitrary. It is frustrating 
that people do not want to believe that, and even write tests depending 
on today's arbitrary serialization order being deterministic 
indefinitely. There is a section about this in the doctest doc, but 
people do it anyway. I will think about a sentence to add.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Tim Chase
On 05/04/12 12:22, Steve Howell wrote:
> Which variant do you recommend?
> 
> """ anydbm is a generic interface to variants of the DBM database
> — dbhash (requires bsddb), gdbm, or dbm. If none of these modules
> is installed, the slow-but-simple implementation in module
> dumbdbm will be used.
> 
> """

If you use the stock anydbm module, it automatically chooses the
best it knows from the ones available:

  import os
  import hashlib
  import random
  from string import letters

  import anydbm

  KB = 1024
  MB = KB * KB
  GB = MB * KB
  DESIRED_SIZE = 1 * GB
  KEYS_TO_SAMPLE = 20
  FNAME = "mydata.db"

  i = 0
  md5 = hashlib.md5()
  db = anydbm.open(FNAME, 'c')
  try:
print("Generating junk data...")
while os.path.getsize(FNAME) < 6*GB:
  key = md5.update(str(i))[:16]
  size = random.randrange(1*KB, 4*KB)
  value = ''.join(random.choice(letters)
for _ in range(size))
  db[key] = value
  i += 1
print("Gathering %i sample keys" % KEYS_TO_SAMPLE)
keys_of_interest = random.sample(db.keys(), KEYS_TO_SAMPLE)
  finally:
db.close()

  print("Reopening for a cold sample set in case it matters")
  db = anydbm.open(FNAME)
  try:
print("Performing %i lookups")
for key in keys_of_interest:
  v = db[key]
print("Done")
  finally:
db.close()


(your specs said ~6gb of data, keys up to 16 characters, values of
1k-4k, so this should generate such data)

-tkc
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: most efficient way of populating a combobox (in maya)

2012-05-04 Thread Prasad, Ramit
> > I'm making a GUI in maya using python only and I'm trying to see which
> > is more efficient. I'm trying to populate an optionMenuGrp / combo box
> > whose contents come from os.listdir(folder). Now this is fine if the
> > folder isn't that full but the folder has a few hundred items (almost in
> > the thousands), it is also on the (work) network and people are
> > constantly reading from it as well. Now I'm trying to write the GUI so
> > that it makes the interface, and using threading - Thread, populate the
> > box. Is this a good idea? Has anyone done this before and have
> > experience with any limitations on it? Is the performance not
> > significant?
> > Thanks for any advice
> 
> 
> Why don't you try it and see?
> 
> 
> It's not like populating a combobox in Tkinter with the contents of
> os.listdir requires a large amount of effort. Just try it and see whether
> it performs well enough.

In my experience, a generic combobox with hundreds or thousands of elements is 
difficult and annoying to use. Not sure if the Tkinter version has scroll bars 
or auto-completion, but if not you may want to subclass and add those features.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--

This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Emile van Sebille

On 5/4/2012 10:46 AM Tim Chase said...

I hit a few snags testing this on my winxp w/python2.6.1 in that getsize 
wasn't finding the file as it was created in two parts with .dat and 
.dir extension.


Also, setting key failed as update returns None.

The changes I needed to make are marked below.

Emile


   import os
   import hashlib
   import random
   from string import letters

   import anydbm

   KB = 1024
   MB = KB * KB
   GB = MB * KB
   DESIRED_SIZE = 1 * GB
   KEYS_TO_SAMPLE = 20
   FNAME = "mydata.db"


FDATNAME = r"mydata.db.dat"



   i = 0
   md5 = hashlib.md5()
   db = anydbm.open(FNAME, 'c')
   try:
 print("Generating junk data...")
 while os.path.getsize(FNAME)<  6*GB:


  while os.path.getsize(FDATNAME) < 6*GB:


   key = md5.update(str(i))[:16]


md5.update(str(i))
key = md5.hexdigest()[:16]


   size = random.randrange(1*KB, 4*KB)
   value = ''.join(random.choice(letters)
 for _ in range(size))
   db[key] = value
   i += 1
 print("Gathering %i sample keys" % KEYS_TO_SAMPLE)
 keys_of_interest = random.sample(db.keys(), KEYS_TO_SAMPLE)
   finally:
 db.close()

   print("Reopening for a cold sample set in case it matters")
   db = anydbm.open(FNAME)
   try:
 print("Performing %i lookups")
 for key in keys_of_interest:
   v = db[key]
 print("Done")
   finally:
 db.close()




--
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Tim Chase
On 05/04/12 14:14, Emile van Sebille wrote:
> On 5/4/2012 10:46 AM Tim Chase said...
> 
> I hit a few snags testing this on my winxp w/python2.6.1 in that getsize 
> wasn't finding the file as it was created in two parts with .dat and 
> .dir extension.

Hrm...must be a Win32 vs Linux thing.

> Also, setting key failed as update returns None.

Doh, that's what I get for not testing my hand-recreation of the
test program I cobbled together and then deleted.  Thanks for
tweaking that.

-tkc



-- 
http://mail.python.org/mailman/listinfo/python-list


pickle question: sequencing of operations

2012-05-04 Thread Russell E. Owen
What is the sequence of calls when unpickling a class with __setstate__?

>From experimentation I see that __setstate__ is called and __init__ is 
not, but I think I need more info.

I'm trying to pickle an instance of a class that is a subclass of 
another class that contains unpickleable objects.

What I'd like to do is basically just pickle the constructor parameters 
and then use those to reconstruct the object on unpickle, but I'm not 
sure how to go about this. Or an example if anyone has one.

-- Russell

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
> On 5/4/2012 8:00 AM, Peng Yu wrote:
>>
>> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
>>>
>>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:

 Thanks. This is what I'm looking for. I think that this should be
 added to the python document as a manifestation (but nonnormalized) of
 what "A set object is an unordered collection of distinct hashable
 objects" means.
>>>
>>>
>>> There are other things that can prove it to be unordered, too; the
>>> exact pattern and order of additions and deletions can affect the
>>> iteration order. The only thing you can be sure of is that you can't
>>> be sure of it.
>>
>>
>> I agree. My point was just to suggest adding more explanations on the
>> details in the manual.
>
>
> I am not sure how much clearer we can be in the language manual. The word
> 'unordered' means just that. If one imposes an arbitrary linear order on an
> unordered collection, it is arbitrary. It is frustrating that people do not
> want to believe that, and even write tests depending on today's arbitrary
> serialization order being deterministic indefinitely. There is a section
> about this in the doctest doc, but people do it anyway. I will think about a
> sentence to add.

You can just add the example that you posted to demonstrate what the
unordered means. A curious user might want to know under what
condition the "unorderness" can affect the results, because for
trivial examples (like the following), it does seem that there is some
orderness in a set.

set(['a', 'b', 'c'])
set(['c', 'b', 'a'])

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-04 Thread Emile van Sebille

On 5/4/2012 12:49 PM Tim Chase said...

On 05/04/12 14:14, Emile van Sebille wrote:

On 5/4/2012 10:46 AM Tim Chase said...

I hit a few snags testing this on my winxp w/python2.6.1 in that getsize
wasn't finding the file as it was created in two parts with .dat and
.dir extension.


Hrm...must be a Win32 vs Linux thing.


Or an anydbm thing -- you may get different results depending...

Emile

--
http://mail.python.org/mailman/listinfo/python-list


for loop: weird behavior

2012-05-04 Thread ferreirafm
Hi there,
I simply can't print anything in the second for-loop bellow:

#
#!/usr/bin/env python   
  

import sys

filename = sys.argv[1]
outname = filename.split('.')[0] + '_pdr.dat'
begin = 'Distance distribution'
end = 'Reciprocal'
first = 0
last = 0
with open(filename) as inf:
for num, line in enumerate(inf, 1):
#print num, line
  
if begin in line:
first = num
if end in line:
last = num
for num, line in enumerate(inf, 1):
print 'Ok!'
print num, line
if num in range(first + 5, last - 1):
print line
print first, last
print range(first + 5, last - 1)

The output goes here:
http://pastebin.com/egnahct2

Expected: at least the string 'Ok!' from the second for-loop.

What I'm doing wrong?
thanks in advance.
Fred







--
View this message in context: 
http://python.6.n6.nabble.com/for-loop-weird-behavior-tp4953214.html
Sent from the Python - python-list mailing list archive at Nabble.com.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: for loop: weird behavior

2012-05-04 Thread Terry Reedy

On 5/4/2012 4:33 PM, ferreirafm wrote:

Hi there,
I simply can't print anything in the second for-loop bellow:

#
#!/usr/bin/env python

import sys

filename = sys.argv[1]
outname = filename.split('.')[0] + '_pdr.dat'
begin = 'Distance distribution'
end = 'Reciprocal'
first = 0
last = 0
with open(filename) as inf:
 for num, line in enumerate(inf, 1):
 #print num, line
 if begin in line:
 first = num
 if end in line:
 last = num


The file pointer is now at the end of the file. As an iterator, the file 
is exhausted. To reiterate, return the file pointer to the beginning 
with inf.seek(0).



 for num, line in enumerate(inf, 1):
 print 'Ok!'
 print num, line
 if num in range(first + 5, last - 1):
 print line
 print first, last
 print range(first + 5, last - 1)


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: recruiter spam

2012-05-04 Thread Chris Withers

Please don't spam the list with job adverts, post to the job board instead:

http://www.python.org/community/jobs/howto/

cheers,

Chris

On 03/05/2012 22:13, Preeti Bhattad wrote:

Hi there,
If you have USA work visa and if you reside in USA;



--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Cameron Simpson
On 04May2012 15:08, Peng Yu  wrote:
| On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
| > On 5/4/2012 8:00 AM, Peng Yu wrote:
| >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
| >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
|  Thanks. This is what I'm looking for. I think that this should be
|  added to the python document as a manifestation (but nonnormalized) of
|  what "A set object is an unordered collection of distinct hashable
|  objects" means.
| >>>
| >>> There are other things that can prove it to be unordered, too; the
| >>> exact pattern and order of additions and deletions can affect the
| >>> iteration order. The only thing you can be sure of is that you can't
| >>> be sure of it.
| >>
| >> I agree. My point was just to suggest adding more explanations on the
| >> details in the manual.
| >
| > I am not sure how much clearer we can be in the language manual. The word
| > 'unordered' means just that. [...]
| 
| You can just add the example that you posted to demonstrate what the
| unordered means. A curious user might want to know under what
| condition the "unorderness" can affect the results, because for
| trivial examples (like the following), it does seem that there is some
| orderness in a set.

I'm with Terry here: anything else in the line you suggest would
complicate things for the reader, and potentially mislead.

Future implementation changes (and, indeed, _other_ implementations like
Jython) can change any of this. So there _are_ no ``condition the
"unorderness" can affect the results'': a set is unordered, and you
could even _legitimately_ get different orders from the same set
if you iterate over it twice. It is unlikely, but permissable.

Any attempt to describe such conditions beyond "it might happen at any
time" would be misleading.

| set(['a', 'b', 'c'])
| set(['c', 'b', 'a'])

The language does not say these will get the same iteration order. It
happens that the Python you're using, today, does that.

You can't learn the language specification from watching behaviour;
you learn the guarrenteed behaviour -- what you may rely on happening --
from the specification, and you can test that an implementation obeys (or
at any rate, does not disobey) the specification by watching behaviour.

You seem to be trying to learn the spec from behaviour.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Loud pipes make noise. Skill and experience save lives.
- EdBob Morandi
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Peng Yu
On Fri, May 4, 2012 at 6:12 PM, Cameron Simpson  wrote:
> On 04May2012 15:08, Peng Yu  wrote:
> | On Fri, May 4, 2012 at 12:43 PM, Terry Reedy  wrote:
> | > On 5/4/2012 8:00 AM, Peng Yu wrote:
> | >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico  wrote:
> | >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu  wrote:
> |  Thanks. This is what I'm looking for. I think that this should be
> |  added to the python document as a manifestation (but nonnormalized) of
> |  what "A set object is an unordered collection of distinct hashable
> |  objects" means.
> | >>>
> | >>> There are other things that can prove it to be unordered, too; the
> | >>> exact pattern and order of additions and deletions can affect the
> | >>> iteration order. The only thing you can be sure of is that you can't
> | >>> be sure of it.
> | >>
> | >> I agree. My point was just to suggest adding more explanations on the
> | >> details in the manual.
> | >
> | > I am not sure how much clearer we can be in the language manual. The word
> | > 'unordered' means just that. [...]
> |
> | You can just add the example that you posted to demonstrate what the
> | unordered means. A curious user might want to know under what
> | condition the "unorderness" can affect the results, because for
> | trivial examples (like the following), it does seem that there is some
> | orderness in a set.
>
> I'm with Terry here: anything else in the line you suggest would
> complicate things for the reader, and potentially mislead.
>
> Future implementation changes (and, indeed, _other_ implementations like
> Jython) can change any of this. So there _are_ no ``condition the
> "unorderness" can affect the results'': a set is unordered, and you
> could even _legitimately_ get different orders from the same set
> if you iterate over it twice. It is unlikely, but permissable.
>
> Any attempt to describe such conditions beyond "it might happen at any
> time" would be misleading.
>
> | set(['a', 'b', 'c'])
> | set(['c', 'b', 'a'])
>
> The language does not say these will get the same iteration order. It
> happens that the Python you're using, today, does that.
>
> You can't learn the language specification from watching behaviour;
> you learn the guarrenteed behaviour -- what you may rely on happening --
> from the specification, and you can test that an implementation obeys (or
> at any rate, does not disobey) the specification by watching behaviour.
>
> You seem to be trying to learn the spec from behaviour.

My point is if something is said in the document, it is better to be
substantiated by an example. I don't think that this has anything with
"learn the spec from behaviour."

-- 
Regards,
Peng
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Mark Lawrence

On 05/05/2012 00:37, Peng Yu wrote:


My point is if something is said in the document, it is better to be
substantiated by an example. I don't think that this has anything with
"learn the spec from behaviour."



I side with the comments made by Terry Reedy and Cameron Simpson so 
please give it a rest, you're flogging a dead horse.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: When convert two sets with the same elements to lists, are the lists always going to be the same?

2012-05-04 Thread Terry Reedy

Peng, I actually am thinking about it.

Underlying problem: while unordered means conceptually unordered as far 
as the collection goes, the items in the collection, if homogenous 
enough, may have a natural order, which users find hard to ignore. Even 
if not comparable, an implementation such as CPython that uses linear 
sequential memory will impose some order. Even if the implementation 
uses unordered (holographic?) memory, order will be imposed to iterate, 
as when creating a serialized representation of the collection. Abstract 
objects, concrete objects, and serialized representations are three 
different things, but people tend to conflate them.


Order consistency issues: if the unordered collection is iterated, when 
can one expect the order to be the same? Theoretically, essentially 
never, except that iterating dicts by keys, values, or key-value pairs 
is guaranteed to be consistent, which means that re-iterating has to be 
consistent. I actually think the same might as well be true for sets, 
although there is no doc that says so.


If two collections are equal, should the iteration order be the same? It 
has always been true that if hash values collide, insertion order 
matters. However, a good hash function avoids hash collisions as much as 
possible in practical use cases. Without doing something artificial, as 
I did with the example, collisions should be especially rare on 64-bit 
builds. If one collection has a series of additions and deletions so 
that the underlying hash table has a different size than an equal 
collection build just from insertions, then order will also be different.


If the same collection is built by insertion in the same order, but in 
different runs, bugfix versions, or language versions, will iteration 
order by the same? Historically, it has been for CPython for about a 
decade, and people has come to depend on that constancy, in spite of 
warning not to. Even core developers have not been immune, as the 
CPython test suite has a few set or dict iteration order dependencies 
until it was recently randomized.


Late last year, it became obvious that this constancy is a practical 
denial-of-service security hole. The option to randomize hashing for 
each run was added to 2.6, 2.7, 3.1, and 3.2. Randomized hashing by run 
is part of 3.3. So some of the discussion above is obsolete. The example 
I gave only works for that one run, as hash('a') changes each run. So 
iteration order now changes with each run in fact as well as in theory.


For the doc, the problem is what to say and where without being 
repetitous (and to get multiple people to agree ;-).


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list