Re: building Python 2.4.2 on Mac OS X

2006-01-09 Thread Walter Overby
Not sure if this will meet your needs, but I have had good luck using
the "Fink" package manager, which has 2.4.2 in unstable.  It takes a
while to update the package list and build, but it worked for me
without errors.

Regards,

Walter.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 2.6, 3.0, and truly independent intepreters

2008-11-06 Thread Walter Overby
Hi,

I've been following this discussion, and although I'm not nearly the
Python expert that others on this thread are, I think I understand
Andy's point of view.  His premises seem to include at least:

1. His Python code does not control the creation of the threads.  That
is done "at the app level".
2. Perhaps more importantly, his Python code does not control the
allocation of the data he needs to operate on.  He's got, for example,
"an opaque OS object" that is manipulated by CPU-intensive OS
functions.

sturlamolden suggests a few approaches:

> 1. Check if a NumPy record array may suffice (dtypes may be nested).
> It will if you don't have dynamically allocated pointers inside the
> data structure.

I suspect that the OS is very likely to have dynamically allocated
pointers inside their opaque structures.

> 2. Consider using multiprocessing's proxy objects or outproc ActiveX
> objects.

I don't understand how this would help.  If these large data
structures reside only in one remote process, then the overhead of
proxying the data into another process for manipulation requires too
much IPC, or at least so Andy stipulates.

> 3. Go to http://pyro.sourceforge.net, download the code and read the
> documentation.

I don't see how this solves the problem with 2.  I admit I have only
cursory knowledge, but I understand "remoting" approaches to have the
same weakness.

I understand Andy's problem to be that he needs to operate on a large
amount of in-process data from several threads, and each thread mixes
CPU-intensive C functions with callbacks to Python utility functions.
He contends that, even though he releases the GIL in the CPU-bound C
functions, the reacquisition of the GIL for the utility functions
causes unacceptable contention slowdowns in the current implementation
of CPython.

After reading Martin's posts, I think I also understand his point of
view.  Is the time spent in these Python callbacks so large compared
to the C functions that you really have to wait?  If so, then Andy has
crossed over into writing performance-critical code in Python.  Andy
proposes that the Python community could work on making that possible,
but Martin cautions that it may be very hard to do so.

If I understand them correctly, none of these concerns are silly.

Walter.
--
http://mail.python.org/mailman/listinfo/python-list


Re: 2.6, 3.0, and truly independent intepreters

2008-11-06 Thread Walter Overby
On Nov 6, 2:03 pm, sturlamolden <[EMAIL PROTECTED]> wrote:
> On Nov 6, 6:05 pm, Walter Overby <[EMAIL PROTECTED]> wrote:
>
> > I don't understand how this would help.  If these large data
> > structures reside only in one remote process, then the overhead of
> > proxying the data into another process for manipulation requires too
> > much IPC, or at least so Andy stipulates.
>
> Perhaps it will, or perhaps not. Reading or writing to a pipe has
> slightly more overhead than a memcpy. There are things that Python
> needs to do that are slower than the IPC. In this case, the real
> constraint would probably be contention for the object in the server,
> not the IPC. (And don't blame it on the GIL, because putting a lock
> around the object would not be any better.)

(I'm not blaming anything on the GIL.)

I read Andy to stipulate that the pipe needs to transmit "hundreds of
megs of data and/or thousands of data structure instances."  I doubt
he'd be happy with memcpy either.  My instinct is that contention for
a lock could be the quicker option.

And don't forget, he says he's got an "opaque OS object."  He asked
the group to explain how to send that via IPC to another process.  I
surely don't know how.

> > > 3. Go tohttp://pyro.sourceforge.net, download the code and read the
> > > documentation.
>
> > I don't see how this solves the problem with 2.
>
> It puts Python objects in shared memory. Shared memory is the fastest
> form of IPC there is. The overhead is basically zero. The only
> constraint will be contention for the object.

I don't think he has Python objects to work with.  I'm persuaded when
he says: "when you're talking about large, intricate data structures
(which include opaque OS object refs that use process-associated
allocators), even a shared memory region between the child process and
the parent can't do the job."

Why aren't you persuaded?



> Yes, callbacks to Python are expensive. But is the problem the GIL?
> Instead of contention for the GIL, he seems to prefer contention for a
> complex object. Is that any better? It too has to be protected by a
> lock.

At a couple points, Andy has expressed his preference for a "single
high level sync object" to synchronize access to the data, at least
that's my reading.  What he doesn't seem to prefer is the slowdown
arising from the Python callbacks acquiring the GIL.  I think that
would be an additional lock, and that's near the heart of Andy's
concern, as I read him.

> > If I understand them correctly, none of these concerns are silly.
>
> No they are not. But I think he underestimates what multiple processes
> can do. The objects in 'multiprocessing' are already a lot faster than
> their 'threading' and 'Queue' counterparts.

Andy has complimented 'multiprocessing' as a "huge huge step."  He
just offers a scenario where multiprocessing might not be the best
solution, and so far, I see no evidence he is wrong.  That's not
underestimation, in my estimation!

Walter.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Use Regular Expressions to extract URL's

2010-05-01 Thread Walter Overby
A John Gruber post from November seems relevant.  I have not tried his
regex in any language.

http://daringfireball.net/2009/11/liberal_regex_for_matching_urls

Regards,

Walter.
-- 
http://mail.python.org/mailman/listinfo/python-list