Re: building Python 2.4.2 on Mac OS X
Not sure if this will meet your needs, but I have had good luck using the "Fink" package manager, which has 2.4.2 in unstable. It takes a while to update the package list and build, but it worked for me without errors. Regards, Walter. -- http://mail.python.org/mailman/listinfo/python-list
Re: 2.6, 3.0, and truly independent intepreters
Hi, I've been following this discussion, and although I'm not nearly the Python expert that others on this thread are, I think I understand Andy's point of view. His premises seem to include at least: 1. His Python code does not control the creation of the threads. That is done "at the app level". 2. Perhaps more importantly, his Python code does not control the allocation of the data he needs to operate on. He's got, for example, "an opaque OS object" that is manipulated by CPU-intensive OS functions. sturlamolden suggests a few approaches: > 1. Check if a NumPy record array may suffice (dtypes may be nested). > It will if you don't have dynamically allocated pointers inside the > data structure. I suspect that the OS is very likely to have dynamically allocated pointers inside their opaque structures. > 2. Consider using multiprocessing's proxy objects or outproc ActiveX > objects. I don't understand how this would help. If these large data structures reside only in one remote process, then the overhead of proxying the data into another process for manipulation requires too much IPC, or at least so Andy stipulates. > 3. Go to http://pyro.sourceforge.net, download the code and read the > documentation. I don't see how this solves the problem with 2. I admit I have only cursory knowledge, but I understand "remoting" approaches to have the same weakness. I understand Andy's problem to be that he needs to operate on a large amount of in-process data from several threads, and each thread mixes CPU-intensive C functions with callbacks to Python utility functions. He contends that, even though he releases the GIL in the CPU-bound C functions, the reacquisition of the GIL for the utility functions causes unacceptable contention slowdowns in the current implementation of CPython. After reading Martin's posts, I think I also understand his point of view. Is the time spent in these Python callbacks so large compared to the C functions that you really have to wait? If so, then Andy has crossed over into writing performance-critical code in Python. Andy proposes that the Python community could work on making that possible, but Martin cautions that it may be very hard to do so. If I understand them correctly, none of these concerns are silly. Walter. -- http://mail.python.org/mailman/listinfo/python-list
Re: 2.6, 3.0, and truly independent intepreters
On Nov 6, 2:03 pm, sturlamolden <[EMAIL PROTECTED]> wrote: > On Nov 6, 6:05 pm, Walter Overby <[EMAIL PROTECTED]> wrote: > > > I don't understand how this would help. If these large data > > structures reside only in one remote process, then the overhead of > > proxying the data into another process for manipulation requires too > > much IPC, or at least so Andy stipulates. > > Perhaps it will, or perhaps not. Reading or writing to a pipe has > slightly more overhead than a memcpy. There are things that Python > needs to do that are slower than the IPC. In this case, the real > constraint would probably be contention for the object in the server, > not the IPC. (And don't blame it on the GIL, because putting a lock > around the object would not be any better.) (I'm not blaming anything on the GIL.) I read Andy to stipulate that the pipe needs to transmit "hundreds of megs of data and/or thousands of data structure instances." I doubt he'd be happy with memcpy either. My instinct is that contention for a lock could be the quicker option. And don't forget, he says he's got an "opaque OS object." He asked the group to explain how to send that via IPC to another process. I surely don't know how. > > > 3. Go tohttp://pyro.sourceforge.net, download the code and read the > > > documentation. > > > I don't see how this solves the problem with 2. > > It puts Python objects in shared memory. Shared memory is the fastest > form of IPC there is. The overhead is basically zero. The only > constraint will be contention for the object. I don't think he has Python objects to work with. I'm persuaded when he says: "when you're talking about large, intricate data structures (which include opaque OS object refs that use process-associated allocators), even a shared memory region between the child process and the parent can't do the job." Why aren't you persuaded? > Yes, callbacks to Python are expensive. But is the problem the GIL? > Instead of contention for the GIL, he seems to prefer contention for a > complex object. Is that any better? It too has to be protected by a > lock. At a couple points, Andy has expressed his preference for a "single high level sync object" to synchronize access to the data, at least that's my reading. What he doesn't seem to prefer is the slowdown arising from the Python callbacks acquiring the GIL. I think that would be an additional lock, and that's near the heart of Andy's concern, as I read him. > > If I understand them correctly, none of these concerns are silly. > > No they are not. But I think he underestimates what multiple processes > can do. The objects in 'multiprocessing' are already a lot faster than > their 'threading' and 'Queue' counterparts. Andy has complimented 'multiprocessing' as a "huge huge step." He just offers a scenario where multiprocessing might not be the best solution, and so far, I see no evidence he is wrong. That's not underestimation, in my estimation! Walter. -- http://mail.python.org/mailman/listinfo/python-list
Re: Use Regular Expressions to extract URL's
A John Gruber post from November seems relevant. I have not tried his regex in any language. http://daringfireball.net/2009/11/liberal_regex_for_matching_urls Regards, Walter. -- http://mail.python.org/mailman/listinfo/python-list
