Re: OT I before E [was Re: Lies in education [was Re: The "loop and a half"]]
On Oct 5, 2017 2:02 PM, "Roel Schroeven" wrote: Thomas Jollans schreef op 5/10/2017 10:30: On 2017-10-05 06:47, Steve D'Aprano wrote: > >> On Thu, 5 Oct 2017 02:54 pm, Chris Angelico wrote: >> >>> (There are exceptions even to the longer form of the rule, but only a >>> handful. English isn't a tidy language.) >>> >> Even with the longer version of the rule, there are so few applicable >> cases, >> and enough difficulty in applying the rule correctly, that the rule is not >> worth the breath it takes to say it. >> > > Maybe we should just all switch to Dutch on this list. Might be easier. > Certainly more consistent. > Although that way may not be obvious at first unless you're Dutch. (or Flemish) There are a lot of exceptions, weird rules and other difficulties in Dutch as well; we native speakers just don't notice them that much. -- The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom. -- Isaac Asimov Roel Schroeve Esperanto. https://en.m.wikipedia.org/wiki/Esperanto_grammar Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
FYI apparmor and lxc in Ubuntu
Hi all, this is just an FYI in case anyone else has the same issue I just ran into. If you use python 3.6 or 3.7 under Ubuntu with lxc, you may discover that your site-packages aren't being imported correctly within the container, but when you SSH in, everything works correctly. If that happens, check the config file for apparmor; on my system (Ubuntu 16.0.4), it disallowed python 3.6+. You will also need to set the language environment variable for lxc to use UTF-8, or your strings may get discarded (it defaults to ASCII). SSH goes through a different path for setting up the environment, so sshing in won't trigger the error. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Cross platform mutex to prevent script running more than instance?
What about using flock()? I don't know if it works on Windows, but it works really well for Unix/Linux systems. I typically create a log file in a known location using any atomic method that doesn't replace/overwrite a file, and flock() it for the duration of the script. Thanks, Cem Karan On Mon, Sep 3, 2018, 11:39 PM Cameron Simpson wrote: > On 03Sep2018 07:45, Malcolm Greene wrote: > >Use case: Want to prevent 2+ instances of a script from running ... > >ideally in a cross platform manner. I've been researching this topic and > >am surprised how complicated this capability appears to be and how the > >diverse the solution set is. I've seen solutions ranging from using > >directories, named temporary files, named sockets/pipes, etc. Is there > >any consensus on best practice here? > > I like os.mkdir of a known directory name. This tends to be atomic and > forbidden when the name already exists, on all UNIX platforms, over remote > filesystems. And, I expect, likewise on Windows. > > All the other modes like opening files O_EXCL etc tend to be platform > specific > and not reliable over network filesystems. > > And pid based approaches don't work cross machine, if that is an issue. > > Cheers, > Cameron Simpson > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Cross platform mutex to prevent script running more than instance?
What about using flock()? I don't know if it works on Windows, but it works really well for Unix/Linux systems. I typically create a log file in a known location using any atomic method that doesn't replace/overwrite a file, and flock() it for the duration of the script. Thanks, Cem Karan On Mon, Sep 3, 2018, 11:39 PM Cameron Simpson wrote: > On 03Sep2018 07:45, Malcolm Greene wrote: > >Use case: Want to prevent 2+ instances of a script from running ... > >ideally in a cross platform manner. I've been researching this topic and > >am surprised how complicated this capability appears to be and how the > >diverse the solution set is. I've seen solutions ranging from using > >directories, named temporary files, named sockets/pipes, etc. Is there > >any consensus on best practice here? > > I like os.mkdir of a known directory name. This tends to be atomic and > forbidden when the name already exists, on all UNIX platforms, over remote > filesystems. And, I expect, likewise on Windows. > > All the other modes like opening files O_EXCL etc tend to be platform > specific > and not reliable over network filesystems. > > And pid based approaches don't work cross machine, if that is an issue. > > Cheers, > Cameron Simpson > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Python environment on mac
There are two variables you will need to set; PATH and PYTHONPATH. You set your PYTHONPATH correctly, but for executables like pip, you need to set the PATH as well. You MUST do that for each account! The reason it didn't work as root is because once you su to root, it replaces your PYTHONPATH and PATH (and all other environment variables) with root's. sudo shouldn't have that problem. BE VERY CAREFUL CHANGING THESE VARIABLES FOR ROOT! I managed to wedge a system until I reverted my environment. Thanks, Cem Karan On Jul 26, 2016 9:58 AM, "Crane Ugly" wrote: > Mac OS X comes with its own version of python and structure to support it. > So far it was good enough for me. Then I started to use modules that > distributed through MacPorts and this is where I get lost. > I do not quite understand how Python environment is set. Or how to set it > in a way of using, say MacPorts distribution alone. > For example: standard location for pip utility is /usr/local/bin/pip. > MacPorts structure has it too but as a link > lrwxr-xr-x 1 root admin 67 May 23 22:32 /opt/local/bin/pip-2.7 -> > /opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/pip > Which means that the standard utility will be used. > The things is that depending on a way I run pip I get different results: > $ pip list|grep pep8 > pep8 (1.7.0) > $ sudo pip list|grep pep8 > $ > pep8 was installed through macports. > In second case pip is using stripped environment and pointing to standard > Mac OS Python repository. > But in a way to install anything with pip I have to use sudo. > In my profile I have variable PYTHONPATH: > > PYTHONPATH=/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages > It is pointing to macports structure. But when I use sudo (in case of > using pip) it get stripped. > How to setup and maintain python environment in a trustful way? So it is > clear where all installed modules are? > > Leonid > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Diff between object graphs?
On Wed, Apr 22, 2015 at 8:11 AM, Rustom Mody wrote: > On Wednesday, April 22, 2015 at 4:07:35 PM UTC+5:30, Cem Karan wrote: > > Hi all, I need some help. I'm working on a simple event-based simulator > for my dissertation research. The simulator has state information that I > want to analyze as a post-simulation step, so I currently save (pickle) the > entire simulator every time an event occurs; this lets me analyze the > simulation at any moment in time, and ask questions that I haven't thought > of yet. The problem is that pickling this amount of data is both > time-consuming and a space hog. This is true even when using bz2.open() to > create a compressed file on the fly. > > No answer to your questions... > But you do know that bzip is rather worse than gzip in time > and not really so much better in space dont you?? > http://tukaani.org/lzma/benchmarks.html > I had no idea, I'll try my tests using gzip as well, just to see. That said, I could still use the diff between object graphs; saving less state is definitely going to be a speed/space improvement over saving everything! Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Issue 19904
Does anyone know where issue 19904 (http://bugs.python.org/issue19904) is at? I don't see it as being in python 3.5, but I was wondering if I just missed it. I could use support for __uint128_t so that I can interface with external C code via ctypes. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Issue 19904
OK, thanks. I know it depends on differing compiler intrinsics, so it isn't a trivial patch. Maybe in 3.6 or 3.7. Thanks, Cem Karan Original message From: Zachary Ware Date:08/27/2015 3:23 PM (GMT-05:00) To: [email protected] Cc: Subject: Re: Issue 19904 On Thu, Aug 27, 2015 at 2:02 PM, CFK wrote: > Does anyone know where issue 19904 (http://bugs.python.org/issue19904) is > at? I don't see it as being in python 3.5, but I was wondering if I just > missed it. I could use support for __uint128_t so that I can interface with > external C code via ctypes. The issue is still open and no commits are listed in the messages, so it's still just a request. -- Zach -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Battle of the garbage collectors, or ARGGHHHHHH!!!!
TLDR version: the bdwgc garbage collector (http://www.hboehm.info/gc/) and python's collector are not playing nice with one another, and I need to make them work with each other. Long version: I'm trying to write bindings for python via ctypes to control a library written in C that uses the bdwgc garbage collector ( http://www.hboehm.info/gc/). The bindings mostly work, except for when either bdwgc or python's garbage collector decide to get into an argument over what is garbage and what isn't, in which case I get a segfault because one or the other collector has already reaped the memory. I need the two sides to play nice with one another. I can think of two solutions: First, I can replace Python's garbage collector via the functions described at https://docs.python.org/3/c-api/memory.html#customize-memory-allocators so that they use the bdwgc functions instead. However, this leads me to a whole series of questions: 1. Has anyone done anything like this before? Is there any reason to believe it won't work? 2. Since I'm going through ctypes, the python interpreter will be up and running before my library's code will be called. I'm guessing that this will lead to horribleness, but I'm hoping that python is able to do better than that somehow. Second, is to hope that there is some way of getting memory from python, use it in C, and let the python garbage collector deal with it (essentially replacing bdwgc in the C code with python's garbage collector). I don't have a great deal of hope for either method working, but I'm hoping I'm wrong, and that someone can save me from the headaches I'm having. Is there hope, or am I stuck? Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Unable to subclass ctypes.c_uint64: was: Re: Battle of the garbage collectors, or ARGGHHHHHH!!!!
On Wed, Apr 26, 2017 at 10:38 PM, Cem Karan wrote: > > On Apr 24, 2017, at 8:54 PM, Jon Ribbens > wrote: > > > On 2017-04-24, CFK wrote: > >> Long version: I'm trying to write bindings for python via ctypes to > control > >> a library written in C that uses the bdwgc garbage collector ( > >> http://www.hboehm.info/gc/). The bindings mostly work, except for when > >> either bdwgc or python's garbage collector decide to get into an > argument > >> over what is garbage and what isn't, in which case I get a segfault > because > >> one or the other collector has already reaped the memory. > > > > Make your Python C objects contain a pointer to a > > GC_MALLOC_UNCOLLECTABLE block that contains a pointer to the > > bwdgc object it's an interface to? And GC_FREE it in tp_dealloc? > > Then bwdgc won't free any C memory that Python is referencing. > > OK, I realized today that there was a miscommunication somewhere. My > python code is all pure python, and the library is pure C, and it is not > designed to be called by python (it's intended to be language neutral, so > if someone wants to call it from a different language, they can). That > means that tp_dealloc (which is part of the python C API) is probably not > going to work. > > I got interrupted (again) so I didn't have a chance to try the next trick > and register the ctypes objects as roots from which to scan in bdwgc, but > I'm hoping that roots aren't removed. If that works, I'll post it to the > list. > > Thanks, > Cem Karan I'm still working on fixing the battle of the garbage collectors, but as a part of that work I've realized that it would be handy for me to subclass various ctypes like so: """ from ctypes import * class foo(c_uint64): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) #Additional setup work here, including marking this type as something bdwgc should alone def __del__(self): #Allow anything not owned by python to be reclaimed by bdwgc super().__del__() """ Where the additional work would shift the type from being in the root set to out of it, and (I hope) stopping the battle of the garbage collectors. The issue is that while the above code works in python 3.4 and earlier, I get the following from python 3.6.1: """ Traceback (most recent call last): File "", line 1, in TypeError: __class__ set to defining 'foo' as """ Is this the way of the future, or is this a bug that should be reported appropriately? Relevant info: """ $ python Python 3.6.1 (default, Apr 24 2017, 08:00:07) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> exit() $ uname -a Darwin Mac-Pro.local 16.5.0 Darwin Kernel Version 16.5.0: Fri Mar 3 16:52:33 PST 2017; root:xnu-3789.51.2~3/RELEASE_X86_64 x86_64 $ sw_vers ProductName:Mac OS X ProductVersion:10.12.4 BuildVersion:16E195 """ Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Jun 21, 2017 1:38 AM, "Paul Rubin" wrote: Cem Karan writes: > I'm not too sure how much of performance impact that will have. My > code generates a very large number of tiny, short-lived objects at a > fairly high rate of speed throughout its lifetime. At least in the > last iteration of the code, garbage collection consumed less than 1% > of the total runtime. Maybe this is something that needs to be done > and profiled to see how well it works? If the gc uses that little runtime and your app isn't suffering from the added memory fragmentation, then it sounds like you're doing fine. Yes, and this is why I suspect CPython would work well too. My usage pattern may be similar to Python usage patterns. The only way to know for sure is to try it and see what happens. > I **still** can't figure out how they managed to do it, How it works (i.e. what the implementation does) is quite simple and understandable. The amazing thing is that it doesn't leak memory catastrophically. I'll have to read through the code then, just to see what they are doing. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 12:38 AM, "Paul Rubin" wrote: Lawrence D’Oliveiro writes: > while “memory footprint” depends on how much memory is actually being > retained in accessible objects. If the object won't be re-accessed but is still retained by gc, then refcounting won't free it either. > Once again: The trouble with GC is, it doesn’t know when to kick in: > it just keeps on allocating memory until it runs out. When was the last time you encountered a problem like that in practice? It's almost never an issue. "Runs out" means reached an allocation threshold that's usually much smaller than the program's memory region. And as you say, you can always manually trigger a gc if the need arises. I'm with Paul and Steve on this. I've had to do a **lot** of profiling on my simulator to get it to run at a reasonable speed. Memory usage seems to follow an exponential decay curve, hitting a strict maximum that strongly correlates with the number of live objects in a given simulation run. When I draw memory usage graphs, I see sawtooth waves to the memory usage which suggest that the garbage builds up until the GC kicks in and reaps the garbage. In short, only an exceptionally poorly written GC would exhaust memory before reaping garbage. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 9:32 AM, "Chris Angelico" wrote: On Thu, Jun 22, 2017 at 11:24 PM, CFK wrote: > When > I draw memory usage graphs, I see sawtooth waves to the memory usage which > suggest that the garbage builds up until the GC kicks in and reaps the > garbage. Interesting. How do you actually measure this memory usage? Often, when a GC frees up memory, it's merely made available for subsequent allocations, rather than actually given back to the system - all it takes is one still-used object on a page and the whole page has to be retained. As such, a "create and drop" usage model would tend to result in memory usage going up for a while, but then remaining stable, as all allocations are being fulfilled from previously-released memory that's still owned by the process. I'm measuring it using a bit of a hack; I use psutil.Popen ( https://pypi.python.org/pypi/psutil) to open a simulation as a child process, and in a tight loop gather the size of the resident set and the number of virtual pages currently in use of the child. The sawtooths are about 10% (and decreasing) of the size of the overall memory usage, and are probably due to different stages of the simulation doing different things. That is an educated guess though, I don't have strong evidence to back it up. And, yes, what you describe is pretty close to what I'm seeing. The longer the simulation has been running, the smoother the memory usage gets. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 4:03 PM, "Chris Angelico" wrote: On Fri, Jun 23, 2017 at 5:22 AM, CFK wrote: > On Jun 22, 2017 9:32 AM, "Chris Angelico" wrote: > > On Thu, Jun 22, 2017 at 11:24 PM, CFK wrote: >> When >> I draw memory usage graphs, I see sawtooth waves to the memory usage which >> suggest that the garbage builds up until the GC kicks in and reaps the >> garbage. > > Interesting. How do you actually measure this memory usage? Often, > when a GC frees up memory, it's merely made available for subsequent > allocations, rather than actually given back to the system - all it > takes is one still-used object on a page and the whole page has to be > retained. > > As such, a "create and drop" usage model would tend to result in > memory usage going up for a while, but then remaining stable, as all > allocations are being fulfilled from previously-released memory that's > still owned by the process. > > > I'm measuring it using a bit of a hack; I use psutil.Popen > (https://pypi.python.org/pypi/psutil) to open a simulation as a child > process, and in a tight loop gather the size of the resident set and the > number of virtual pages currently in use of the child. The sawtooths are > about 10% (and decreasing) of the size of the overall memory usage, and are > probably due to different stages of the simulation doing different things. > That is an educated guess though, I don't have strong evidence to back it > up. > > And, yes, what you describe is pretty close to what I'm seeing. The longer > the simulation has been running, the smoother the memory usage gets. Ah, I think I understand. So the code would be something like this: Phase one: Create a bunch of objects Do a bunch of simulation Destroy a bunch of objects Simulate more Destroy all the objects used in this phase, other than the result Phase two: Like phase one In that case, yes, it's entirely possible that the end of a phase could signal a complete cleanup of intermediate state, with the consequent release of memory to the system. (Or, more likely, a near-complete cleanup, with release of MOST of memory.) Very cool bit of analysis you've done there. Thank you! And, yes, that is essentially what is going on (or was in that version of the simulator; I'm in the middle of a big refactor to speed things up and expect the memory usage patterns to change) Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Battle of the garbage collectors, or ARGGHHHHHH!!!!
On Wed, Apr 26, 2017 at 10:38 PM, Cem Karan wrote: > > On Apr 24, 2017, at 8:54 PM, Jon Ribbens > wrote: > > > On 2017-04-24, CFK wrote: > >> Long version: I'm trying to write bindings for python via ctypes to > control > >> a library written in C that uses the bdwgc garbage collector ( > >> http://www.hboehm.info/gc/). The bindings mostly work, except for when > >> either bdwgc or python's garbage collector decide to get into an > argument > >> over what is garbage and what isn't, in which case I get a segfault > because > >> one or the other collector has already reaped the memory. > > > > Make your Python C objects contain a pointer to a > > GC_MALLOC_UNCOLLECTABLE block that contains a pointer to the > > bwdgc object it's an interface to? And GC_FREE it in tp_dealloc? > > Then bwdgc won't free any C memory that Python is referencing. > > OK, I realized today that there was a miscommunication somewhere. My > python code is all pure python, and the library is pure C, and it is not > designed to be called by python (it's intended to be language neutral, so > if someone wants to call it from a different language, they can). That > means that tp_dealloc (which is part of the python C API) is probably not > going to work. > > I got interrupted (again) so I didn't have a chance to try the next trick > and register the ctypes objects as roots from which to scan in bdwgc, but > I'm hoping that roots aren't removed. If that works, I'll post it to the > list. > > Thanks, > Cem Karan Right, apparently I win at the 'late reply' game. That said, I wanted to give Jon Ribbens credit for his idea, because it was very close to what I used in the end. The only difference is that I also used weakref.finalize() to tie a finalizer to the lifetime of the ctypes pointer that I was using. The finalizer called GC_FREE() to free the uncollectable block, which allowed the C allocator to cleanup memory. The only thing I never figured out was how to get a C block to hold onto python memory. I didn't need it, but it felt like it would make for a nice duality with the method above. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
