On Sep 11, 2015 6:35 AM, "Anne Archibald" <archib...@astron.nl> wrote: > > > > On Fri, Sep 11, 2015 at 3:20 PM Sebastian Berg <sebast...@sipsolutions.net> wrote: >> >> On Fr, 2015-09-11 at 13:10 +0000, Daniel Manson wrote: >> > Originally posted as issue 6301 on github. >> > >> > >> > Presumably any block of code that modifies an ndarray's buffer is >> > wrapped in a (thread safe?) check of the writable flag. Would it be >> > possible to hold a counter rather than a simple bool flag and then >> > increment the counter whenever you test the flag? Hopefully this would >> > introduce only a tiny additional overhead, but would permit >> > pseudo-hashing to test whether a mutable array has changed since you >> > last encountered it. >> > >> >> Just a quick note. This is a bit more complex then it might appear. The >> reason being that when a view of the array is changed, you would have to >> "notify" the array itself that it has changed. So propagation from top >> to bottom does not seem straight forward to me. (the other way is fine, >> since on check you could check all parents, but you cannot check all >> children). > > > Actually not so much. Like the writable flag, you'd make the counter be a per-buffer piece of information. Each array already has a pointer to the array object that "owns" the buffer, so you'd just go there in one hop. This does mean that modifying one view would affect the modified flag on all views sharing the same buffer, whether there's data overlap or not, but for caching purposes that's not so bad.
Weirdly, the writeable flag is maintained on a per-view basis -- you can have a read only view of a writeable array, or even vice-versa (e.g. if you set the base array to be read-only after creating the view). This doesn't matter for this purpose though -- it just means that if you want to detect array changes, you can't assume that read-only arrays never change. > I think a more serious concern is that it may be customary to simply check the writable flag by hand rather than calling an is_writable function, so to make this idea work you'd have to change all code that checks the writable flag, including user code. There is PyArray_FailUnlessWriteable (which must be called with the GIL held, because it can raise an exception, so thread safety is OK). It's used consistently inside numpy itself these days. But that's new api in 1.6, so yeah, there is surely a ton of legacy code out there that is doing its own checking. I guess we'd have to somehow deprecate PyArray_ISWRITEABLE and other methods of direct flag checking, and wait for people to switch over? There are some arguments for doing this anyway I suppose... though it would certainly be a hassle and take a while. > You'd also have to make sure that all code that tries to write to an array really checks the writable flag. For purposes of something like updating Spyder's gui, I guess it might be OK if buggy code that doesn't check the RO flag is also buggy in that it fails to update the GUI? Maybe it would even cause people to finally notice the bug and get it fixed? Agreed that you certainly could not rely on this to be 100% accurate in practice though :-/ > Rather than making this happen for all arrays, does it make sense to use an array subclass with a "dirty flag", maybe even if this requires manual setting in some cases? I think the problem with this would be that you can have an ndarray view of your special subclass, or vice-versa, and code that goes via the regular ndarray will not update your flag. And unfortunately that includes most numpy functions implicitly (e.g. np.sin(a, out=b) doesn't much care whether b is a subclass, it's just going to blindly do its thing). Actually, now that I think about it there is a much worse general version of this problem, which probably kills the idea dead. Via the buffer interface (among others), it is totally legal and encouraged to create non-numpy views of numpy arrays. They'll check the flags once at creation time, but after that they're free to write directly to the underlying buffer whenever they please: a = np.ones(10) a2 = memoryview(a) a2[0] = 0 # now what? -n
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion