Re: [Numpy-discussion] feature request - increment counter on write check

Nathaniel Smith Fri, 11 Sep 2015 10:44:55 -0700

On Sep 11, 2015 6:35 AM, "Anne Archibald" <archib...@astron.nl> wrote:
>
>
>
> On Fri, Sep 11, 2015 at 3:20 PM Sebastian Berg <sebast...@sipsolutions.net>
wrote:
>>
>> On Fr, 2015-09-11 at 13:10 +0000, Daniel Manson wrote:
>> > Originally posted as issue 6301 on github.
>> >
>> >
>> > Presumably any block of code that modifies an ndarray's buffer is
>> > wrapped in a (thread safe?) check of the writable flag. Would it be
>> > possible to hold a counter rather than a simple bool flag and then
>> > increment the counter whenever you test the flag? Hopefully this would
>> > introduce only a tiny additional overhead, but would permit
>> > pseudo-hashing to test whether a mutable array has changed since you
>> > last encountered it.
>> >
>>
>> Just a quick note. This is a bit more complex then it might appear. The
>> reason being that when a view of the array is changed, you would have to
>> "notify" the array itself that it has changed. So propagation from top
>> to bottom does not seem straight forward to me. (the other way is fine,
>> since on check you could check all parents, but you cannot check all
>> children).
>
>
> Actually not so much. Like the writable flag, you'd make the counter be a
per-buffer piece of information. Each array already has a pointer to the
array object that "owns" the buffer, so you'd just go there in one hop.
This does mean that modifying one view would affect the modified flag on
all views sharing the same buffer, whether there's data overlap or not, but
for caching purposes that's not so bad.


Weirdly, the writeable flag is maintained on a per-view basis -- you can
have a read only view of a writeable array, or even vice-versa (e.g. if you
set the base array to be read-only after creating the view). This doesn't
matter for this purpose though -- it just means that if you want to detect
array changes, you can't assume that read-only arrays never change.

> I think a more serious concern is that it may be customary to simply
check the writable flag by hand rather than calling an is_writable
function, so to make this idea work you'd have to change all code that
checks the writable flag, including user code.

There is PyArray_FailUnlessWriteable (which must be called with the GIL
held, because it can raise an exception, so thread safety is OK). It's used
consistently inside numpy itself these days. But that's new api in 1.6, so
yeah, there is surely a ton of legacy code out there that is doing its own
checking.

I guess we'd have to somehow deprecate PyArray_ISWRITEABLE and other
methods of direct flag checking, and wait for people to switch over? There
are some arguments for doing this anyway I suppose... though it would
certainly be a hassle and take a while.

> You'd also have to make sure that all code that tries to write to an
array really checks the writable flag.

For purposes of something like updating Spyder's gui, I guess it might be
OK if buggy code that doesn't check the RO flag is also buggy in that it
fails to update the GUI? Maybe it would even cause people to finally notice
the bug and get it fixed? Agreed that you certainly could not rely on this
to be 100% accurate in practice though :-/

> Rather than making this happen for all arrays, does it make sense to use
an array subclass with a "dirty flag", maybe even if this requires manual
setting in some cases?

I think the problem with this would be that you can have an ndarray view of
your special subclass, or vice-versa, and code that goes via the regular
ndarray will not update your flag. And unfortunately that includes most
numpy functions implicitly (e.g. np.sin(a, out=b) doesn't much care whether
b is a subclass, it's just going to blindly do its thing).

Actually, now that I think about it there is a much worse general version
of this problem, which probably kills the idea dead. Via the buffer
interface (among others), it is totally legal and encouraged to create
non-numpy views of numpy arrays. They'll check the flags once at creation
time, but after that they're free to write directly to the underlying
buffer whenever they please:

a = np.ones(10)
a2 = memoryview(a)
a2[0] = 0
# now what?

-n

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] feature request - increment counter on write check

Reply via email to