hi there,
I have written a Go package[1] that can read/write simple arrays in the numpy
file format [2].
when I wrote it, it was for simple interoperability use cases, but now people
would like to be able to read back ragged-arrays[3].
unless I am mistaken, this means I need to interpret pieces of pickled data
(`ndarray`, `multiarray` and `dtype`).
so I am trying to understand how to unpickle `dtype` values that have been
pickled:
```python
import numpy as np
import pickle
import pickletools as pt
pt.dis(pickle.dumps(np.dtype("int32"), protocol=4), annotate=True)
```
gives:
```
0: \x80 PROTO 4 Protocol version indicator.
2: \x95 FRAME 55 Indicate the beginning of a new frame.
11: \x8c SHORT_BINUNICODE 'numpy' Push a Python Unicode string object.
18: \x94 MEMOIZE(as 0)Store the stack top into the memo. The
stack is not popped.
19: \x8c SHORT_BINUNICODE 'dtype' Push a Python Unicode string object.
26: \x94 MEMOIZE(as 1)Store the stack top into the memo. The
stack is not popped.
27: \x93 STACK_GLOBAL Push a global object (module.attr) on the
stack.
28: \x94 MEMOIZE(as 2)Store the stack top into the memo. The
stack is not popped.
29: \x8c SHORT_BINUNICODE 'i4'Push a Python Unicode string object.
33: \x94 MEMOIZE(as 3)Store the stack top into the memo. The
stack is not popped.
34: \x89 NEWFALSE Push False onto the stack.
35: \x88 NEWTRUE Push True onto the stack.
36: \x87 TUPLE3 Build a three-tuple out of the top three
items on the stack.
37: \x94 MEMOIZE(as 4)Store the stack top into the memo. The
stack is not popped.
38: RREDUCE Push an object built from a callable and
an argument tuple.
39: \x94 MEMOIZE(as 5)Store the stack top into the memo. The
stack is not popped.
40: (MARK Push markobject onto the stack.
41: KBININT13 Push a one-byte unsigned integer.
43: \x8c SHORT_BINUNICODE '<' Push a Python Unicode string object.
46: \x94 MEMOIZE(as 6)Store the stack top into the memo. The
stack is not popped.
47: NNONE Push None on the stack.
48: NNONE Push None on the stack.
49: NNONE Push None on the stack.
50: JBININT -1Push a four-byte signed integer.
55: JBININT -1Push a four-byte signed integer.
60: KBININT10 Push a one-byte unsigned integer.
62: tTUPLE (MARK at 40) Build a tuple out of the topmost stack
slice, after markobject.
63: \x94 MEMOIZE(as 7) Store the stack top into the memo. The
stack is not popped.
64: bBUILD Finish building an object, via
__setstate__ or dict update.
65: .STOPStop the unpickling machine.
highest protocol among opcodes = 4
```
I have tried to find the usual `__reduce__` and `__setstate__` methods to
understand what are the various arguments, to no avail.
so, in :
```python
>>> np.dtype("int32").__reduce__()[1]
('i4', False, True)
>>> np.dtype("int32").__reduce__()[2]
(3, '<', None, None, None, -1, -1, 0)
```
what are the meaning of the various arguments ?
thanks in advance,
sebastien.
[1] https://github.com/sbinet/npyio
[2] https://numpy.org/neps/nep-0001-npy-format.html
[3] https://github.com/sbinet/npyio/issues/20
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com