If you convert an array of strings to datetime64s and 'NaT' (or one of
its variants) appears in the string, all subsequent values are
rendered as NaT:
(this is in 1.7.1 but the problem is present in current dev version as well)
>>> import numpy as np
>>> a = np.array(['2010', 'nat', '2030'])
>>> a.astype(np.datetime64)
array(['2010', 'NaT', 'NaT'], dtype='datetime64[Y]')
The fix is to re-initalize 'dt' inside the loop in
_strided_to_strided_string_to_datetime
(patch attached)
Correct behavior (with patch):
>>> import numpy as np
>>> a=np.array(['2010', 'nat', '2020'])
>>> a.astype(np.datetime64)
array(['2010', 'NaT', '2020'], dtype='datetime64[Y]')
>>>
diff --git a/numpy/core/src/multiarray/dtype_transfer.c b/numpy/core/src/multiarray/dtype_transfer.c
index f758139..3bd362c 100644
--- a/numpy/core/src/multiarray/dtype_transfer.c
+++ b/numpy/core/src/multiarray/dtype_transfer.c
@@ -884,12 +884,13 @@ _strided_to_strided_string_to_datetime(char *dst, npy_intp dst_stride,
NpyAuxData *data)
{
_strided_datetime_cast_data *d = (_strided_datetime_cast_data *)data;
- npy_int64 dt = ~NPY_DATETIME_NAT;
npy_datetimestruct dts;
char *tmp_buffer = d->tmp_buffer;
char *tmp;
while (N > 0) {
+ npy_int64 dt = ~NPY_DATETIME_NAT;
+
/* Replicating strnlen with memchr, because Mac OS X lacks it */
tmp = memchr(src, '\0', src_itemsize);
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion