Package: dask
Version: 2022.01.0+dfsg-1
Severity: serious
The autopkgtest for dask is once-again failing on 32-bit architectures,
this is blocking the fix for
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005962
from migrating to testing.
> assert buf.getvalue() == expected
E assert "<class 'dask...312.0 bytes\n" == "<class 'dask...496.0 bytes\n"
E <class 'dask.dataframe.core.DataFrame'>
E Int64Index: 4 entries, 0 to 3
E Data columns (total 3 columns):
E # Column Non-Null Count Dtype
E --- ------ -------------- -----
E 0 x 4 non-null int64
E 1 y 4 non-null category...
E
E ...Full output truncated (7 lines hidden), use '-vv' to show
/usr/lib/python3/dist-packages/dask/dataframe/tests/test_dataframe.py:3612:
AssertionError
I think the most likely explanation is that the memory usage of the
data structure is indeed lower on 32-bit architectures and that this
does not indicate a bug. As such I have prepared a patch that adjusts
the expected result in the test based on the pointer size on the current
architecture.
diff -Nru dask-2022.01.0+dfsg/debian/changelog
dask-2022.01.0+dfsg/debian/changelog
--- dask-2022.01.0+dfsg/debian/changelog 2022-02-21 01:11:53.000000000
+0000
+++ dask-2022.01.0+dfsg/debian/changelog 2022-02-27 07:27:00.000000000
+0000
@@ -1,3 +1,11 @@
+dask (2022.01.0+dfsg-1.1) UNRELEASED; urgency=medium
+
+ * Non-maintainer upload.
+ * Adjust test_dataframe.py for different data structure sizes on 32-bit
+ architectures.
+
+ -- Peter Michael Green <plugw...@debian.org> Sun, 27 Feb 2022 07:27:00 +0000
+
dask (2022.01.0+dfsg-1) unstable; urgency=medium
* New upstream release
diff -Nru dask-2022.01.0+dfsg/debian/patches/series
dask-2022.01.0+dfsg/debian/patches/series
--- dask-2022.01.0+dfsg/debian/patches/series 2022-02-20 22:18:36.000000000
+0000
+++ dask-2022.01.0+dfsg/debian/patches/series 2022-02-27 07:27:00.000000000
+0000
@@ -9,3 +9,4 @@
use-youtube-nocookie.patch
reproducible-config-autofunction.patch
32bit-comatibility.patch
+test-dataframe-32-bit.patch
diff -Nru dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch
dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch
--- dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch
1970-01-01 00:00:00.000000000 +0000
+++ dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch
2022-02-27 07:27:00.000000000 +0000
@@ -0,0 +1,25 @@
+Description: Adjust test_dataframe.py for different data structure sizes on
32-bit architectures.
+Author: Peter Michael Green <plugw...@debian.org>
+
+Index: dask-2022.01.0+dfsg/dask/dataframe/tests/test_dataframe.py
+===================================================================
+--- dask-2022.01.0+dfsg.orig/dask/dataframe/tests/test_dataframe.py
++++ dask-2022.01.0+dfsg/dask/dataframe/tests/test_dataframe.py
+@@ -3597,6 +3597,8 @@ def test_categorize_info():
+ # Verbose=False
+ buf = StringIO()
+ ddf.info(buf=buf, verbose=True)
++ pointersize = np.array([0],dtype=np.uintp).itemsize
++ bytecount = 128 + pointersize * 46
+ expected = (
+ "<class 'dask.dataframe.core.DataFrame'>\n"
+ "Int64Index: 4 entries, 0 to 3\n"
+@@ -3607,7 +3609,7 @@ def test_categorize_info():
+ " 1 y 4 non-null category\n"
+ " 2 z 4 non-null object\n"
+ "dtypes: category(1), object(1), int64(1)\n"
+- "memory usage: 496.0 bytes\n"
++ "memory usage: "+str(bytecount)+".0 bytes\n"
+ )
+ assert buf.getvalue() == expected
+