Package: dask
Version: 2022.01.0+dfsg-1
Severity: serious

The autopkgtest for dask is once-again failing on 32-bit architectures,
this is blocking the fix for https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005962
from migrating to testing.

>       assert buf.getvalue() == expected
E       assert "<class 'dask...312.0 bytes\n" == "<class 'dask...496.0 bytes\n"
E           <class 'dask.dataframe.core.DataFrame'>
E           Int64Index: 4 entries, 0 to 3
E           Data columns (total 3 columns):
E            #   Column  Non-Null Count  Dtype
E           ---  ------  --------------  -----
E            0   x       4 non-null      int64
E            1   y       4 non-null      category...
E
E         ...Full output truncated (7 lines hidden), use '-vv' to show

/usr/lib/python3/dist-packages/dask/dataframe/tests/test_dataframe.py:3612: 
AssertionError
I think the most likely explanation is that the memory usage of the
data structure is indeed lower on 32-bit architectures and that this
does not indicate a bug. As such I have prepared a patch that adjusts
the expected result in the test based on the pointer size on the current
architecture.

diff -Nru dask-2022.01.0+dfsg/debian/changelog 
dask-2022.01.0+dfsg/debian/changelog
--- dask-2022.01.0+dfsg/debian/changelog        2022-02-21 01:11:53.000000000 
+0000
+++ dask-2022.01.0+dfsg/debian/changelog        2022-02-27 07:27:00.000000000 
+0000
@@ -1,3 +1,11 @@
+dask (2022.01.0+dfsg-1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Adjust test_dataframe.py for different data structure sizes on 32-bit
+    architectures.
+
+ -- Peter Michael Green <plugw...@debian.org>  Sun, 27 Feb 2022 07:27:00 +0000
+
 dask (2022.01.0+dfsg-1) unstable; urgency=medium
 
   * New upstream release
diff -Nru dask-2022.01.0+dfsg/debian/patches/series 
dask-2022.01.0+dfsg/debian/patches/series
--- dask-2022.01.0+dfsg/debian/patches/series   2022-02-20 22:18:36.000000000 
+0000
+++ dask-2022.01.0+dfsg/debian/patches/series   2022-02-27 07:27:00.000000000 
+0000
@@ -9,3 +9,4 @@
 use-youtube-nocookie.patch
 reproducible-config-autofunction.patch
 32bit-comatibility.patch
+test-dataframe-32-bit.patch
diff -Nru dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch 
dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch
--- dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch      
1970-01-01 00:00:00.000000000 +0000
+++ dask-2022.01.0+dfsg/debian/patches/test-dataframe-32-bit.patch      
2022-02-27 07:27:00.000000000 +0000
@@ -0,0 +1,25 @@
+Description:  Adjust test_dataframe.py for different data structure sizes on 
32-bit architectures.
+Author: Peter Michael Green <plugw...@debian.org>
+
+Index: dask-2022.01.0+dfsg/dask/dataframe/tests/test_dataframe.py
+===================================================================
+--- dask-2022.01.0+dfsg.orig/dask/dataframe/tests/test_dataframe.py
++++ dask-2022.01.0+dfsg/dask/dataframe/tests/test_dataframe.py
+@@ -3597,6 +3597,8 @@ def test_categorize_info():
+     # Verbose=False
+     buf = StringIO()
+     ddf.info(buf=buf, verbose=True)
++    pointersize = np.array([0],dtype=np.uintp).itemsize
++    bytecount = 128 + pointersize * 46
+     expected = (
+         "<class 'dask.dataframe.core.DataFrame'>\n"
+         "Int64Index: 4 entries, 0 to 3\n"
+@@ -3607,7 +3609,7 @@ def test_categorize_info():
+         " 1   y       4 non-null      category\n"
+         " 2   z       4 non-null      object\n"
+         "dtypes: category(1), object(1), int64(1)\n"
+-        "memory usage: 496.0 bytes\n"
++        "memory usage: "+str(bytecount)+".0 bytes\n"
+     )
+     assert buf.getvalue() == expected
+ 

Reply via email to