Public bug reported: Hi, I was checking a build fail in Ubuntu on armhf. => https://launchpad.net/ubuntu/+source/netgen/6.2.2006+really6.2.1905+dfsg-2/+build/20717107 It worked fine for the actual build, but then crashes in the self tests:
$ export PYTHONPATH="$PYTHONPATH:/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/python3/dist-packages" $ apt install python3-tk python3-numpy $ cd ~/netgen-6.2.2006+really6.2.1905+dfsg/tests/pytest $ LD_LIBRARY_PATH=/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/$DEB_HOST_MULTIARCH python3 -m pytest -k test_pickling -s ... test_pickling.py Bus error (core dumped) This seems to be 100% reproducible, if one follow the steps that the Debian package build does. The other tests pass test_pickling.py::test_pickle_stl PASSED test_pickling.py::test_pickle_occ PASSED test_pickling.py::test_pickle_geom2d PASSED test_pickling.py::test_pickle_mesh PASSED Just test_pickle_csg fails. And in this test the failing line is: geo_dump = pickle.dumps(geo) With geo being <netgen.libngpy._csg.CSGeometry object at 0xf6da99b0> Running that in python3-dbg and gdb into the core file shows the pickling deep into netgen's code (which is better than a generic pickling issue I guess) #0 0xf659c99e in ngcore::BinaryOutArchive::Write<double> (x=10000000000, this=0xffa90cc4) at ./libsrc/stlgeom/../general/../core/archive.hpp:732 #1 ngcore::BinaryOutArchive::operator& (this=0xffa90cc4, d=@0x26aa6d8: 10000000000) at ./libsrc/stlgeom/../general/../core/archive.hpp:681 #2 0xf641d4de in netgen::Surface::DoArchive (archive=..., this=0x26aa6d0) at ./libsrc/csg/surface.hpp:68 #3 netgen::OneSurfacePrimitive::DoArchive (archive=..., this=0x26aa6d0) at ./libsrc/csg/surface.hpp:344 #4 netgen::QuadraticSurface::DoArchive (this=0x26aa6d0, ar=...) at ./libsrc/csg/algprim.hpp:52 #5 0xf641dc00 in netgen::Sphere::DoArchive (this=0x26aa6d0, ar=...) at ./libsrc/csg/algprim.hpp:151 #6 0xf6434c28 in ngcore::Archive::operator&<netgen::Surface, void> (val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307 #7 ngcore::Archive::operator&<netgen::Surface> (this=this@entry=0xffa90cc4, p=@0x2727718: 0x26aa6d0) at ./libsrc/csg/../general/../core/archive.hpp:490 #8 0xf6430dca in ngcore::Archive::Do<netgen::Surface*, void> (n=<optimized out>, data=<optimized out>, this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:280 #9 ngcore::Archive::operator&<netgen::Surface*> (v=std::vector of length 32, capacity 32 = {...}, this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:209 #10 ngcore::SymbolTable<netgen::Surface*>::DoArchive<netgen::Surface*> (ar=..., this=0x2843c64) at ./libsrc/csg/../general/../core/symboltable.hpp:44 #11 ngcore::Archive::operator&<ngcore::SymbolTable<netgen::Surface*>, void> (val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307 #12 netgen::CSGeometry::DoArchive (this=0x2843c60, archive=...) at ./libsrc/csg/csgeom.cpp:329 #13 0xf648a958 in ngcore::Archive::operator&<netgen::CSGeometry, void> (val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:305 #14 ngcore::Archive::operator&<netgen::CSGeometry> (this=this@entry=0xffa90cc4, p=@0xffa90ba4: 0x2843c60) at ./libsrc/csg/../general/../core/archive.hpp:518 #15 0xf64a4218 in ngcore::NGSPickle<netgen::CSGeometry, ngcore::BinaryOutArchive, ngcore::BinaryInArchive>()::{lambda(netgen::CSGeometry*)#1}::operator()(netgen::CSGeometry*) const ( self=<optimized out>, this=<optimized out>) at /usr/include/pybind11/pytypes.h:199 .... That is: ./libsrc/stlgeom/../general/../core/archive.hpp:732 721 private: 722 template <typename T> 723 Archive & Write (T x) 724 { 725 if (unlikely(ptr > BUFFERSIZE-sizeof(T))) 726 { 727 stream->write(&buffer[0], ptr); 728 *reinterpret_cast<T*>(&buffer[0]) = x; // NOLINT 729 ptr = sizeof(T); 730 return *this; 731 } 732 *reinterpret_cast<T*>(&buffer[ptr]) = x; // NOLINT 733 ptr += sizeof(T); 734 return *this; 735 } 736 }; With the variables in the crash file being: (gdb) p &buffer $5 = (std::array<char, 1024> *) 0xffa90d40 (gdb) p ptr $3 = 1 Depending on how the real code (not gdb on the crash file) interprets this pointer addition that might explain the SigBus as it reflects unaligned access and if it adds that up to just "0xffa90d41" (which happens in gdb) then it fails. I'm a bit lost as .hpp backends to serialize/pickle python files really isn't my home turf :-/ Therefore I wanted to reach out to you as experts on netgen if this makes sense to you. I can keep the repro-systems around for a while, so if you have debug-questions or small modifications to try I should be able test them. P.S. The reason this didn't show up in the past is because before the tests were not correctly run at build time, the last Debian upload fixed that and since then it is an FTFBS. But it seems not to trigger in all environments, e.g. in the Debian builds it did not crash the same way. FYI: I'm not entirely sure, there also is this recent bug about unaligned access - but the logs linked there didn't look to be "the same". Still as FYI: https://bugs.debian.org/cgi- bin/bugreport.cgi?bug=984439 Note: I've reported the very same bug upstream and will link it, this LP bug is meant as tracker to be found via the update-excuse tag. ** Affects: netgen Importance: Unknown Status: Unknown ** Affects: netgen (Ubuntu) Importance: Undecided Status: New ** Affects: opencascade (Ubuntu) Importance: Undecided Status: New ** Affects: netgen (Debian) Importance: Unknown Status: Unknown ** Tags: update-excuse ** Bug watch added: github.com/NGSolve/netgen/issues #89 https://github.com/NGSolve/netgen/issues/89 ** Also affects: netgen via https://github.com/NGSolve/netgen/issues/89 Importance: Unknown Status: Unknown ** Bug watch added: Debian Bug tracker #984439 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439 ** Also affects: netgen (Debian) via https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439 Importance: Unknown Status: Unknown -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1919335 Title: FTBFS (test fail with sigbus) on armhf in Hirsute To manage notifications about this bug go to: https://bugs.launchpad.net/netgen/+bug/1919335/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs