Public bug reported:

Hi,
I was checking a build fail in Ubuntu on armhf.
=> 
https://launchpad.net/ubuntu/+source/netgen/6.2.2006+really6.2.1905+dfsg-2/+build/20717107
It worked fine for the actual build, but then crashes in the self tests:

$ export 
PYTHONPATH="$PYTHONPATH:/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/python3/dist-packages"
$ apt install python3-tk python3-numpy
$ cd ~/netgen-6.2.2006+really6.2.1905+dfsg/tests/pytest
$ 
LD_LIBRARY_PATH=/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/$DEB_HOST_MULTIARCH
 python3 -m pytest -k test_pickling -s
...
test_pickling.py Bus error (core dumped)
This seems to be 100% reproducible, if one follow the steps that the Debian 
package build does.

The other tests pass

test_pickling.py::test_pickle_stl PASSED
test_pickling.py::test_pickle_occ PASSED
test_pickling.py::test_pickle_geom2d PASSED
test_pickling.py::test_pickle_mesh PASSED
Just test_pickle_csg fails.
And in this test the failing line is: geo_dump = pickle.dumps(geo)
With geo being <netgen.libngpy._csg.CSGeometry object at 0xf6da99b0>

Running that in python3-dbg and gdb into the core file shows the pickling
deep into netgen's code (which is better than a generic pickling issue I guess)

#0  0xf659c99e in ngcore::BinaryOutArchive::Write<double> (x=10000000000, 
this=0xffa90cc4) at ./libsrc/stlgeom/../general/../core/archive.hpp:732
#1  ngcore::BinaryOutArchive::operator& (this=0xffa90cc4, d=@0x26aa6d8: 
10000000000) at ./libsrc/stlgeom/../general/../core/archive.hpp:681
#2  0xf641d4de in netgen::Surface::DoArchive (archive=..., this=0x26aa6d0) at 
./libsrc/csg/surface.hpp:68
#3  netgen::OneSurfacePrimitive::DoArchive (archive=..., this=0x26aa6d0) at 
./libsrc/csg/surface.hpp:344
#4  netgen::QuadraticSurface::DoArchive (this=0x26aa6d0, ar=...) at 
./libsrc/csg/algprim.hpp:52
#5  0xf641dc00 in netgen::Sphere::DoArchive (this=0x26aa6d0, ar=...) at 
./libsrc/csg/algprim.hpp:151
#6  0xf6434c28 in ngcore::Archive::operator&<netgen::Surface, void> (val=..., 
this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307
#7  ngcore::Archive::operator&<netgen::Surface> (this=this@entry=0xffa90cc4, 
p=@0x2727718: 0x26aa6d0) at ./libsrc/csg/../general/../core/archive.hpp:490
#8  0xf6430dca in ngcore::Archive::Do<netgen::Surface*, void> (n=<optimized 
out>, data=<optimized out>, this=0xffa90cc4) at 
./libsrc/csg/../general/../core/archive.hpp:280
#9  ngcore::Archive::operator&<netgen::Surface*> (v=std::vector of length 32, 
capacity 32 = {...}, this=0xffa90cc4) at 
./libsrc/csg/../general/../core/archive.hpp:209
#10 ngcore::SymbolTable<netgen::Surface*>::DoArchive<netgen::Surface*> (ar=..., 
this=0x2843c64) at ./libsrc/csg/../general/../core/symboltable.hpp:44
#11 ngcore::Archive::operator&<ngcore::SymbolTable<netgen::Surface*>, void> 
(val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307
#12 netgen::CSGeometry::DoArchive (this=0x2843c60, archive=...) at 
./libsrc/csg/csgeom.cpp:329
#13 0xf648a958 in ngcore::Archive::operator&<netgen::CSGeometry, void> 
(val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:305
#14 ngcore::Archive::operator&<netgen::CSGeometry> (this=this@entry=0xffa90cc4, 
p=@0xffa90ba4: 0x2843c60) at ./libsrc/csg/../general/../core/archive.hpp:518
#15 0xf64a4218 in ngcore::NGSPickle<netgen::CSGeometry, 
ngcore::BinaryOutArchive, 
ngcore::BinaryInArchive>()::{lambda(netgen::CSGeometry*)#1}::operator()(netgen::CSGeometry*)
 const (
    self=<optimized out>, this=<optimized out>) at 
/usr/include/pybind11/pytypes.h:199
....
That is:
./libsrc/stlgeom/../general/../core/archive.hpp:732

 721   private:                                                                 
      
 722     template <typename T>                                                  
      
 723     Archive & Write (T x)                                                  
      
 724     {                                                                      
      
 725       if (unlikely(ptr > BUFFERSIZE-sizeof(T)))                            
      
 726         {                                                                  
      
 727           stream->write(&buffer[0], ptr);                                  
      
 728           *reinterpret_cast<T*>(&buffer[0]) = x; // NOLINT                 
      
 729           ptr = sizeof(T);                                                 
      
 730           return *this;                                                    
      
 731         }                                                                  
      
 732       *reinterpret_cast<T*>(&buffer[ptr]) = x; // NOLINT                   
      
 733       ptr += sizeof(T);                                                    
      
 734       return *this;                                                        
      
 735     }                                                                      
      
 736   }; 
With the variables in the crash file being:
(gdb) p &buffer
$5 = (std::array<char, 1024> *) 0xffa90d40
(gdb) p ptr
$3 = 1

Depending on how the real code (not gdb on the crash file) interprets
this pointer addition that might explain the SigBus as it reflects
unaligned access and if it adds that up to just "0xffa90d41" (which
happens in gdb) then it fails.

I'm a bit lost as .hpp backends to serialize/pickle python files really isn't 
my home turf :-/
Therefore I wanted to reach out to you as experts on netgen if this makes sense 
to you.
I can keep the repro-systems around for a while, so if you have debug-questions 
or small modifications to try I should be able test them.

P.S. The reason this didn't show up in the past is because before the
tests were not correctly run at build time, the last Debian upload fixed
that and since then it is an FTFBS. But it seems not to trigger in all
environments, e.g. in the Debian builds it did not crash the same way.

FYI: I'm not entirely sure, there also is this recent bug about
unaligned access - but the logs linked there didn't look to be "the
same". Still as FYI: https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=984439

Note: I've reported the very same bug upstream and will link it, this LP
bug is meant as tracker to be found via the update-excuse tag.

** Affects: netgen
     Importance: Unknown
         Status: Unknown

** Affects: netgen (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: opencascade (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: netgen (Debian)
     Importance: Unknown
         Status: Unknown


** Tags: update-excuse

** Bug watch added: github.com/NGSolve/netgen/issues #89
   https://github.com/NGSolve/netgen/issues/89

** Also affects: netgen via
   https://github.com/NGSolve/netgen/issues/89
   Importance: Unknown
       Status: Unknown

** Bug watch added: Debian Bug tracker #984439
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439

** Also affects: netgen (Debian) via
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439
   Importance: Unknown
       Status: Unknown

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1919335

Title:
  FTBFS (test fail with sigbus) on armhf in Hirsute

To manage notifications about this bug go to:
https://bugs.launchpad.net/netgen/+bug/1919335/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to