Hi Timo, On Sun, Sep 15, 2024 at 04:30:44PM +0200, Timo Röhling wrote: > * Helmut Grohne <hel...@subdivi.de> [2024-09-09 18:38]: > > > And it will not solve cross-building for packages which interrogate > > > numpy.get_include() directly, as SciPy does in its Meson build. > > > > But this is not a matter of numpy-config then. Would you mind detailing > > how numpy.get_include() works? Generally speaking, this sounds like a > > bad idea in any case. > > The get_include() upstream implementation essentially boils down to > > import numpy._core as _core > return os.path.join(os.path.dirname(_core.__file__), "include")
On my system, _core.file becomes "/usr/lib/python3/dist-packages/numpy/_core/__init__.py" Do you mean numpy.core rather than numpy._core? (Not sure what the difference is.) > This is actually the One True Source from which everything else we discussed > derives, i.e., numpy-config is a wrapper script around numpy._configtool, > which computes --pkgconfigdir as > > Path(numpy.get_include()) / ".." / "lib" / "pkgconfig" This looks next to unfixable to me. If we want to support cross building here, we need to communicate the host architecture in some way. The current upstream way of communicating it is "The architecture of the Python interpreter" and that's bad for cross compiling on Debian. The other way that's really bad is DEB_HOST_* environment variables as that only works in Debian package builds and not when using Debian as a base to cross build software manually. > Therefore, in order to keep everything consistent, my patch [1] primarily > changes the numpy.get_include() behavior and resorts to the aforementioned > $PKG_CONFIG hack in order to determine which path should be returned. Effectively, you propose that the architecture is implicitly communicated via the PKG_CONFIG environment variable. As mentioned elsewhere, this is a very unusual way, but given the options available I more and more see merit here. > This works because I *also* let python3-numpy-dev install a sane numpy.pc at > /usr/lib/${DEB_HOST_MULTIARCH}/pkgconfig (which in turn is the reason why I > need to patch test_configtool in [1]: the assumption that pkgconfigdir > points to the NumPy module tree no longer holds). > > [1] > https://sources.debian.org/src/numpy/1:2.1.1+ds-1/debian/patches/0008-Fix-path-configuration-to-work-with-Debian-specific-.patch/ Let me suggest yet another way! Since numpy is a Python extension, I suspect that the majority of its users is building some dependent Python extension. Is there actually another way of using it? (If yes, the rest may be a bad idea.) We already communicate the host architecture to extension builds via the _PYTHON_SYSCONFIGDATA_NAME environment variable. Once setting it, the sysconfig module behaves differently. For instance the MULTIARCH variable changes. How about changing numpy.get_include() to look into sysconfig for producing an architecture-dependent path and then symlinking the .pc file into place such that the standard upstream way via PKG_CONFIG_LIBDIR works again? I am proposing this, because _PYTHON_SYSCONFIGDATA_NAME already is a widely established way of doing this. > > [ Yet another solution snipped ] > Okay, I admit this is actually not such a bad idea if we are looking to keep > numpy-config around. It does not solve the get_include() problem, but it > makes numpy-config more robust. Indeed. > But now that I have trapped you sufficiently deep in the rabbit hole of > NumPy build tooling, do you have another good idea to make get_include() > work? Yes, above. In any case, the central question to ask here is: How should a builder communicate the host architecture to numpy? Once answering it, the rest falls into place naturally (or in case it doesn't, the answer to the previous question is not a good one). Regardless of what the answer is, the question and answer should be documented in README.multiarch or something similar. Helmut