On Wed, May 07, 2025 at 02:20:44PM +0200, Simon Josefsson wrote:
Thanks for answers! Surprisingly I now find myself agreeing that your approach is reasonable and is consistent with existing Debian practices. I just wish that the existing practices were more libre and more consistent with documented policies, but I also think this is not the popular opinion.
So, let's delve deeper on the practical impact of such consistency or not. Let's say we have a hypothetical package called gnipgnop-rattrap. It's an accessibility tool which tracks elements of your face using pretrained Haar cascade classifier models, and based on where you look, moves the "mouse" pointer. The models we ship it with have been trained solely on 75 gigabytes of images captured from Disney films, which are not available anywhere because the people who trained the models are afraid of being sued. What should Debian do? Remove the package from the archive so no one can use it? Patch it to download the models from a random URL which may or may not be accessible? Construct 75 gigabytes of DFSG-free annotated training data to stuff into the source package?

