On 2024-06-23 08:26, Jon Turney via Cygwin-apps wrote:
On 06/06/2024 20:03, Brian Inglis via Cygwin-apps wrote:
I found github/nexB/license-expression Python package to do SPDX licence checks developed by the same team doing SPDX-toolkit for SPDX, using the same current data, by and working with Fedora folks et al.

Thanks for taking a look at this problem.

Having a package for this seems fine, but: this package is what calm uses, and still has the drawbacks I mentioned:

* embeds the SPDX license data, doesn't dynamically fetch it
* can't really handle LicenseRef reasonably

Thanks Jon,

They appear to be looking at splitting the code and data packages, as the rules appear to be settling down somewhat, whereas the data will only expand, and they seem to want to make it easier to keep up the release cadence.

[But looking back at zoneinfo tzdb packages tzcode and tzdata: they used to be split, are still are logically, but every release now happens to both simultaneously, and for our distro, it makes no sense to not release tzcode, including the utilities, with tzdata, as that way we always pick up the latest tweaks.]

Given the trapping problems you solved below, it should be possible in the SPDX instantiation, to trap the unpackaged LicenseRef-* and ExceptionRef-* entries and downgrade their reporting to warnings.

[I do not agree with their packaging of ...Refs based on the reporting or requesting organization, for example, ScanCode, Fedora, etc. rather than the package source, for example, I use LicenseRef-IANA-TZ-Public-Domain rather than something like LicenseRef-ScanCode-Public-Domain, and they only allow ExceptionRefs after WITH even when, for example, Google grants IP rights in addition to those in the licence, tying both together, so they are *NOT* independent.

Having raised a few issues and made a few points about various public domain packages and licences, SPDX appear to be fixated on licence texts as the embodiment of each variety of public domain licence, despite there probably only being a few major sources: expired copyrights (coming up for really ancient sources), US government department sources, and individual US developers; there may be occasional non-US individual developers, but as there is no real public domain concept elsewhere, they have often come up with equivalent expressions, like WTFPL, etc.]

I also am not sure we want to have to jump thru SPDX hoops for each Cygwin package licence we hit before Fedora does?

Successful attempt to package Python license-expression (without tests):

     https://cygwin.com/cgi-bin2/jobs.cgi?id=8210

log at:

     https://github.com/cygwin/scallywag/actions/runs/9293093201

cygport attached and at:

https://cygwin.com/cgit/cygwin-packages/playground/commit/?id=3626386b10c967f780547d1703ad23bd50f6331a

The package installs and runs using PoC attached in spdx-license-expression.py script hooked into /usr/share/cygport/lib/pkg_pkg.cygpart license hint addition patch attached.

I'm not super-keen on adding a cygport dependency on python, just to do this 
check.

It would probably be preferable to do this check initially after the .cygport is read, rather than only telling you about problems when you get around to doing to the package step.

Add after the mandatory variables checks for LICENSE, etc.?
Could be optional additional packages - install python-SPDX-licen[cs]e-expression package which depends on python-license-expression to do checks - cygport checks for SPDX and runs licence checks only if present?

I also ran a test of the Python script and module against all package source cygport files declaring licences which I maintain or ever looked at, including a git/cygwin-packages/*.cygport download from 2023-02, showing the results in the attached log. I also attempted to trap the exceptions in the script, but that does not seem to work in any documented obvious manner, but I do not know enough Python to address this fully.

Yeah, the way validate() handles parse errors is bizarre and unhelpful.

What I ended up doing is calling parse() first to catch those errors, so something like:

     try:
         licensing.parse(expression)
         errs = licensing.validate(expression).errors
     except (ExpressionError, ExpressionParseError) as e:
         print(e, file=sys.stderr)
         return 2

Thanks for that tip, I will have another look at calm, and see if I can work on adding that, also checking for ...Refs, and warning or erroring (non-fatal) as appropriate, in python-SPDX-license-expression.

Then another approach early in cygport to detect and check.

--
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                -- Antoine de Saint-Exupéry

Reply via email to