Bug#1004184: gcc-11: generate bad code for matplotlib with -O1/-O2 on mips64el

2023-02-24 Thread James Addison
Source: gcc-11
Followup-For: Bug #1004184
X-Debbugs-Cc: frederic-emmanuel.pi...@synchrotron-soleil.fr
Control: forwarded -1 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914

Hi Frederic: I'm linking a forwarded GCC GNU bug report that I _think_ is the
upstream report matching this bug.  I found it from a related GitHub issue[1]
for matplotlib.

The GCC bug reporter has done some work to create a minimal reproducer case.
Could you check whether the issue reported there is the same one as here?  (I
will do eventually, but am not familiar with C and do not have mips hardware
available so it may take some time)

[1] - https://github.com/matplotlib/matplotlib/issues/21789



Bug#1023666: gcc-10 should not be shipped in bookworm

2023-02-28 Thread James Addison
Package: gcc-10
Version: 10.4.0-7
Followup-For: Bug #1023666

Bug #1004184 implies that gcc-11 cannot build correct mips64 code for a key
Debian package (source: matplotlib) without buildflag adjustments.  However,
gcc-10 does emit correct code for the same package and architecture.

Should that be considered a blocker for this bug?



Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-19 Thread James Addison
Package: gcc-11
Followup-For: Bug #1005863
X-Debbugs-Cc: martin-eric.rac...@iki.fi
Control: affects -1 - sudo net-tools
Control: affects -1 + libjavascriptcoregtk-4.0-18
Control: affects -1 + gobjc++-12-x86-64-linux-gnu
Control: affects -1 + libfsapfs-dev

Dear Maintainer and Martin-Éric,

Using a recent (2023-03-19) mirror of i386 packages from 'main' and 'contrib'
of bookworm, in combination with an ad-hoc script[1], the following packages
currently appear susceptible to this bug:

  * libjavascriptcoregtk-4.0-18_2.38.5-1_i386
/usr/lib/i386-linux-gnu/libjavascriptcoregtk-4.0.so.18.21.8

  * gobjc++-12-x86-64-linux-gnu_12.2.0-14cross1_i386
/usr/lib/gcc-cross/x86_64-linux-gnu/12/cc1objplus

  * libfsapfs-dev_20201107-1+b3_i386
/usr/lib/i386-linux-gnu/libfsapfs.a


That's three potential positives; in total, the check ran on approximately
thirty-two thousand (32340, to be more precise) packages.

Thanks,
James

[1] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033065#22


Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-19 Thread James Addison
Package: gcc-11
Followup-For: Bug #1005863
X-Debbugs-Cc: martin-eric.rac...@iki.fi
Control: affects -1 + sudo

> That's three potential positives; in total, the check ran on approximately
> thirty-two thousand (32340, to be more precise) packages.

My apologies: there was a bug in the script to run these checks, and it had
not, in fact, run on every one of the 32340 packages -- instead only a sample
(of yet-to-be-determined size) were checked.

Complete results should follow; initial output appears to show that, in fact,
many more than three packages (including a recent 'sudo' package) are indeed
affected.



Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-19 Thread James Addison
Package: gcc-11
Followup-For: Bug #1005863
X-Debbugs-Cc: martin-eric.rac...@iki.fi

Ok; I should have realised that scanning the entire contents of the i386
bookworm archive for particular opcodes across _all_ files on a single machine
seemed to complete surprisingly quickly..

Please find attached an updated check-script (check.sh) that is running
currently.

It makes some tradeoffs for scanning performance reasons: in particular, it's
only inspecting files that have the executable bit set, or that end with the
suffix '.so' or '.a'.

It seems that it's going to take a while to run to completion on the available
hardware here: my estimate would be approximately another two days (48 hours).

I'm uncertain whether the script will run to completion uninterrupted, and also
it is not written to be easily-resumable, so.. let's at least gather some
summary statistics from the output while it's in progress.

Please also find attached a reporting script (report.sh) that summarises the
total number of packages scanned, the number of packages where at least one
file was inspected, and the number of packages where at least one inspected
file contained a 'nopl' opcode.

The current report.sh output at the time of writing is:

  2441
  2042
  130


So my guess is that approximately 6-7% of i386 packages in bookworm _that
contain binaries or shared libraries_ are susceptible to this bug.

The opcode may not be encountered at runtime when those packages are used, 
and analysis of the packages to determine where they sit in Debian's dependency
graph would indicate the level of impact on a system, however my initial sense
is that this could indeed be a fairly critical issue on Geode LX hardware for
Debian bookworm.

It's also a larger number of packages than we could expect individual
maintainers to adjust their buildflags for on any realistic timescale - so
either a Debian-specific patch or upstream fix would be required to continue to
support Geode LX (in my opinion, and assuming that the script and report are
accurate-enough to be guiding indicators).
#!/bin/bash

FULLPATH="$1"
PACKAGE=$(basename "${FULLPATH}" .deb)

dpkg -x "${FULLPATH}" "${PACKAGE}";

echo -n "Checking ${PACKAGE} ... ";
find "$PACKAGE" -type f -a \( -executable -o -name '*.so' -o -name '*.a' \) 
-print | wc -l

while IFS= read -r -d '' file; do
objdump --architecture=i386 --disassemble-all "$file" | grep -q -w "nopl" 
&& echo "E $file"
done < <(find "${PACKAGE}" -type f -a -executable -o -name '*.so' -print0) 
2>/dev/null;

rm -rf "${PACKAGE}";
#!/bin/bash

# total number of packages checked
grep "^Checking" affected.txt | wc -l

# packages that contained at least one binary/shared-library to inspect
grep "^Checking" affected.txt | grep -v " 0$" | wc -l

# packages where at least one error was found in a binary/shared-library
grep "^Checking" affected.txt -A 1 | grep "^E" | wc -l


Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-19 Thread James Addison
Package: gcc-11
Followup-For: Bug #1005863
X-Debbugs-Cc: martin-eric.rac...@iki.fi

> So my guess is that approximately 6-7% of i386 packages in bookworm _that
> contain binaries or shared libraries_ are susceptible to this bug.

...

> It's also a larger number of packages than we could expect individual
> maintainers to adjust their buildflags for on any realistic timescale - so
> either a Debian-specific patch or upstream fix would be required to continue 
> to
> support Geode LX (in my opinion, and assuming that the script and report are
> accurate-enough to be guiding indicators).

This reporting was, in fact, too optimistic; the check.sh script had a bug
that meant it wasn't inspecting '*.a' files (even though it was including them
in the per-package binary/library counts).

With this fix in place:

-done < <(find "${PACKAGE}" -type f -a -executable -o -name '*.so' -print0) 
2>/dev/null;
+done < <(find "${PACKAGE}" -type f -a \( -executable -o -name '*.so' -o -name 
'*.a' \) -print0) 2>/dev/null;

... the report summary at the time-of-writing is:

  2467
  1639
  345

... and so it seems that the percentage of susceptible packages for arch i386
in bookworm could be closer to 20%.



Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-19 Thread James Addison
Package: gcc-11
Followup-For: Bug #1005863
X-Debbugs-Cc: debian-gcc@lists.debian.org, debian-rele...@lists.debian.org, 
debian-pol...@lists.debian.org

Hi folks,

Bug #1005863 describes a gcc-11 behaviour that results in software that exits
ungracefully on Geode LX i686 hardware.  Despite self-reporting as i586
sometimes, Geode LX is in fact an i686 CPU (without physical address extensions
and multi-instruction noops -- both optional per spec).

My assessment -- which may be incorrect -- is that something like 20% of
packages in the bookworm i386 suite are susceptible to the bug, so I think that
installing bookworm on a Geode LX system would present users with a poor
experience of Debian.

Would it be fair to raise the severity of this bug to a release-critical
level?

I understand that toolchains are an important part of the ecosystem and that
changes to them -- especially ones that may affect many packages -- should be
undertaken with care, and that we are into bookworm's pre-release hard freeze.

Thank you,
James



Bug#1005863: gcc-11: invalid opcode for Geode LX on i386

2023-03-20 Thread James Addison
On Mon, 20 Mar 2023 at 07:22, Bastian Blank  wrote:
>
> On Sun, Mar 19, 2023 at 11:47:21PM +, James Addison wrote:
> > Would it be fair to raise the severity of this bug to a release-critical
> > level?
>
> No, it would be fair to remove Geode LX from the set of supported
> processors.  Those are now over 15 years old.

Ok, thank you; understood.

It looks like this was previously documented[1] for the Debian 9.0
(stretch) release in 2017, and later discussed[2] further.

I'll continue following the upstream bug, but I clearly don't fully
understand the problem yet.

My hope was that we could continue to maintain (in fact, with my
updated understanding: restore) support for the affected Geode LX
platform.  I can accept that that may not be possible.

[1] - 
https://www.debian.org/releases/stretch/i386/release-notes/ch-information.html#i386-is-now-almost-i686

[2] - https://lists.debian.org/debian-user/2019/04/msg01091.html



Bug#1005863: binutils: invalid opcode for Geode LX on i386

2023-03-20 Thread James Addison
Followup-For: Bug #1005863
X-Debbugs-Cc: ballo...@debian.org
Control: reassign -1 binutils 2.38-1

Reassigning this from package 'gcc' to 'binutils':

It looks like it is GNU binutils[1] (and in particular, the GNU assembler)
that is responsible for producing the assembly opcodes for a binary compiled
with gcc.

On Mon, 20 Mar 2023 11:27:40 +0100, Bill Allombert wrote:
> From a purely engineering perspective, without a way to address this problem,
> increasing the severity will not achieve much.

Yep, agreed.  I'd like to learn more about technical fix feasibility before
adjusting the severity.

There was a commit[2] in Y2010 of GNU binutils to stop emitting NOPL on (32bit)
i686 targets.. I'm wondering if it's possible that a regression since then may
have caused the opcodes to reappear.

(it continues to be equally likely that I've completely misunderstood and am
creating noise without making any useful progress)

[1] - https://www.gnu.org/software/binutils/

[2] - 
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=2210942396dab942a86cb6777c705554b84ebb0e



Bug#1005863: gcc: should reject combination of i686 architecture and fcf-protection feature

2023-03-24 Thread James Addison
Followup-For: Bug #1005863
Control: affects -1 = sudo



Bug#1004184: gcc-11: generate bad code for matplotlib with -O1/-O2 on mips64el

2024-08-25 Thread James Addison
Source: gcc-11
Followup-For: Bug #1004184
Control: fixed -1 gcc-14/14.1.0-1

I haven't yet confirmed that the output of an O1/O2 build is corrected when
compiling on MIPS, but the relevant patches have arrived in gcc v14.1 and are
packaged in Debian, so I'm updating the tags on this bug to record that.