Package: ghc Version: 7.6.3-7 Severity: important Tags: upstream patch Hi,
ghc has been removed from the archive on s390x because it hangs randomly during the build process. This has been reported upstream as ticket #7993, which hasn't progress so far. In the meantime the same issue has been reported as ticket #8134 for powerpc64. It happens the problem is the same and that it affect 64-bit big endian platforms. A patch is provided in this bug report, and has been committed upstream. I have tried this patch and I have been been able to build ghc successfully 3 times in a loop after bootstraping it from the last available binary in snapshot.d.o. I have attached this patch to this bug report for convenience, so that it could be dropped in debian/patches. Would it be possible to upload a fixed version with this patch? I will then take care of bootstrapping the binary on s390x again and uploading the package to the archive. Thanks, Aurelien -- System Information: Debian Release: jessie/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: s390x Kernel: Linux 3.2.0-4-s390x (SMP w/2 CPU cores) Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash
commit a4b1a43542b11d09dd3b603d82c5a0e99da67d74 Author: Austin Seipp <aus...@well-typed.com> Date: Fri Nov 1 22:17:01 2013 -0500 Fix loop on 64bit Big-Endian platforms (#8134) This is a fun one. In the RTS, `cas` expects a pointer to StgWord which will translate to unsigned long (8 bytes under LP64.) But we had previously declared token_locked as *StgBool* - which evaluates to 'int' (4 bytes under LP64.) That means we fail to provide enough storage for the cas primitive, causing it to corrupt memory on a 64bit platform. Hilariously, this somehow did not affect little-endian platforms (ARM, x86, etc) before. That's because to clear our lock token, we would say: token_locked = 0; But because token_locked is 32bits technically, this only writes to half of the 64bit quantity. On a Big-Endian machine, this won't do anything. That is, token_locked starts as 0: / token_locked | v 0x00000000 and the first cas modifies the memory to: / valid / corrupted | | v v 0x00000000 0x00000001 We then clear token_locked, but this doesn't change the corrupted 4 bytes of memory. And then we try to lock the token again, spinning until it is released - clearly a deadlock. Related: Windows (amd64) doesn't follow LP64, but LLP64, where both int and long are 4 bytes, so this shouldn't change anything on these platforms. Thanks to Reid Barton for helping the diagnosis. Also, thanks to Jens Peterson who confirmed this also fixes building GHC on Fedora/ppc64 and Fedora/s390x. Authored-by: Gustavo Luiz Duarte <gustav...@linux.vnet.ibm.com> Signed-off-by: Austin Seipp <aus...@well-typed.com> diff --git a/rts/STM.c b/rts/STM.c index e342ebf..bea0356 100644 --- a/rts/STM.c +++ b/rts/STM.c @@ -949,7 +949,7 @@ void stmPreGCHook (Capability *cap) { static volatile StgInt64 max_commits = 0; #if defined(THREADED_RTS) -static volatile StgBool token_locked = FALSE; +static volatile StgWord token_locked = FALSE; static void getTokenBatch(Capability *cap) { while (cas((void *)&token_locked, FALSE, TRUE) == TRUE) { /* nothing */ }