Here is a follow-up to the story, for those curious what happens in a similar IA64 architecture. And this should be it.
As for the problem on E2K itself, we should discuss it with MCST and/or investigate whether the missing information about the faults can be recovered to better satisfy POSIX. On Sat, 29 Dec 2018, Ivan Zakharyaschev wrote: > > > As for the SIGILL peculiarity, it has a reason in the Elbrus > > > architecture. > I've studied the assembler code and found the other true > reason in this specific case: these are faults "hidden" in an explicitly > "speculative" computation which utltimately result in SIGILL. (The E2K ISA > is reminiscent of IA64; this can help get the idea.) The specific kind of > the fault is "forgotten", unfortunately. > Besides, in many aspects including the newly mentioned by me explicitly > speculative instructions, E2K reminds IA64. > > And it'd be interesting to have a look how they treat faults coming from > speculative computations in Linux/ia64 to get an idea whether it can be > done in a manner with better conformance to POSIX. > * * * > > BTW, saving and forgetting the type of the original fault doesn't seem I meant "not forgetting". > to be something expensive to implement (after some thought): when a > register is marked as invalid, it shouldn't matter anymore what value > it holds. So, the same register can be used to save the information > about the type of the fault. As Dmitry Levin pointed out, probably not, because there can be too much information (the fault, and the associated addres) for a single register. > * * * > > I wanted to see how Linux/ia64 handles these complications arising > from speculative computations possibly causing a fault; and powered on > such a machine, and had a look at the above examples with SIGILL on > E2K: the third one, and the fifth one (speculative division by zero). > > The third example from above: > > imz@rx2620:~/test-speculative-SIGSEGV$ cc -Wall -O3 -xc - -S -o c.s && cat c.s > int main(int argc, char ** argv) { > if (0 < argc) > ++*(char*)0xbad; > return 0xbeef; > } > .file "" > .pred.safe_across_calls p1-p5,p16-p63 > .section .text.startup,"ax",@progbits > .align 16 > .align 64 > .global main# > .type main#, @function > .proc main# > main: > .prologue > .body > .mmi > cmp4.ge p6, p7 = 0, r32 > addl r14 = 2989, r0 > addl r8 = 48879, r0 > ;; > .mmi > (p7) ld1 r15 = [r14] > ;; > (p7) adds r15 = 1, r15 > nop 0 > ;; > .mib > (p7) st1 [r14] = r15 > nop 0 > br.ret.sptk.many b0 > .endp main# > .ident "GCC: (Debian 4.6.3-14) 4.6.3" > .section .note.GNU-stack,"",@progbits > imz@rx2620:~/test-speculative-SIGSEGV$ cc -Wall -O3 c.s && ./a.out; echo $? > Segmentation fault > 139 > Notes on the assembler: the possible groupings into VLIWs are > separated by double semicolons (";;"). Predicative execution of > instructions is marked by a prefix with the corresponding predicate > register in parentheses, like "(p7)" in the code above: > > .mmi > (p7) ld1 r15 = [r14] > ;; > (p7) adds r15 = 1, r15 > nop 0 > ;; > .mib > (p7) st1 [r14] = r15 > > These are the "load", "add", and "store" instructions corresponding to: > ++*(char*)0xbad > > All this shows that gcc-4.6 on IA-64 doesn't generate speculative > computations for the same examples that had speculative computations > on E2K. Unfortunately, this means that we couldn't compare the > interesting bits of the behavior between Linux/e2k and Linux/ia64 > quickly. Perhaps, editing the IA64 assembler code can give a desired > example. Cool! Linux/ia64 also produces SIGILL in the same situation; it seems to have no magic. (But there is a second part of the story!) imz@rx2620:~/test-speculative-SIGSEGV$ diff c.s c_s.s 18c18 < (p7) ld1 r15 = [r14] --- > (p7) ld1.s r15 = [r14] imz@rx2620:~/test-speculative-SIGSEGV$ cc c_s.s && ./a.out; echo $? Illegal instruction 132 "ld1.s" is the "load 1 byte" instruction with the "speculative" flag. If we do not use the "invalid" register in a "store" instruction, then there is no fault: imz@rx2620:~/test-speculative-SIGSEGV$ diff c_s.s c_nost.s 24,25d23 < (p7) st1 [r14] = r15 < nop 0 imz@rx2620:~/test-speculative-SIGSEGV$ cc c_nost.s && ./a.out; echo $? 239 And the second part: The problem has a solution on IA64. The compiler would know how to replay the faulty speculative computation, so it would be able generate code to do this non-speculatively and trigger the real fault. And there is an instruction that checks whether a register is "valid"[1] and helps to jump to the recovery code[2]: "chk.s". I've implemented this approach manually in c_chk.s like this (but I have not seen what a compiler would do actually; IA64 has other flavors of speculative instructions, like "ld.a" etc., so there are rich possiblities): .file "" .pred.safe_across_calls p1-p5,p16-p63 .section .text.startup,"ax",@progbits .align 16 .align 64 .global main# .type main#, @function .proc main# main: .prologue .body .mmi addl r14 = 2989, r0 addl r8 = 48879, r0 ;; .mmi ld1.s r15 = [r14] ;; .mmi cmp4.ge p6, p7 = 0, r32 ;; (p7) adds r15 = 1, r15 nop 0 ;; (p7) chk.s r15, .recovery ;; .back: .mib (p7) st1 [r14] = r15 nop 0 br.ret.sptk.many b0 .recovery: ld1 r15 = [r14] //adds r15 = 1, r15 br.cond.sptk .back .endp main# .ident "GCC: (Debian 4.6.3-14) 4.6.3" .section .note.GNU-stack,"",@progbits imz@rx2620:~/test-speculative-SIGSEGV$ cc c_chk.s && ./a.out; echo $? Segmentation fault 139 It produced a normal behavior, better satisfying POSIX. [1]: https://blogs.msdn.microsoft.com/oldnewthing/20040119-00/?p=41003 [2]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/ia64/strchr.S;h=3a29e80b52c350a76e880cbb8daa66c91fa98964;hb=HEAD#l87 [1] seems to be outdated because it shows a wrong variant of "chk.s", but has a story about the registers being 65-bit having an additional bit for their "validity". [2] is a manually written example of this approach which I googled up quickly searching for "chk.s" "ia64". -- Best regards, Ivan