On Sat, Apr 25, 2026 at 07:29:27PM -0600, Todd C. Miller wrote:
> On Mon, 20 Apr 2026 10:15:47 +0200, Walter Alejandro Iglesias wrote:
>
> > With the other diff I posted for this issue the segfault still happened
> > in some scenarios, plus some regressions. With this one, I haven't had
> > any problem so far.
>
> Nice work. I started to try debugging this when it was first
> reported but didn't get very far.
It's a wormhole. :-)
None of what I posted are solutions. I found more ways of reproducing
the segfault whatever, wherever I do with sp->ep->refcnt. Reloading
*ecp avoids the write-after-free aborts but it's still a way to dodge
the real issue which, may be (still not sure), has a common root.
If the hours I spent with gdb (from ports) didn't lie to me, this is
what happens. Both *sp and *ecp get screwed after ex_visual() is called
by ex_cmd() (ex/ex.c:1372):
/*
* Call the underlying function for the ex command.
*
* XXX
* Interrupts behave like errors, for now.
*/
if (ecp->cmd->fn(sp, ecp) || INTERRUPTED(sp)) {
if (F_ISSET(gp, G_SCRIPTED))
F_SET(sp, SC_EXIT_FORCE);
goto err;
}
The problem comes here (ex/ex_visual.c:122):
if (F_ISSET(sp, SC_EX_GLOBAL)) {
/*
* When the vi screen(s) exit, we don't want to lose our hold
* on this screen or this file, otherwise we're going to fail
* fairly spectacularly.
*/
++sp->refcnt;
++sp->ep->refcnt;
/*
* Fake up a screen pointer -- vi doesn't get to change our
* underlying file, regardless.
*/
tsp = sp;
HERE -> if (vi(&tsp))
return (1);
/*
* !!!
* Historically, if the user exited the vi screen(s) using an
* ex quit command (e.g. :wq, :q) ex/vi exited, it was only if
* they exited vi using the Q command that ex continued. Some
* early versions of nvi continued in ex regardless, but users
* didn't like the semantic.
*
* Reset the screen.
*/
if (ex_init(sp))
return (1);
/* Move out of the vi screen. */
(void)ex_puts(sp, "\n");
Is that "fake up a screen pointer" a fake assumption? Does nvi end up
using the same SCR for everything? Anyway, when vi() (vi/vi.c:63) takes
care of tsp, sp is cleansed. It happens at this point (vi/vi.c:391):
/*
* If the current screen is still displayed, it will
* need a new status line.
*/
F_SET(sp, SC_STATUS);
/* Switch screens, change focus. */
sp = sp->nextdisp;
vip = VIP(sp);
(void)sp->gp->scr_rename(sp, sp->frp->name, 1);
/* Don't trust the cursor. */
F_SET(vip, VIP_CUR_INVALID);
/* Refresh so we can display messages. */
HERE -> if (vs_refresh(sp, 1))
return (1);
}
When this whole walk ends and we return to ex_cmd(), *sp and *ecp are
filled with garbage.
How can we create a real separate screen for global-visual, I mean
creating and destroying it as a separate process? I don't know how to
do this, even when I made a big progress in understanding the code I'm
still lost.
>
> - todd
--
Walter