Hi all,
During debugging calc for missing parts of my migration, i came across
a bug in the code.
I separated the documentation from the code commit. Maybe it is worth
cherry picking it for trunk?
Here is the relevant part:
## 14. Calc crash-on-open AV — latent UAF, debug-CRT-deterministic (NOT
a migration bug)
**Component:**`sc`view init — **stock AOO defect**, source
byte-identical to upstream.
**Root cause (CONFIRMED):**`ScViewData::ReadUserDataSequence`
([viewdata.cxx:2821](main/sc/source/ui/view/viewdata.cxx#L2821)) does
`delete pTabData[nTab]; pTabData[nTab] = new ScViewDataTable;`per sheet
but **never
refreshes `pThisTab`** (which pointed at `pTabData[nTabNo]`). Back in
`ScTabView::SetTabNo`, line 1660 reads `aViewData.GetActivePart()`→
`pThisTab->eWhichActive`
**before**line 1663 fixes `pThisTab`→ use-after-free → AV
`sc!ScTabView::SetTabNo``mov ecx,[eax+edx*4+0x664]`with **edx=0xDDDDDDDD**
(`pGridWin[0xdddddddd]`).
**Why it shows here but not in normal use — and why "release is fine" is
too strong.**
This is a genuine latent **UAF**(reading a dangling pointer is UB); it
is *masked*in
release by **unspecified allocator behaviour**, not made correct. The
`delete`and `new`
are same-thread, adjacent, same size class, with nothing between them (the
`ScViewDataTable`ctor runs *after*`operator new`). Mainstream allocators
keep per-size
free lists with MRU/LIFO reuse, so `new`almost always returns the
**just-freed block**—
and when it does, `pThisTab == pTabData[nTabNo]`again and points at the
*live, freshly
constructed* object (genuinely correct, not stale-luck). That is why it
has survived ~20
years in the field.
But it is **not guaranteed**, and can surface non-deterministically even
in release:
-Windows **LFH randomizes**allocation slots (Win8+ exploit mitigation),
so `new`may
return a *different*slot. In release the freed block isn't poisoned,
so the immediate
read usually still sees plausible old bytes — but the window here is
**not**two
instructions: the rest of `ReadUserDataSequence`parses *all other
sheets*(lots of
allocation), any of which can reclaim that block and overwrite the
`eWhichActive`offset
with a large value → `pGridWin[garbage]`→ a **rare, timing-dependent
release crash**.
-Hardened/diagnostic allocators (PageHeap, Application Verifier, ASan, a
custom
`operator new`) don't do MRU reuse at all → they crash release too.
-The debug CRT (`MSVCR90D`) is just the *deterministic*trigger: it
poison-fills freed
blocks with `0xDD`and **delays**reuse, so `new`returns a different
block every time.
cdb proof (`.frame 0; dv /t`+ `?? &this->aViewData`): `eOldActive =
0xDDDDDDDD`, but
`this`/`pDoc`/`pViewShell`all valid, and crucially
`pThisTab = 0x09a5f118`(old/low region, freed) **≠**`pTabData[nTabNo=0]
= 0x13deca08`
(fresh/valid) — the new table object is fine; only the stale pointer
dangles.
it's a latent UAF whose release manifestation is rare and
non-deterministic (the worst kind: invisible in testing, occasionally
lethal in the field),
so the 1-line upstream fix is warranted: `pThisTab =
pTabData[nTabNo];`at the end of
`ReadUserDataSequence`([viewdata.cxx:2955-2960](main/sc/source/ui/view/viewdata.cxx#L2955)),
mirroring `ScViewData::SetTabNo`line 1502 — it removes the dependence on
allocator luck
entirely. **Triage rule:**a debug-build `0xDD`/`0xFEEE`AV is a
*migration*bug only if an
upstream *failure*feeds it (a throw, a missing staged file/data, as in
§1); if every object
is valid and only one pointer dangles across a `delete`/`new`, it's a
debug-CRT-exposed
latent UAF. Reproducing in release tells you whether the allocator is
masking it — a *clean*
release run is **not**proof of correctness, only that MRU reuse held
that time.
all the best
peter