And how much of the L1 instruction cache? Intel's Pentium, Pentium Pro, Pentium III, Celeron 1.7 GHz each have only 8 KB of it. Make one call to snprintfv, and the cache is emptied. AMD64 has 64 KB of it; it's a bit better.
Let's talk L2 cache. I doubt glibc has a smaller printf, and anyway I won't hope my L1 cache to contain much more than the program's currently running hot loop.
Paolo