On Tue, Mar 17, 2026 at 7:49 PM Nathan Bossart <[email protected]>
wrote:
> On Sat, Mar 14, 2026 at 11:43:38PM +0100, KAZAR Ayoub wrote:
> > Just a small concern about where some varlenas have a larger binary size
> > than its text representation ex:
> > SELECT pg_column_size(to_tsvector('SIMD is GOOD'));
> > pg_column_size
> > ----------------
> > 32
> >
> > its text representation is less than sizeof(Vector8) so currently v3
> would
> > enter SIMD path and exit out just from the beginning (two extra branches)
> > because it does this:
> > + if (TupleDescAttr(tup_desc, attnum - 1)->attlen == -1 &&
> > + VARSIZE_ANY_EXHDR(DatumGetPointer(value)) > sizeof(Vector8))
> >
> > I thought maybe we could do * 2 or * 4 its binary size, depends on the
> type
> > really but this is just a proposition if this case is something
> concerning.
>
> Can we measure the impact of this? How likely is this case?
>
I'll respond to this separately in a different email.
>
> > +static pg_attribute_always_inline void CopyAttributeOutText(CopyToState
> cstate, const char *string,
> > +
> bool use_simd, size_t len);
> > +static pg_attribute_always_inline void CopyAttributeOutCSV(CopyToState
> cstate, const char *string,
> > +
> bool use_quote, bool use_simd,
> size_t len);
>
> Can you test this on its own, too? We might be able to separate this and
> the change below into a prerequisite patch, assuming they show benefits.
>
I tested inlining alone and found the results were about an improvement of
1% to 4% across all configurations.
The inlining is only meaningful in combination with the SIMD work, for the
reason described below.
>
> > if (is_csv)
> > - CopyAttributeOutCSV(cstate, string,
> > -
> cstate->opts.force_quote_flags[attnum - 1]);
> > + {
> > + if (use_simd)
> > + CopyAttributeOutCSV(cstate, string,
> > +
> cstate->opts.force_quote_flags[attnum - 1],
> > +
> true, len);
> > + else
> > + CopyAttributeOutCSV(cstate, string,
> > +
> cstate->opts.force_quote_flags[attnum - 1],
> > +
> false, len);
>
> There isn't a terrible amount of branching on use_simd in these functions,
> so I'm a little skeptical this makes much difference. As above, it would
> be good to measure it
I compiled three variants
v3: use_simd passed as compile-time, CopyAttribute functions inlined.
v3_variable: use_simd as is variable, CopyAttribute functions inlined.
v3_variable_noinline: use_simd as is variable, CopyAttribute functions are
not inlined.
None of the helpers are explicitly inlined by us.
The assembly reveals two things:
1) The CSV SIMD helpers (CopyCheckCSVQuoteNeedSIMD, CopySkipCSVEscapeSIMD)
are inlined by the compiler naturally in all
three variants, CopySkipTextSIMD is never inlined by the compiler in any
variant.
2) The constant-emitting approach (v3) does matter (just a little
apparently) specifically for CopySkipTextSIMD.
Its the same story as COPY FROM patch's first commit it just emits code
without use_simd branch
jbe ... ; len > sizeof(Vector8)
je ... ; need_transcoding
call CopySkipTextSIMD
Whether the extra branching in for constant passing is worth it or not is
demonstrated by the benchmark.
Test Master v3 v3_var v3_var_noinl
TEXT clean 1504ms -24.1% -23.0% -21.5%
CSV clean 1760ms -34.9% -32.7% -33.0%
TEXT 1/3 backslashes 3763ms +4.6% +6.9% +4.1%
CSV 1/3 quotes 3885ms +3.1% +2.7% -0.8%
Wide table TEXT (integer columns):
Cols Master v3 v3_var v3_var_noinl
50 2083ms -0.7% -0.6% +3.5%
100 4094ms -0.1% -0.5% +4.5%
200 1560ms +0.6% -2.3% +3.2%
500 1905ms -1.0% -1.3% +4.7%
1000 1455ms +1.8% +0.4% +4.3%
Wide table CSV:
Cols Master v3 v3_var v3_var_noinl
50 2421ms +4.0% +6.7% +5.8%
100 4980ms +0.1% +2.0% +0.1%
200 1901ms +1.4% +3.5% +1.4%
500 2328ms +1.8% +2.7% +2.2%
1000 1815ms +2.0% +2.8% +2.5%
I'm not sure whether there's a diff between v3 and v3_var practically
speaking, what do you think ?
Regards,
Ayoub