Re: Learning EXPLAIN

2017-12-08 Thread Guillaume Lelarge
Hi,

2017-12-08 2:12 GMT+01:00 Flávio Henrique :

> Hi experts!
>
> I read this nice article about Understanding EXPLAIN [1] weeks ago that
> opened my mind about the tool, but it seems no enough to explain a lot of
> plans that I see in this list.
>

Thanks.

I often read responses to a plan that are not covered by the article.
>
> I need/want to know EXPLAIN better.
>
> Can you kindly advise me a good reading about advanced EXPLAIN?
>
>
There's not much out there. This document was written after reading this
list, viewing some talks (you may find a lot of them on youtube), and
reading the code.

I intend to update this document, since I learned quite more since 2012.
Though I didn't find the time yet :-/

Anyway, thanks.


-- 
Guillaume.


Re: Learning EXPLAIN

2017-12-08 Thread Flavio Henrique Araque Gurgel
Em sex, 8 de dez de 2017 às 14:20, Guillaume Lelarge 
escreveu:

> Hi,
>
> 2017-12-08 2:12 GMT+01:00 Flávio Henrique :
>
>> Hi experts!
>>
>> I read this nice article about Understanding EXPLAIN [1] weeks ago that
>> opened my mind about the tool, but it seems no enough to explain a lot of
>> plans that I see in this list.
>>
>
> Thanks.
>
> I often read responses to a plan that are not covered by the article.
>>
>> I need/want to know EXPLAIN better.
>>
>> Can you kindly advise me a good reading about advanced EXPLAIN?
>>
>>
> There's not much out there. This document was written after reading this
> list, viewing some talks (you may find a lot of them on youtube), and
> reading the code.
>
> I intend to update this document, since I learned quite more since 2012.
> Though I didn't find the time yet :-/
>
> Anyway, thanks.
>
>
Hello all

I would like to make clear that there are two "Flavio Henrique" on the
lists, me beeing one of them, I'd like to say that I'm not the OP.
A bit off-topic anyway, thanks for understanding.

Flavio Gurgel


Re: Learning EXPLAIN

2017-12-08 Thread Gustavo Velasquez
Dude,

You can rest assured that at least the Brazilians members will always know
based on your last name you are not the same :-).

What's the point of explaining that anyways? Got curious.

As to what pertains to the topic:

This is another simple yet effective doc:

https://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf

Execution plans are tricky and reading them accurately to make good
decisions require a lot of experience and awareness of the situation. It
does not only require that you know how to read the tool itself but also
know how the DB and schemas have been designed, if stats are up to date,
how tables are populated, frequency and type of queries, adequate indexing
in place, the hardware it sits on, etc.

It's a mix of science, broaden knowledge, perspicacity, and why not
say, it's an art.

Have a great weekend.

___
Gustavo Velasquez
+1 (256) 653-9725


On Fri, Dec 8, 2017 at 7:32 AM, Flavio Henrique Araque Gurgel <
[email protected]> wrote:

>
> Em sex, 8 de dez de 2017 às 14:20, Guillaume Lelarge <
> [email protected]> escreveu:
>
>> Hi,
>>
>> 2017-12-08 2:12 GMT+01:00 Flávio Henrique :
>>
>>> Hi experts!
>>>
>>> I read this nice article about Understanding EXPLAIN [1] weeks ago that
>>> opened my mind about the tool, but it seems no enough to explain a lot of
>>> plans that I see in this list.
>>>
>>
>> Thanks.
>>
>> I often read responses to a plan that are not covered by the article.
>>>
>>> I need/want to know EXPLAIN better.
>>>
>>> Can you kindly advise me a good reading about advanced EXPLAIN?
>>>
>>>
>> There's not much out there. This document was written after reading this
>> list, viewing some talks (you may find a lot of them on youtube), and
>> reading the code.
>>
>> I intend to update this document, since I learned quite more since 2012.
>> Though I didn't find the time yet :-/
>>
>> Anyway, thanks.
>>
>>
> Hello all
>
> I would like to make clear that there are two "Flavio Henrique" on the
> lists, me beeing one of them, I'd like to say that I'm not the OP.
> A bit off-topic anyway, thanks for understanding.
>
> Flavio Gurgel
>
>


Re: Table with large number of int columns, very slow COPY FROM

2017-12-08 Thread Andres Freund
Hi,

On 2017-12-07 20:21:45 -0800, Alex Tokarev wrote:
> I have a set of tables with fairly large number of columns, mostly int with
> a few bigints and short char/varchar columns. I¹ve noticed that Postgres is
> pretty slow at inserting data in such a table. I tried to tune every
> possible setting: using unlogged tables, increased shared_buffers, etc; even
> placed the db cluster on ramfs and turned fsync off. The results are pretty
> much the same with the exception of using unlogged tables that improves
> performance just a little bit.

> I have made a minimally reproducible test case consisting of a table with
> 848 columns, inserting partial dataset of 100,000 rows with 240 columns. On
> my dev VM the COPY FROM operation takes just shy of 3 seconds to complete,
> which is entirely unexpected for such a small dataset.

I don't find this to be this absurdly slow. On my laptop loading with a
development checkout this takes 1223.950 ms. This is 20mio fields
parsed/sec, rows with 69mio fields/sec inserted.  Removing the TRUNCATE
and running the COPYs concurrently scales well to a few clients, and
only stops because my laptop's SSD stops being able to keep up.


That said, I do think there's a few places that could stand some
improvement. Locally the profile shows up as:
+   15.38%  postgres  libc-2.25.so[.] __GI_strtoll_l_internal
+   11.79%  postgres  postgres[.] heap_fill_tuple
+8.00%  postgres  postgres[.] CopyFrom
+7.40%  postgres  postgres[.] CopyReadLine
+6.79%  postgres  postgres[.] ExecConstraints
+6.68%  postgres  postgres[.] NextCopyFromRawFields
+6.36%  postgres  postgres[.] heap_compute_data_size
+6.02%  postgres  postgres[.] pg_atoi

the strtoll is libc functionality triggered by pg_atoi(), something I've
seen show up in numerous profiles. I think it's probably time to have
our own optimized version of it rather than relying on libcs.

That heap_fill_tuple(), which basically builds a tuple from the parsed
datums, takes time somewhat proportional to the number of columns in the
table seems hard to avoid, especially because this isn't something we
want to optimize for with the price of making more common workloads with
fewer columns slower. But there seems quite some micro-optimization
potential.

That ExecConstraints() shows up seems unsurprising, it has to walk
through all the table's columns checking for constraints. We could
easily optimize this so we have a separate datastructure listing
constraints, but that'd be slower in the very common case of more
reasonable numbers of columns.

The copy implementation deserves some optimization too...

> Here¹s a tarball with test schema and data:
> http://nohuhu.org/copy_perf.tar.bz2; it¹s 338k compressed but expands to
> ~50mb. Here¹s the result of profiling session with perf:
> https://pastebin.com/pjv7JqxD

Thanks!

Greetings,

Andres Freund



Faster str to int conversion (was Table with large number of int columns, very slow COPY FROM)

2017-12-08 Thread Andres Freund
Hi,


On 2017-12-08 10:17:34 -0800, Andres Freund wrote:
> the strtoll is libc functionality triggered by pg_atoi(), something I've
> seen show up in numerous profiles. I think it's probably time to have
> our own optimized version of it rather than relying on libcs.

Attached is a hand-rolled version. After quickly hacking up one from
scratch, I noticed we already kind of have one for int64 (scanint8), so
I changed the structure of this one to be relatively similar.

It's currently using the overflow logic from [1], but that's not
fundamentally required, we could rely on fwrapv for this one too.

This one improves performance of the submitted workload from 1223.950ms
to 1020.640ms (best of three). The profile's shape changes quite
noticeably:

master:
+   15.38%  postgres  libc-2.25.so  [.] __GI_strtoll_l_internal
+   11.79%  postgres  postgres  [.] heap_fill_tuple
+8.00%  postgres  postgres  [.] CopyFrom
+7.40%  postgres  postgres  [.] CopyReadLine
+6.79%  postgres  postgres  [.] ExecConstraints
+6.68%  postgres  postgres  [.] NextCopyFromRawFields
+6.36%  postgres  postgres  [.] heap_compute_data_size
+6.02%  postgres  postgres  [.] pg_atoi
patch:
+   13.70%  postgres  postgres  [.] heap_fill_tuple
+   10.46%  postgres  postgres  [.] CopyFrom
+9.31%  postgres  postgres  [.] pg_strto32
+8.39%  postgres  postgres  [.] CopyReadLine
+7.88%  postgres  postgres  [.] ExecConstraints
+7.63%  postgres  postgres  [.] InputFunctionCall
+7.41%  postgres  postgres  [.] heap_compute_data_size
+7.21%  postgres  postgres  [.] pg_verify_mbstr
+5.49%  postgres  postgres  [.] NextCopyFromRawFields


This probably isn't going to resolve Alex's performance concerns
meaningfully, but seems quite worthwhile to do anyway.

We probably should have int8/16/64 version coded just as use the 32bit
version, but I decided to leave that out for now. Primarily interested
in comments.  Wonder a bit whether it's worth providing an 'errorOk'
mode like scanint8 does, but surveying its callers suggests we should
rather change them to not need it...

Greetings,

Andres Freund

[1] 
http://archives.postgresql.org/message-id/20171030112751.mukkriz2rur2qkxc%40alap3.anarazel.de
>From 98fbe53be0a3046f8ace687f846f91a0043deee8 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 29 Oct 2017 22:13:54 -0700
Subject: [PATCH 1/3] Provide overflow safe integer math inline functions.

Author: Andres Freund, with some code stolen from Greg Stark
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 config/c-compiler.m4  |  22 
 configure |  33 ++
 configure.in  |   4 +
 src/include/common/int.h  | 229 ++
 src/include/pg_config.h.in|   3 +
 src/include/pg_config.h.win32 |   3 +
 6 files changed, 294 insertions(+)
 create mode 100644 src/include/common/int.h

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 6dcc7906491..0d91e52a28f 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -296,6 +296,28 @@ fi])# PGAC_C_BUILTIN_CONSTANT_P
 
 
 
+# PGAC_C_BUILTIN_OP_OVERFLOW
+# -
+# Check if the C compiler understands __builtin_$op_overflow(),
+# and define HAVE__BUILTIN_OP_OVERFLOW if so.
+#
+# Check for the most complicated case, 64 bit multiplication, as a
+# proxy for all of the operations.
+AC_DEFUN([PGAC_C_BUILTIN_OP_OVERFLOW],
+[AC_CACHE_CHECK(for __builtin_mul_overflow, pgac_cv__builtin_op_overflow,
+[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
+[PG_INT64_TYPE result;
+__builtin_mul_overflow((PG_INT64_TYPE) 1, (PG_INT64_TYPE) 2, &result);]
+)],
+[pgac_cv__builtin_op_overflow=yes],
+[pgac_cv__builtin_op_overflow=no])])
+if test x"$pgac_cv__builtin_op_overflow" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_OP_OVERFLOW, 1,
+  [Define to 1 if your compiler understands __builtin_$op_overflow.])
+fi])# PGAC_C_BUILTIN_OP_OVERFLOW
+
+
+
 # PGAC_C_BUILTIN_UNREACHABLE
 # --
 # Check if the C compiler understands __builtin_unreachable(),
diff --git a/configure b/configure
index 4ecd2e19224..f66899488cc 100755
--- a/configure
+++ b/configure
@@ -14467,6 +14467,39 @@ esac
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_mul_overflow" >&5
+$as_echo_n "checking for __builtin_mul_overflow... " >&6; }
+if ${pgac_cv__builtin_op_overflow+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+PG_INT64_TYPE result;
+__builtin_mul_overflow((PG_INT64_TYPE) 1, (PG_INT64_TYPE) 2, &result);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv__builtin_op_overflow=yes
+else
+  pgac_cv__builtin_op_overflow=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{

Re: Learning EXPLAIN

2017-12-08 Thread Sam Gendler
What about the many-part explanation posted on the blog that accompanies
explain.depesz.com.  Here is the first installment. I seem to remember that
there are 5 or 6 installments.

https://www.depesz.com/2013/04/16/explaining-the-unexplainable/

On Fri, Dec 8, 2017 at 8:44 AM, Gustavo Velasquez 
wrote:

> Dude,
>
> You can rest assured that at least the Brazilians members will always know
> based on your last name you are not the same :-).
>
> What's the point of explaining that anyways? Got curious.
>
> As to what pertains to the topic:
>
> This is another simple yet effective doc:
>
> https://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf
>
> Execution plans are tricky and reading them accurately to make good
> decisions require a lot of experience and awareness of the situation. It
> does not only require that you know how to read the tool itself but also
> know how the DB and schemas have been designed, if stats are up to date,
> how tables are populated, frequency and type of queries, adequate indexing
> in place, the hardware it sits on, etc.
>
> It's a mix of science, broaden knowledge, perspicacity, and why not
> say, it's an art.
>
> Have a great weekend.
>
> 
> ___
> Gustavo Velasquez
> +1 (256) 653-9725 <(256)%20653-9725>
>
>
> On Fri, Dec 8, 2017 at 7:32 AM, Flavio Henrique Araque Gurgel <
> [email protected]> wrote:
>
>>
>> Em sex, 8 de dez de 2017 às 14:20, Guillaume Lelarge <
>> [email protected]> escreveu:
>>
>>> Hi,
>>>
>>> 2017-12-08 2:12 GMT+01:00 Flávio Henrique :
>>>
 Hi experts!

 I read this nice article about Understanding EXPLAIN [1] weeks ago that
 opened my mind about the tool, but it seems no enough to explain a lot of
 plans that I see in this list.

>>>
>>> Thanks.
>>>
>>> I often read responses to a plan that are not covered by the article.

 I need/want to know EXPLAIN better.

 Can you kindly advise me a good reading about advanced EXPLAIN?


>>> There's not much out there. This document was written after reading this
>>> list, viewing some talks (you may find a lot of them on youtube), and
>>> reading the code.
>>>
>>> I intend to update this document, since I learned quite more since 2012.
>>> Though I didn't find the time yet :-/
>>>
>>> Anyway, thanks.
>>>
>>>
>> Hello all
>>
>> I would like to make clear that there are two "Flavio Henrique" on the
>> lists, me beeing one of them, I'd like to say that I'm not the OP.
>> A bit off-topic anyway, thanks for understanding.
>>
>> Flavio Gurgel
>>
>>
>


Re: Learning EXPLAIN

2017-12-08 Thread Sam Gendler
Ah, I see now that the article you linked to in the OP is written by the
same author and is maybe the very same content. If so, that sure seems
pretty comprehensive to me, though I've also been reading this list, off
and on, for many years, which has probably filled a lot of the gaps left by
the original blog posts.

On Fri, Dec 8, 2017 at 8:47 PM, Sam Gendler 
wrote:

> What about the many-part explanation posted on the blog that accompanies
> explain.depesz.com.  Here is the first installment. I seem to remember
> that there are 5 or 6 installments.
>
> https://www.depesz.com/2013/04/16/explaining-the-unexplainable/
>
> On Fri, Dec 8, 2017 at 8:44 AM, Gustavo Velasquez  > wrote:
>
>> Dude,
>>
>> You can rest assured that at least the Brazilians members will always
>> know based on your last name you are not the same :-).
>>
>> What's the point of explaining that anyways? Got curious.
>>
>> As to what pertains to the topic:
>>
>> This is another simple yet effective doc:
>>
>> https://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf
>>
>> Execution plans are tricky and reading them accurately to make good
>> decisions require a lot of experience and awareness of the situation. It
>> does not only require that you know how to read the tool itself but also
>> know how the DB and schemas have been designed, if stats are up to date,
>> how tables are populated, frequency and type of queries, adequate indexing
>> in place, the hardware it sits on, etc.
>>
>> It's a mix of science, broaden knowledge, perspicacity, and why not
>> say, it's an art.
>>
>> Have a great weekend.
>>
>> 
>> ___
>> Gustavo Velasquez
>> +1 (256) 653-9725 <(256)%20653-9725>
>>
>>
>> On Fri, Dec 8, 2017 at 7:32 AM, Flavio Henrique Araque Gurgel <
>> [email protected]> wrote:
>>
>>>
>>> Em sex, 8 de dez de 2017 às 14:20, Guillaume Lelarge <
>>> [email protected]> escreveu:
>>>
 Hi,

 2017-12-08 2:12 GMT+01:00 Flávio Henrique :

> Hi experts!
>
> I read this nice article about Understanding EXPLAIN [1] weeks ago
> that opened my mind about the tool, but it seems no enough to explain a 
> lot
> of plans that I see in this list.
>

 Thanks.

 I often read responses to a plan that are not covered by the article.
>
> I need/want to know EXPLAIN better.
>
> Can you kindly advise me a good reading about advanced EXPLAIN?
>
>
 There's not much out there. This document was written after reading
 this list, viewing some talks (you may find a lot of them on youtube), and
 reading the code.

 I intend to update this document, since I learned quite more since
 2012. Though I didn't find the time yet :-/

 Anyway, thanks.


>>> Hello all
>>>
>>> I would like to make clear that there are two "Flavio Henrique" on the
>>> lists, me beeing one of them, I'd like to say that I'm not the OP.
>>> A bit off-topic anyway, thanks for understanding.
>>>
>>> Flavio Gurgel
>>>
>>>
>>
>