Oracle number to PostgreSQL

2019-05-01 Thread Charlin Barak
Hi,
When migrating from Oracle number(10,3) to PostgreSQL, is numeric(10,3)
ideal or should I consider some other data types?

Thanks.

Charlin


Re: Oracle number to PostgreSQL

2019-05-01 Thread Stephen Frost
Greetings,

* Charlin Barak (charlinba...@gmail.com) wrote:
> When migrating from Oracle number(10,3) to PostgreSQL, is numeric(10,3)
> ideal or should I consider some other data types?

This really depends on what data is actually in that field and if you
need it to be exact.

If the field actually contains integers and always will, then using an
integer type field is *much* more efficient, either 'integer' if the
values will always be less than 2B, or 'bigint' if it'll be larger.

If the field actually contains floating point and can be inexact, then
using 'real' or 'float8' will be more efficient than numeric.

If the field needs to be exact (eg: monetary fields), then using
numeric(10,3) is correct.

I strongly recommend considering the data that goes into the field when
making this decision- there really is no 'one size fits all' when going
from number to numeric.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Oracle number to PostgreSQL

2019-05-01 Thread Charlin Barak
Thanks Stephen. Something for me to consider.

Charlin

On Wed, May 1, 2019 at 9:25 AM Stephen Frost  wrote:

> Greetings,
>
> * Charlin Barak (charlinba...@gmail.com) wrote:
> > When migrating from Oracle number(10,3) to PostgreSQL, is numeric(10,3)
> > ideal or should I consider some other data types?
>
> This really depends on what data is actually in that field and if you
> need it to be exact.
>
> If the field actually contains integers and always will, then using an
> integer type field is *much* more efficient, either 'integer' if the
> values will always be less than 2B, or 'bigint' if it'll be larger.
>
> If the field actually contains floating point and can be inexact, then
> using 'real' or 'float8' will be more efficient than numeric.
>
> If the field needs to be exact (eg: monetary fields), then using
> numeric(10,3) is correct.
>
> I strongly recommend considering the data that goes into the field when
> making this decision- there really is no 'one size fits all' when going
> from number to numeric.
>
> Thanks,
>
> Stephen
>


Query not producing expected result

2019-05-01 Thread Chuck Martin
I need help figuring out why a query is not returning the records I expect
it to. I'm searching on a DateTime column (timestamp without time zone -
not nullable). The query includes:

 AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'
 AND event.EventDone < 1

This does not return a record that has a DateTime value "May 1, 2019 9:52
AM". If I modify the query to:

 AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-2-2019'
 AND event.EventDone < 1

it does return the record in question. I assume I am not doing the data
comparison correctly, but don't know why.

Chuck Martin
Avondale Software


Re: Query not producing expected result

2019-05-01 Thread Julien Rouhaud
On Wed, May 1, 2019 at 6:27 PM Chuck Martin  wrote:
>
> I need help figuring out why a query is not returning the records I expect it 
> to. I'm searching on a DateTime column (timestamp without time zone - not 
> nullable). The query includes:
>
>  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'  AND 
> event.EventDone < 1
>
> This does not return a record that has a DateTime value "May 1, 2019 9:52 
> AM". If I modify the query to:
>
>  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-2-2019'  AND 
> event.EventDone < 1
>
> it does return the record in question. I assume I am not doing the data 
> comparison correctly, but don't know why.

That's because it's using a timestamp comparison, not a date
comparison.  So 'May-1-2019' is casted to timestamp, which gives
2019-05-01 00:00:00.

If you want a comparison based on date instead of timestamp , you have
to ask to it explicitly.  For instance: event.DateTime::date <=
'May-1-2019'




Re: Query not producing expected result

2019-05-01 Thread David G. Johnston
On Wed, May 1, 2019 at 9:27 AM Chuck Martin 
wrote:

> I need help figuring out why a query is not returning the records I expect
> it to. I'm searching on a DateTime column (timestamp without time zone -
> not nullable). The query includes:
>
>  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'
>  AND event.EventDone < 1
>
> This does not return a record that has a DateTime value "May 1, 2019 9:52
> AM". If I modify the query to:
>
>  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-2-2019'
>  AND event.EventDone < 1
>
> it does return the record in question. I assume I am not doing the data
> comparison correctly, but don't know why.
>
>
The field is a timestamp - which means there is a time involved.  If you
don't specify one explicitly that time is going to be midnight.  9:52AM on
the 1st is after midnight on the 1st so the first query doesn't return the
9:52AM record.

IOW, the DateTime field remains as-is and the comparator is turned into a
timestamp without time zone.

David J.


Re: Query not producing expected result

2019-05-01 Thread Tom Lane
Chuck Martin  writes:
> I need help figuring out why a query is not returning the records I expect
> it to. I'm searching on a DateTime column (timestamp without time zone -
> not nullable). The query includes:

>  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'
>  AND event.EventDone < 1

> This does not return a record that has a DateTime value "May 1, 2019 9:52
> AM".

Well, no, since the implied value of the constant is 'May-1-2019 00:00'.

If you only want 1-day precision of the comparison, maybe you should cast
or truncate the timestamp down to date.

regards, tom lane




Re: Query not producing expected result

2019-05-01 Thread Chuck Martin
Thanks, guys. It should have been obvious to me, but wasn't.

I found the correct result was returned with either

AND event.DateTime <= 'May-1-2019 24:00'

or

AND event.DateTime::date <= 'May-1-2019'

The latter seems best.

Chuck Martin
Avondale Software


On Wed, May 1, 2019 at 12:41 PM Tom Lane  wrote:

> Chuck Martin  writes:
> > I need help figuring out why a query is not returning the records I
> expect
> > it to. I'm searching on a DateTime column (timestamp without time zone -
> > not nullable). The query includes:
>
> >  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'
> >  AND event.EventDone < 1
>
> > This does not return a record that has a DateTime value "May 1, 2019 9:52
> > AM".
>
> Well, no, since the implied value of the constant is 'May-1-2019 00:00'.
>
> If you only want 1-day precision of the comparison, maybe you should cast
> or truncate the timestamp down to date.
>
> regards, tom lane
>
>
>


Re: Query not producing expected result

2019-05-01 Thread Ron

On 5/1/19 11:39 AM, Julien Rouhaud wrote:

On Wed, May 1, 2019 at 6:27 PM Chuck Martin  wrote:

I need help figuring out why a query is not returning the records I expect it 
to. I'm searching on a DateTime column (timestamp without time zone - not 
nullable). The query includes:

  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-1-2019'  AND 
event.EventDone < 1

This does not return a record that has a DateTime value "May 1, 2019 9:52 AM". 
If I modify the query to:

  AND event.Primaryresp_fkey = 511 AND event.DateTime <= 'May-2-2019'  AND 
event.EventDone < 1

it does return the record in question. I assume I am not doing the data 
comparison correctly, but don't know why.

That's because it's using a timestamp comparison, not a date
comparison.  So 'May-1-2019' is casted to timestamp, which gives
2019-05-01 00:00:00.

If you want a comparison based on date instead of timestamp , you have
to ask to it explicitly.  For instance: event.DateTime*::date*  <=
'May-1-2019'


Wouldn't that stop the query planner from using any index on event.DateTime?

--
Angular momentum makes the world go 'round.


Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
Chuck:

On Wed, May 1, 2019 at 6:56 PM Chuck Martin  wrote:
> Thanks, guys. It should have been obvious to me, but wasn't.
> I found the correct result was returned with either
> AND event.DateTime <= 'May-1-2019 24:00'
> or
> AND event.DateTime::date <= 'May-1-2019'
> The latter seems best.

The latter may prevent index usage, if you've got one.

One think I've said before. Dates are integer-like ( they are
countable ), but timestamps are real-like ( they may be countable due
to finite precision, like float or doubles are, but you should not
count on it ). For real-like stuff it is normally better to work with
half-open ranges, which in your case would translate to to query for

   event.DateTime < 'May-2-2019'  ( the minus-ininity <= DateTime
would for the other end ).

The reason is you can cover the whole DateTime domain with
non-intersecting half-open ranges, but not with open or closed ones
and, as a side effect, the starting point of a range is the same as
the next one ( also, this does not need cast, better for the optimizer
) ( If your input is an end date I normally pass this to timestamp
using ts < '$inclusive_end_date'::date + 1 ) ( I do lots of reports of
this kind, and using this helps a lot once you get the hang of it
after a couple tests ).

( I use half-open for dates to, for uniformity, and for being able to
use FEBRUARY= dateColumn >= "YEAR-02-01' and dateColumn <
"YEAR-03-01", no need to worry about leap years or remembering how
many days each month has. Generally they are easier, the only con I've
found is inability to use between ).

Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Chuck Martin
Thanks for the extra information. It is indeed an indexed column. I'll have
to think some more about how to address this in a general way, as this
issue can come up all over. I suppose using

AND datetime <= 'May 1, 2019 24:00'

would produce the same as

AND datetime < 'May 2, 2019'

wouldn't it? I'm not sure one is easier to implement than the other.

Chuck Martin
Avondale Software


On Wed, May 1, 2019 at 1:16 PM Francisco Olarte 
wrote:

> Chuck:
>
> On Wed, May 1, 2019 at 6:56 PM Chuck Martin 
> wrote:
> > Thanks, guys. It should have been obvious to me, but wasn't.
> > I found the correct result was returned with either
> > AND event.DateTime <= 'May-1-2019 24:00'
> > or
> > AND event.DateTime::date <= 'May-1-2019'
> > The latter seems best.
>
> The latter may prevent index usage, if you've got one.
>
> One think I've said before. Dates are integer-like ( they are
> countable ), but timestamps are real-like ( they may be countable due
> to finite precision, like float or doubles are, but you should not
> count on it ). For real-like stuff it is normally better to work with
> half-open ranges, which in your case would translate to to query for
>
>event.DateTime < 'May-2-2019'  ( the minus-ininity <= DateTime
> would for the other end ).
>
> The reason is you can cover the whole DateTime domain with
> non-intersecting half-open ranges, but not with open or closed ones
> and, as a side effect, the starting point of a range is the same as
> the next one ( also, this does not need cast, better for the optimizer
> ) ( If your input is an end date I normally pass this to timestamp
> using ts < '$inclusive_end_date'::date + 1 ) ( I do lots of reports of
> this kind, and using this helps a lot once you get the hang of it
> after a couple tests ).
>
> ( I use half-open for dates to, for uniformity, and for being able to
> use FEBRUARY= dateColumn >= "YEAR-02-01' and dateColumn <
> "YEAR-03-01", no need to worry about leap years or remembering how
> many days each month has. Generally they are easier, the only con I've
> found is inability to use between ).
>
> Francisco Olarte.
>
>
>


Re: Query not producing expected result

2019-05-01 Thread Adrian Klaver

On 5/1/19 10:15 AM, Francisco Olarte wrote:

Chuck:

On Wed, May 1, 2019 at 6:56 PM Chuck Martin  wrote:

Thanks, guys. It should have been obvious to me, but wasn't.
I found the correct result was returned with either
AND event.DateTime <= 'May-1-2019 24:00'
or
AND event.DateTime::date <= 'May-1-2019'
The latter seems best.


The latter may prevent index usage, if you've got one.

One think I've said before. Dates are integer-like ( they are
countable ), but timestamps are real-like ( they may be countable due
to finite precision, like float or doubles are, but you should not
count on it ). For real-like stuff it is normally better to work with
half-open ranges, which in your case would translate to to query for

event.DateTime < 'May-2-2019'  ( the minus-ininity <= DateTime
would for the other end ).

The reason is you can cover the whole DateTime domain with
non-intersecting half-open ranges, but not with open or closed ones
and, as a side effect, the starting point of a range is the same as
the next one ( also, this does not need cast, better for the optimizer
) ( If your input is an end date I normally pass this to timestamp
using ts < '$inclusive_end_date'::date + 1 ) ( I do lots of reports of
this kind, and using this helps a lot once you get the hang of it
after a couple tests ).

( I use half-open for dates to, for uniformity, and for being able to
use FEBRUARY= dateColumn >= "YEAR-02-01' and dateColumn <
"YEAR-03-01", no need to worry about leap years or remembering how
many days each month has. Generally they are easier, the only con I've
found is inability to use between ).


Would daterange help/simplify?:

create table dt_test(id integer, dt_fld date);

insert into dt_test values (1, '2019-02-03'), (2, '2019-02-26'), (3, 
'2019-03-01');


select dt_fld from dt_test where  dt_fld  <@  daterange('2019-02-01', 
'2019-03-01');

   dt_fld

 2019-02-03
 2019-02-26



Francisco Olarte.





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Query not producing expected result

2019-05-01 Thread Chuck Martin
Something like daterange would be a solution in some circumstances, but
this query is a user-generated one, and they don't have that much control
over the query. It has to be modified as needed behind the scenes so that
it produces the results they expect. In this instance, I'm now (given the
advice received here) inclined to check the value entered when searching
for a date, and if no time is entered, add '24:00' to the date.

Chuck Martin
Avondale Software


On Wed, May 1, 2019 at 1:32 PM Adrian Klaver 
wrote:

> On 5/1/19 10:15 AM, Francisco Olarte wrote:
> > Chuck:
> >
> > On Wed, May 1, 2019 at 6:56 PM Chuck Martin 
> wrote:
> >> Thanks, guys. It should have been obvious to me, but wasn't.
> >> I found the correct result was returned with either
> >> AND event.DateTime <= 'May-1-2019 24:00'
> >> or
> >> AND event.DateTime::date <= 'May-1-2019'
> >> The latter seems best.
> >
> > The latter may prevent index usage, if you've got one.
> >
> > One think I've said before. Dates are integer-like ( they are
> > countable ), but timestamps are real-like ( they may be countable due
> > to finite precision, like float or doubles are, but you should not
> > count on it ). For real-like stuff it is normally better to work with
> > half-open ranges, which in your case would translate to to query for
> >
> > event.DateTime < 'May-2-2019'  ( the minus-ininity <= DateTime
> > would for the other end ).
> >
> > The reason is you can cover the whole DateTime domain with
> > non-intersecting half-open ranges, but not with open or closed ones
> > and, as a side effect, the starting point of a range is the same as
> > the next one ( also, this does not need cast, better for the optimizer
> > ) ( If your input is an end date I normally pass this to timestamp
> > using ts < '$inclusive_end_date'::date + 1 ) ( I do lots of reports of
> > this kind, and using this helps a lot once you get the hang of it
> > after a couple tests ).
> >
> > ( I use half-open for dates to, for uniformity, and for being able to
> > use FEBRUARY= dateColumn >= "YEAR-02-01' and dateColumn <
> > "YEAR-03-01", no need to worry about leap years or remembering how
> > many days each month has. Generally they are easier, the only con I've
> > found is inability to use between ).
>
> Would daterange help/simplify?:
>
> create table dt_test(id integer, dt_fld date);
>
> insert into dt_test values (1, '2019-02-03'), (2, '2019-02-26'), (3,
> '2019-03-01');
>
> select dt_fld from dt_test where  dt_fld  <@  daterange('2019-02-01',
> '2019-03-01');
> dt_fld
> 
>   2019-02-03
>   2019-02-26
>
> >
> > Francisco Olarte.
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
Chuck:

On Wed, May 1, 2019 at 7:23 PM Chuck Martin  wrote:
>
> Thanks for the extra information. It is indeed an indexed column. I'll have 
> to think some more about how to address this in a general way, as this issue 
> can come up all over. I suppose using
>
> AND datetime <= 'May 1, 2019 24:00'
>
> would produce the same as
>
> AND datetime < 'May 2, 2019'
>
> wouldn't it? I'm not sure one is easier to implement than the other.

At first the <= seems easier, but it is deceiving. So deceiving it is
incorrect, you should use < in both.

I've made a test:

http://sqlfiddle.com/#!17/9eecb/29310
It basically says << select 'May 1 2019 24:00'::timestamp, 'May 2
2019'::timestamp >>
Gives the same result for both, << 2019-05-02T00:00:00Z >>.

So your first condition, using <= is wrong as it will select data just
at 00:00:00 of the next day.

This is why I was telling you to use half-open-ranges, and once you
use half-open ( < ) it is easier to see whats is going on using  'May
2 2019', or 'May 1 2019'::date+1, than remembering a 24:00:00 folds to
the next day due to the peculiarities of text to timestamp conversion
( which allows just this value, but advances the date ).

Also, see that even if you use 'May 1 2019'::date+1,and index can be
used, as the casting and conversions can be constant-folded.

I normally recommend everyone to get used to half-open for time
intervals and conditions, it is much easier to get right. Also,
24:00(:00.000) is just one above the limit for dates know, but
earth rotation is slowing, and it may be fixed by either putting more
seconds or more hours, so who knows. But ::date+1 will be correct for
as long as people keep maintaining postgres time arithmetic routines.

Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Adrian Klaver

On 5/1/19 10:37 AM, Chuck Martin wrote:
Something like daterange would be a solution in some circumstances, but 
this query is a user-generated one, and they don't have that much 
control over the query. It has to be modified as needed behind the 
scenes so that it produces the results they expect. In this instance, 
I'm now (given the advice received here) inclined to check the value 
entered when searching for a date, and if no time is entered, add 
'24:00' to the date.


I should have made it clearer, my suggestion was mostly directed at 
Franciso's example.


Still:

select tsrange('2019-05-01', '2019-05-02') @> '2019-05-01 9:52'::timestamp;
 ?column?
--
 t

If you are modifying anyway:)

For this sort of thing, I have found range types to be a time and sanity 
saver. Just throwing it out there.




Chuck Martin
Avondale Software



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
On Wed, May 1, 2019 at 7:37 PM Chuck Martin  wrote:
>
> Something like daterange would be a solution in some circumstances, but this 
> query is a user-generated one, and they don't have that much control over the 
> query. It has to be modified as needed behind the scenes so that it produces 
> the results they expect. In this instance, I'm now (given the advice received 
> here) inclined to check the value entered when searching for a date, and if 
> no time is entered, add '24:00' to the date.

What I normally do for that is, if the user must enter a date, use
$user_input::date+1 and always go to less than.

But anyway your solution with <= is incorrect. And you have made the
mistake probably because the 24:00 lead you to think postgres will
split the timestamp, compare
the date with may 1 and the time with 24:00, and that is not true. The
less-than option plus one-day add will not lead you to that error.

You can use <= with 23:59:59.999, will be good for
some years if leap-seconds do not bite you.

The problem is when the user enters a date, he wants a date search, so
cast(dateTime as date) <= limit. When he enters a time he does not
usually know what he is asking for ( normally when my users ask for
'May 1 23:15 they want to include up to 23:15:59, users think in
"truncate to my precision, then search inclusively" ). But they begin
to understand it when I ask "ok, twice a month bills, go 1..15 and
16..what ?", much easier to say [-MM-01 , -MM-16) and [
-MM-16, (-MM-01)+1month)

Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Adrian Klaver

On 5/1/19 10:51 AM, Francisco Olarte wrote:

On Wed, May 1, 2019 at 7:37 PM Chuck Martin  wrote:


Something like daterange would be a solution in some circumstances, but this 
query is a user-generated one, and they don't have that much control over the 
query. It has to be modified as needed behind the scenes so that it produces 
the results they expect. In this instance, I'm now (given the advice received 
here) inclined to check the value entered when searching for a date, and if no 
time is entered, add '24:00' to the date.


What I normally do for that is, if the user must enter a date, use
$user_input::date+1 and always go to less than.

But anyway your solution with <= is incorrect. And you have made the
mistake probably because the 24:00 lead you to think postgres will
split the timestamp, compare


You will have to explain further as I am not seeing it:

test_(postgres)# select '2019-05-01 9:52' <= '2019-05-01 24:00'::timestamp;
 ?column?
--
 t

test_(postgres)# select '2019-05-01 24:00' <= '2019-05-01 24:00'::timestamp;
 ?column?
--
 t


the date with may 1 and the time with 24:00, and that is not true. The
less-than option plus one-day add will not lead you to that error.

You can use <= with 23:59:59.999, will be good for
some years if leap-seconds do not bite you.

The problem is when the user enters a date, he wants a date search, so
cast(dateTime as date) <= limit. When he enters a time he does not
usually know what he is asking for ( normally when my users ask for
'May 1 23:15 they want to include up to 23:15:59, users think in
"truncate to my precision, then search inclusively" ). But they begin
to understand it when I ask "ok, twice a month bills, go 1..15 and
16..what ?", much easier to say [-MM-01 , -MM-16) and [
-MM-16, (-MM-01)+1month)

Francisco Olarte.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Query not producing expected result

2019-05-01 Thread Chuck Martin
Ok, I see that my assumptions were incorrect. In this instance, the use of
< date+1 will return what is expected, where my solution might not have.
For other circumstances, I want to explore tsrange.

And, no, I'm not in Avondale, CA, but Decatur, GA (a few miles from
Avondale Estates, where I once lived).

Chuck Martin
Avondale Software


On Wed, May 1, 2019 at 1:52 PM Francisco Olarte 
wrote:

> On Wed, May 1, 2019 at 7:37 PM Chuck Martin 
> wrote:
> >
> > Something like daterange would be a solution in some circumstances, but
> this query is a user-generated one, and they don't have that much control
> over the query. It has to be modified as needed behind the scenes so that
> it produces the results they expect. In this instance, I'm now (given the
> advice received here) inclined to check the value entered when searching
> for a date, and if no time is entered, add '24:00' to the date.
>
> What I normally do for that is, if the user must enter a date, use
> $user_input::date+1 and always go to less than.
>
> But anyway your solution with <= is incorrect. And you have made the
> mistake probably because the 24:00 lead you to think postgres will
> split the timestamp, compare
> the date with may 1 and the time with 24:00, and that is not true. The
> less-than option plus one-day add will not lead you to that error.
>
> You can use <= with 23:59:59.999, will be good for
> some years if leap-seconds do not bite you.
>
> The problem is when the user enters a date, he wants a date search, so
> cast(dateTime as date) <= limit. When he enters a time he does not
> usually know what he is asking for ( normally when my users ask for
> 'May 1 23:15 they want to include up to 23:15:59, users think in
> "truncate to my precision, then search inclusively" ). But they begin
> to understand it when I ask "ok, twice a month bills, go 1..15 and
> 16..what ?", much easier to say [-MM-01 , -MM-16) and [
> -MM-16, (-MM-01)+1month)
>
> Francisco Olarte.
>
>
>


Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
Adrian..

On Wed, May 1, 2019 at 7:50 PM Adrian Klaver  wrote:
> I should have made it clearer, my suggestion was mostly directed at
> Franciso's example.
...
> For this sort of thing, I have found range types to be a time and sanity
> saver. Just throwing it out there.

I've had problems with the functions, being used to the [start,end)
notation on paper. I'll look at them again.

But anyway, after so many years of not having intervals and operators,
I read "$start<= $val and $val < $end" as "$val in [$start,$end)", I
think it shares brain paths with "for(;;)" parsing to "forever /
loop". I would like to have the "$start <= $val < $end" which some
language whose name I do not remember has, for complex $vals.

Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Adrian Klaver

On 5/1/19 10:58 AM, Francisco Olarte wrote:

Adrian..

On Wed, May 1, 2019 at 7:50 PM Adrian Klaver  wrote:

I should have made it clearer, my suggestion was mostly directed at
Franciso's example.

...

For this sort of thing, I have found range types to be a time and sanity
saver. Just throwing it out there.


I've had problems with the functions, being used to the [start,end)
notation on paper. I'll look at them again.


You don't have to use the functions:

test_(postgres)# select dt_fld from dt_test where dt_fld  <@ 
'[2019-02-01, 2019-03-01)'::daterange ;

   dt_fld

 2019-02-03
 2019-02-26




But anyway, after so many years of not having intervals and operators,
I read "$start<= $val and $val < $end" as "$val in [$start,$end)", I
think it shares brain paths with "for(;;)" parsing to "forever /
loop". I would like to have the "$start <= $val < $end" which some
language whose name I do not remember has, for complex $vals.

Francisco Olarte.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
Adrian:

On Wed, May 1, 2019 at 7:57 PM Adrian Klaver  wrote:
> You will have to explain further as I am not seeing it:
> test_(postgres)# select '2019-05-01 9:52' <= '2019-05-01 24:00'::timestamp;
>   ?column?
> --
>   t
>
> test_(postgres)# select '2019-05-01 24:00' <= '2019-05-01 24:00'::timestamp;
>   ?column?
> --
>   t

Because you are using two selected examples. The one with 9:52 is ok.

The last one is misleading because you are using a constant for a
particular timestamp in MAY THE SECOND wich can be written to look
like it is in MAY THE FIRST.

Rewrite it as
select '2019-05-02'::timestamp <= '2019-05-01 24:00'::timestamp;

And you'll see and out of range date selected.

This is why <= AND 24:00 are bad and misleading.

You may not have problems with 00:00:00 times, but work a bit billing
phone calls and you'll find about one in 86400 hit it ( more in my
case as traffic distribution is skewed ). Use that kind of condition
and you end up chasing why the monthly report has a dozen less calls
than the sum of the daily ones the billing guys made using excel.

Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Francisco Olarte
On Wed, May 1, 2019 at 8:04 PM Adrian Klaver  wrote:
> You don't have to use the functions:
> test_(postgres)# select dt_fld from dt_test where dt_fld  <@
> '[2019-02-01, 2019-03-01)'::daterange ;

I knew there have to be a cast syntax ( I should have said I try to
avoid casts as well ).  I will take your advice and learn more of
these, but all that operators and cast take quite a bit to learn (
specially as I do not do that much sql and I tend to forget operators
as I routinelly have to use about a dozen languages,that's why I try
to use the more common / old way ( and 30+ years of habit take a bit
to change ) ).

Thanks.
Francisco Olarte.




Re: Query not producing expected result

2019-05-01 Thread Adrian Klaver

On 5/1/19 11:04 AM, Francisco Olarte wrote:

Adrian:

On Wed, May 1, 2019 at 7:57 PM Adrian Klaver  wrote:

You will have to explain further as I am not seeing it:
test_(postgres)# select '2019-05-01 9:52' <= '2019-05-01 24:00'::timestamp;
   ?column?
--
   t

test_(postgres)# select '2019-05-01 24:00' <= '2019-05-01 24:00'::timestamp;
   ?column?
--
   t


Because you are using two selected examples. The one with 9:52 is ok.

The last one is misleading because you are using a constant for a
particular timestamp in MAY THE SECOND wich can be written to look
like it is in MAY THE FIRST.

Rewrite it as
select '2019-05-02'::timestamp <= '2019-05-01 24:00'::timestamp;

And you'll see and out of range date selected.


Technically it is correct as:

test_(postgres)# select '2019-05-02'::timestamp;
  timestamp
-
 2019-05-02 00:00:00

which is Midnight and is both the end of one day and start of another.

It comes down to where you want to draw the line between days.




This is why <= AND 24:00 are bad and misleading.

You may not have problems with 00:00:00 times, but work a bit billing
phone calls and you'll find about one in 86400 hit it ( more in my
case as traffic distribution is skewed ). Use that kind of condition
and you end up chasing why the monthly report has a dozen less calls
than the sum of the daily ones the billing guys made using excel.

Francisco Olarte.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Upgrading locale issues

2019-05-01 Thread Peter Geoghegan
On Mon, Apr 29, 2019 at 7:45 AM rihad  wrote:
> Hi. Today we run pg_ctl promote on a slave server (10.7) and started
> using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
> 11.2. And you guessed it, most varchar indexes got corrupted because
> system local changed in subtle ways. So I created the extension amcheck
> and reindexed all bad indexes one by one. Is there any way to prevent
> such things in the future? Will switching to ICU fix all such issues?

Not necessarily, but it will detect the incompatibility more or less
automatically, making it far more likely that the problem will be
caught before it does any harm. ICU versions collations, giving
Postgres a way to reason about their compatibility over time. The libc
collations are not versioned, though (at least not in any standard way
that Postgres can take advantage of).

> The problem with it is that ICU collations are absent in pg_collation,
> initdb should be run to create them, but pg_basebackup only runs on an
> empty base directory, so I couldn't run initdb + pg_basebackup to
> prepare the replica server. I believe I can run the create collation
> command manually, but what would it look like for en-x-icu?

It is safe to call pg_import_system_collations() directly, which is
all that initdb does. This is documented, so you wouldn't be relying
on a hack.

-- 
Peter Geoghegan




Re: Upgrading locale issues

2019-05-01 Thread Thomas Munro
On Thu, May 2, 2019 at 8:26 AM Peter Geoghegan  wrote:
> On Mon, Apr 29, 2019 at 7:45 AM rihad  wrote:
> > Hi. Today we run pg_ctl promote on a slave server (10.7) and started
> > using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
> > 11.2. And you guessed it, most varchar indexes got corrupted because
> > system local changed in subtle ways. So I created the extension amcheck
> > and reindexed all bad indexes one by one. Is there any way to prevent
> > such things in the future? Will switching to ICU fix all such issues?
>
> Not necessarily, but it will detect the incompatibility more or less
> automatically, making it far more likely that the problem will be
> caught before it does any harm. ICU versions collations, giving
> Postgres a way to reason about their compatibility over time. The libc
> collations are not versioned, though (at least not in any standard way
> that Postgres can take advantage of).

As discussed over on -hackers[1], I think it's worth pursuing that
though.  FWIW I've proposed locale versioning for FreeBSD's libc[2].
The reason I haven't gone further with that yet even though the code
change has been accepted in principle by FreeBSD reviewers is because
I got stuck on the question of how exactly to model the versions.  If,
say, just Turkish changes, I don't want to be rebuilding my French
indexes, which means that I don't think you can use the CLDR version
string.  Frustratingly, you could probably do a good job of that by
just checksumming the collation definition files, but that seems a bit
too crude.

There is also the question of how PostgreSQL should model versions,
and as I've argued in [1], I think we should track them at the level
of database object dependencies.

I'm hoping to reopen this can of worms for PostgreSQL 13 (and the
corresponding support could in theory be in FreeBSD 13... coincidence,
or a sign!?)

> > The problem with it is that ICU collations are absent in pg_collation,
> > initdb should be run to create them, but pg_basebackup only runs on an
> > empty base directory, so I couldn't run initdb + pg_basebackup to
> > prepare the replica server. I believe I can run the create collation
> > command manually, but what would it look like for en-x-icu?
>
> It is safe to call pg_import_system_collations() directly, which is
> all that initdb does. This is documented, so you wouldn't be relying
> on a hack.

Unfortunately you can't use ICU collations as a database default yet
(though there was some WIP code[3]), so ICU only saves you from
versioning problems if you explicitly set collations for columns or
expressions, and even then the version tracking is currently just a
warning that you clear manually with a command, not a mechanism that
really tracks which database objects were last rebuilt/validated with
a given version.

[1] 
https://www.postgresql.org/message-id/flat/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
[2] https://reviews.freebsd.org/D17166
[3] https://www.postgresql.org/message-id/flat/3366.1498183854%40sss.pgh.pa.us

-- 
Thomas Munro
https://enterprisedb.com




Starting Postgres when there is no disk space

2019-05-01 Thread Igal Sapir
I have Postgres running in a Docker container with PGDATA mounted from the
host.  Postgres consume all of the disk space, 130GB [1], and can not be
started [2].  The database has a lot of bloat due to much many deletions.
The problem is that now I can not start Postgres at all.

I mounted an additional partition with 100GB, hoping to fix the bloat with
a TABLESPACE in the new mount, but how can I do anything if Postgres will
not start in the first place?

I expected there to be a tool that can defrag the database files, e.g. a
"vacuumdb" utility that can run without Postgres.  Or maybe run Postgres
and disable the WAL so that no new disk space will be required.

Surely, I'm not the first one to experience this issue.  How can I fix this?

Thank you,

Igal

[1]
root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
625M/pgdata/pg_wal
608K/pgdata/global
0   /pgdata/pg_commit_ts
0   /pgdata/pg_dynshmem
8.0K/pgdata/pg_notify
0   /pgdata/pg_serial
0   /pgdata/pg_snapshots
16K /pgdata/pg_subtrans
0   /pgdata/pg_twophase
16K /pgdata/pg_multixact
130G/pgdata/base
0   /pgdata/pg_replslot
0   /pgdata/pg_tblspc
0   /pgdata/pg_stat
0   /pgdata/pg_stat_tmp
7.9M/pgdata/pg_xact
4.0K/pgdata/pg_logical
0   /pgdata/tmp
130G/pgdata

[2]
postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
waiting for server to start2019-05-01 20:43:59.301 UTC [34] LOG:
listening on IPv4 address "0.0.0.0", port 5432
2019-05-01 20:43:59.301 UTC [34] LOG:  listening on IPv6 address "::", port
5432
2019-05-01 20:43:59.303 UTC [34] LOG:  listening on Unix socket
"/var/run/postgresql/.s.PGSQL.5432"
2019-05-01 20:43:59.322 UTC [35] LOG:  database system shutdown was
interrupted; last known up at 2019-05-01 19:37:32 UTC
2019-05-01 20:43:59.863 UTC [35] LOG:  database system was not properly
shut down; automatic recovery in progress
2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
...2019-05-01 20:44:02.389 UTC [35] LOG:  redo done at 144/74FFE060
2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction was at
log time 2019-04-28 05:05:24.687581+00
.2019-05-01 20:44:03.474 UTC [35] PANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
2019-05-01 20:44:03.480 UTC [34] LOG:  startup process (PID 35) was
terminated by signal 6: Aborted
2019-05-01 20:44:03.480 UTC [34] LOG:  aborting startup due to startup
process failure
2019-05-01 20:44:03.493 UTC [34] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.


strange nested loop row count estimates

2019-05-01 Thread Sergey Koposov
Hi, 

I'm currently trying to understand the expected row counts for a query 
involving a nested loop join and bitmap index scan 
on the functional index and a custom operator. And the numbers that I see don't 
make sense to me currently. Hopefully 
somebody here can shed some light on it, or confirm this is some kind of issue. 
 

Here is the query and explain analyze

explain analyze select * from twomass.psc as t , gaia_dr2.gaia_source as g 
where 
(
(q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 
0.0003, 0)  and  
q3c_nearby_it(t.ra, t.decl, 
0.0003, 1))  
or  
(q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 
0.0003, 1)  and  
q3c_nearby_it(t.ra, t.decl, 
0.0003, 3))
) 
and
0.0003 ==<<>>== (g.ra,g.dec,t.ra,t.decl)::q3c_type limit 10;

https://explain.depesz.com/s/vcNd

What I can't understand at all is how the estimate of 3E15 rows is obtained 
by the nested loop 
given that the bitmap heap scan is expected to return *one* single row for each 
row of the 'left' table. 
So in my mind the estimate of the total number of rows
should be ~ 1e9 rows after the nested loop. Because of this crazy overestimate, 
I actually have to force the nested loop 
in this query by disabling seqscan. 
(if I don't disable the seqscan -- this is the plan I get which ignores the 
indices:
https://explain.depesz.com/s/EIiG

Some more details about the query: 
q3c_ang2ipix(ra,dec) is the function mapping (double,double) -> bigint and the 
tables have a functional index on that.
Like this: 
   Table "gaia_dr2.gaia_source"
  Column  |   Type| Modifiers 
--+---+---
 ra   | double precision  | 
 dec  | double precision  | 
...
Indexes:
"gaia_source2_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec"))

q3c_nearby_() function just returns bigint.

The ==<<>== is the custom operator with custom low selectivity (1e-12 in this 
case)

The tables in the join in question have 450 mill  and 1.5 billion rows. 

I hope somebody can help me understand what's going on. 

Thank you in advance. 

  Sergey


PS the kind of query that I show  comes from the q3c module ( 
https://github.com/segasai/q3c ) 
that is used for spatial queries of large astronomical catalogues. 



Re: Starting Postgres when there is no disk space

2019-05-01 Thread David Rowley
On Thu, 2 May 2019 at 12:07, Igal Sapir  wrote:
> I mounted an additional partition with 100GB, hoping to fix the bloat with a 
> TABLESPACE in the new mount, but how can I do anything if Postgres will not 
> start in the first place?

You could move the pg_wal directory over to the new partition and ln
it back to its original location.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: Starting Postgres when there is no disk space

2019-05-01 Thread Michael Loftis
Best optionCopy/move the entire  pgdata to a larger space. It may also
be enough to just move the WAL (leaving a symlink) freeing up the 623M but
I doubt it since VACUUM FULL occurs in the same table space and can need an
equal amount of space (130G) depending on how much it can actually free up.

You may also get away with just moving (and leaving a symlink) for the base
but I don't recall if that works or not.

On Wed, May 1, 2019 at 18:07 Igal Sapir  wrote:

> I have Postgres running in a Docker container with PGDATA mounted from the
> host.  Postgres consume all of the disk space, 130GB [1], and can not be
> started [2].  The database has a lot of bloat due to much many deletions.
> The problem is that now I can not start Postgres at all.
>
> I mounted an additional partition with 100GB, hoping to fix the bloat with
> a TABLESPACE in the new mount, but how can I do anything if Postgres will
> not start in the first place?
>
> I expected there to be a tool that can defrag the database files, e.g. a
> "vacuumdb" utility that can run without Postgres.  Or maybe run Postgres
> and disable the WAL so that no new disk space will be required.
>
> Surely, I'm not the first one to experience this issue.  How can I fix
> this?
>
> Thank you,
>
> Igal
>
> [1]
> root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
> 625M/pgdata/pg_wal
> 608K/pgdata/global
> 0   /pgdata/pg_commit_ts
> 0   /pgdata/pg_dynshmem
> 8.0K/pgdata/pg_notify
> 0   /pgdata/pg_serial
> 0   /pgdata/pg_snapshots
> 16K /pgdata/pg_subtrans
> 0   /pgdata/pg_twophase
> 16K /pgdata/pg_multixact
> 130G/pgdata/base
> 0   /pgdata/pg_replslot
> 0   /pgdata/pg_tblspc
> 0   /pgdata/pg_stat
> 0   /pgdata/pg_stat_tmp
> 7.9M/pgdata/pg_xact
> 4.0K/pgdata/pg_logical
> 0   /pgdata/tmp
> 130G/pgdata
>
> [2]
> postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
> waiting for server to start2019-05-01 20:43:59.301 UTC [34] LOG:
> listening on IPv4 address "0.0.0.0", port 5432
> 2019-05-01 20:43:59.301 UTC [34] LOG:  listening on IPv6 address "::",
> port 5432
> 2019-05-01 20:43:59.303 UTC [34] LOG:  listening on Unix socket
> "/var/run/postgresql/.s.PGSQL.5432"
> 2019-05-01 20:43:59.322 UTC [35] LOG:  database system shutdown was
> interrupted; last known up at 2019-05-01 19:37:32 UTC
> 2019-05-01 20:43:59.863 UTC [35] LOG:  database system was not properly
> shut down; automatic recovery in progress
> 2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
> ...2019-05-01 20:44:02.389 UTC [35] LOG:  redo done at 144/74FFE060
> 2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction was at
> log time 2019-04-28 05:05:24.687581+00
> .2019-05-01 20:44:03.474 UTC [35] PANIC:  could not write to file
> "pg_logical/replorigin_checkpoint.tmp": No space left on device
> 2019-05-01 20:44:03.480 UTC [34] LOG:  startup process (PID 35) was
> terminated by signal 6: Aborted
> 2019-05-01 20:44:03.480 UTC [34] LOG:  aborting startup due to startup
> process failure
> 2019-05-01 20:44:03.493 UTC [34] LOG:  database system is shut down
>  stopped waiting
> pg_ctl: could not start server
> Examine the log output.
>
>
>
>
>
>
>
>
>
>
>
> --

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler


Re: Starting Postgres when there is no disk space

2019-05-01 Thread Igal Sapir
Thank you both.  The symlink sounds like a very good idea.  My other disk
is 100GB and the database is already 130GB so moving the whole thing will
require provisioning that will take more time.  I will try the symlinks
first.  Possibly moving some tables to a tablespace on the other partition
to make more room.

I have a scheduled process that runs daily to delete old data and do full
vacuum.  Not sure why this happened (again).

Thanks,

Igal

On Wed, May 1, 2019 at 6:02 PM Michael Loftis  wrote:

> Best optionCopy/move the entire  pgdata to a larger space. It may also
> be enough to just move the WAL (leaving a symlink) freeing up the 623M but
> I doubt it since VACUUM FULL occurs in the same table space and can need an
> equal amount of space (130G) depending on how much it can actually free up.
>
> You may also get away with just moving (and leaving a symlink) for the
> base but I don't recall if that works or not.
>
> On Wed, May 1, 2019 at 18:07 Igal Sapir  wrote:
>
>> I have Postgres running in a Docker container with PGDATA mounted from
>> the host.  Postgres consume all of the disk space, 130GB [1], and can not
>> be started [2].  The database has a lot of bloat due to much many
>> deletions.  The problem is that now I can not start Postgres at all.
>>
>> I mounted an additional partition with 100GB, hoping to fix the bloat
>> with a TABLESPACE in the new mount, but how can I do anything if Postgres
>> will not start in the first place?
>>
>> I expected there to be a tool that can defrag the database files, e.g. a
>> "vacuumdb" utility that can run without Postgres.  Or maybe run Postgres
>> and disable the WAL so that no new disk space will be required.
>>
>> Surely, I'm not the first one to experience this issue.  How can I fix
>> this?
>>
>> Thank you,
>>
>> Igal
>>
>> [1]
>> root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
>> 625M/pgdata/pg_wal
>> 608K/pgdata/global
>> 0   /pgdata/pg_commit_ts
>> 0   /pgdata/pg_dynshmem
>> 8.0K/pgdata/pg_notify
>> 0   /pgdata/pg_serial
>> 0   /pgdata/pg_snapshots
>> 16K /pgdata/pg_subtrans
>> 0   /pgdata/pg_twophase
>> 16K /pgdata/pg_multixact
>> 130G/pgdata/base
>> 0   /pgdata/pg_replslot
>> 0   /pgdata/pg_tblspc
>> 0   /pgdata/pg_stat
>> 0   /pgdata/pg_stat_tmp
>> 7.9M/pgdata/pg_xact
>> 4.0K/pgdata/pg_logical
>> 0   /pgdata/tmp
>> 130G/pgdata
>>
>> [2]
>> postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
>> waiting for server to start2019-05-01 20:43:59.301 UTC [34] LOG:
>> listening on IPv4 address "0.0.0.0", port 5432
>> 2019-05-01 20:43:59.301 UTC [34] LOG:  listening on IPv6 address "::",
>> port 5432
>> 2019-05-01 20:43:59.303 UTC [34] LOG:  listening on Unix socket
>> "/var/run/postgresql/.s.PGSQL.5432"
>> 2019-05-01 20:43:59.322 UTC [35] LOG:  database system shutdown was
>> interrupted; last known up at 2019-05-01 19:37:32 UTC
>> 2019-05-01 20:43:59.863 UTC [35] LOG:  database system was not properly
>> shut down; automatic recovery in progress
>> 2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
>> ...2019-05-01 20:44:02.389 UTC [35] LOG:  redo done at 144/74FFE060
>> 2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction was at
>> log time 2019-04-28 05:05:24.687581+00
>> .2019-05-01 20:44:03.474 UTC [35] PANIC:  could not write to file
>> "pg_logical/replorigin_checkpoint.tmp": No space left on device
>> 2019-05-01 20:44:03.480 UTC [34] LOG:  startup process (PID 35) was
>> terminated by signal 6: Aborted
>> 2019-05-01 20:44:03.480 UTC [34] LOG:  aborting startup due to startup
>> process failure
>> 2019-05-01 20:44:03.493 UTC [34] LOG:  database system is shut down
>>  stopped waiting
>> pg_ctl: could not start server
>> Examine the log output.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>
> "Genius might be described as a supreme capacity for getting its possessors
> into trouble of all kinds."
> -- Samuel Butler
>


Re: Starting Postgres when there is no disk space

2019-05-01 Thread Ron

To get the cluster up and running, you only need to move a GB or two.

On 5/1/19 9:24 PM, Igal Sapir wrote:
Thank you both.  The symlink sounds like a very good idea. My other disk 
is 100GB and the database is already 130GB so moving the whole thing will 
require provisioning that will take more time.  I will try the symlinks 
first.  Possibly moving some tables to a tablespace on the other partition 
to make more room.


I have a scheduled process that runs daily to delete old data and do full 
vacuum.  Not sure why this happened (again).


Thanks,

Igal

On Wed, May 1, 2019 at 6:02 PM Michael Loftis > wrote:


Best optionCopy/move the entire  pgdata to a larger space. It may
also be enough to just move the WAL (leaving a symlink) freeing up the
623M but I doubt it since VACUUM FULL occurs in the same table space
and can need an equal amount of space (130G) depending on how much it
can actually free up.

You may also get away with just moving (and leaving a symlink) for the
base but I don't recall if that works or not.

On Wed, May 1, 2019 at 18:07 Igal Sapir mailto:i...@lucee.org>> wrote:

I have Postgres running in a Docker container with PGDATA mounted
from the host. Postgres consume all of the disk space, 130GB [1],
and can not be started [2].  The database has a lot of bloat due
to much many deletions.  The problem is that now I can not start
Postgres at all.

I mounted an additional partition with 100GB, hoping to fix the
bloat with a TABLESPACE in the new mount, but how can I do
anything if Postgres will not start in the first place?

I expected there to be a tool that can defrag the database files,
e.g. a "vacuumdb" utility that can run without Postgres.  Or maybe
run Postgres and disable the WAL so that no new disk space will be
required.

Surely, I'm not the first one to experience this issue.  How can I
fix this?

Thank you,

Igal

[1]
root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
625M    /pgdata/pg_wal
608K    /pgdata/global
0       /pgdata/pg_commit_ts
0       /pgdata/pg_dynshmem
8.0K    /pgdata/pg_notify
0       /pgdata/pg_serial
0       /pgdata/pg_snapshots
16K     /pgdata/pg_subtrans
0       /pgdata/pg_twophase
16K     /pgdata/pg_multixact
130G    /pgdata/base
0       /pgdata/pg_replslot
0       /pgdata/pg_tblspc
0       /pgdata/pg_stat
0       /pgdata/pg_stat_tmp
7.9M    /pgdata/pg_xact
4.0K    /pgdata/pg_logical
0       /pgdata/tmp
130G    /pgdata

[2]
postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
waiting for server to start2019-05-01 20:43:59.301 UTC [34]
LOG:  listening on IPv4 address "0.0.0.0", port 5432
2019-05-01 20:43:59.301 UTC [34] LOG: listening on IPv6 address
"::", port 5432
2019-05-01 20:43:59.303 UTC [34] LOG: listening on Unix socket
"/var/run/postgresql/.s.PGSQL.5432"
2019-05-01 20:43:59.322 UTC [35] LOG: database system shutdown was
interrupted; last known up at 2019-05-01 19:37:32 UTC
2019-05-01 20:43:59.863 UTC [35] LOG: database system was not
properly shut down; automatic recovery in progress
2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
...2019-05-01 20:44:02.389 UTC [35] LOG: redo done at 144/74FFE060
2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction
was at log time 2019-04-28 05:05:24.687581+00
.2019-05-01 20:44:03.474 UTC [35] PANIC: could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
2019-05-01 20:44:03.480 UTC [34] LOG: startup process (PID 35) was
terminated by signal 6: Aborted
2019-05-01 20:44:03.480 UTC [34] LOG: aborting startup due to
startup process failure
2019-05-01 20:44:03.493 UTC [34] LOG: database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.











-- 


"Genius might be described as a supreme capacity for getting its
possessors
into trouble of all kinds."
-- Samuel Butler



--
Angular momentum makes the world go 'round.


Re: strange nested loop row count estimates

2019-05-01 Thread Tom Lane
Sergey Koposov  writes:
> I'm currently trying to understand the expected row counts for a query 
> involving a nested loop join and bitmap index scan 
> on the functional index and a custom operator. And the numbers that I see 
> don't make sense to me currently.

What sort of selectivity estimator have you got attached to that custom
operator?

regards, tom lane




Re: strange nested loop row count estimates

2019-05-01 Thread Sergey Koposov
On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote:
> Sergey Koposov  writes:
> > 
> > I'm currently trying to understand the expected row counts for a query 
> > involving a nested loop join and bitmap index scan 
> > on the functional index and a custom operator. And the numbers that I see 
> > don't make sense to me currently.
> What sort of selectivity estimator have you got attached to that custom
> operator?

This is the code, but basically it is just a constant based on the search 
radius (which is the leftmost float argument of the operator)
https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89
For the query in question it should be ~ 1e-12

The whole idea of the operator was to specifically inform PG that this query 
returns a small number of rows. 

(the underlying idea of the query is that it does positional crossmatch between 
datasets on the sphere with a certain small radius). And 
obviously the selectivity of this is is extremely tiny). 



Re: strange nested loop row count estimates

2019-05-01 Thread Tom Lane
Sergey Koposov  writes:
> On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote:
>> What sort of selectivity estimator have you got attached to that custom
>> operator?

> This is the code, but basically it is just a constant based on the search 
> radius (which is the leftmost float argument of the operator)
> https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89

Hm, that query should be paying attention to join selectivity, and
you don't have a join selectivity function.

I think that it applies the restriction selectivity while
estimating the size of the bitmap scan's output.  But that's not
what's going to determine the estimated size of the join output.

Too tired to look at this really closely, but I think basically
the inconsistency boils down to the lack of consistency between
your restriction estimator (1e-12) and your join estimator
(which, since you haven't got one, is going to default to
something way larger, possibly 0.5).

regards, tom lane




Re: strange nested loop row count estimates

2019-05-01 Thread Sergey Koposov
On Thu, 2019-05-02 at 01:05 -0400, Tom Lane wrote:
> Sergey Koposov  writes:
> > 
> > On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote:
> > > 
> > > What sort of selectivity estimator have you got attached to that custom
> > > operator?
> > 
> > This is the code, but basically it is just a constant based on the search 
> > radius (which is the leftmost float argument of the operator)
> > https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89
> Hm, that query should be paying attention to join selectivity, and
> you don't have a join selectivity function.
> 
> I think that it applies the restriction selectivity while
> estimating the size of the bitmap scan's output.  But that's not
> what's going to determine the estimated size of the join output.
> 
> Too tired to look at this really closely, but I think basically
> the inconsistency boils down to the lack of consistency between
> your restriction estimator (1e-12) and your join estimator
> (which, since you haven't got one, is going to default to
> something way larger, possibly 0.5).

Thanks very much checking, Tom!  
Adding the join selectivity estimator fixed the problem. 
I think I initially tried it, but it wasn't clear whether it was called at all 
or not.
Plus I was confused by the fact that the bitmap scan prediction showed 1 row, 
so it looked like the selectivity worked. 

        Sergey


Re: Starting Postgres when there is no disk space

2019-05-01 Thread Michael Nolan
Assuming you get the database back online, I would suggest you put a
procedure in place to monitor disk space and alert you when it starts to
get low.
--
Mike Nolan


Re: Upgrading locale issues

2019-05-01 Thread rihad

On 05/02/2019 12:26 AM, Peter Geoghegan wrote:

On Mon, Apr 29, 2019 at 7:45 AM rihad  wrote:

Hi. Today we run pg_ctl promote on a slave server (10.7) and started
using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
11.2. And you guessed it, most varchar indexes got corrupted because
system local changed in subtle ways. So I created the extension amcheck
and reindexed all bad indexes one by one. Is there any way to prevent
such things in the future? Will switching to ICU fix all such issues?

Not necessarily, but it will detect the incompatibility more or less
automatically, making it far more likely that the problem will be
caught before it does any harm. ICU versions collations, giving
Postgres a way to reason about their compatibility over time. The libc
collations are not versioned, though (at least not in any standard way
that Postgres can take advantage of).


The problem with it is that ICU collations are absent in pg_collation,
initdb should be run to create them, but pg_basebackup only runs on an
empty base directory, so I couldn't run initdb + pg_basebackup to
prepare the replica server. I believe I can run the create collation
command manually, but what would it look like for en-x-icu?

It is safe to call pg_import_system_collations() directly, which is
all that initdb does. This is documented, so you wouldn't be relying
on a hack.

Thanks for the reply. Do you know what would a "decent" ICU collation be 
to bind to a field's schema definition so it would mimic a UTF-8 
encoding for a multilingual column? Maybe und-x-icu? We aren't as much 
concerned about their sortability in most cases, we just want indexes to 
better handle future PG/ICU upgrades. But what does und(efined) even 
mean with respect to collations? With UTF-8 at least some default 
collation is specified, like en_US.UTF-8. Will results be in a 
completely undefined order as a result of ORDER BY "icu_und_column"?