Re: Exclusion constraints on overlapping text arrays?

2019-08-30 Thread Achilleas Mantzios

On 30/8/19 3:42 π.μ., Ken Tanzer wrote:

Hi.  Using 9.6.14, I was setting up a table with this:

EXCLUDE using gist (EXCLUDE using gist (grant_number_codes with &&)

Where grant_numbers is a varchar[].  I get this error:

ERROR:  data type character varying[] has no default operator class for access method 
"gist"
HINT:  You must specify an operator class for the index or define a default 
operator class for the data type.

I did some Googling, and it looks like you could do this for integer arrays with the intarray extension.  I didn't see anything I'd recognize as an equivalent for varchar (or text) arrays.  Is there 
any way to do this now?  And if not, is there much prospect of this being implemented at some point?


I found a couple of old threads about this.  Not sure if they are still 
relevant, but listed below.

Also, on a side note, I tried using grant_number_codes::text[] with &&, but 
that got a syntax error.  Does that mean casting isn't allowed at all in these 
constraints?

Maybe take a look at 
https://dba.stackexchange.com/questions/185174/error-data-type-text-has-no-default-operator-class-for-access-method-gist


Thanks in advance,
Ken

1) This 2014 thread asked about this:

_*array exclusion constraints*_
https://www.postgresql.org/message-id/flat/20141113183843.E8AC620362%40smtp.hushmail.com

and pointed toward this 2013 discussion:

_*Todo item: Support amgettuple() in GIN*_
https://www.postgresql.org/message-id/flat/5297DC17.7000608%40proxel.se



--
AGENCY Software
A Free Software data system
By and for non-profits
/http://agency-software.org//
/https://demo.agency-software.org/client/
ken.tan...@agency-software.org 
(253) 245-3801

Subscribe to the mailing list 
 to
learn more about AGENCY or
follow the discussion.



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Bad Estimate for multi tenant database queries

2019-08-30 Thread Peter Grman
Hello,

I've noticed that we our queries have very bad estimates, which leads to
the planner using slow nested loops, here is a subset of the query without
tenant separation (correct estimates):

explain (ANALYZE, COSTS, BUFFERS, FORMAT text)
select *
from "Reservation"."Reservation" r
join "Rates"."TimeSliceDefinition" t on r."TimeSliceDefinitionId" =
t."Id"
join "Reservation"."NoShowFee" f on r."NoShowFeeId" = f."Id"
where r."DepartureUtc" > '2018-01-01' and r."ArrivalUtc" < '2018-09-30'

Gather  (cost=14034.74..22788.40 rows=12346 width=793) (actual
time=23.815..57.178 rows=12263 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=18997
  ->  Hash Join  (cost=13034.74..20553.80 rows=5144 width=793) (actual
time=20.869..49.029 rows=4088 loops=3)
Hash Cond: (r."TimeSliceDefinitionId" = t."Id")
Buffers: shared hit=18997
->  Parallel Hash Join  (cost=12907.55..20413.09 rows=5144
width=662) (actual time=19.210..45.177 rows=4088 loops=3)
  Hash Cond: (f."Id" = r."NoShowFeeId")
  Buffers: shared hit=18683
  ->  Parallel Seq Scan on "NoShowFee" f  (cost=0.00..7343.25
rows=61825 width=143) (actual time=0.006..15.481 rows=49460 loops=3)
Buffers: shared hit=6725
  ->  Parallel Hash  (cost=12843.25..12843.25 rows=5144
width=519) (actual time=19.071..19.072 rows=4088 loops=3)
Buckets: 16384  Batches: 1  Memory Usage: 4832kB
Buffers: shared hit=11958
->  Parallel Seq Scan on "Reservation" r
 (cost=0.00..12843.25 rows=5144 width=519) (actual time=0.971..14.919
rows=4088 loops=3)
  Filter: (("DepartureUtc" > '2018-01-01
00:00:00'::timestamp without time zone) AND ("ArrivalUtc" < '2018-09-30
00:00:00'::timestamp without time zone))
  Rows Removed by Filter: 43126
  Buffers: shared hit=11958
->  Hash  (cost=96.53..96.53 rows=2453 width=131) (actual
time=1.586..1.586 rows=2453 loops=3)
  Buckets: 4096  Batches: 1  Memory Usage: 457kB
  Buffers: shared hit=216
  ->  Seq Scan on "TimeSliceDefinition" t  (cost=0.00..96.53
rows=2453 width=131) (actual time=0.009..0.697 rows=2453 loops=3)
Buffers: shared hit=216
Planning Time: 0.424 ms
Execution Time: 58.146 ms

and with tenant separation (wrong estimates):

explain (ANALYZE, COSTS, BUFFERS, FORMAT text)
select *
from "Reservation"."Reservation" r
join "Rates"."TimeSliceDefinition" t on r."TimeSliceDefinitionId" =
t."Id"
join "Reservation"."NoShowFee" f on r."NoShowFeeId" = f."Id"
where r."DepartureUtc" > '2018-01-01' and r."ArrivalUtc" < '2019-12-31'
and r."AccountCode" = 'Code1' and t."AccountCode" = 'Code1' and
f."AccountCode" = 'Code1'

Nested Loop  (cost=419.37..6656.11 rows=3 width=793) (actual
time=1.367..95.051 rows=8992 loops=1)
  Buffers: shared hit=41970
  ->  Nested Loop  (cost=418.95..6504.77 rows=49 width=650) (actual
time=1.355..49.789 rows=8992 loops=1)
Buffers: shared hit=5980
->  Bitmap Heap Scan on "TimeSliceDefinition" t  (cost=4.39..39.99
rows=14 width=131) (actual time=0.015..0.035 rows=14 loops=1)
  Recheck Cond: ("AccountCode" = 'Code1'::text)
  Heap Blocks: exact=7
  Buffers: shared hit=9
  ->  Bitmap Index Scan on
"IX_TimeSliceDefinition_AccountCode_PropertyId_Name"  (cost=0.00..4.39
rows=14 width=0) (actual time=0.010..0.010 rows=14 loops=1)
Index Cond: ("AccountCode" = 'Code1'::text)
Buffers: shared hit=2
->  Bitmap Heap Scan on "Reservation" r  (cost=414.56..461.66
rows=11 width=519) (actual time=1.104..2.987 rows=642 loops=14)
  Recheck Cond: (("TimeSliceDefinitionId" = t."Id") AND
("AccountCode" = 'Code1'::text))
  Filter: (("DepartureUtc" > '2018-01-01 00:00:00'::timestamp
without time zone) AND ("ArrivalUtc" < '2019-12-31 00:00:00'::timestamp
without time zone))
  Rows Removed by Filter: 14
  Heap Blocks: exact=4776
  Buffers: shared hit=5971
  ->  BitmapAnd  (cost=414.56..414.56 rows=12 width=0) (actual
time=1.057..1.057 rows=0 loops=14)
Buffers: shared hit=1195
->  Bitmap Index Scan on
"IX_Reservation_TimeSliceDefinitionId"  (cost=0.00..13.84 rows=189 width=0)
(actual time=0.063..0.063 rows=665 loops=14)
  Index Cond: ("TimeSliceDefinitionId" = t."Id")
  Buffers: shared hit=90
->  Bitmap Index Scan on
"IX_Reservation_AccountCode_EntityId"  (cost=0.00..398.31 rows=8786
width=0) (actual time=1.056..1.056 rows=9225 loops=13)
  Index Cond: ("AccountCode" = 'Code1'::text)
  Buffers: shared hit=1105
  ->  Index Scan using "PK_NoShowFee" on "NoShowFee" f  (cost=0.42..3.09

Re: Exclusion constraints on overlapping text arrays?

2019-08-30 Thread Achilleas Mantzios

On 30/8/19 10:20 π.μ., Achilleas Mantzios wrote:

On 30/8/19 3:42 π.μ., Ken Tanzer wrote:

Hi.  Using 9.6.14, I was setting up a table with this:

EXCLUDE using gist (EXCLUDE using gist (grant_number_codes with &&)

Where grant_numbers is a varchar[].  I get this error:

ERROR:  data type character varying[] has no default operator class for access method 
"gist"
HINT:  You must specify an operator class for the index or define a default 
operator class for the data type.

I did some Googling, and it looks like you could do this for integer arrays with the intarray extension.  I didn't see anything I'd recognize as an equivalent for varchar (or text) arrays.  Is 
there any way to do this now?  And if not, is there much prospect of this being implemented at some point?


I found a couple of old threads about this.  Not sure if they are still 
relevant, but listed below.

Also, on a side note, I tried using grant_number_codes::text[] with &&, but 
that got a syntax error.  Does that mean casting isn't allowed at all in these 
constraints?

Maybe take a look at 
https://dba.stackexchange.com/questions/185174/error-data-type-text-has-no-default-operator-class-for-access-method-gist


So you download latest version of smlar from here : 
http://sigaev.ru/git/gitweb.cgi?p=smlar.git
following is from commands given to FreeBSD but you get the point

% tar xvfz smlar-92dc9c7.tar.gz
cd smlar-92dc9c7
gmake (or make in linux)
if it complaints about not finding /contrib/contrib-global.mk then you do
setenv USE_PGXS 1 (export USE_PGXS=1 in linux)
and repeat the make step
sudo make install (again solve problems as above)

when installed successfully then :
psql
create extension smlar;
--and then create your exclude constraint :
alter table your_table_name ADD constraint constrname EXCLUDE USING gist 
(grant_number_codes _text_sml_ops with &&);



Thanks in advance,
Ken

1) This 2014 thread asked about this:

_*array exclusion constraints*_
https://www.postgresql.org/message-id/flat/20141113183843.E8AC620362%40smtp.hushmail.com

and pointed toward this 2013 discussion:

_*Todo item: Support amgettuple() in GIN*_
https://www.postgresql.org/message-id/flat/5297DC17.7000608%40proxel.se



--
AGENCY Software
A Free Software data system
By and for non-profits
/http://agency-software.org//
/https://demo.agency-software.org/client/
ken.tan...@agency-software.org 
(253) 245-3801

Subscribe to the mailing list 
 to
learn more about AGENCY or
follow the discussion.



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: literal vs dynamic partition constraint in plan execution

2019-08-30 Thread Luca Ferrari
On Fri, Aug 30, 2019 at 8:29 AM Luca Ferrari  wrote:
> testdb=># \d+ respi.y2018
> ...
> Partition of: respi.root FOR VALUES IN ('2018')
> Partition constraint: ((date_part('year'::text, mis_ora) IS NOT NULL)
> AND (date_part('year'::text, mis_ora) = '2018'::double precision))
> Partition key: LIST (date_part('month'::text, mis_ora))
> Check constraints:
> "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precision)
> "y2018_mis_ora_check1" CHECK (mis_ora >= make_timestamp(2018, 1,
> 1, 0, 0, 0::double precision) AND mis_ora <= make_timestamp(2018, 12,
> 31, 23, 59, 59::double precision))
> Partitions: respi.y2018m01 FOR VALUES IN ('1'),
> respi.y2018m02 FOR VALUES IN ('2'),
> respi.y2018m03 FOR VALUES IN ('3'),
> respi.y2018m04 FOR VALUES IN ('4'),
> respi.y2018m05 FOR VALUES IN ('5'),
> respi.y2018m06 FOR VALUES IN ('6'),
> respi.y2018m07 FOR VALUES IN ('7'),
> respi.y2018m08 FOR VALUES IN ('8'),
> respi.y2018m09 FOR VALUES IN ('9'),
> ...
>


While the condition
mis_ora >= current_timestamp
does not cut off the 2018 branch, the following does

=# explain select * from respi.root where ts >= '2019-08-28 23:35:00.007245'
and  extract( year from mis_ora ) = extract( year from current_timestamp )
and extract( month from mis_ora ) >= extract( month from
current_timestamp )order by ts;

 Sort  (cost=7246692.21..7246692.28 rows=26 width=36)
   Sort Key: r.ts
   ->  Nested Loop  (cost=0.00..7246691.60 rows=26 width=36)
 Join Filter: (r.sen_id = s.sen_id)
 ->  Seq Scan on sensori s  (cost=0.00..13.57 rows=329 width=16)
   Filter: interesting
 ->  Materialize  (cost=0.00..7246465.93 rows=43 width=32)
   ->  Append  (cost=0.00..7246465.72 rows=43 width=32)
 Subplans Removed: 31
 ->  Seq Scan on y2019m08 r  (cost=0.00..623008.30
rows=2 width=32)
   Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
 ->  Seq Scan on y2019m09 r_1  (cost=0.00..49.00
rows=1 width=32)
   Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
 ->  Seq Scan on y2019m10 r_2  (cost=0.00..49.00
rows=1 width=32)
   Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))
 ->  Seq Scan on y2019m11 r_3  (cost=0.00..49.00
rows=1 width=32)
   Filter: ((ts >= '2019-08-28
23:35:00.007245'::timestamp without time zone) AND
(date_part('month'::text, mis_ora) >= date_part('month'::text,
CURRENT_TIMESTAMP)) AND (date_part('year'::text, mis_ora) =
date_part('year'::text, CURRENT_TIMESTAMP)))

The fact that making explicit the condition against the year and the
month, which are the top level partition constraint, makes me think
that the executor will try to go down all the branches to the leaf if
the condition is not filtered at the top level. Even if I don't
understand why.

Luca




Regarding db dump with Fc taking very long time to completion

2019-08-30 Thread Durgamahesh Manne
Hi
To respected international postgresql team

I am using postgresql 11.4 version
I have scheduled logical dump job which runs daily one time at db level
There was one table that has write intensive activity for every 40 seconds
in db
The size of the table is about 88GB
 Logical dump of that table is taking more than 7 hours to be completed

 I need to reduce to dump time of that table that has 88GB in size


Regards
Durgamahesh Manne


Re: Regarding db dump with Fc taking very long time to completion

2019-08-30 Thread Luca Ferrari
On Fri, Aug 30, 2019 at 11:51 AM Durgamahesh Manne
 wrote:
>  Logical dump of that table is taking more than 7 hours to be completed
>
>  I need to reduce to dump time of that table that has 88GB in size

Good luck!
I would see two possible solutions to the problem:
1) use physical backup and switch to incremental (e..g, pgbackrest)
2) partition the table and backup single pieces, if possible
(constraints?) and be assured it will become hard to maintain (added
partitions, and so on).

Are all of the 88 GB be written during a bulk process? I guess no, so
maybe partitioning you can avoid locking the whole dataset and reduce
contention (and thus time).

Luca




"storing" a calculated value in plsql function ?

2019-08-30 Thread stan
I have created a function (PLSQL) that does a complex select with joins on 
various
tables and views, and returns a table.

In the resultant table, I have raw data, and adjusted data. The adjusted data i
all adjusted by a common factor, which is calculated in the select. Presently, I
calculate this same adjustment factor several times in the select.

Is there a way to reference this value, multiple times, once it is calculated? 
Or
would I have to create a 2nd select that calculates this adjustment factor, and
stores it in a PLSQL variable< and if I do that, can I reference this stored 
value
in the select?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




SSPI auth and mixed case usernames

2019-08-30 Thread Niels Jespersen
Hello

Postgresql 11.2 on Windows. 

I have a user mapping i pg_hba.conf 

sspi map=domain

In pg_ident.conf, I have the following: 

domain/^(([A-Z|a-z]{3}[A|a]?)|([Xx]\d{2}))@DOMAIN$\1

This maps windows logonname til a postgres username. Hower, for reasons I 
cannot explain, sometimes the username comes in all-lowercase, at other times 
it comes all-caps. This is dependant on the Windows host the client is 
connected to. 

I do not want to create both XXX and xxx as users on Postgres. I would prefer 
to translate alle usernames to lowercase in the map. 

Is that possible, and if so, how? 

Regards Niels Jespersen







Re: SSPI auth and mixed case usernames

2019-08-30 Thread Magnus Hagander
On Fri, Aug 30, 2019 at 1:27 PM Niels Jespersen  wrote:

> Hello
>
> Postgresql 11.2 on Windows.
>
> I have a user mapping i pg_hba.conf
>
> sspi map=domain
>
> In pg_ident.conf, I have the following:
>
> domain/^(([A-Z|a-z]{3}[A|a]?)|([Xx]\d{2}))@DOMAIN$\1
>
> This maps windows logonname til a postgres username. Hower, for reasons I
> cannot explain, sometimes the username comes in all-lowercase, at other
> times it comes all-caps. This is dependant on the Windows host the client
> is connected to.
>

It is actually dependent on what the user typed into their login box when
they logged in to the machine. Yes, that's mostly insane, but that's how
those APIs in Windows work.


I do not want to create both XXX and xxx as users on Postgres. I would
> prefer to translate alle usernames to lowercase in the map.
>
> Is that possible, and if so, how?
>

No, PostgreSQL will not do that automatically for you.

What pg_ident.conf allows you to do is say that the user is allowed to log
in to the postgres user in lowercase even if the username retrieved using
sspi is not in lowercase. But the application still has to actually try to
log in with lowercase, and do so before it connects to PostgreSQL.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: "storing" a calculated value in plsql function ?

2019-08-30 Thread Luca Ferrari
On Fri, Aug 30, 2019 at 12:48 PM stan  wrote:
> In the resultant table, I have raw data, and adjusted data. The adjusted data 
> i
> all adjusted by a common factor, which is calculated in the select. 
> Presently, I
> calculate this same adjustment factor several times in the select.

Is it possible to add the computed column as output of your query?
Even define a rowtype that includes such column?


> Is there a way to reference this value, multiple times, once it is 
> calculated? Or
> would I have to create a 2nd select that calculates this adjustment factor, 
> and
> stores it in a PLSQL variable< and if I do that, can I reference this stored 
> value
> in the select?

Yes, you can references variables as values on queries.
As an example 
.

Luca




Re: "storing" a calculated value in plsql function ?

2019-08-30 Thread Pavel Stehule
pá 30. 8. 2019 v 12:48 odesílatel stan  napsal:

> I have created a function (PLSQL) that does a complex select with joins on
> various
> tables and views, and returns a table.
>
> In the resultant table, I have raw data, and adjusted data. The adjusted
> data i
> all adjusted by a common factor, which is calculated in the select.
> Presently, I
> calculate this same adjustment factor several times in the select.
>
> Is there a way to reference this value, multiple times, once it is
> calculated? Or
> would I have to create a 2nd select that calculates this adjustment
> factor, and
> stores it in a PLSQL variable< and if I do that, can I reference this
> stored value
> in the select?
>

you can use custom configuration values like global variables

https://wiki.postgresql.org/wiki/Variable_Design#PostgreSQL_User-Defined_GUCS

Regards

Pavel Stehule

>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>
>
>


Re: SSPI auth and mixed case usernames

2019-08-30 Thread Niels Jespersen






Hello Magnus


Thank you for your prompt reply. 


I’m not sure I understand your last statement. I want to achieve that regardless of the case of the entered username is logged into the same Postgres user (whose name is created in all
 lowercase).


In other words, Windows usernames one day entered as XYz, the next day entered as xYz, should logon to Postgres user xyz.


Niels







Fra: Magnus Hagander 
Dato: 30. august 2019 kl. 13.31.33 CEST
Til: Niels Jespersen 
Cc: pgsql-general@lists.postgresql.org 
Emne: Re: SSPI auth and mixed case usernames








On Fri, Aug 30, 2019 at 1:27 PM Niels Jespersen  wrote:


Hello

Postgresql 11.2 on Windows. 

I have a user mapping i pg_hba.conf 

sspi map=domain

In pg_ident.conf, I have the following: 

domain        /^(([A-Z|a-z]{3}[A|a]?)|([Xx]\d{2}))@DOMAIN$    \1

This maps windows logonname til a postgres username. Hower, for reasons I cannot explain, sometimes the username comes in all-lowercase, at other times it comes all-caps. This is dependant on the Windows host the client is connected to.




It is actually dependent on what the user typed into their login box when they logged in to the machine. Yes, that's mostly insane, but that's how those APIs in Windows work.





I do not want to create both XXX and xxx as users on Postgres. I would prefer to translate alle usernames to lowercase in the map.


Is that possible, and if so, how? 



No, PostgreSQL will not do that automatically for you. 


What pg_ident.conf allows you to do is say that the user is allowed to log in to the postgres user in lowercase even if the username retrieved using sspi is not in lowercase. But the application still has to actually try to log in with lowercase, and do
 so before it connects to PostgreSQL.
 

-- 


 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/










Re: Work hours?

2019-08-30 Thread Steven Lembark


> > Any thoughts as to the best way to approach this?  
> 
> Use generate_series:
> 
> https://www.postgresql.org/docs/11/functions-srf.html
> 
> to generate all the days in the month.
> 
> Loop over the days and use EXTRACT:
> 
> https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
> 
> to find the dates with a dow(The day of the week as Sunday (0) to 
> Saturday (6)) or isodow(The day of the week as Monday (1) to Sunday
> (7)) that falls in Mon-Fri and add to counter.

Assumes all weekdays are work days and that all weeks are uniform
for work. Any holiday, emergency, outage leaves you with a skewed
result.

First workaround is table of non-work days: generate a temp table of 
the series not intersecting the non-workdays (materialized views
are nice for this). 

Then you get into issues of different people having different non-
work days, leaving your subtraction table keyed by person+date.

Frequently the non-work days are by employee class, which allows a
table of days off by employee grade + employees w/ grade => days
off by empoloyee.

Then individual employees will have their own time off due to paid
vacation, medical or family leave, and sick days. Depending on your
number of employees a non_working_days w/ date + employee works or 
you get into the pay grade + generic days and employee + pay grade
for the generic days off merged with a separate table of individual
days off. Subtract that from a temp table generated by the sequences
and you'll have a complete schedule.



-- 
Steven Lembark 3920 10th Ave South
Workhorse Computing   Birmingham, AL 35222
lemb...@wrkhors.com+1 888 359 3508




Re: Exclusion constraints on overlapping text arrays?

2019-08-30 Thread Tom Lane
Ken Tanzer  writes:
> Hi.  Using 9.6.14, I was setting up a table with this:
> EXCLUDE using gist (EXCLUDE using gist (grant_number_codes with &&)
> Where grant_numbers is a varchar[].  I get this error:
> ERROR:  data type character varying[] has no default operator class for
> access method "gist"

> I did some Googling, and it looks like you could do this for integer arrays
> with the intarray extension.  I didn't see anything I'd recognize as an
> equivalent for varchar (or text) arrays.  Is there any way to do this now?
> And if not, is there much prospect of this being implemented at some point?

I haven't heard of anyone working on it recently.

> Also, on a side note, I tried using grant_number_codes::text[] with &&, but
> that got a syntax error.  Does that mean casting isn't allowed at all in
> these constraints?

As far as that goes, you'd just need to add parentheses.

regards, tom lane




SQL equivalint of #incude directive ?

2019-08-30 Thread stan


I thought this would be common. But a quick Google only revealed what look to be
workarounds.

I am defining a bunch of functions, and I would prefer to store them in a
separate file, which then gets "source" by the main DB init file.

Is there a standard way to do this?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin





Re: SQL equivalint of #incude directive ?

2019-08-30 Thread Pavel Stehule
pá 30. 8. 2019 v 15:49 odesílatel stan  napsal:

>
> I thought this would be common. But a quick Google only revealed what look
> to be
> workarounds.
>
> I am defining a bunch  of functions, and I would prefer to store them in a
> separate file, which then gets "source" by the main DB init file.
>
> Is there a standard way to do this?
>

no, it isn't

Pavel


> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>
>
>
>


Re: SSPI auth and mixed case usernames

2019-08-30 Thread Stephen Frost
Greetings,

* Niels Jespersen (n...@dst.dk) wrote:
>Hello Magnus
>Thank you for your prompt reply. 
>I’m not sure I understand your last statement. I want to achieve that
>regardless of the case of the entered username is logged into the same
>Postgres user (whose name is created in all lowercase).
>In other words, Windows usernames one day entered as XYz, the next day
>entered as xYz, should logon to Postgres user xyz.

You just have to make sure that the users tell whatever program they're
using to connect to PG (like psql, pgAdmin, whatever) that their PG
username is 'xyz'.  If they try to log in as 'XYZ' then that's gonna be
a different PG user.

If you have some other application that's being used to connect to PG
then you could do the lowercase in the app...

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: SSPI auth and mixed case usernames

2019-08-30 Thread Magnus Hagander
On Fri, Aug 30, 2019 at 3:00 PM Niels Jespersen  wrote:

> Hello Magnus
>
> Thank you for your prompt reply.
>
> I’m not sure I understand your last statement. I want to achieve that
> regardless of the case of the entered username is logged into the same
> Postgres user (whose name is created in all lowercase).
>
> In other words, Windows usernames one day entered as XYz, the next day
> entered as xYz, should logon to Postgres user xyz.
>
>
Right. The client application needs to enforce that the usernamed passed to
PostgreSQL is lowercased before it tries to connect. The only thing
pg_ident mapping does is make sure that it actually works  when the client
application does this, but it's the client that has to do it.

For example, if using psql you can't say "psql -h myserver.domain.com -d
mydatabase", you have to explicitly say "psql -h myserver.domain.com -d
mydatabase -U mylowercaseusername"

//Magnus


Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619

2019-08-30 Thread Justin Pryzby
Moving this old thread to -hackers
https://www.postgresql.org/message-id/flat/20180519142603.GA30060%40telsasoft.com

I wanted to mention that this seems to still be an issue, now running pg11.5.

log_time   | 2019-08-30 23:20:00.118+10
user_name  | postgres
database   | ts
session_id | 5d69227e.235
session_line   | 1
command_tag| CLUSTER
session_start_time | 2019-08-30 23:19:58+10
error_severity | ERROR
sql_state_code | XX000
message| unexpected chunk number 1 (expected 0) for toast value 
2369261203 in pg_toast_2619
query  | CLUSTER pg_statistic USING 
pg_statistic_relid_att_inh_index
application_name   | psql

Note that my original report was for "missing" chunk during "VACUUM FULL", and
the current error is "unexpected chunk" during CLUSTER.  I imagine that's
related issue.  I haven't seen this in awhile (but stopped trying to reproduce
it long ago).  A recently-deployed update to this maintenance script is
probably why it's now doing CLUSTER.

On Fri, May 25, 2018 at 08:49:50AM -0500, Justin Pryzby wrote:
> On Sat, May 19, 2018 at 02:39:26PM -0400, Tom Lane wrote:
> > Justin Pryzby  writes:
> > > [pryzbyj@database ~]$ while :; do for db in `psql postgres -Atc "SELECT 
> > > datname FROM pg_database WHERE datallowconn"`; do for t in pg_statistic 
> > > pg_attrdef pg_constraint; do echo "$db.$t..."; 
> > > PGOPTIONS=-cstatement_timeout='9s' psql $db -qc "VACUUM FULL $t"; done; 
> > > done; done
> > 
> > > ...
> > > postgres.pg_statistic...
> > > postgres.pg_attrdef...
> > > postgres.pg_constraint...
> > > template1.pg_statistic...
> > > template1.pg_attrdef...
> > > template1.pg_constraint...
> > > ts.pg_statistic...
> > > ERROR:  canceling statement due to statement timeout
> > > ts.pg_attrdef...
> > > ts.pg_constraint...
> > > postgres.pg_statistic...
> > > ERROR:  missing chunk number 0 for toast value 3372855171 in pg_toast_2619
> > 
> > Hm, so was the timeout error happening every time through on that table,
> > or just occasionally, or did you provoke it somehow?  I'm wondering how
> > your 9s timeout relates to the expected completion time.
> 
> Actually statement_timeout isn't essential for this (maybe it helps to 
> triggers
> it more often - not sure).
> 
> Could you try:
> time sh -ec 'while :; do time psql postgres -c "VACUUM FULL VERBOSE 
> pg_toast.pg_toast_2619"; psql postgres -c "VACUUM FULL VERBOSE pg_statistic"; 
> done'; date
> 
> Three servers experienced error within 30min, but one server didn't fail until
> 12h later, and a handful others still haven't failed..
> 
> Does this help at all ?
>  2018-05-24 21:57:49.98-03  | 5b075f8d.1ad1 | LOG| pryzbyj   | 
> postgres | statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619
>  2018-05-24 21:57:50.067-03 | 5b075f8d.1ad1 | INFO   | pryzbyj   | 
> postgres | vacuuming "pg_toast.pg_toast_2619"
>  2018-05-24 21:57:50.09-03  | 5b075f8d.1ad1 | INFO   | pryzbyj   | 
> postgres | "pg_toast_2619": found 0 removable, 408 nonremovable row versions 
> in 99 pages
>  2018-05-24 21:57:50.12-03  | 5b075f8e.1ada | LOG| pryzbyj   | 
> postgres | statement: VACUUM FULL VERBOSE pg_statistic
>  2018-05-24 21:57:50.129-03 | 5b075f8e.1ada | INFO   | pryzbyj   | 
> postgres | vacuuming "pg_catalog.pg_statistic"
>  2018-05-24 21:57:50.185-03 | 5b075f8e.1ada | ERROR  | pryzbyj   | 
> postgres | missing chunk number 0 for toast value 3382957233 in pg_toast_2619
> 
> Some thing; this server has autovacuum logging, although it's not clear to me
> if that's an essential component of the problem, either:
>  2018-05-24 21:16:39.856-06 | LOG   | 5b078017.7b99 | pryzbyj   | postgres | 
> statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619
>  2018-05-24 21:16:39.876-06 | LOG   | 5b078010.7968 |   |  | 
> automatic vacuum of table "postgres.pg_toast.pg_toast_2619": index scans: 1   
>  +
> |   |   |   |  | 
> pages: 0 removed, 117 r
>  2018-05-24 21:16:39.909-06 | INFO  | 5b078017.7b99 | pryzbyj   | postgres | 
> vacuuming "pg_toast.pg_toast_2619"
>  2018-05-24 21:16:39.962-06 | INFO  | 5b078017.7b99 | pryzbyj   | postgres | 
> "pg_toast_2619": found 0 removable, 492 nonremovable row versions in 117 pages
>  2018-05-24 21:16:40.025-06 | LOG   | 5b078018.7b9b | pryzbyj   | postgres | 
> statement: VACUUM FULL VERBOSE pg_statistic
>  2018-05-24 21:16:40.064-06 | INFO  | 5b078018.7b9b | pryzbyj   | postgres | 
> vacuuming "pg_catalog.pg_statistic"
>  2018-05-24 21:16:40.145-06 | ERROR | 5b078018.7b9b | pryzbyj   | postgres | 
> missing chunk number 0 for toast value 765874692 in pg_toast_2619
> 
> Or this one?
> 
> postgres=# SELECT log_time, database, user_name, session_id, 
> left(message,999) FROM postgres_log WHERE (log_time>='2018-05-24 19:56' AND 
> log_time<'2018-05-24 19:58') AND (database='po

Re: SQL equivalint of #incude directive ?

2019-08-30 Thread Thomas Kellerer

stan schrieb am 30.08.2019 um 15:48:

I thought this would be common. But a quick Google only revealed what look to be
workarounds.

I am defining a bunch of functions, and I would prefer to store them in a
separate file, which then gets "source" by the main DB init file.

Is there a standard way to do this?



Define "standard".

With psql you can use the \i directive from within a SQL script.

But that does not work with other SQL clients.

It all depends on the SQL client you use to run those scripts.

I use Liquibase to manage schema setup and migration - that has a totally different 
"include" directive then psql.







Re: SSPI auth and mixed case usernames

2019-08-30 Thread Niels Jespersen







Ok, yes. It’s s server for analytics running R. So users speecify their oen connection string. My initial thought was to leave out username in the connection string, but I think now to
 specify username in lowercase. 


Thank you, and Magnus also. 


Niels 







Fra: Stephen Frost 
Dato: 30. august 2019 kl. 16.21.39 CEST
Til: Niels Jespersen 
Cc: Magnus Hagander , pgsql-general@lists.postgresql.org 
Emne: Re: SSPI auth and mixed case usernames




Greetings,

* Niels Jespersen (n...@dst.dk) wrote:
>    Hello Magnus
>    Thank you for your prompt reply. 
>    I’m not sure I understand your last statement. I want to achieve that
>    regardless of the case of the entered username is logged into the same
>    Postgres user (whose name is created in all lowercase).
>    In other words, Windows usernames one day entered as XYz, the next day
>    entered as xYz, should logon to Postgres user xyz.

You just have to make sure that the users tell whatever program they're
using to connect to PG (like psql, pgAdmin, whatever) that their PG
username is 'xyz'.  If they try to log in as 'XYZ' then that's gonna be
a different PG user.

If you have some other application that's being used to connect to PG
then you could do the lowercase in the app...

Thanks,

Stephen








Escape parentheses in aggregate function

2019-08-30 Thread Rich Shepard

I want to copy query results to a text file and there's an aggregate
function in the SELECT expression. One of the aggregate function's
parentheses seems to end the \copy() function and I don't know how best to
write the statement. A minimal example:

\copy(select count_value, sum(count_value)
from table_name) to 'output_file.txt';

Do I write sumE'('count_valueE')', sum\(count_value\), or something else?

TIA,

Rich




Re: Escape parentheses in aggregate function

2019-08-30 Thread Tom Lane
Rich Shepard  writes:
> I want to copy query results to a text file and there's an aggregate
> function in the SELECT expression. One of the aggregate function's
> parentheses seems to end the \copy() function and I don't know how best to
> write the statement. A minimal example:

> \copy(select count_value, sum(count_value)
> from table_name) to 'output_file.txt';

I think your problem is the line break, not the parentheses.
psql knows how to count parens, but it has no concept of letting
backslash commands continue across lines.

regards, tom lane




Re: Escape parentheses in aggregate function

2019-08-30 Thread Rich Shepard

On Fri, 30 Aug 2019, Tom Lane wrote:


I think your problem is the line break, not the parentheses. psql knows
how to count parens, but it has no concept of letting backslash commands
continue across lines.


Tom,

Interesting. I've adopted separating select statements by their expressions
as it makes it easier for me to get it right, or fix it. However, I can put
it all on a single line once I know the syntax is correct and I add the
\copy() function.

Thanks,

Rich




Re: Escape parentheses in aggregate function

2019-08-30 Thread Tom Lane
Rich Shepard  writes:
> On Fri, 30 Aug 2019, Tom Lane wrote:
>> I think your problem is the line break, not the parentheses. psql knows
>> how to count parens, but it has no concept of letting backslash commands
>> continue across lines.

> Interesting. I've adopted separating select statements by their expressions
> as it makes it easier for me to get it right, or fix it. However, I can put
> it all on a single line once I know the syntax is correct and I add the
> \copy() function.

FWIW, if you're running a current release then there's a reasonable
alternative for writing multi-line COPY-from-query commands; see commit
log below.

regards, tom lane


Author: Tom Lane 
Branch: master Release: REL_12_BR [6d3ede5f1] 2019-01-26 14:15:42 -0500
Branch: REL_11_STABLE Release: REL_11_2 [2c50c9f23] 2019-01-26 14:15:42 -0500
Branch: REL_10_STABLE Release: REL_10_7 [8e97a97b3] 2019-01-26 14:15:42 -0500
Branch: REL9_6_STABLE Release: REL9_6_12 [ae4c7d5ab] 2019-01-26 14:15:42 -0500
Branch: REL9_5_STABLE Release: REL9_5_16 [cda1e27fb] 2019-01-26 14:15:42 -0500

Fix psql's "\g target" meta-command to work with COPY TO STDOUT.

Previously, \g would successfully execute the COPY command, but
the target specification if any was ignored, so that the data was
always dumped to the regular query output target.  This seems like
a clear bug, so let's not just fix it but back-patch it.

While at it, adjust the documentation for \copy to recommend
"COPY ... TO STDOUT \g foo" as a plausible alternative.

Back-patch to 9.5.  The problem exists much further back, but the
code associated with \g was refactored enough in 9.5 that we'd
need a significantly different patch for 9.4, and it doesn't
seem worth the trouble.

Daniel Vérité, reviewed by Fabien Coelho

Discussion: 
https://postgr.es/m/15dadc39-e050-4d46-956b-dcc4ed098...@manitou-mail.org




Re: Escape parentheses in aggregate function

2019-08-30 Thread Rich Shepard

On Fri, 30 Aug 2019, Tom Lane wrote:


FWIW, if you're running a current release then there's a reasonable
alternative for writing multi-line COPY-from-query commands; see commit
log below.


Thanks, Tom. I'm running 11.1 and plan to upgrade to 11.5 this weekend.

Regards,

Rich




Aggregate functions on groups

2019-08-30 Thread Rich Shepard

Tables hold data on fish counts by stream name, species, and (unreported)
collection dates. I'm trying to write a query that returns the total number
of each species in each stream.

The latest attempt is (lines wrapped by alpine; submitted as one line):

\copy (select f.stream_tribs, f.count_value, sum(f.count_value),
i.common_name, i.sci_name  from fish_counts as f, itis as i where
f.stream_tribs like '%Nehalem River%' group by f.stream_tribs,
i.common_name, i.sci_name, f.count_value  order by f.stream_tribs,
i.common_name, i.sci_name, f.count_value) to
'/home/rshepard/projects/oregon/mohler_sand/data/fish/fishes.dat';

The returned set starts this way:

Anderson Creek trib to Nehalem River0   0   Black crappie   Pomoxis 
nigromaculatus
Anderson Creek trib to Nehalem River3   3   Black crappie   Pomoxis 
nigromaculatus
Anderson Creek trib to Nehalem River0   0   BluegillLepomis 
macrochirus
Anderson Creek trib to Nehalem River3   3   BluegillLepomis 
macrochirus
Anderson Creek trib to Nehalem River0   0   Brook trout 
Salvelinus fontinalis
Anderson Creek trib to Nehalem River3   3   Brook trout 
Salvelinus fontinalis
Anderson Creek trib to Nehalem River0   0   Brown bullhead  
Ameiurus nebulosus
Anderson Creek trib to Nehalem River3   3   Brown bullhead  
Ameiurus nebulosus

What I want returned would look like this:

Anderson Creek trib to Nehalem River  Black crappie  Pomoxis nigromaculatus 3
Anderson Creek trib to Nehalem River  Bluegill   Lepomis macrochirus3
Anderson Creek trib to Nehalem River  Brook troutSalvelinus fontinalis  3
Anderson Creek trib to Nehalem River  Brown bullhead Ameiurus nebulosus 3

I've read the manual yet must have not seen the section explaining how to
apply aggregate functions to groups.

Thanks in advance,

Rich




Re: Aggregate functions on groups [RESOLVED]

2019-08-30 Thread Rich Shepard

On Fri, 30 Aug 2019, John W Higgins wrote:


You are grouping by count_value which means that you are asking the system
to return a row for each different count_value.


John,

I didn't realize this.


So if you remove the f.count_value from the select statement (not the
sum(f.count_value)) - and you remove f.count_value from the group_by and
order_by statements - you should get what you want


Aha. I thought I had to select f.count_value in order to obtain
sum(f.count_value); it's been a long time since I needed to do something
like this.


Something like

\copy (select f.stream_tribs, sum(f.count_value),
i.common_name, i.sci_name  from fish_counts as f, itis as i where
f.stream_tribs like '%Nehalem River%' group by f.stream_tribs,
i.common_name, i.sci_name  order by f.stream_tribs,
i.common_name, i.sci_name) to
'/home/rshepard/projects/oregon/mohler_sand/data/fish/fishes.dat';


Thanks very much!

Rich




Returning a table from a function, that requires multiple selects?

2019-08-30 Thread stan
I need to encapsulate, what are basically 2 related function calls into a single
function. The result of each of th calls is a date type.

y current thinking is to return a 2 row table with the 2 dates in it. But, I 
seem to
be having issues getting this to work.

Is it possible for a function to return a table with results from multiple
queries?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: Returning a table from a function, that requires multiple selects?

2019-08-30 Thread Guyren Howe
On Aug 30, 2019, at 13:03 , stan  wrote:

> I need to encapsulate, what are basically 2 related function calls into a 
> single
> function. The result of each of th calls is a date type.
> 
> y current thinking is to return a 2 row table with the 2 dates in it. But, I 
> seem to
> be having issues getting this to work.
> 
> Is it possible for a function to return a table with results from multiple
> queries?

You could just return a tuple VALUES(a, b). Or you could define a type to 
return if you want to get fancy.

Re: Returning a table from a function, that requires multiple selects?

2019-08-30 Thread Rob Sargent


> On Aug 30, 2019, at 2:09 PM, Guyren Howe  wrote:
> 
> On Aug 30, 2019, at 13:03 , stan mailto:st...@panix.com>> 
> wrote:
> 
>> I need to encapsulate, what are basically 2 related function calls into a 
>> single
>> function. The result of each of th calls is a date type.
>> 
>> y current thinking is to return a 2 row table with the 2 dates in it. But, I 
>> seem to
>> be having issues getting this to work.
>> 
>> Is it possible for a function to return a table with results from multiple
>> queries?
> 
> You could just return a tuple VALUES(a, b). Or you could define a type to 
> return if you want to get fancy.
Here I you might want VALUE(array[‘heading1’,a], array[‘heading2',b]) unless 
you’re certain you know which date is which.

Re: Returning a table from a function, that requires multiple selects?

2019-08-30 Thread Rob Sargent


> On Aug 30, 2019, at 2:03 PM, stan  wrote:
> 
> I need to encapsulate, what are basically 2 related function calls into a 
> single
> function. The result of each of th calls is a date type.
> 
> y current thinking is to return a 2 row table with the 2 dates in it. But, I 
> seem to
> be having issues getting this to work.
> 
> Is it possible for a function to return a table with results from multiple
> queries?
> 
> -- 
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
>   -- Benjamin Franklin
> 
> 
If the two queries are identical in returned columns you might be able to use 
UNION:
select f1.* from first_query as f1 UNION select f2.* from second_query as f2;

You can’t do any processing of f1 or f2.



Re: Returning a table from a function, that requires multiple selects?

2019-08-30 Thread Tom Lane
Rob Sargent  writes:
>> On Aug 30, 2019, at 2:09 PM, Guyren Howe  wrote:
>> 
>> On Aug 30, 2019, at 13:03 , stan mailto:st...@panix.com>> 
>> wrote:
>>> Is it possible for a function to return a table with results from multiple
>>> queries?

>> You could just return a tuple VALUES(a, b). Or you could define a type to 
>> return if you want to get fancy.

> Here I you might want VALUE(array[‘heading1’,a], array[‘heading2',b]) unless 
> you’re certain you know which date is which.

Yeah, that would be a good reason to return a declared composite type.
Something like

CREATE TYPE two_dates AS (start date, stop date);

CREATE FUNCTION f(...) RETURNS two_dates AS ...;

SELECT * FROM f(...);

regards, tom lane




Re: Returning a table from a function, that requires multiple selects?

2019-08-30 Thread stan
On Fri, Aug 30, 2019 at 04:03:15PM -0400, stan wrote:
> I need to encapsulate, what are basically 2 related function calls into a 
> single
> function. The result of each of th calls is a date type.
> 
> y current thinking is to return a 2 row table with the 2 dates in it. But, I 
> seem to
> be having issues getting this to work.
> 
> Is it possible for a function to return a table with results from multiple
> queries?
> 

Got it working, so yes this can be done.

Next I have to figure out how to configure the next function to accept this 
table.
Error message says something about configuring it to accept a record.
-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




How to get RAISE messges displayed?

2019-08-30 Thread stan
OK, I am doing enough of this to start using some debug error messages.

I put the following in a function declaration:

RAISE notice 'Called with %', $1 ;

But, when I call the function I do not see anything. I edited postgresql.conf

Like this:

client_min_messages = notice
#client_min_messages = notice   # values in order of decreasing detail:

Event though it looked like this should be the default.

restarted Postgres with:

/etc/init.d/postgresql restart

But I still do not see this, or see it in the log when I call this function.

Ubuntu 18.04 a Postgresql 11 if that matters.


-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: How to get RAISE messges displayed?

2019-08-30 Thread stan
On Fri, Aug 30, 2019 at 06:22:14PM -0400, stan wrote:
> OK, I am doing enough of this to start using some debug error messages.
> 
> I put the following in a function declaration:
> 
> RAISE notice 'Called with %', $1 ;
> 
> But, when I call the function I do not see anything. I edited postgresql.conf
> 
> Like this:
> 
> client_min_messages = notice
> #client_min_messages = notice   # values in order of decreasing 
> detail:
> 
> Event though it looked like this should be the default.
> 
> restarted Postgres with:
> 
> /etc/init.d/postgresql restart
> 
> But I still do not see this, or see it in the log when I call this function.
> 
> Ubuntu 18.04 a Postgresql 11 if that matters.
> 

Got it working.

Not 100% sure what I had wrong.
-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: Exclusion constraints on overlapping text arrays?

2019-08-30 Thread Ken Tanzer
On Fri, Aug 30, 2019 at 12:59 AM Achilleas Mantzios <
ach...@matrix.gatewaynet.com> wrote:

> On 30/8/19 10:20 π.μ., Achilleas Mantzios wrote:
>
> Maybe take a look at
> https://dba.stackexchange.com/questions/185174/error-data-type-text-has-no-default-operator-class-for-access-method-gist
>
>
> So you download latest version of smlar from here :
> http://sigaev.ru/git/gitweb.cgi?p=smlar.git
> following is from commands given to FreeBSD but you get the point
>
>
Well I set this up on a test machine running 11.4, and it certainly seems
to do what it's supposed to.  Pretty cool, and thanks for pointing this out!

Is there much or any prospect of this becoming a supported extension?

And Tom--thanks for the pointer on the parentheses!

Cheers,
Ken


-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://demo.agency-software.org/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.


Re: Aggregate functions on groups

2019-08-30 Thread Morris de Oryx
Your tributaries and fish master tables make sense. If I read your code
right, you're grouping by too many columns. I flattened the data into a
survey table for this simple example:

select tributary,
   common_name,
   scientific_name,
   sum(count_value) as fish_seen,
   count(count_value) as observations_made

   from survey

   group by 1,2,3 -- The GROUP BY clause can use positions on the select
list, if you feel like typing less.


tributary  common_name scientific_name
   fish_seen  observations_made
Anderson Creek trib to Nehalem River   Black crappie   Pomoxis
nigromaculatus 3  2
Anderson Creek trib to Nehalem River   Brook trout Salvelinus
fontinalis  3  2
Anderson Creek trib to Nehalem River   BluegillLepomis macrochirus
   3  2
Anderson Creek trib to Nehalem River   Brown bullhead  Ameiurus nebulosus
  3  2

But this is not why I'm answering. I'm responding as I wanted to make sure
that you're aware of the pg-similarity extension:

https://salsa.debian.org/postgresql/pg-similarity

This tool implements a *lot* of similarity measures for fuzzy cmparisons.
Some are sting-oriented algorithms (Jaro-Winkler, Soundex, Levenshtein,
etc.), and others derive from and/or apply to field population comparisons,
like the Jaccard and Dice Coefficients. There's a lot of great stuff in the
package.

On Sat, Aug 31, 2019 at 3:14 AM Rich Shepard 
wrote:

> Tables hold data on fish counts by stream name, species, and (unreported)
> collection dates. I'm trying to write a query that returns the total number
> of each species in each stream.
>
> The latest attempt is (lines wrapped by alpine; submitted as one line):
>
> \copy (select f.stream_tribs, f.count_value, sum(f.count_value),
> i.common_name, i.sci_name  from fish_counts as f, itis as i where
> f.stream_tribs like '%Nehalem River%' group by f.stream_tribs,
> i.common_name, i.sci_name, f.count_value  order by f.stream_tribs,
> i.common_name, i.sci_name, f.count_value) to
> '/home/rshepard/projects/oregon/mohler_sand/data/fish/fishes.dat';
>
> The returned set starts this way:
>
> Anderson Creek trib to Nehalem River0   0   Black crappie
>  Pomoxis nigromaculatus
> Anderson Creek trib to Nehalem River3   3   Black crappie
>  Pomoxis nigromaculatus
> Anderson Creek trib to Nehalem River0   0   Bluegill
> Lepomis macrochirus
> Anderson Creek trib to Nehalem River3   3   Bluegill
> Lepomis macrochirus
> Anderson Creek trib to Nehalem River0   0   Brook trout
>  Salvelinus fontinalis
> Anderson Creek trib to Nehalem River3   3   Brook trout
>  Salvelinus fontinalis
> Anderson Creek trib to Nehalem River0   0   Brown bullhead
> Ameiurus nebulosus
> Anderson Creek trib to Nehalem River3   3   Brown bullhead
> Ameiurus nebulosus
>
> What I want returned would look like this:
>
> Anderson Creek trib to Nehalem River  Black crappie  Pomoxis
> nigromaculatus 3
> Anderson Creek trib to Nehalem River  Bluegill   Lepomis macrochirus
>   3
> Anderson Creek trib to Nehalem River  Brook troutSalvelinus
> fontinalis  3
> Anderson Creek trib to Nehalem River  Brown bullhead Ameiurus nebulosus
>  3
>
> I've read the manual yet must have not seen the section explaining how to
> apply aggregate functions to groups.
>
> Thanks in advance,
>
> Rich
>
>
>