Re: PostgreSQL ping/pong to client
Maybe postgresql (application) layer ping pong. Enabling system wise timeouts can impact my whole setup including my web app, thats why it is hard for me to configure them. On Wed, Apr 17, 2019, 22:12 Francisco Olarte wrote: > On Wed, Apr 17, 2019 at 4:49 PM Ajay Pratap > wrote: > > Correction: I meant when my java application dies postgres should break > all the connections that were associated with that peer. > > And how is the server supposed to detect that without keepalives? TCP > is dessigned to survice for extended period of times without traffic, > I used that a lot in the dial up times. > > And what makes you think keepalives are impactful and unrealistic? I > use them a lot, they do not impact my workloads measurably. > > Francisco Olarte. > -- *Disclaimer:* The information in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful. Drishti is neither liable for the improper, incomplete transmission of the information contained in this communication nor any delay in its receipt. The communication is not intended to operate as an electronic signature under any applicable law. Drishti assumes no responsibility for any loss or damage resulting from the use of e-mails.
Re: Multicolumn index for single-column queries?
On 18/04/2019 18:52, rihad wrote: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? Thanks. The multi column index will require more RAM to hold it. So if there is memory contention, then there would be an increased risk of swapping, leading to slower query times. I suspect that if there is more than enough RAM, then a multi column index will be slightly slower than a single column index. However, the difference will probably be lost in the noise -- in other words, the various things happening in the background will most likely to have far more significant impact on query duration. IMHO Cheers, Gavin
Re: Multicolumn index for single-column queries?
Am 18.04.19 um 08:52 schrieb rihad: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? it depends . it depends on the queries you are using, on your workload. a multi-column-index will be large than an index over just one column, therefore you will have more disk-io when you read from such an index. Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
Re: Multicolumn index for single-column queries?
Andreas Kretschmer wrote: > Am 18.04.19 um 08:52 schrieb rihad: > > Hi. Say there are 2 indexes: > > > > "foo_index" btree (foo_id) > > > > "multi_index" btree (foo_id, approved, expires_at) > > > > > > foo_id is an integer. Some queries involve all three columns in their > > WHERE clauses, some involve only foo_id. > > Would it be ok from general performance standpoint to remove foo_index > > and rely only on multi_index? I know that > > PG would have to do less work updating just one index compared to > > updating them both, but wouldn't searches > > on foo_id alone become slower? > > it depends . > > it depends on the queries you are using, on your workload. a > multi-column-index will be large than an index over just one column, > therefore you will have more disk-io when you read from such an index. To be more explicit: if you can live with a slightly less efficient index scan and want fast data modifications, use only the second index. If you hardly ever update the table, don't mind the wasted space and want every bit of query speed (data warehouse), having both indexes might be better. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Multicolumn index for single-column queries?
On 4/18/19 2:14 AM, Andreas Kretschmer wrote: Am 18.04.19 um 08:52 schrieb rihad: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? it depends . it depends on the queries you are using, on your workload. a multi-column-index will be large than an index over just one column, therefore you will have more disk-io when you read from such an index. But two indexes are larger than one index, and updating two indexes requires more disk IO than updating one index. (Prefix compression would obviate the need for this question. Then your multi-column index would be *much* smaller.) -- Angular momentum makes the world go 'round.
Re: Multicolumn index for single-column queries?
On 19/04/2019 01:24, Ron wrote: On 4/18/19 2:14 AM, Andreas Kretschmer wrote: Am 18.04.19 um 08:52 schrieb rihad: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? it depends . it depends on the queries you are using, on your workload. a multi-column-index will be large than an index over just one column, therefore you will have more disk-io when you read from such an index. But two indexes are larger than one index, and updating two indexes requires more disk IO than updating one index. Agreed. A key question would be: how often is the query run, compared to the frequency Insertions, Updates, and Deletions -- wrt the table. (Prefix compression would obviate the need for this question. Then your multi-column index would be *much* smaller.) True, but a multi column index will still be bigger than single column index. [...]
Re: PostgreSQL ping/pong to client
Ajay Pratap writes: > Enabling system wise timeouts can impact my whole setup including my web > app, thats why it is hard for me to configure them. Uh ... who said anything about system-wide timeouts? The normal way to set this up is to configure the tcp_keepalives_xxx PG parameters, which will be applied to connections to the Postgres server, nothing else. regards, tom lane
Re: Multicolumn index for single-column queries?
Andreas Kretschmer writes: > Am 18.04.19 um 08:52 schrieb rihad: >> Hi. Say there are 2 indexes: >> >> "foo_index" btree (foo_id) >> >> "multi_index" btree (foo_id, approved, expires_at) >> >> >> foo_id is an integer. Some queries involve all three columns in >> their WHERE clauses, some involve only foo_id. >> Would it be ok from general performance standpoint to remove >> foo_index and rely only on multi_index? I know that >> PG would have to do less work updating just one index compared to >> updating them both, but wouldn't searches >> on foo_id alone become slower? > > it depends . > > it depends on the queries you are using, on your workload. a > multi-column-index will be large than an index over just one column, > therefore you will have more disk-io when you read from such an index. I think it also depends on the average number of rows having the same foo_id.
Re: Multicolumn index for single-column queries?
On 4/18/19 8:45 AM, Gavin Flower wrote: On 19/04/2019 01:24, Ron wrote: On 4/18/19 2:14 AM, Andreas Kretschmer wrote: [snip] (Prefix compression would obviate the need for this question. Then your multi-column index would be *much* smaller.) True, but a multi column index will still be bigger than single column index. TANSTAAFL. -- Angular momentum makes the world go 'round.
Re: Method to pass data between queries in a multi-statement transaction
On 4/17/19 5:30 PM, Souvik Bhattacherjee wrote: There are few if any situations where you need to immediately and completely pass all values from one query to another in the same transaction where the queries cannot just be combined into a single statement. Your representative example is one that is easily combined into a single statement. > What if I need the result of the join to be stored into table3 as well as the tuples that participated in the query to be deleted from table1. The following can be done without the need to transfer values from the previous query into the next: begin; insert into table3 (id, attr1, attr2) (select t1.cid, t1.empname, t2.dept from table1 t1, table2 t2 where t1.cid = t2.cid); delete from table1 where cid in (select c.cid from table1 t1, table2 t2 where t1.cid = t2.cid); Well the DELETE is not going to work as c.cid will error as undefined. commit; However note that we have to perform the join twice, which is not efficient. Now to make things worse, increase the number of tables to join while imposing the requirement of tuple deletion to apply to all or to a subset of the tables that participate in join. You might want to take a look at CTE's: https://www.postgresql.org/docs/11/queries-with.html Now, the stuff you are trying seems to indicate you are trying to do something in C, inside the engine itself, with all of this. If that is the case you may want to be more clear as to what you are attempting to do. But as far as server SQL goes the only persistence area are tables/relations - including temporary ones. I'm trying to modify the engine here. -SB On Wed, Apr 17, 2019 at 6:16 PM David G. Johnston mailto:david.g.johns...@gmail.com>> wrote: On Wed, Apr 17, 2019 at 3:04 PM Souvik Bhattacherjee mailto:kivu...@gmail.com>> wrote: Hello, I'm trying to pass some values between queries in a multi-statement transaction. For example, consider the following representative multi-statement transaction: begin; select * from table1 t1, table2 t2 where t1.cid = t2.cid; delete from table1 where cid in (values-to-be-populated-from-the-previous-query); commit; There are few if any situations where you need to immediately and completely pass all values from one query to another in the same transaction where the queries cannot just be combined into a single statement. Your representative example is one that is easily combined into a single statement. Now, the stuff you are trying seems to indicate you are trying to do something in C, inside the engine itself, with all of this. If that is the case you may want to be more clear as to what you are attempting to do. But as far as server SQL goes the only persistence area are tables/relations - including temporary ones. David J. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Method to pass data between queries in a multi-statement transaction
Thanks Michel. However this only works if a is an unique attribute in the table that would help us to identify tuples that participated in the join. Consider the following join: insert into table3 (id, level, empname, salary) (select t0.cid, t0.level, t1.empname, t2.salary from table0 t0, table1 t1, table2 t2 where t0.cid = t1.cid and t1.pid = t2.pid); Now if I want to delete those tuples from table2 that satisfied the join condition, I need to execute the join again with additional attributes. Also note that based on query plan, i.e. whether table0 and table1 were joined first followed by table1 and table2, we have to execute one additional join to get the tuples in table2 that satisfied the join condition (t1.pid = t2.pid). Getting that information while the query is executed may not be difficult. There are other use cases in my application that require me to transfer the data from one query to the next within a transaction. Thus, what I'm looking for here is way to store the information and then pass that information to the next query efficiently. For example, is it possible to define a struct of my choice, private to the current transaction, that would store the data and then pass it around to the next query in the transaction without having to materialize that struct (or deal with concurrency issues as in the hash table approach mentioned earlier) . -SB On Wed, Apr 17, 2019 at 10:47 PM Michel Pelletier < pelletier.mic...@gmail.com> wrote: > On Wed, Apr 17, 2019 at 5:30 PM Souvik Bhattacherjee > wrote: > >> > What if I need the result of the join to be stored into table3 as well >> as the tuples that participated in the query to be deleted from table1. The >> following can be done without the need to transfer values from the previous >> query into the next: >> >> begin; >> insert into table3 (id, attr1, attr2) (select t1.cid, t1.empname, t2.dept >> from table1 t1, table2 t2 where t1.cid = t2.cid); >> delete from table1 where cid in (select c.cid from table1 t1, table2 t2 >> where t1.cid = t2.cid); >> commit; >> >> > > You can use INSERT...RETURNING in a WITH query: > > postgres=# create table foo (a integer); > CREATE TABLE > postgres=# create table bar (b integer); > CREATE TABLE > postgres=# insert into bar values (42); > INSERT 0 1 > postgres=# with i as (insert into foo values (42) returning a) delete from > bar where b = (select a from i); > DELETE 1 > > -Michel > > >> >> On Wed, Apr 17, 2019 at 6:16 PM David G. Johnston < >> david.g.johns...@gmail.com> wrote: >> >>> >>> On Wed, Apr 17, 2019 at 3:04 PM Souvik Bhattacherjee >>> wrote: >>> Hello, I'm trying to pass some values between queries in a multi-statement transaction. For example, consider the following representative multi-statement transaction: begin; select * from table1 t1, table2 t2 where t1.cid = t2.cid; delete from table1 where cid in (values-to-be-populated-from-the-previous-query); commit; >>> >>> There are few if any situations where you need to immediately and >>> completely pass all values from one query to another in the same >>> transaction where the queries cannot just be combined into a single >>> statement. Your representative example is one that is easily combined into >>> a single statement. >>> >>> Now, the stuff you are trying seems to indicate you are trying to do >>> something in C, inside the engine itself, with all of this. If that is the >>> case you may want to be more clear as to what you are attempting to do. >>> But as far as server SQL goes the only persistence area are >>> tables/relations - including temporary ones. >>> >>> David J. >>> >>
Re: Method to pass data between queries in a multi-statement transaction
Well the DELETE is not going to work as c.cid will error as undefined. > Yes, that's a typo. I haven't tested it out before typing; just wanted to convey the general idea. -SB On Thu, Apr 18, 2019 at 10:50 AM Adrian Klaver wrote: > On 4/17/19 5:30 PM, Souvik Bhattacherjee wrote: > > There are few if any situations where you need to immediately and > > completely pass all values from one query to another in the same > > transaction where the queries cannot just be combined into a single > > statement. Your representative example is one that is easily combined > > into a single statement. > > > > > What if I need the result of the join to be stored into table3 as > > well as the tuples that participated in the query to be deleted from > > table1. The following can be done without the need to transfer values > > from the previous query into the next: > > > > begin; > > insert into table3 (id, attr1, attr2) (select t1.cid, t1.empname, > > t2.dept from table1 t1, table2 t2 where t1.cid = t2.cid); > > delete from table1 where cid in (select c.cid from table1 t1, table2 t2 > > where t1.cid = t2.cid); > > Well the DELETE is not going to work as c.cid will error as undefined. > > > commit; > > > > However note that we have to perform the join twice, which is not > > efficient. Now to make things worse, increase the number of tables to > > join while imposing the requirement of tuple deletion to apply to all or > > to a subset of the tables that participate in join. > > You might want to take a look at CTE's: > > https://www.postgresql.org/docs/11/queries-with.html > > > > > Now, the stuff you are trying seems to indicate you are trying to do > > something in C, inside the engine itself, with all of this. If that is > > the case you may want to be more clear as to what you are attempting to > > do. But as far as server SQL goes the only persistence area are > > tables/relations - including temporary ones. > > > >> I'm trying to modify the engine here. > > > > -SB > > > > On Wed, Apr 17, 2019 at 6:16 PM David G. Johnston > > mailto:david.g.johns...@gmail.com>> wrote: > > > > > > On Wed, Apr 17, 2019 at 3:04 PM Souvik Bhattacherjee > > mailto:kivu...@gmail.com>> wrote: > > > > Hello, > > > > I'm trying to pass some values between queries in a > > multi-statement transaction. For example, consider the following > > representative multi-statement transaction: > > > > begin; > > select * from table1 t1, table2 t2 where t1.cid = t2.cid; > > delete from table1 where cid in > > (values-to-be-populated-from-the-previous-query); > > commit; > > > > > > There are few if any situations where you need to immediately and > > completely pass all values from one query to another in the same > > transaction where the queries cannot just be combined into a single > > statement. Your representative example is one that is easily > > combined into a single statement. > > > > Now, the stuff you are trying seems to indicate you are trying to do > > something in C, inside the engine itself, with all of this. If that > > is the case you may want to be more clear as to what you are > > attempting to do. But as far as server SQL goes the only > > persistence area are tables/relations - including temporary ones. > > > > David J. > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >
Re: Method to pass data between queries in a multi-statement transaction
> > Thus, what I'm looking for here is way to store the information and then > pass that information to the next query efficiently. > For example, is it possible to define a struct of my choice, private to > the current transaction, that would store the data and then pass it around > to the next query in the transaction without having to materialize that > struct (or deal with concurrency issues as in the hash table approach > mentioned earlier) . > Perhaps I am missing something obvious, but why not use a temp table?
SQL query
Hi, I have come across a query that a developer wrote to update a few rows in table, the query did update the two desired rows but also updated the rest of the table with the column value as 'false'. Update tableA set col1 = null and col2 in (1,2); The query updated col1 to null for the rows where col2 was either 1 or 2, rest of rows were also updated for col1 to 'false'. The above was run without where clause. Could the experts throw some light on this? Regards Vikas Sharma
SQL query
On Thursday, April 18, 2019, Vikas Sharma wrote: > > The above was run without where clause. > There was no where clause thus every row has to be updated by definition...null and bool evaluates to either false or null since if bool is false the null doesn’t matter and if bool is true the result is unknown from the null. David J.
Re: SQL query
On 4/18/19 9:43 AM, Vikas Sharma wrote: Hi, I have come across a query that a developer wrote to update a few rows in table, the query did update the two desired rows but also updated the rest of the table with the column value as 'false'. Update tableA set col1 = null and col2 in (1,2); The query updated col1 to null for the rows where col2 was either 1 or 2, rest of rows were also updated for col1 to 'false'. The above was run without where clause. Could the experts throw some light on this? Hmm. What Postgres version? Assuming col1 is boolean, correct? My experimentation: create table up_test(id integer, col1 boolean, col2 integer); insert into up_test values (1, 't', 1), (2, null, 2), (3, 't', 4); update up_test set col1=null and col2 in(1,2); UPDATE 3 select * from up_test; id | col1 | col2 +--+-- 1 | |1 2 | |2 3 | f|4 (3 rows) truncate up_test ; TRUNCATE TABLE insert into up_test values (1, 't', 1), (2, null, 2), (3, 't', 4); INSERT 0 3 update up_test set col1=(null and col2 in(1,2)); UPDATE 3 test_(postgres)# select * from up_test; id | col1 | col2 +--+-- 1 | |1 2 | |2 3 | f|4 Looks to me it is seeing the and as part of an expression. Regards Vikas Sharma -- Adrian Klaver adrian.kla...@aklaver.com
Re: Method to pass data between queries in a multi-statement transaction
On Thu, Apr 18, 2019 at 9:06 AM Michael Lewis wrote: > Thus, what I'm looking for here is way to store the information and then >> pass that information to the next query efficiently. >> For example, is it possible to define a struct of my choice, private to >> the current transaction, that would store the data and then pass it around >> to the next query in the transaction without having to materialize that >> struct (or deal with concurrency issues as in the hash table approach >> mentioned earlier) . >> > > Perhaps I am missing something obvious, but why not use a temp table? > Right, or as Adrian and I pointed out, use a CTE (WITH) query, which will materialize any results you want for the query. Perhaps if you give us a working, reproducible, self contained example of what you expect we can help you better.
Re: SQL query
On 4/18/19 11:43 AM, Vikas Sharma wrote: Hi, I have come across a query that a developer wrote to update a few rows in table, the query did update the two desired rows but also updated the rest of the table with the column value as 'false'. Update tableA set col1 = null and col2 in (1,2); The query updated col1 to null for the rows where col2 was either 1 or 2, rest of rows were also updated for col1 to 'false'. The above was run without where clause. On the QA server, right? -- Angular momentum makes the world go 'round.
Re: ERROR: operator class "gin__int_ops" does not exist for access method "gin"
On 4/17/19 9:38 PM, Julie Nishimura wrote: Thank you! All works now Great. I mentioned upstream about migrating to version newer then 9.4. Given the amount of work you are putting into this, would you not be better served by going to a version that has more then ~10 months left on its community support?: https://www.postgresql.org/support/versioning/ Version Current minor Supported First Release Final Release 9.4 9.4.21Yes December 18, 2014 February 13, 2020 -- Adrian Klaver adrian.kla...@aklaver.com
Re: SQL query
Not able to produce this with PG 11.1 If col1 is any type other than boolean, the update statement fails in syntax. If col1 is boolean, then it updated it correctly. In other words update col1 = NULL and col2 in (1,2) is treated same as update col1 = NULL where col2 in (1,2) Also I checked it on DB2 which rejected the UPDATE sql as bad syntax, as expected.
Re: SQL query
Oh wait. I see that it in both cases it did update correct target rows, but the value of col1 for non matching rows is different. In the first case (and col2), the non matching rows also got updated. So yes, same behavior like yours. > Sent: Thursday, April 18, 2019 at 2:36 PM > From: "Ravi Krishna" > To: "Vikas Sharma" > Cc: pgsql-general@lists.postgresql.org > Subject: Re: SQL query > > Not able to produce this with PG 11.1 > > If col1 is any type other than boolean, the update statement fails in syntax. > If col1 is boolean, then it updated it correctly. In other words >update col1 = NULL > and col2 in (1,2) > is treated same as >update col1 = NULL > where col2 in (1,2) > > Also I checked it on DB2 which rejected the UPDATE sql as bad syntax, as > expected. > > > >
Re: SQL query
On 4/18/19 11:36 AM, Ravi Krishna wrote: Not able to produce this with PG 11.1 If col1 is any type other than boolean, the update statement fails in syntax. If col1 is boolean, then it updated it correctly. In other words update col1 = NULL and col2 in (1,2) is treated same as update col1 = NULL where col2 in (1,2) The above is not the same format as OP's query: Update tableA set col1 = null and col2 in (1,2); Also I checked it on DB2 which rejected the UPDATE sql as bad syntax, as expected. -- Adrian Klaver adrian.kla...@aklaver.com
Re: SQL query
> The above is not the same format as OP's query: > > Update tableA set col1 = null and col2 in (1,2); I did include set in the sql. I typed it wrong here.
Re: Method to pass data between queries in a multi-statement transaction
On Thu, Apr 18, 2019 at 9:03 AM Souvik Bhattacherjee wrote: > Thanks Michel. > > However this only works if a is an unique attribute in the table that > would help us to identify tuples that participated in the join. Consider > the following join: > > insert into table3 (id, level, empname, salary) > (select t0.cid, t0.level, t1.empname, t2.salary from table0 t0, table1 > t1, table2 t2 where t0.cid = t1.cid and t1.pid = t2.pid); > > Now if I want to delete those tuples from table2 that satisfied the join > condition, I need to execute the join again with additional attributes. > Or just add pid to table3... > Also note that based on query plan, i.e. whether table0 and table1 were > joined first followed by table1 and table2, we have to execute one > additional join to get the tuples in table2 that satisfied the join > condition (t1.pid = t2.pid). > ??? > Getting that information while the query is executed may not be difficult. > There are other use cases in my application that require me to transfer the > data from one query to the next within a transaction. > There may be some that benefit to some degree but its likely that you can write the application and queries in such a way to avoid a hard requirement. Thus, what I'm looking for here is way to store the information and then > pass that information to the next query efficiently. > For example, is it possible to define a struct of my choice, private to the > current transaction, that would store the data and then pass it around to > the next query in the transaction without having to materialize that struct > (or deal with concurrency issues as in the hash table approach mentioned > earlier) . > How much development and maintenance effort are you willing to spend here to gain what is likely to amount to only a bit of efficiency? Many things are possible if you are going to modify the server code but why add grief? David J.
Re: Multicolumn index for single-column queries?
On 19/04/2019 02:12, Ron wrote: On 4/18/19 8:45 AM, Gavin Flower wrote: On 19/04/2019 01:24, Ron wrote: On 4/18/19 2:14 AM, Andreas Kretschmer wrote: [snip] (Prefix compression would obviate the need for this question. Then your multi-column index would be *much* smaller.) True, but a multi column index will still be bigger than single column index. TANSTAAFL. QUOTE: [Oh, 'tanstaafl.' Means ~There ain't no such thing as a free lunch.'] From The Moon is a Harsh Mistress, by Robert Heinlein. Published 1966 Is where I first came across TANSTAAFL. However, it appears to have been used at least as early as 1949. Just adding this, as probably there are many people who don't know the acronym.
Re: Multicolumn index for single-column queries?
On 19/04/2019 01:47, Harald Fuchs wrote: Andreas Kretschmer writes: Am 18.04.19 um 08:52 schrieb rihad: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? it depends . it depends on the queries you are using, on your workload. a multi-column-index will be large than an index over just one column, therefore you will have more disk-io when you read from such an index. I think it also depends on the average number of rows having the same foo_id. The number of rows referenced by an index entry for the multi_index will always be less than or equal to those for the matching foo_index. Also there will be fewer index entries per block for the multi_index. Which is why the I/O count will be higher; even in the best case, where there is an equal row referenced by the index entries.
Re: Method to pass data between queries in a multi-statement transaction
Or just add pid to table3... > That's an application requirement. So pid cannot be added at will to table3. How much development and maintenance effort are you willing to spend here to gain what is likely to amount to only a bit of efficiency? Many things are possible if you are going to modify the server code but why add grief? > How much development / maintenance effort do you anticipate for implementing this feature? This is something that my application would need so I'm willing to dedicate some time to it. By the way, I figured that the hash table deallocation issue could be resolved (although in an inefficient way) by serializing the data to a string and then copying that value into the hash table during insertion. However the hash table is still visible to all the transactions I suppose and as a result needs to be locked. Just wanted to let you know that I have initialized the hash table within the PostgresMain() method and the hash table is declared as an extern variable, which I anticipate to be accessed by many methods. How difficult is it to make a hash table (or any data structure) private to the current transaction so that I do not have to resort to locking? -SB On Thu, Apr 18, 2019 at 5:34 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Thu, Apr 18, 2019 at 9:03 AM Souvik Bhattacherjee > wrote: > >> Thanks Michel. >> >> However this only works if a is an unique attribute in the table that >> would help us to identify tuples that participated in the join. Consider >> the following join: >> >> insert into table3 (id, level, empname, salary) >> (select t0.cid, t0.level, t1.empname, t2.salary from table0 t0, table1 >> t1, table2 t2 where t0.cid = t1.cid and t1.pid = t2.pid); >> >> Now if I want to delete those tuples from table2 that satisfied the join >> condition, I need to execute the join again with additional attributes. >> > > Or just add pid to table3... > >> Also note that based on query plan, i.e. whether table0 and table1 were >> joined first followed by table1 and table2, we have to execute one >> additional join to get the tuples in table2 that satisfied the join >> condition (t1.pid = t2.pid). >> > > ??? > >> Getting that information while the query is executed may not be >> difficult. There are other use cases in my application that require me to >> transfer the data from one query to the next within a transaction. >> > > There may be some that benefit to some degree but its likely that you can > write the application and queries in such a way to avoid a hard requirement. > > Thus, what I'm looking for here is way to store the information and then >> pass that information to the next query efficiently. >> > For example, is it possible to define a struct of my choice, private to >> the current transaction, that would store the data and then pass it around >> to the next query in the transaction without having to materialize that >> struct (or deal with concurrency issues as in the hash table approach >> mentioned earlier) . >> > > How much development and maintenance effort are you willing to spend here > to gain what is likely to amount to only a bit of efficiency? Many things > are possible if you are going to modify the server code but why add grief? > > David J. > >
Re: Multicolumn index for single-column queries?
On 19/04/2019 01:47, Harald Fuchs wrote: Andreas Kretschmer writes: Am 18.04.19 um 08:52 schrieb rihad: Hi. Say there are 2 indexes: "foo_index" btree (foo_id) "multi_index" btree (foo_id, approved, expires_at) foo_id is an integer. Some queries involve all three columns in their WHERE clauses, some involve only foo_id. Would it be ok from general performance standpoint to remove foo_index and rely only on multi_index? I know that PG would have to do less work updating just one index compared to updating them both, but wouldn't searches on foo_id alone become slower? it depends . it depends on the queries you are using, on your workload. a multi-column-index will be large than an index over just one column, therefore you will have more disk-io when you read from such an index. I think it also depends on the average number of rows having the same foo_id. The number of rows referenced by an index entry for the multi_index will always be less than or equal to those for the matching foo_index. Also there will be fewer index entries per block for the multi_index, which is why the I/O count will be higher even in the best case where there is an equal number of rows referenced by each index entry.
Re: Multicolumn index for single-column queries?
On 19/04/2019 14:01, Gavin Flower wrote: [...] Also there will be fewer index entries per block for the multi_index, which is why the I/O count will be higher even in the best case where there is an equal number of rows referenced by each index entry. Not sure why my system had this still in my draft folder! Sorry, for the duplication...