Re: Best way to select a random row from a derived table

2018-01-28 Thread Fabien COELHO



I am aware of 2 ways to select a random row from a table:

   1) select * from table_name order by random() limit 1;
   -- terribly inefficient

   2) select * from table_name tablesample system_rows(1) limit 1;
   -- only works on tables, not views or subqueries

Is there an option that is reasonably efficient and can be used on views
and subqueries?


My 0.02€: I'd say this is not possible. In order to choose a item from a 
set randomly, you need to consider somehow the size of the set. Solution 
(2) can be done because the size of the table is known to TABLESAMPLE. 
Solution (1) does it by actually generating the set, hence the cost, so 
that its size is implicitely known as well. I cannot see a way out of this 
conundrum for a general query for which the size is not known without 
executing it. I'd like to be proven false, though:-)


--
Fabien.

Re: Best way to select a random row from a derived table

2018-01-28 Thread Condor

On 28-01-2018 08:39, Ryan Murphy wrote:

Hello hackers and postgressers,

I am aware of 2 ways to select a random row from a table:

1) select * from table_name order by random() limit 1;
-- terribly inefficient

2) select * from table_name tablesample system_rows(1) limit 1;
-- only works on tables, not views or subqueries

Is there an option that is reasonably efficient and can be used on
views and subqueries?

Thanks!
Ryan



I do it with:

SELECT * FROM table_name OFFSET RANDOM() * LIMIT 1;

Regards,
HC



Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread Ryan Murphy
Hello,

I heard at a PostgreSQL talk that you should not liberally create temp
tables in the course of frequently-used functions etc, because (roughly)
you're using up some of the same resources that you for your regular tables.

Is this true?  Is there an important reason not to have e.g. a plpgsql
function that uses a temp table?  What are the specific problems if I do
this?  Is the problem ameliorated if I add ON COMMIT DROP?

Best,
Ryan


Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread David G. Johnston
On Sunday, January 28, 2018, Ryan Murphy  wrote:

> Hello,
>
> I heard at a PostgreSQL talk that you should not liberally create temp
> tables in the course of frequently-used functions etc, because (roughly)
> you're using up some of the same resources that you for your regular tables.
>
> Is this true?  Is there an important reason not to have e.g. a plpgsql
> function that uses a temp table?  What are the specific problems if I do
> this?  Is the problem ameliorated if I add ON COMMIT DROP?
>

I believe the main, and maybe only, concern is the bloating of the system
catalog tables since you are constantly adding and removing records.  Yes,
they will be vacuumed but vacuuming and bloat on catalog tables slows every
single query down to some, degree since every query has to lookup its
objects is those catalogs.  Though caching probably alleviates some of that.

The way most temp tables are used on commit drop likely has little impact
on this, but the specific usage pattern matters a great deal in answering
the question.

David J.


Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread Ryan Murphy
>
> I believe the main, and maybe only, concern is the bloating of the system
> catalog tables since you are constantly adding and removing records.  Yes,
> they will be vacuumed but vacuuming and bloat on catalog tables slows every
> single query down to some, degree since every query has to lookup its
> objects is those catalogs.  Though caching probably alleviates some of that
>

Yes, that's exactly the concern I heard, thanks for reminding me.

If I want to e.g. temporarily store a "setof records" or a "table" result
in a variable as part of a calculation in a plpgsql function, do I have any
other option than CREATE TEMPORARY TABLE?  It didn't seem to work when I
DECLAREd a variable of type "setof table_name" or "setof
table_name%rowtype", and then SELECT INTO it.


Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread Andy Colson

On 01/28/2018 08:46 AM, Ryan Murphy wrote:

I believe the main, and maybe only, concern is the bloating of the system 
catalog tables since you are constantly adding and removing records.  Yes, they 
will be vacuumed but vacuuming and bloat on catalog tables slows every single 
query down to some, degree since every query has to lookup its objects is those 
catalogs.  Though caching probably alleviates some of that


Yes, that's exactly the concern I heard, thanks for reminding me.

If I want to e.g. temporarily store a "setof records" or a "table" result in a variable as part of 
a calculation in a plpgsql function, do I have any other option than CREATE TEMPORARY TABLE?  It didn't seem to work 
when I DECLAREd a variable of type "setof table_name" or "setof table_name%rowtype", and then 
SELECT INTO it.



You may not need temp tables at all.  You can use subselects, derived tables, 
and cte's:

select sum(a+b) as total
from (
   select a, b+1
   from detail
) as tmpx;


This does the same thing as a temp table, with no temp table.

-Andy



Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread Rob Sargent
 “On commit drop” is generally good practice for several reasons: if your 
function is called in a loop, or the create statement is in a loop or you want 
to call your function more than once in a given session (with out explicitly 
dropping your temp table).
The commit in question is the function btw. 

> On Jan 28, 2018, at 8:53 AM, Andy Colson  wrote:
> 
>> On 01/28/2018 08:46 AM, Ryan Murphy wrote:
>>I believe the main, and maybe only, concern is the bloating of the system 
>> catalog tables since you are constantly adding and removing records.  Yes, 
>> they will be vacuumed but vacuuming and bloat on catalog tables slows every 
>> single query down to some, degree since every query has to lookup its 
>> objects is those catalogs.  Though caching probably alleviates some of that
>> Yes, that's exactly the concern I heard, thanks for reminding me.
>> If I want to e.g. temporarily store a "setof records" or a "table" result in 
>> a variable as part of a calculation in a plpgsql function, do I have any 
>> other option than CREATE TEMPORARY TABLE?  It didn't seem to work when I 
>> DECLAREd a variable of type "setof table_name" or "setof 
>> table_name%rowtype", and then SELECT INTO it.
> 
> You may not need temp tables at all.  You can use subselects, derived tables, 
> and cte's:
> 
> select sum(a+b) as total
> from (
>   select a, b+1
>   from detail
> ) as tmpx;
> 
> 
> This does the same thing as a temp table, with no temp table.
> 
> -Andy
> 



Re: PostgreSQL Restore Database Without Backup and Log

2018-01-28 Thread Adrian Klaver

On 01/26/2018 05:58 AM, fabio.silva wrote:

Hi!
I had a Ransomware atack and I lost a  lot of file from my server.
I just have the OID folder and the files inside it.
Is it possible restore database using only this data files?


I am going to say no, too much information is missing to reconstruct the 
cluster.



I don't have the folder data, I don't have the folder global, I just have
Datafiles from my old database.
I'm using PostgreSQL 9.4 and Windows Server 2012R2
Thanks!



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: PostgreSQL Restore Database Without Backup and Log

2018-01-28 Thread Tom Lane
Adrian Klaver  writes:
> On 01/26/2018 05:58 AM, fabio.silva wrote:
>> I had a Ransomware atack and I lost a  lot of file from my server.
>> I just have the OID folder and the files inside it.
>> Is it possible restore database using only this data files?

> I am going to say no, too much information is missing to reconstruct the 
> cluster.

There are various PG support companies that specialize in data recovery
after filesystem failures, and it's possible that one of them could help
in getting at least some data back from this situation.  It likely won't
be cheap though.  There's definitely no easy, certain-to-work answer.

regards, tom lane



pg_trgm Extension Installed but Operators and Functions do not exist

2018-01-28 Thread Igal @ Lucee.org
I am trying to run a query which was working a few weeks ago, and it 
utilizes the <<-> operator but I am getting an error: operator does not 
exist text <<-> text.


If I try instead the function word_similarity() I get a similar (no pun 
intended) error: function word_similarity(text, text) does not exist.


If I try to install the pg_trgm extension with `CREATE EXTENSION 
pg_trgm;` I get an error: extension "pg_trgm" already exists.


Running PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit

Any ideas?  Thanks,

Igal Sapir
Lucee Core Developer
Lucee.org 



Re: pg_trgm Extension Installed but Operators and Functions do not exist

2018-01-28 Thread Tom Lane
"Igal @ Lucee.org"  writes:
> I am trying to run a query which was working a few weeks ago, and it 
> utilizes the <<-> operator but I am getting an error: operator does not 
> exist text <<-> text.

> If I try instead the function word_similarity() I get a similar (no pun 
> intended) error: function word_similarity(text, text) does not exist.

> If I try to install the pg_trgm extension with `CREATE EXTENSION 
> pg_trgm;` I get an error: extension "pg_trgm" already exists.

Seems like a likely bet is that the extension is installed in a schema
that isn't in your search_path.

regards, tom lane



Re: pg_trgm Extension Installed but Operators and Functions do not exist

2018-01-28 Thread Igal @ Lucee.org

On 1/28/2018 12:35 PM, Tom Lane wrote:

"Igal @ Lucee.org"  writes:

I am trying to run a query which was working a few weeks ago, and it
utilizes the <<-> operator but I am getting an error: operator does not
exist text <<-> text.

If I try to install the pg_trgm extension with `CREATE EXTENSION
pg_trgm;` I get an error: extension "pg_trgm" already exists.

Seems like a likely bet is that the extension is installed in a schema
that isn't in your search_path.


Thank you, Tom.  I was unaware of the fact that extensions are installed 
to specific schemas.


For future users who might face this issue, I fixed it by re-creating 
the extension in the pg_catalog schema, which is always in the 
search_path [1]:


  DROP EXTENSION pg_trgm;

  CREATE EXTENSION pg_trgm SCHEMA pg_catalog;

Thanks again,

Igal

[1] 
https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SEARCH-PATH






Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread David G. Johnston
On Sunday, January 28, 2018, Rob Sargent  wrote:
>
> The commit in question is the function btw.
>

That doesn't make sense - functions can't commit.

David J.


Re: FW: Setting up streaming replication problems

2018-01-28 Thread Thiemo Kellner
Me again. Hope you wont feel to bothered by me. I just summarise so far 
and am

still in dire need of guidance.

Debian 9 with PostreSQL 10. I have two nodes in the cluster I use as 
master and

as standby. I tried to setup replication with Rigg's book and the official
documentation and a couple of web pages.

I am aware that there is danger of dead lock with synchronous 
replication with
only two host as well there is no point in having replicated servers on 
the same
metal as the master - but in trying to figure out how to setup as I am 
trying to

do - unless replication within a cluster does not work anyway.

I am not sure whether to put the md5 value of the repuser password into
primary_conninfo or the plain one. I don't feel the documentation or the 
book is

clear on that. I thought to have tried both ways to no avail.

I could not find a hint in the logs, that standby tried to connect to 
master.


Find below my configs

Cheers Thiemo

== Hot standby ==

/etc/postgresql/10/main2/pg_hba.conf
hostreplication all 127.0.0.1/32md5
hostreplication all ::1/128 md5
local   replication repuser peer
hostreplication repuser 0.0.0.1/0   md5
hostreplication repuser ::1/0   md5

/etc/postgresql/10/main2/postgresql.conf
wal_level = replica
max_replication_slots = 12
synchronous_standby_names = 'main,main2'
hot_standby = on
log_min_messages = debug1
log_connections = on
log_statement = 'ddl'
log_replication_commands = on
lc_messages = 'C.UTF-8'

/etc/postgresql/10/main2/recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=localhost user=repuser port=5432 password=value or

plain text?>'

== master ==
/etc/postgresql/10/main/pg_hba.conf
hostreplication all 127.0.0.1/32md5
hostreplication all ::1/128 md5
local   replication repuser peer
hostreplication repuser 0.0.0.1/0   md5
hostreplication repuser ::1/0   md5

/etc/postgresql/10/main/postgresql.conf
wal_level = replica
archive_mode = off
max_wal_senders = 12
max_replication_slots = 12
synchronous_standby_names = 'main2,main'
hot_standby = on
wal_receiver_timeout = 60s
log_min_messages = debug1
log_connections = on
log_statement = 'ddl'
log_replication_commands = on
lc_messages = 'C.UTF-8'

/etc/postgresql/10/main/recovery.conf
standby_mode = 'off'
primary_conninfo = 'host=localhost user=repuser port=5433 password=value or

plain text?>'

-- Öffentlicher PGP-Schlüssel:
http://pgp.mit.edu/pks/lookup?op=get&search=0xCA167FB0E717AFFC
<>

Re: FW: Setting up streaming replication problems

2018-01-28 Thread Martin Goodson

On 28/01/2018 22:24, Thiemo Kellner wrote:

Me again. Hope you wont feel to bothered by me. I just summarise so
far and am still in dire need of guidance.



I am not sure whether to put the md5 value of the repuser password
into primary_conninfo or the plain one. I don't feel the
documentation or the book is clear on that. I thought to have tried
both ways to no avail.


The documentation seems pretty clear on the subject, if you look at the
standby server settings documentation and look at primary_conninfo it says:

"A password needs to be provided too, if the primary demands password
authentication. It can be provided in the primary_conninfo string, or in
a separate ~/.pgpass file on the standby server (use replication as the
database name)"

That's not an md5 version of the password, that's the actual clear text
password. That's why putting the password into the password file instead
of the recovery file (or specifying trust if you actually do) might not
be such a bad idea, really.

But are you sure the password is the issue? What messages are you seeing
in your logs on both the master and the standby? Could you post the logs
here, or at least a representative sample? What are you seeing in the
standby logs when you start it? What are you seeing in the master's logs
at the same time?

M.

--
Martin Goodson

"Have you thought up some clever plan, Doctor?"
"Yes, Jamie, I believe I have."
"What're you going to do?"
"Bung a rock at it."



How to Optimize pg_trgm Performance

2018-01-28 Thread Igal @ Lucee.org
I want to use pg_trgm for auto-suggest functionality.  I created a 
Materialized View with the information that I need, with the relevant 
columns being (keywords text, rank int).  keywords is the column from 
which I build the tri-grams, and rank is some popularity factor so that 
popular results will show up higher than less popular results given the 
same tri-gram distance.


I want to return results in the order of [distance], [distance_word], 
[rank].  The input comes from the user and is not known in advance.  My 
query is as follows:


    SELECT title
    ,id
    ,(input <-> keywords) AS distance
        ,(input <<-> keywords) AS distance_word
    ,rank
    FROM  (VALUES (cast('red pill' AS text))) consts(input)
    ,mv_autosuggest
    ORDER BY 3, 4, 5
    LIMIT 20;

This gives me pretty good results, but it takes too long and is not 
likely to scale well.


I have created two indexes but neither seem to be used:

CREATE INDEX mv_autosuggest_keywords_tgrm_gist ON staging.mv_autosuggest 
USING gist (keywords gist_trgm_ops);


CREATE INDEX mv_autosuggest_keywords_tgrm_gin ON staging.mv_autosuggest 
USING gin (keywords gin_trgm_ops);


This is the result of explain analyze:

QUERY PLAN |
---|
Limit  (cost=356.41..356.46 rows=20 width=51) (actual 
time=163.132..163.135 rows=20 
loops=1)   |
  ->  Sort  (cost=356.41..372.96 rows=6619 width=51) (actual 
time=163.130..163.131 rows=20 
loops=1)    |
    Sort Key: (('red pill'::text <-> mv_autosuggest.keywords)), 
(('red pill'::text <<-> mv_autosuggest.keywords)), mv_autosuggest.rank |

    Sort Method: top-N heapsort  Memory: 28kB |
    ->  Seq Scan on mv_autosuggest  (cost=0.00..180.29 rows=6619 
width=51) (actual time=0.263..161.289 rows=6619 loops=1)  |

Planning time: 0.139 ms |
Execution time: 163.174 ms |

How can I improve the performance here?

Thank you,

Igal Sapir
Lucee Core Developer
Lucee.org 



Re: [GENERAL] pgpass file type restrictions

2018-01-28 Thread Bruce Momjian
On Thu, Oct 19, 2017 at 08:22:18AM -0400, Andrew Dunstan wrote:
> 
> 
> On 10/19/2017 02:12 AM, Tom Lane wrote:
> > Desidero  writes:
> >> I’m running into problems with the restriction on pgpass file types. When
> >> attempting to use something like an anonymous pipe for a passfile, psql
> >> throws an error stating that it only accepts plain files.
> >> ...
> >> Does anyone know why it’s set up to avoid using things like anonymous pipes
> >> (or anything but "plain files")?
> > A bit of digging in the git history says that the check was added here:
> >
> > commit 453d74b99c9ba6e5e75d214b0d7bec13553ded89
> > Author: Bruce Momjian 
> > Date:   Fri Jun 10 03:02:30 2005 +
> > 
> > Add the "PGPASSFILE" environment variable to specify to the password
> > file.
> > 
> > Andrew Dunstan
> > 
> > and poking around in the mailing list archives from that time finds
> > what seems to be the originating thread:
> >
> > https://www.postgresql.org/message-id/flat/4123BF8C.5000909%40pse-consulting.de
> >
> > There's no real discussion there of the check for plain-file-ness.
> > My first guess would have been that the idea was to guard against
> > symlink attacks; but then surely the stat call needed to have been
> > changed to lstat?  So I'm not quite sure of the reasoning.  Perhaps
> > Andrew remembers.
> 
> 
> 
> That was written 13 years ago. I'm afraid my memory isn't that good.

I am coming in late here, but the thread does say:


https://www.postgresql.org/message-id/200506100302.j5A32aj12016%40candle.pha.pa.us

Another new addition is that we now will check to see that the password
file is a regular file and not a symlink or something.  This was part of
your patch for PGPASSFILE but I extended it to ~/.pgpass too.

Seems the stat, and not lstat, usage is a bug.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: FW: Setting up streaming replication problems

2018-01-28 Thread Thiemo Kellner

Thanks for your patience.

On 01/29/18 00:11, Martin Goodson wrote:

"A password needs to be provided too, if the primary demands password
authentication. It can be provided in the primary_conninfo string, or in
a separate ~/.pgpass file on the standby server (use replication as the
database name)"


I tried plain text.


But are you sure the password is the issue? What messages are you seeing
in your logs on both the master and the standby? Could you post the logs
here, or at least a representative sample? What are you seeing in the
standby logs when you start it? What are you seeing in the master's logs
at the same time?


No, I am not.


== Master log of start gives me ==

2018-01-29 05:55:39.996 CET [1307] DEBUG:  registering background worker 
"logical replication launcher"
2018-01-29 05:55:39.996 CET [1307] LOG:  listening on IPv6 address 
"::1", port 5432
2018-01-29 05:55:39.996 CET [1307] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2018-01-29 05:55:40.027 CET [1307] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5432"
2018-01-29 05:55:40.061 CET [1307] DEBUG:  mmap(148897792) with 
MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2018-01-29 05:55:40.153 CET [1308] LOG:  database system was shut down 
at 2018-01-29 05:50:37 CET

2018-01-29 05:55:40.154 CET [1308] DEBUG:  checkpoint record is at 0/1649758
2018-01-29 05:55:40.155 CET [1308] DEBUG:  redo record is at 0/1649758; 
shutdown TRUE
2018-01-29 05:55:40.155 CET [1308] DEBUG:  next transaction ID: 0:583; 
next OID: 16398
2018-01-29 05:55:40.156 CET [1308] DEBUG:  next MultiXactId: 1; next 
MultiXactOffset: 0
2018-01-29 05:55:40.156 CET [1308] DEBUG:  oldest unfrozen transaction 
ID: 548, in database 1
2018-01-29 05:55:40.156 CET [1308] DEBUG:  oldest MultiXactId: 1, in 
database 1
2018-01-29 05:55:40.156 CET [1308] DEBUG:  commit timestamp Xid 
oldest/newest: 0/0
2018-01-29 05:55:40.156 CET [1308] DEBUG:  transaction ID wrap limit is 
2147484195, limited by database with OID 1
2018-01-29 05:55:40.156 CET [1308] DEBUG:  MultiXactId wrap limit is 
2147483648, limited by database with OID 1

2018-01-29 05:55:40.156 CET [1308] DEBUG:  starting up replication slots
2018-01-29 05:55:40.158 CET [1308] DEBUG:  MultiXactId wrap limit is 
2147483648, limited by database with OID 1
2018-01-29 05:55:40.158 CET [1308] DEBUG:  MultiXact member stop limit 
is now 4294914944 based on MultiXact 1

2018-01-29 05:55:40.212 CET [1312] DEBUG:  autovacuum launcher started
2018-01-29 05:55:40.213 CET [1307] DEBUG:  starting background worker 
process "logical replication

launcher"
2018-01-29 05:55:40.216 CET [1307] LOG:  database system is ready to 
accept connections
2018-01-29 05:55:40.217 CET [1314] DEBUG:  logical replication launcher 
started
2018-01-29 05:55:40.822 CET [1315] [unknown]@[unknown] LOG:  connection 
received: host=[local]
2018-01-29 05:55:40.822 CET [1315] [unknown]@[unknown] LOG:  incomplete 
startup packet
2018-01-29 05:55:41.344 CET [1318] [unknown]@[unknown] LOG:  connection 
received: host=[local]
2018-01-29 05:55:41.349 CET [1318] postgres@postgres LOG:  connection 
authorized: user=postgres database=postgres
2018-01-29 05:55:41.900 CET [1321] [unknown]@[unknown] LOG:  connection 
received: host=[local]
2018-01-29 05:55:41.905 CET [1321] postgres@postgres LOG:  connection 
authorized: user=postgres database=postgres
2018-01-29 05:55:42.440 CET [1324] [unknown]@[unknown] LOG:  connection 
received: host=[local]
2018-01-29 05:55:42.447 CET [1324] postgres@postgres LOG:  connection 
authorized: user=postgres database=postgres


== Standby log gives me ==

2018-01-28 05:55:32.703 CET [5214] DEBUG:  autovacuum: processing 
database "postgres"
2018-01-29 05:55:47.724 CET [1333] DEBUG:  registering background worker 
"logical replication launcher"
2018-01-29 05:55:47.724 CET [1333] LOG:  listening on IPv6 address 
"::1", port 5433
2018-01-29 05:55:47.725 CET [1333] LOG:  listening on IPv4 address 
"127.0.0.1", port 5433
2018-01-29 05:55:47.759 CET [1333] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5433"
2018-01-29 05:55:47.793 CET [1333] DEBUG:  mmap(148897792) with 
MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2018-01-29 05:55:47.887 CET [1334] LOG:  database system was shut down 
at 2018-01-29 05:50:37 CET

2018-01-29 05:55:47.887 CET [1334] DEBUG:  checkpoint record is at 0/1636408
2018-01-29 05:55:47.889 CET [1334] DEBUG:  redo record is at 0/1636408; 
shutdown TRUE
2018-01-29 05:55:47.889 CET [1334] DEBUG:  next transaction ID: 0:556; 
next OID: 16385
2018-01-29 05:55:47.889 CET [1334] DEBUG:  next MultiXactId: 1; next 
MultiXactOffset: 0
2018-01-29 05:55:47.889 CET [1334] DEBUG:  oldest unfrozen transaction 
ID: 548, in database 1
2018-01-29 05:55:47.889 CET [1334] DEBUG:  oldest MultiXactId: 1, in 
database 1
2018-01-29 05:55:47.889 CET [1334] DEBUG:  commit timestamp Xid 
oldest/newest: 0/0
2018-01-29 05:55:47.889 CET [1334] DEBUG:  transaction ID wrap limit is

Consider Spaces in pg_trgm for Better Similarity

2018-01-28 Thread Igal @ Lucee.org
Is there a way to consider white space in tri-grams?  That would allow 
for better matches of phrases.


For example, currently "one two three" and "three two one" would 
generate the same tri-grams ({  o,  t, on, th, tw,ee ,hre,ne 
,one,ree,thr,two,wo }), and the distance of "one two four" will be the 
same for both of them.  The query:


SELECT   phrase
        ,input
        ,similarity(t1.phrase, t2.input)
        ,word_similarity(t1.phrase, t2.input)
FROM      (values('one two three'),('three two one')) t1(phrase)
        ,(values('one two four')) t2(input);

Returns:

phrase    |input    |similarity  |word_similarity |
--|-|||
one two three |one two four |0.8 |0.615384638 |
three two one |one two four |0.8 |0.615384638 |

But surely "one two four" is more similar to "one two three" than to 
"three two one".


Any thoughts?

Igal Sapir
Lucee Core Developer
Lucee.org 



Re: Downsides of liberally using CREATE TEMP TABLE ... ON COMMIT DROP

2018-01-28 Thread Rob Sargent
I think the table is dropped by end of function

> On Jan 28, 2018, at 2:45 PM, David G. Johnston  
> wrote:
> 
>> On Sunday, January 28, 2018, Rob Sargent  wrote:
>> The commit in question is the function btw.
> 
> That doesn't make sense - functions can't commit.
> 
> David J.