Re: kind of a bag of attributes in a DB . . .

2019-09-08 Thread Chris Travers
On Sat, Sep 7, 2019 at 5:17 PM Albretch Mueller  wrote:

> Say, you get lots of data and their corresponding metadata, which in
> some cases may be undefined or undeclared (left as an empty string).
> Think of youtube json files or the result of the "file" command.
>
> I need to be able to "instantly" search that metadata and I think DBs
> are best for such jobs and get some metrics out of it.
>
> I know this is not exactly a kosher way to deal with data which can't
> be represented in a nice tabular form, but I don't find the idea that
> half way off either.
>
> What is the pattern, anti-pattern or whatever relating to such design?
>
> Do you know of such implementations with such data?
>

We do the debug logs of JSONB with some indexing.It works in some
limited cases but you need to have a good sense of index possibilities and
how the indexes actually work.


> lbrtchx
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: How to access Postgres .pgpass file from php?

2019-09-08 Thread rob stone
Hello Howard,

On Sat, 2019-09-07 at 22:17 +, Howard Wells wrote:
> I have my php files in the web root, and the Postgres 10 logon
> credentials are in the php file.  I want to put them outside the web
> root for security, because a malicious robot could easily read the
> credentials. 
> 
> After research, I found the .pgpass file.  That looks like the ideal
> solution, but after even more research, I haven't found how to use
> that file from a php logon script. 
> 
> Here is the section from my php script:
> 
> $dsn =
> vsprintf('pgsql:host=%s;port=%s;dbname=%s;user=%s;password=%s', [
> 'host' => '000.00.00.00',
> 'port' => '5432',
> 'dbname' => '[dbname]',
> 'user' => '[username]',
> 'password' => '[password]',
> ]);
> 
> Currently I store the real dbname, user and password in the php.  My
> questions are:
> 
> 1.How can I access it from the .pgpass file instead? 
> 
> 2.Where is .phpass loccated in Apache2 Ubuntu 18.04?
> 
> Thanks for any help with this. 
> 
> Howard
> 
> 
> 

We use pg_service.conf to hold the access credentials.
It's just pg_connect("service=sandbox") for example, where 'sandbox' is
the tag for the database you wish to access.
Using Debian it is kept in the /etc/php/7.3/apache2 path, where 7.3
represents the php version. It needs to be owned by the Apache user
which defaults to 'www-data'. Permissions are 0600.

HTH,
Robert







Re: Convert a row to a nested JSON document containing all relations in PostgreSQL

2019-09-08 Thread Ali Alizadeh


> 1) What side are you talking about, the parent of the record or the
> children?

the children. only one level of depth is fine.

> 2) What procedural language are you using?

PL/pgSQL

As I understand, I need to loop over all columns in the "NEW" record, find out 
if the column is a foreign key using information_schema or pg_catalog, find the 
foreign key details like to which column on which table, then perform a dynamic 
SQL SELECT (because I presume table and column names would be strings, not SQL 
identifiers) over the target table for the target record, convert the record to 
JSON and finally assign it to the appropriate key of the JSON object of 
top-level row. I'm yet trying to write the actual working code for this, for 
which I welcome any help or directions. And there might be simpler solutions to 
this problem, which I would like to know about.

Also see my question on SO: 
https://stackoverflow.com/questions/57830543/convert-a-row-to-a-nested-json-document-containing-all-relations-in-postgresql



Re: Whan is it safe to mark a function PARALLEL SAFE?

2019-09-08 Thread Tom Lane
Jim Finnerty  writes:
> According to the documentation:
> "Functions and aggregates must be marked PARALLEL UNSAFE if they write to
> the database, access sequences, change the transaction state even
> temporarily (e.g. a PL/pgSQL function which establishes an EXCEPTION block
> to catch errors), or make persistent changes to settings."

I believe the reason for the EXCEPTION-block restriction is that plpgsql
does that by establishing a subtransaction, and we don't allow
subtransactions in workers.  It seems like that's probably just an
implementation restriction that could be lifted with a little work,
much more easily than the general prohibition on writing-to-the-DB
could be.  (Obviously, the subtransaction would still be restricted
from DB writes.)  The "persistent change to settings" rule is there
not because it would fail, but because it wouldn't be persistent ---
the GUC change would only be visible inside the particular worker.

> If a LANGUAGE C function calls ereport(ERROR, ...), does that qualify as a
> potential change to the transaction state that requires it to be marked
> PARALLEL UNSAFE?

No.  It would certainly be impractical to have a rule that you can't
throw errors in workers.

> If an error is raised in one parallel worker, does this
> cause the other parallel workers to be immediately terminated?

I think not, though I didn't work on that code.  The error will
be reported to the parent backend, which will cause it to fail
the query ... but I think it just waits for the other worker
children to exit first.  That's not something to rely on of course.
Even if we don't make an attempt to cancel the other workers today
we probably will in future.  But the cancel attempt would certainly
be asynchronous, so I'm not sure how "immediate" you are worried
about it being.

> How about a C function f(x) that calls out to an external system and returns
> a text value.  If f(x) is altered on the external system, it might return a
> slightly different answer for some x.  Let's say that for some x it returns
> "one" instead of "1", and we happen to know that users don't care if it
> returns "one" or "1".  If someone were to declare f(x) to be PARALLEL SAFE,
> what's the worst that could happen?

Well, this isn't so much about whether the function is parallel safe
as whether it is marked volatile or not; as you describe it, it would
potentially give time-varying results even in a non-parallel query.
Such a function should be marked volatile to avoid strange behavior,
ie the optimizer making invalid assumptions.

AFAIK "parallel safe" and "non-volatile" are more or less independent
restrictions, though someone might correct me.  A function that writes
to the DB must be considered both volatile and parallel unsafe, but
if it doesn't do that then I think it could have any combination
of these properties.

regards, tom lane




Re: Whan is it safe to mark a function PARALLEL SAFE?

2019-09-08 Thread Peter Geoghegan
On Sun, Sep 8, 2019 at 12:27 PM Tom Lane  wrote:
> > If an error is raised in one parallel worker, does this
> > cause the other parallel workers to be immediately terminated?
>
> I think not, though I didn't work on that code.  The error will
> be reported to the parent backend, which will cause it to fail
> the query ... but I think it just waits for the other worker
> children to exit first.  That's not something to rely on of course.
> Even if we don't make an attempt to cancel the other workers today
> we probably will in future.  But the cancel attempt would certainly
> be asynchronous, so I'm not sure how "immediate" you are worried
> about it being.

If workers call CHECK_FOR_INTERRUPTS() frequently, which they should,
then it should appear to users as if raising an error in one worker
kills everything almost immediately, or immediately.  For example, if
a parallel CREATE INDEX has a worker that raises a unique violation
error, that must work in a way that at least *appears* to be very
similar to what the user would get with a serial CREATE INDEX. (The
worst that can happen is that they'll very occasionally get two unique
violation errors instead of one, or something like that.)

That said, there are theoretical failures where it could take rather a
long time for the parent/leader to get the memo -- see
WaitForParallelWorkersToAttach() and its caller (note that anything
using a gather node is subject to the same kind of failure that
WaitForParallelWorkersToAttach() handles, even though they won't call
the function themselves). These failures (e.g. fork() failure) are
generally assumed to be rare to non-existent, though. Users will
surely be upset if parallel queries cannot be cancelled almost
immediately. If it happened with any regularity, somebody would have
complained by now.

As Tom said, it's hard to give a useful answer without more context --
how you define "immediate"?

-- 
Peter Geoghegan




Re: Convert a row to a nested JSON document containing all relations in PostgreSQL

2019-09-08 Thread Adrian Klaver

On 9/8/19 5:22 AM, Ali Alizadeh wrote:



1) What side are you talking about, the parent of the record or the
children?


the children. only one level of depth is fine.


2) What procedural language are you using?


PL/pgSQL

As I understand, I need to loop over all columns in the "NEW" record, find out if the column is a foreign key using information_schema or pg_catalog, find the foreign key 


Probably easier to look whether there are FK's on a table:

select * from information_schema.table_constraints where table_schema = 
'some_schema' and table_name ='some_name' and constraint_type = 'FOREIGN 
KEY';


Then use that to look up the columns(to account for multi-column FK's):

select table_name, column_name from 
information_schema.constraint_column_usage where constraint_name = 
'fk_name';


details like to which column on which table, then perform a dynamic SQL 
SELECT (because I presume table and column names would be strings, not 
SQL identifiers) over the target


Take a look at the format() example here:

https://www.postgresql.org/docs/11/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

https://www.postgresql.org/docs/11/functions-string.html#FUNCTIONS-STRING-FORMAT

table for the target record, convert the record to JSON and finally 
assign it to the appropriate key of the JSON object of top-level row. 
I'm yet trying to write the actual working code for this, for which I 
welcome any help or directions. And there might be simpler solutions to 
this problem, which I would like to know about.


Also see my question on SO: 
https://stackoverflow.com/questions/57830543/convert-a-row-to-a-nested-json-document-containing-all-relations-in-postgresql




--
Adrian Klaver
adrian.kla...@aklaver.com




help formualting a query

2019-09-08 Thread stan
I am having a hard time figuring out to do this, the SQL way. Doing in a
procedural way with a foreach would be deasy, but I suspect that a
properly formulated SQL query can achieve this.

I have a table that contains a  series of rows. Each row has a project key,
a start date, and an end date. There are multiple rows with different start
and end dates for each project.

I need to return something that contains, one row per project key, with the
min(star date) and max(end date) for ALL the records for that given
project.

Any suggestions?


-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: help formualting a query

2019-09-08 Thread Tom Lane
stan  writes:
> I have a table that contains a  series of rows. Each row has a project key,
> a start date, and an end date. There are multiple rows with different start
> and end dates for each project.

> I need to return something that contains, one row per project key, with the
> min(star date) and max(end date) for ALL the records for that given
> project.

Maybe I'm misunderstanding something, but isn't this just

select project_key, min(start_date), max(end_date)
from my_table
group by project_key

??

regards, tom lane




Re: floating point output

2019-09-08 Thread Rob Sargent



> On Sep 5, 2019, at 7:32 AM, Rob Sargent  wrote:
> 
> 
> 
>> On Sep 4, 2019, at 9:14 PM, Adrian Klaver  wrote:
>> 
>>> On 9/4/19 5:23 PM, Rob Sargent wrote:
>>> I've found the description of floating point types (here 
>>> ), 
>>> but I'm looking for the rationale of the output format, particularly with 
>>> respect to total digits presented (variable in a single select's output) 
>>> and the dropping of a trailing zero (to some implying a loss of precision). 
>>>  Is the code my only guide here?
>> 
>> Some examples would help explain your concerns.
>> 
>> Bottom line, if you want precision use numeric.
>> 
>> 
>> 
> When at my desk I’ll give example. Actual precision is not the issue. Purely 
> cosmetics.

Below I show a floating point column from a recent query.  

Here's my guess at what's happening in the formatting:
  1) scientific notation if smaller than 1.0e-04 (I’m not in the 
very-much-larger-than-one world)
  2) 15 digits after the most signicant 0
  3) remove trailing zeros

I may be alone in this but I find the presentation messy and that's unusual in 
the PostgreSQL world.  To me the most agregious is the truncation of the zeros. 
 Pick a number of digits and stick with it. One format too.

1 |  8.81189e-05 |
2 |  0.000108911 |
3 |  0.000118812 |
4 | 0.000128999871000129 |
5 | 0.000132499966875008 |
6 | 0.000132999867000133 |
7 | 0.000136922971597714 |
8 | 0.000139499930250035 |
9 | 0.000158809516247166 |
   10 | 0.000170294107629758 |
   11 |  0.000287129 |
   12 | 0.000327999672000328 |
   13 | 0.000628999371000629 |
   14 |  0.000653466 |
   15 | 0.000667999332000668 |
   16 | 0.000681999318000682 |
   17 | 0.000683999316000684 |
   18 | 0.000708999291000709 |
   19 | 0.000722999277000723 |
   20 | 0.000730999269000731 |
   21 |  0.000742575 |
   22 | 0.000753999246000754 |
   23 | 0.000755999244000756 |
   24 | 0.000762999237000763 |
   25 | 0.000765999234000766 |
   26 | 0.000773999226000774 |
   27 | 0.000780999219000781 |
   28 | 0.000785999214000786 |
   29 | 0.000787999212000788 |
   30 | 0.000791999208000792 |
   31 | 0.000793999206000794 |
   32 | 0.000813999186000814 |
   33 | 0.000817999182000818 |
   34 | 0.000826999173000827 |
   35 | 0.000833999166000834 |
   36 | 0.000834999165000835 |
   37 | 0.000838999161000839 |
   38 | 0.000847999152000848 |
   39 | 0.000854999145000855 |
   40 | 0.000855999144000856 |
   41 | 0.000857999142000858 |
   42 |  0.000851486 |
   43 | 0.000864999135000865 |
   44 |  0.000861387 |
   45 | 0.000875999124000876 |
   46 | 0.000882999117000883 |
   47 | 0.000887999112000888 |
   48 | 0.000887999112000888 |
   49 | 0.000904999095000905 |
   50 | 0.000910999089000911 |
   51 | 0.000912999087000913 |
   52 | 0.000912999087000913 |
   53 | 0.000914999085000915 |
   54 | 0.000921999078000922 |
   55 | 0.000922999077000923 |
   56 | 0.000938999061000939 |
   57 |  0.000940595 |
   58 | 0.000967999032000968 |
   59 | 0.000971999028000972 |
   60 | 0.000982999017000983 |
   61 | 0.000995999004000996 |
   62 |0.001000998999001 |
   63 |0.001003998996001 |
   64 |0.001003998996001 |
   65 |  0.00101099898900101 |
   66 |  0.00101199898800101 |
   67 |  0.00101599898400102 |
   68 |  0.00101899898100102 |
   69 |  0.00102199897800102 |
   70 |  0.00102599897400103 |
   71 |  0.00103099896900103 |
   72 |  0.00104999895000105 |
   73 |  0.00106899893100107 |
   74 |  0.00107499892500108 |
   75 |  0.00107599892400108 |
   76 |  0.00108499891500109 |
   77 |  0.00108699891300109 |
   78 |  0.00108799891200109 |
   79 |   0.0011039988960011 |
   80 |  0.00110799889200111 |
   81 |  0.00111099888900111 |
   82 |  0.00111299888700111 |
   83 |  0.00111599888400112 |
   84 |  0.00111699888300112 |
   85 |  0.00111999888000112 |
   86 |  0.00111999888000112 |
   87 |  0.00112099887900112 |
   88 |  0.00116599883400117 |
   89 |  0.00117461448106578 |
   90 |   0.0011788180746529 |
   91 |  0.00118099881900118 |
   92 |   0.0011839994080003 |
   93 |  0.00118827261924794 |
   94 |  0.00118899881100119 |
   95 |   0.0012049987950012 |
   96 |  0.00120622217754733 |
   97 |  0.00120999879000121 |
   98 |  0.00121409079871902 |
   99 |  0.00121481807138018 |
  100 |  0.00122618178102479 |
  101 |  0.00122699877300123 |
  102 |  0.00122899877100123 |
  103 |  0.00123999876000124 |
  104 |  0.00125999874000126 |
  105 |  0.00127766866245761 |
  106 |  0.00128099935950032 |
  107 |  0.00129099870900129 |
  108 |  0.00129199870800129 |
  109 |  0.00129299870700129 |
  110 |  0.00129924967518758 |
  111 |   0.0013019986980013 |
  112 |   0.0013019986980013 |
  113 |  0.00130449934775033 |
  114 |  0.00133099866900133 |
  115 |  0.0013310886731 |
  116 |  0.00135249932375034 |
  117 |  0.001378851 |
  118 |  0.00142799857200143 |
  119 |  0.00143799856200144 |
  120 | 

Re: floating point output

2019-09-08 Thread Tom Lane
Rob Sargent  writes:
> Below I show a floating point column from a recent query.  

> Here's my guess at what's happening in the formatting:
>   1) scientific notation if smaller than 1.0e-04 (I’m not in the 
> very-much-larger-than-one world)
>   2) 15 digits after the most signicant 0
>   3) remove trailing zeros

> I may be alone in this but I find the presentation messy and that's
> unusual in the PostgreSQL world.

In released versions of Postgres, float8out basically devolves to
sprintf with "%.15g" format, and the above is what POSIX says that
should do.

As of v12, we'll use the "Ryu" output algorithm, which adjusts
the number of displayed digits to ensure that the value will
re-load exactly by default.  I'm afraid that will probably make
your complaint about variable numbers of digits worse not better.
It's unlikely anybody will entertain an argument to undo it,
though, because (a) Ryu is faster than the old code, and (b) it's
hard to argue against ensuring that output/input preserves values.

You might consider using type "numeric" if you find truncation
of low-order zeroes undesirable.

regards, tom lane