Re: Hungarian collation in English Windows server

2018-08-24 Thread Durumdara
Dear Laurenz!

> You have to install the collation on the Windows maching, not in
PostgreSQL.

We could add languages. As I remember my coll. told me that Hungarian lang.
pack is installed in that machine.
But maybe this is different thing.

What we need to do to PGSQL see that collation? Would you or other member
decribe the steps, please?

Very-very thank you!

dd


How to split an array into columns

2018-08-24 Thread a
Say if I have an float8 array:

id| data
--|---
a | {1,2}
b | {2,4}


If I could using query to make it looks like this:


id| data[1] | data[2]
--|--|---
a |  1  | 2
b |  2  | 4



Since I would have around 200,000 rows, I would prefer it having enough 
capacity to carry out the calculation such as sum().


Thank you so much!

Re: How to split an array into columns

2018-08-24 Thread Thomas Kellerer
a schrieb am 24.08.2018 um 11:01:
> Say if I have an float8 array:
> 
> id| data
> --|---
> a | {1,2}
> b | {2,4}
> 
> If I could using query to make it looks like this:
> 
> id| data[1] | data[2]
> --|--|---
> a |  1  | 2
> b |  2  | 4
> 
> Since I would have around 200,000 rows, I would prefer it having
> enough capacity to carry out the calculation such as sum().

Maybe I am missing something, but: 

   select id, data[1], data[2]
   from the_table;

will work just fine. 




Arrays, casting and "constrained" data types

2018-08-24 Thread Thomas Kellerer
I stumbled across the following:

Consider the following (simplified) table:

  create table test
  (
val numeric(20,0),
ref_val numeric(20,0)
  );
 
and the following very simple recursive CTE:

with recursive tree as (
  select val, array[val] as path
  from test
  union all
  select child.val, parent.path||child.val 
  from test child
join tree parent on parent.val = child.ref_val
)
select *
from tree;

The above fails with: recursive query "tree" column 2 has type numeric(20,0)[] 
in non-recursive term but type numeric[] overall

However, when casting the array in the non-recursive part, it still doesn't 
work:

with recursive tree as (
  select val, array[val]::numeric[] as path
  from test
  union all
  select child.val, parent.path||child.val 
  from test child
join tree parent on parent.val = child.ref_val
)
select *
from tree;

same error as before. Neither does array[val::numeric] work. 

However, appending the column to an empty array works: 

with recursive tree as (
  select val, array[]::numeric[] || val as path
  from test
  union all
  select child.val, parent.path||child.val 
  from test child
join tree parent on parent.val = child.ref_val
)
select *
from tree;


My question is: why isn't "array[val]::numeric[]" enough to create a numeric[] 
array in the non-recursive part? 

I have seen the same problem with "varchar(x)" 

Thomas







Re: pg_sample

2018-08-24 Thread Naveen Dabas
sir have taken pg_sample
Now i want to run pg_sample with credential but i'm getting this error

Can't locate DBI.pm in @INC (@INC contains: /usr/local/lib64/perl5
/usr/local/share/perl5 /usr/lib64/perl5/vendor_perl
/usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at
./pg_sample line 192.
BEGIN failed--compilation aborted at ./pg_sample line 192.


can you help me in this

thanks

On Thu, Aug 23, 2018 at 8:22 PM, Adrian Klaver 
wrote:

> On 08/23/2018 07:39 AM, Naveen Dabas wrote:
>
>> sir from where should i install it.
>> I tried but  i didn't found separate  link for pg_sample
>> can you help me in this
>>
>
>
> Something similar:
>
> https://github.com/18F/rdbms-subsetter
>
> It is Python based and can be pip installed.
>
>
>>
>> thanks
>>
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
--
With Regards
Naveen Dabas
Ph. 9017298370

-- 
*Important Disclaimer:* Information contained in this email is for the 
recipient primarily addressed to. If you are not the primary recipient or 
are not supposed to receive this email, you are advised to kindly delete 
the email or the thread and notify of the error. The logo is a registered 
and copyrighted property of *ACTAS TECHNOLOGIES PRIVATE LIMITED*. Do not 
use it without authorization.


Re: pg_sample

2018-08-24 Thread Ravi Krishna
> 
> sir have taken pg_sample 
> Now i want to run pg_sample with credential but i'm getting this error
> 
> Can't locate DBI.pm in @INC (@INC contains: /usr/local/lib64/perl5 
> /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl 
> /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at 
> ./pg_sample line 192.
> BEGIN failed--compilation aborted at ./pg_sample line 192.

As is clear from the message, you need to install Perl DBI/DBD first.




Re: pg_sample

2018-08-24 Thread Abhinav Mehta
Solution, execute this on your linux terminal -

$ perl -MCPAN -e 'install Bundle::DBI'
$ perl -MCPAN -e 'install DBD::Pg'

> On 24-Aug-2018, at 6:13 PM, Ravi Krishna  wrote:
> 
>> 
>> sir have taken pg_sample 
>> Now i want to run pg_sample with credential but i'm getting this error
>> 
>> Can't locate DBI.pm in @INC (@INC contains: /usr/local/lib64/perl5 
>> /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl 
>> /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at 
>> ./pg_sample line 192.
>> BEGIN failed--compilation aborted at ./pg_sample line 192.
> 
> As is clear from the message, you need to install Perl DBI/DBD first.
> 
> 




PG-Sample | Error | Equality operator for type json

2018-08-24 Thread Abhinav Mehta
Hi Team,

I’m getting running into following issue while executing pg-sample

Error

DBD::Pg::db do failed: ERROR:  could not identify an equality operator for type 
json
LINE 3:SELECT DISTINCT t1.*
   ^ at ./pg_sample line 296.
main::__ANON__("DBD::Pg::db do failed: ERROR:  could not identify an 
equality"..., DBI::db=HASH(0x1704258), undef) called at ./pg_sample line 638
/pg_sample$ ./pg_sample mydb | psql sampledb


Executing Command

pg_sample mydb | psql sampledb

—
Thanks
Abhinav


Re: PG-Sample | Error | Equality operator for type json

2018-08-24 Thread Abhinav Mehta
BTW - I’ve tried figuring out myself and understood UNION workaround could 
help, but since it’s via pg-sample so just want to understand if there a quick 
workaround without tweaking pg-sample itself.

> On 24-Aug-2018, at 6:20 PM, Abhinav Mehta  wrote:
> 
> Hi Team,
> 
> I’m getting running into following issue while executing pg-sample
> 
> Error
> 
> DBD::Pg::db do failed: ERROR:  could not identify an equality operator for 
> type json
> LINE 3:SELECT DISTINCT t1.*
>   ^ at ./pg_sample line 296.
>   main::__ANON__("DBD::Pg::db do failed: ERROR:  could not identify an 
> equality"..., DBI::db=HASH(0x1704258), undef) called at ./pg_sample line 638
> /pg_sample$ ./pg_sample mydb | psql sampledb
> 
> 
> Executing Command
> 
> pg_sample mydb | psql sampledb
> 
> —
> Thanks
> Abhinav



Re: How to split an array into columns

2018-08-24 Thread Tom Lane
Thomas Kellerer  writes:
> Maybe I am missing something, but: 
>select id, data[1], data[2]
>from the_table;
> will work just fine. 

If the arrays are of varying length, unnest() might be what the OP
is looking for.

regards, tom lane



Re: Hungarian collation in English Windows server

2018-08-24 Thread Tom Lane
Durumdara  writes:
> Dear Laurenz!
>> You have to install the collation on the Windows maching, not in
>> PostgreSQL.

> We could add languages. As I remember my coll. told me that Hungarian lang.
> pack is installed in that machine.
> But maybe this is different thing.
> What we need to do to PGSQL see that collation? Would you or other member
> decribe the steps, please?

Did you add the language pack after PG was already installed?  If so,
you could re-initdb, or if that's inconvenient maybe you could run
pg_import_system_collations().

regards, tom lane



Re: How to split an array into columns

2018-08-24 Thread Paul A Jungwirth
On Fri, Aug 24, 2018 at 2:01 AM, a <372660...@qq.com> wrote:
>
> Say if I have an float8 array:
>
> id| data
> --|---
> a | {1,2}
> b | {2,4}
>
> . . .
>
> Since I would have around 200,000 rows, I would prefer it having enough 
> capacity to carry out the calculation such as sum().

Is this something that would help you?:

https://github.com/pjungwir/aggs_for_vecs

(also on pgxn: https://pgxn.org/dist/aggs_for_vecs)

It would let you say `SELECT vec_to_sum(data)` and get `{3,6}`.

Paul



unorthodox use of PG for a customer

2018-08-24 Thread David Gauthier
Hi Everyone:

I'm going to throw this internal customer request out for ideas, even
though I think it's a bit crazy.  I'm on the brink of telling him it's
impractical and/or inadvisable.  But maybe someone has a solution.

He's writing a script/program that runs on a workstation and needs to write
data to a DB.  This process also sends work to a batch system on a server
farm external to the workstation that will create multiple, parallel
jobs/processes that also have to write to the DB as well. The workstation
may have many of these jobs running at the same time.  And there are 58
workstation which all have/use locally mounted disks for this work.

At first blush, this is easy.  Just create a DB on a server and have all
those clients work with it.  But he's also adamant about having the DB on
the same server(s) that ran the script AND on the locally mounted disk.  He
said he doesn't want the overhead, dependencies and worries of anything
like an external DB with a DBA, etc... . He also wants this to be fast.

My first thought was SQLite.  Apparently, they now have some sort of
multiple, concurrent write ability.  But there's no way those batch jobs on
remote machines are going to be able to get at the locally mounted disk on
the workstation. So I dismissed that idea.  Then I thought about having 58
PG installs, one per workstation, each serving all the jobs pertaining to
that workstation.  That could work.  But 58 DB instances ?  If he didn't
like the ideal of one DBA, 58 can't be good.  Still, the DB would be on the
workstation which seems to be what he wants.

I can't think of anything better.  Does anyone have any ideas?

Thanks in Advance !


Re: unorthodox use of PG for a customer

2018-08-24 Thread James Keener
> he doesn't want the overhead, dependencies and worries of anything like
an external DB with a DBA, etc... . He also wants this to be fast.

So they're trading consistency concerns for ... not having a central db?
Even if your shop requires a DBA for any DB, it sounds like a really bad
deal.

Jim


On Fri, Aug 24, 2018 at 1:18 PM, David Gauthier 
wrote:

> Hi Everyone:
>
> I'm going to throw this internal customer request out for ideas, even
> though I think it's a bit crazy.  I'm on the brink of telling him it's
> impractical and/or inadvisable.  But maybe someone has a solution.
>
> He's writing a script/program that runs on a workstation and needs to
> write data to a DB.  This process also sends work to a batch system on a
> server farm external to the workstation that will create multiple, parallel
> jobs/processes that also have to write to the DB as well. The workstation
> may have many of these jobs running at the same time.  And there are 58
> workstation which all have/use locally mounted disks for this work.
>
> At first blush, this is easy.  Just create a DB on a server and have all
> those clients work with it.  But he's also adamant about having the DB on
> the same server(s) that ran the script AND on the locally mounted disk.  He
> said he doesn't want the overhead, dependencies and worries of anything
> like an external DB with a DBA, etc... . He also wants this to be fast.
>
> My first thought was SQLite.  Apparently, they now have some sort of
> multiple, concurrent write ability.  But there's no way those batch jobs on
> remote machines are going to be able to get at the locally mounted disk on
> the workstation. So I dismissed that idea.  Then I thought about having 58
> PG installs, one per workstation, each serving all the jobs pertaining to
> that workstation.  That could work.  But 58 DB instances ?  If he didn't
> like the ideal of one DBA, 58 can't be good.  Still, the DB would be on the
> workstation which seems to be what he wants.
>
> I can't think of anything better.  Does anyone have any ideas?
>
> Thanks in Advance !
>
>


Re: unorthodox use of PG for a customer

2018-08-24 Thread Adrian Klaver

On 08/24/2018 11:18 AM, David Gauthier wrote:

Hi Everyone:

I'm going to throw this internal customer request out for ideas, even 
though I think it's a bit crazy.  I'm on the brink of telling him it's 
impractical and/or inadvisable.  But maybe someone has a solution.


He's writing a script/program that runs on a workstation and needs to 
write data to a DB.  This process also sends work to a batch system on a 
server farm external to the workstation that will create multiple, 
parallel jobs/processes that also have to write to the DB as well. The 
workstation may have many of these jobs running at the same time.  And 
there are 58 workstation which all have/use locally mounted disks for 
this work.


At first blush, this is easy.  Just create a DB on a server and have all 
those clients work with it.  But he's also adamant about having the DB 
on the same server(s) that ran the script AND on the locally mounted 
disk.  He said he doesn't want the overhead, dependencies and worries of 
anything like an external DB with a DBA, etc... . He also wants this to 
be fast.
My first thought was SQLite.  Apparently, they now have some sort of 
multiple, concurrent write ability.  But there's no way those batch jobs 
on remote machines are going to be able to get at the locally mounted 
disk on the workstation. So I dismissed that idea.  Then I thought about 
having 58 PG installs, one per workstation, each serving all the jobs 
pertaining to that workstation.  That could work.  But 58 DB instances 
?  If he didn't like the ideal of one DBA, 58 can't be good.  Still, the 
DB would be on the workstation which seems to be what he wants.

I can't think of anything better.  Does anyone have any ideas?


So are the 58 database(stores) on the workstation going to be working 
with data independent to each or is the data shared/synced between 
instances?





Thanks in Advance !




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: unorthodox use of PG for a customer

2018-08-24 Thread Edson Carlos Ericksson Richter

Em 24/08/2018 15:18, David Gauthier escreveu:

Hi Everyone:

I'm going to throw this internal customer request out for ideas, even 
though I think it's a bit crazy.  I'm on the brink of telling him it's 
impractical and/or inadvisable.  But maybe someone has a solution.


He's writing a script/program that runs on a workstation and needs to 
write data to a DB.  This process also sends work to a batch system on 
a server farm external to the workstation that will create multiple, 
parallel jobs/processes that also have to write to the DB as well. The 
workstation may have many of these jobs running at the same time.  And 
there are 58 workstation which all have/use locally mounted disks for 
this work.


At first blush, this is easy.  Just create a DB on a server and have 
all those clients work with it.  But he's also adamant about having 
the DB on the same server(s) that ran the script AND on the locally 
mounted disk.  He said he doesn't want the overhead, dependencies and 
worries of anything like an external DB with a DBA, etc... . He also 
wants this to be fast.
My first thought was SQLite.  Apparently, they now have some sort of 
multiple, concurrent write ability.  But there's no way those batch 
jobs on remote machines are going to be able to get at the locally 
mounted disk on the workstation. So I dismissed that idea. Then I 
thought about having 58 PG installs, one per workstation, each serving 
all the jobs pertaining to that workstation.  That could work.  But 58 
DB instances ?  If he didn't like the ideal of one DBA, 58 can't be 
good.  Still, the DB would be on the workstation which seems to be 
what he wants.

I can't think of anything better.  Does anyone have any ideas?

Thanks in Advance !



I'm no expert, but I've dozens of PostgreSQL databases running mostly 
without manual maintenance for years, just do the backups, and you are fine.
In any way, if you need any kind of maintenance, you can program it in 
your app (even backup, restore and vacuum) - it is easy to throw 
administrative commands thru the available interfaces.
And if the database get out of access, no matter if it is centralized or 
remote: you will need someone phisically there to fix it.
AFAIK, you don't even PostgreSQL installer - you can run it embed if you 
wish.


Just my2c,

Edson




Re: unorthodox use of PG for a customer

2018-08-24 Thread David Gauthier
I tried to convince him of the wisdom of one central DB.  I'll try again.

>>So are the 58 database(stores) on the workstation going to be working
with data independent to each or is the data shared/synced between
instances?

No, 58 workstations, each with its own DB.  There's a concept of a
"workarea" (really a dir with a lot of stuff in it) where the script runs.
He wants to tie all the runs for any one workarea together and is stuck on
the idea that there should be a separate DB per workarea.  I told him you
could just stick all the data in the same table just with a "workarea"
column to distinguish between the workareas.  He likes the idea of a
separate DB per workarea.  He just doesn't gt it.


>>I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years.

Ya, I've sort of had the same experience with PG DBs.  Like the everready
bunny, they just keep on running.  But these workstations are pretty
volatile as they keep overloading them and crash them.  Of course any DB
running would die too and have to be restarted/recovered.  So the place
for  the DB is really elsewhere, on an external server that wouldn't be
subject to this volatility and crashing.  I told him about transactions and
how you could prevent partial writing of data sets.

So far, I'm not hearing of anything that looks like a solution given the
constraints he's put on this.  Don't get me wrong, he's a very smart and
sharp software engineer.  Very smart.  But for some reason, he doesn't like
the client/server DB model which would work so nicely here.  I'm just
trying to make sure I didn't miss some sort of solution, PG or not, that
would work here.

Thanks for your interest and input everyone !




On Fri, Aug 24, 2018 at 2:39 PM Edson Carlos Ericksson Richter <
rich...@simkorp.com.br> wrote:

> Em 24/08/2018 15:18, David Gauthier escreveu:
> > Hi Everyone:
> >
> > I'm going to throw this internal customer request out for ideas, even
> > though I think it's a bit crazy.  I'm on the brink of telling him it's
> > impractical and/or inadvisable.  But maybe someone has a solution.
> >
> > He's writing a script/program that runs on a workstation and needs to
> > write data to a DB.  This process also sends work to a batch system on
> > a server farm external to the workstation that will create multiple,
> > parallel jobs/processes that also have to write to the DB as well. The
> > workstation may have many of these jobs running at the same time.  And
> > there are 58 workstation which all have/use locally mounted disks for
> > this work.
> >
> > At first blush, this is easy.  Just create a DB on a server and have
> > all those clients work with it.  But he's also adamant about having
> > the DB on the same server(s) that ran the script AND on the locally
> > mounted disk.  He said he doesn't want the overhead, dependencies and
> > worries of anything like an external DB with a DBA, etc... . He also
> > wants this to be fast.
> > My first thought was SQLite.  Apparently, they now have some sort of
> > multiple, concurrent write ability.  But there's no way those batch
> > jobs on remote machines are going to be able to get at the locally
> > mounted disk on the workstation. So I dismissed that idea. Then I
> > thought about having 58 PG installs, one per workstation, each serving
> > all the jobs pertaining to that workstation.  That could work.  But 58
> > DB instances ?  If he didn't like the ideal of one DBA, 58 can't be
> > good.  Still, the DB would be on the workstation which seems to be
> > what he wants.
> > I can't think of anything better.  Does anyone have any ideas?
> >
> > Thanks in Advance !
> >
>
> I'm no expert, but I've dozens of PostgreSQL databases running mostly
> without manual maintenance for years, just do the backups, and you are
> fine.
> In any way, if you need any kind of maintenance, you can program it in
> your app (even backup, restore and vacuum) - it is easy to throw
> administrative commands thru the available interfaces.
> And if the database get out of access, no matter if it is centralized or
> remote: you will need someone phisically there to fix it.
> AFAIK, you don't even PostgreSQL installer - you can run it embed if you
> wish.
>
> Just my2c,
>
> Edson
>
>
>


Re: unorthodox use of PG for a customer

2018-08-24 Thread Adrian Klaver

On 08/24/2018 12:07 PM, David Gauthier wrote:

I tried to convince him of the wisdom of one central DB.  I'll try again.

 >>So are the 58 database(stores) on the workstation going to be working
with data independent to each or is the data shared/synced between
instances?

No, 58 workstations, each with its own DB.  There's a concept of a 
"workarea" (really a dir with a lot of stuff in it) where the script 
runs.  He wants to tie all the runs for any one workarea together and is 
stuck on the idea that there should be a separate DB per workarea.  I 
told him you could just stick all the data in the same table just with a 
"workarea" column to distinguish between the workareas.  He likes the 
idea of a separate DB per workarea.  He just doesn't gt it.


Then a SQLite db per work area and as part of the batch rsync db file to 
remote server, do work on external server, rsync back to work area.




 >>I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years.

Ya, I've sort of had the same experience with PG DBs.  Like the 
everready bunny, they just keep on running.  But these workstations are 
pretty volatile as they keep overloading them and crash them.  Of course 
any DB running would die too and have to be restarted/recovered.  So the 
place for  the DB is really elsewhere, on an external server that 
wouldn't be subject to this volatility and crashing.  I told him about 
transactions and how you could prevent partial writing of data sets.


So far, I'm not hearing of anything that looks like a solution given the 
constraints he's put on this.  Don't get me wrong, he's a very smart and 
sharp software engineer.  Very smart.  But for some reason, he doesn't 
like the client/server DB model which would work so nicely here.  I'm 
just trying to make sure I didn't miss some sort of solution, PG or not, 
that would work here.


Thanks for your interest and input everyone !




On Fri, Aug 24, 2018 at 2:39 PM Edson Carlos Ericksson Richter 
mailto:rich...@simkorp.com.br>> wrote:


Em 24/08/2018 15:18, David Gauthier escreveu:
 > Hi Everyone:
 >
 > I'm going to throw this internal customer request out for ideas,
even
 > though I think it's a bit crazy.  I'm on the brink of telling him
it's
 > impractical and/or inadvisable.  But maybe someone has a solution.
 >
 > He's writing a script/program that runs on a workstation and
needs to
 > write data to a DB.  This process also sends work to a batch
system on
 > a server farm external to the workstation that will create multiple,
 > parallel jobs/processes that also have to write to the DB as
well. The
 > workstation may have many of these jobs running at the same
time.  And
 > there are 58 workstation which all have/use locally mounted disks
for
 > this work.
 >
 > At first blush, this is easy.  Just create a DB on a server and have
 > all those clients work with it.  But he's also adamant about having
 > the DB on the same server(s) that ran the script AND on the locally
 > mounted disk.  He said he doesn't want the overhead, dependencies
and
 > worries of anything like an external DB with a DBA, etc... . He also
 > wants this to be fast.
 > My first thought was SQLite.  Apparently, they now have some sort of
 > multiple, concurrent write ability.  But there's no way those batch
 > jobs on remote machines are going to be able to get at the locally
 > mounted disk on the workstation. So I dismissed that idea. Then I
 > thought about having 58 PG installs, one per workstation, each
serving
 > all the jobs pertaining to that workstation.  That could work. 
But 58

 > DB instances ?  If he didn't like the ideal of one DBA, 58 can't be
 > good.  Still, the DB would be on the workstation which seems to be
 > what he wants.
 > I can't think of anything better.  Does anyone have any ideas?
 >
 > Thanks in Advance !
 >

I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years, just do the backups, and you
are fine.
In any way, if you need any kind of maintenance, you can program it in
your app (even backup, restore and vacuum) - it is easy to throw
administrative commands thru the available interfaces.
And if the database get out of access, no matter if it is
centralized or
remote: you will need someone phisically there to fix it.
AFAIK, you don't even PostgreSQL installer - you can run it embed if
you
wish.

Just my2c,

Edson





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: unorthodox use of PG for a customer

2018-08-24 Thread Dimitri Maziuk
On 08/24/2018 02:07 PM, David Gauthier wrote:
> 
> ...  He likes the idea of a
> separate DB per workarea.  He just doesn't gt it.

Well there are advantages to that.

> But for some reason, he doesn't like
> the client/server DB model which would work so nicely here.  I'm just
> trying to make sure I didn't miss some sort of solution, PG or not, that
> would work here.

What you should tell him is that he can't have it both ways. Either it's
multiple worker nodes concurrently writing to the same workstation -- in
which case he needs "a server" on the workstation to accept incoming
connections and all that overhead, with enough oomph to handle
concurrency he expects. Or it's a beefed-up central server where
everybody writes to, and every workstation can poll it and maintain its
own state databases from there (i.e. worker nodes don't talk back to
submit nodes).

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: unorthodox use of PG for a customer

2018-08-24 Thread Edson Carlos Ericksson Richter

Em 24/08/2018 16:07, David Gauthier escreveu:

I tried to convince him of the wisdom of one central DB.  I'll try again.

>>So are the 58 database(stores) on the workstation going to be working
with data independent to each or is the data shared/synced between
instances?

No, 58 workstations, each with its own DB.  There's a concept of a 
"workarea" (really a dir with a lot of stuff in it) where the script 
runs.  He wants to tie all the runs for any one workarea together and 
is stuck on the idea that there should be a separate DB per workarea.  
I told him you could just stick all the data in the same table just 
with a "workarea" column to distinguish between the workareas.  He 
likes the idea of a separate DB per workarea.  He just doesn't gt it.


>>I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years.

Ya, I've sort of had the same experience with PG DBs.  Like the 
everready bunny, they just keep on running.  But these workstations 
are pretty volatile as they keep overloading them and crash them.  Of 
course any DB running would die too and have to be 
restarted/recovered.  So the place for  the DB is really elsewhere, on 
an external server that wouldn't be subject to this volatility and 
crashing.  I told him about transactions and how you could prevent 
partial writing of data sets.


So far, I'm not hearing of anything that looks like a solution given 
the constraints he's put on this.  Don't get me wrong, he's a very 
smart and sharp software engineer.  Very smart.  But for some reason, 
he doesn't like the client/server DB model which would work so nicely 
here.  I'm just trying to make sure I didn't miss some sort of 
solution, PG or not, that would work here.


Thanks for your interest and input everyone !




On Fri, Aug 24, 2018 at 2:39 PM Edson Carlos Ericksson Richter 
mailto:rich...@simkorp.com.br>> wrote:


Em 24/08/2018 15:18, David Gauthier escreveu:
> Hi Everyone:
>
> I'm going to throw this internal customer request out for ideas,
even
> though I think it's a bit crazy.  I'm on the brink of telling
him it's
> impractical and/or inadvisable.  But maybe someone has a solution.
>
> He's writing a script/program that runs on a workstation and
needs to
> write data to a DB.  This process also sends work to a batch
system on
> a server farm external to the workstation that will create
multiple,
> parallel jobs/processes that also have to write to the DB as
well. The
> workstation may have many of these jobs running at the same
time.  And
> there are 58 workstation which all have/use locally mounted
disks for
> this work.
>
> At first blush, this is easy.  Just create a DB on a server and
have
> all those clients work with it.  But he's also adamant about having
> the DB on the same server(s) that ran the script AND on the locally
> mounted disk.  He said he doesn't want the overhead,
dependencies and
> worries of anything like an external DB with a DBA, etc... . He
also
> wants this to be fast.
> My first thought was SQLite.  Apparently, they now have some
sort of
> multiple, concurrent write ability.  But there's no way those batch
> jobs on remote machines are going to be able to get at the locally
> mounted disk on the workstation. So I dismissed that idea. Then I
> thought about having 58 PG installs, one per workstation, each
serving
> all the jobs pertaining to that workstation.  That could work. 
But 58
> DB instances ?  If he didn't like the ideal of one DBA, 58 can't be
> good.  Still, the DB would be on the workstation which seems to be
> what he wants.
> I can't think of anything better.  Does anyone have any ideas?
>
> Thanks in Advance !
>

I'm no expert, but I've dozens of PostgreSQL databases running mostly
without manual maintenance for years, just do the backups, and you
are fine.
In any way, if you need any kind of maintenance, you can program
it in
your app (even backup, restore and vacuum) - it is easy to throw
administrative commands thru the available interfaces.
And if the database get out of access, no matter if it is
centralized or
remote: you will need someone phisically there to fix it.
AFAIK, you don't even PostgreSQL installer - you can run it embed
if you
wish.

Just my2c,

Edson


I think its worth to add, PG or not PG, if the workstation crash, you 
will be in trouble with ANY database or file solution you choose - but 
with PG you can minimize the risk by fine tunning the flush to disk 
(either in PG and in OS). When correctly tuned, it works like a tank, 
and is hard to defeat.


Regards,

Edson.



Re: unorthodox use of PG for a customer

2018-08-24 Thread Andrew Kerber
Unless I am missing something, it sounds like you might be able to do this
with an nfs export shared to each workstation. But I am not sure if I
understood what you were describing either.

On Fri, Aug 24, 2018 at 2:22 PM Edson Carlos Ericksson Richter <
rich...@simkorp.com.br> wrote:

> Em 24/08/2018 16:07, David Gauthier escreveu:
> > I tried to convince him of the wisdom of one central DB.  I'll try again.
> >
> > >>So are the 58 database(stores) on the workstation going to be working
> > with data independent to each or is the data shared/synced between
> > instances?
> >
> > No, 58 workstations, each with its own DB.  There's a concept of a
> > "workarea" (really a dir with a lot of stuff in it) where the script
> > runs.  He wants to tie all the runs for any one workarea together and
> > is stuck on the idea that there should be a separate DB per workarea.
> > I told him you could just stick all the data in the same table just
> > with a "workarea" column to distinguish between the workareas.  He
> > likes the idea of a separate DB per workarea.  He just doesn't gt it.
> >
> > >>I'm no expert, but I've dozens of PostgreSQL databases running mostly
> > without manual maintenance for years.
> >
> > Ya, I've sort of had the same experience with PG DBs.  Like the
> > everready bunny, they just keep on running.  But these workstations
> > are pretty volatile as they keep overloading them and crash them.  Of
> > course any DB running would die too and have to be
> > restarted/recovered.  So the place for  the DB is really elsewhere, on
> > an external server that wouldn't be subject to this volatility and
> > crashing.  I told him about transactions and how you could prevent
> > partial writing of data sets.
> >
> > So far, I'm not hearing of anything that looks like a solution given
> > the constraints he's put on this.  Don't get me wrong, he's a very
> > smart and sharp software engineer.  Very smart.  But for some reason,
> > he doesn't like the client/server DB model which would work so nicely
> > here.  I'm just trying to make sure I didn't miss some sort of
> > solution, PG or not, that would work here.
> >
> > Thanks for your interest and input everyone !
> >
> >
> >
> >
> > On Fri, Aug 24, 2018 at 2:39 PM Edson Carlos Ericksson Richter
> > mailto:rich...@simkorp.com.br>> wrote:
> >
> > Em 24/08/2018 15:18, David Gauthier escreveu:
> > > Hi Everyone:
> > >
> > > I'm going to throw this internal customer request out for ideas,
> > even
> > > though I think it's a bit crazy.  I'm on the brink of telling
> > him it's
> > > impractical and/or inadvisable.  But maybe someone has a solution.
> > >
> > > He's writing a script/program that runs on a workstation and
> > needs to
> > > write data to a DB.  This process also sends work to a batch
> > system on
> > > a server farm external to the workstation that will create
> > multiple,
> > > parallel jobs/processes that also have to write to the DB as
> > well. The
> > > workstation may have many of these jobs running at the same
> > time.  And
> > > there are 58 workstation which all have/use locally mounted
> > disks for
> > > this work.
> > >
> > > At first blush, this is easy.  Just create a DB on a server and
> > have
> > > all those clients work with it.  But he's also adamant about having
> > > the DB on the same server(s) that ran the script AND on the locally
> > > mounted disk.  He said he doesn't want the overhead,
> > dependencies and
> > > worries of anything like an external DB with a DBA, etc... . He
> > also
> > > wants this to be fast.
> > > My first thought was SQLite.  Apparently, they now have some
> > sort of
> > > multiple, concurrent write ability.  But there's no way those batch
> > > jobs on remote machines are going to be able to get at the locally
> > > mounted disk on the workstation. So I dismissed that idea. Then I
> > > thought about having 58 PG installs, one per workstation, each
> > serving
> > > all the jobs pertaining to that workstation.  That could work.
> > But 58
> > > DB instances ?  If he didn't like the ideal of one DBA, 58 can't be
> > > good.  Still, the DB would be on the workstation which seems to be
> > > what he wants.
> > > I can't think of anything better.  Does anyone have any ideas?
> > >
> > > Thanks in Advance !
> > >
> >
> > I'm no expert, but I've dozens of PostgreSQL databases running mostly
> > without manual maintenance for years, just do the backups, and you
> > are fine.
> > In any way, if you need any kind of maintenance, you can program
> > it in
> > your app (even backup, restore and vacuum) - it is easy to throw
> > administrative commands thru the available interfaces.
> > And if the database get out of access, no matter if it is
> > centralized or
> > remote: you

Re: unorthodox use of PG for a customer

2018-08-24 Thread Dimitri Maziuk
On 08/24/2018 02:35 PM, Andrew Kerber wrote:
> Unless I am missing something, it sounds like you might be able to do this
> with an nfs export shared to each workstation.

That's no different from polling the central database though, you're
just using nfs server and files instead of a db server and queries.

A compute cluster has to have a manager node that has all the state. The
way to do what he wants is to query that manager from each workstation
and keep the results in the local state database. Whether that is
actually feasible/doable in his particular case is another question.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: unorthodox use of PG for a customer

2018-08-24 Thread Tim Cross


David Gauthier  writes:

> Hi Everyone:
>
> I'm going to throw this internal customer request out for ideas, even
> though I think it's a bit crazy.  I'm on the brink of telling him it's
> impractical and/or inadvisable.  But maybe someone has a solution.
>
> He's writing a script/program that runs on a workstation and needs to write
> data to a DB.  This process also sends work to a batch system on a server
> farm external to the workstation that will create multiple, parallel
> jobs/processes that also have to write to the DB as well. The workstation
> may have many of these jobs running at the same time.  And there are 58
> workstation which all have/use locally mounted disks for this work.
>
> At first blush, this is easy.  Just create a DB on a server and have all
> those clients work with it.  But he's also adamant about having the DB on
> the same server(s) that ran the script AND on the locally mounted disk.  He
> said he doesn't want the overhead, dependencies and worries of anything
> like an external DB with a DBA, etc... . He also wants this to be fast.
>

I would agree the customers proposed architecture has problems. It is
likely to be fragile and difficult to maintain. At some point, there
willl likely be a need to consolidate the data in all these separate
databases, which could lead to other problems.  

It sounds like there is some previous experience which has caused
problems for your customer and they are trying to avoid a repeat by
defining the technical solution rather than asking for a solution. The
first step is to spend more time talking to your customer and getting to
understand the underlying reasons why he is proposing those
technical/architecture constraints. I think once you have full details
regarding his requirements and the risks as he perceives them, you will
likely be able to come up with a much more workable solution which will
both address his/her concerns and be an architecture which is solid and
maintainable. There is a good chance all the reasons will not be purely
technical. 

My wild guess is that previously, there has been problems with central IT
services - probably bureaucracy and poor response times or
communication or there was misalignment with regards to expectations. I
often encounter this type of problem working with researchers who are
very smart and informed in their local area of expertise, but less so
when it comes to understanding the complexities, maintenance overheads
and other activities associated with providing reliable services
(backups, upgrades, tuning etc). The two areas (IT and customer)
frequently speak different languages even when using the same words. It
can be extremely challenging to get clear, consistent and agreed
requirements. For example, what does 'fast' mean?

The 'fast' requirement and the desire to have things run locally could
indicate a concern regarding network performance. I find performance is
often blamed on the network, but this is less often the case in modern
networks. More often than not it is poor algorithm design, badly tuned
databases or badly designed database schemas and CPU or memory
limits.

Pointing out the maintenance overheads and possible failure points in
his proposed architecture may help. Being able to collect real metrics
to demonstrate where bottlenecks and performance issues reside will also
help going forward - good metrics and hard numbers can often circumvent
circular arguments regarding problem causes.

Also consider your internal business processes. I've seen too many good
architecture solutions becoming perceived as failures because the other
non-technical stuff failed - poor communications, failure to align
technical maintenance with business requirements, unnecessary delays or
poor responses to enquiries and questions and inability to adapt to
changing business requirements in a timely manner. This is often the
most frustrating part - you can be an excellent technical person able to
define and implement really good technical solutions, but if the
customer is unable to use the solution effectively, it will be seen as a
technical failure.

Tim
--
Tim Cross