Random function
Hi, I am trying to generate some random data using the random() function. However, I am getting the same result over mulitiple rows. This is a sample of the SQL I am using: select (select string_agg(random()::text,';') from pg_catalog.generate_series(1,3,1) ) from generate_series(1,10,1) And I am getting something like: |string_agg | +--+ |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| |0.243969671428203583;0.69257879443434;0.291524752043187618| If this is the expected output, is there a way to always generate random numbers?
Re: Random function
Luis Roberto Weck writes: > I am trying to generate some random data using the random() function. > However, I am getting the same result over mulitiple rows. This is a > sample of the SQL I am using: > select (select string_agg(random()::text,';') > from pg_catalog.generate_series(1,3,1) ) > from generate_series(1,10,1) The sub-select is independent of the outer select so it's only computed once, and then you get ten copies of that result. Restructuring the query, or inserting an artificial dependency on the outer select's data, would help. regards, tom lane
Re: Random function
How is this a performance related question? On Tue, Mar 24, 2020 at 11:10 AM Luis Roberto Weck < [email protected]> wrote: > However, I am getting the same result over mulitiple rows. This is a > sample of the SQL I am using: > > select (select string_agg(random()::text,';') >from pg_catalog.generate_series(1,3,1) ) >from generate_series(1,10,1) > > And I am getting something like: > > |string_agg | > +--+ > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > |0.243969671428203583;0.69257879443434;0.291524752043187618| > > If this is the expected output, Yes, you've asked it to compute a value, assign it to a column, then generate 10 rows of that value. is there a way to always generate random > numbers? > Don't use a scalar subquery in the main target list. One possible answer: select format('%s;%s;%s', random(), random(), random()) from generate_series(1, 10) David J.
Re: Slow planning time when public schema included (12 vs. 9.4)
On Sat, Mar 21, 2020 at 11:55 PM Tom Lane wrote: > Anders Steinlein writes: > > This they most definitely are not. 9.4 was running on an old box, Ubuntu > > 12.04, while 12 is on an up-to-date Ubuntu 18.04 LTS. AFAICS, 2.15 on the > > 9.4 box and 2.27 on the 12 box. > > I'm suspicious that the root issue has to do with libc differences, > but I haven't any hard data to back that up with. > > Another possibility perhaps is that v12's ANALYZE is collecting a lot > more "common" values than 9.4 did. Whether it is or not, the advice > you already got to ratchet down the stats target would likely be > helpful to reduce the planning time. > Yes, indeed, lowering the statistics target to the default 100 decreased the planning time a lot -- to sub-10m! Thanks for the guidance, although the excessive difference between the two boxes/libc versions are disappointing, to say the least. Do you have any insight into how the Postgres 12 nondeterministic collation feature (with ICU) compares performance-wise in general? Although having a much lower statistics target "fixed" this, I'm concerned joins and sorting is slower in general after having uncovered this (we haven't dug into that performance numbers yet), since email (citext) are PKs in a lot of our tables. Would changing our email domain using citext to instead be a domain over text using a case-insensitive collation be a better choice? Thanks again, -- a.
