Re: CSVQL? CSV SQL? tab-separated table I/O? RENAME COLUMN
On Wed, 2 May 2018 16:01:01 -0700, Adrian Klaver wrote: >On 05/02/2018 02:29 PM, Jim Michaels wrote: > >> >> - the microsoft patented CSV would be required for implementation. it >> handles special data with commas and double-quotes in them > >Huh?: >https://en.wikipedia.org/wiki/Comma-separated_values#History Disclaimer ... I haven't investigated the claim. However, I would not discount the possibility that Microsoft really has patented some variation of CSV. They absolutely did *try* to copyright the use of + and - symbols for specifying addition and subtraction operations in VisualBASIC. It's possible that they slipped something past the examiners. But more likely the use of a CSV-like format was specified to be part of a larger process. In that case the format itself might not be claimed, but rather only the *use* of the format for some specific purpose. IANAL, George
Re: CSVQL? CSV SQL? tab-separated table I/O? RENAME COLUMN
On Thu, 3 May 2018 11:02:00 -0700, Adrian Klaver wrote: >On 05/03/2018 09:47 AM, George Neuner wrote: >> >> ..., I would not discount the possibility that Microsoft really >> has patented some variation of CSV. They absolutely did *try* to >> copyright the use of + and - symbols for specifying addition and >> subtraction operations in VisualBASIC. > >Not seeing it: > >http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=0&f=S&l=50&TERM1=microsoft&FIELD1=AANM&co1=AND&TERM2=csv&FIELD2=&d=PTXT That's the patent database. Microsoft tried to get a *copyright*. I don't recall whether it was granted [I don't believe it was], and this would have been circa ~1990, so it's hard to search for in any case. Unlike the patent database, the copyright database does not contain the protected material - it only gives archival references to it. It generated quite a bit of negative press coverage at the time. The basis of Microsoft's argument was that "x + y" was a unique and protectable expression of the addition concept because it could have been done in other ways, e.g., by "add(x,y)". George
Re: CSVQL? CSV SQL? tab-separated table I/O? RENAME COLUMN
On Sun, 6 May 2018 15:26:22 -0700, Ken Tanzer wrote: >On Fri, May 4, 2018 at 1:03 PM, George Neuner wrote: > >> On Thu, 3 May 2018 11:02:00 -0700, Adrian Klaver >> wrote: >> >> >On 05/03/2018 09:47 AM, George Neuner wrote: >> >> >> >> ..., I would not discount the possibility that Microsoft really >> >> has patented some variation of CSV. They absolutely did *try* to >> >> copyright the use of + and - symbols for specifying addition and >> >> subtraction operations in VisualBASIC. >> > >> >Not seeing it: >> > >> >http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=% >> 2Fnetahtml%2FPTO%2Fsearch-bool.html&r=0&f=S&l=50&TERM1= >> microsoft&FIELD1=AANM&co1=AND&TERM2=csv&FIELD2=&d=PTXT >> >> >> That's the patent database. Microsoft tried to get a *copyright*. I >> don't recall whether it was granted [I don't believe it was], and this >> would have been circa ~1990, so it's hard to search for in any case. >> Unlike the patent database, the copyright database does not contain >> the protected material - it only gives archival references to it. >> >> It generated quite a bit of negative press coverage at the time. The >> basis of Microsoft's argument was that "x + y" was a unique and >> protectable expression of the addition concept because it could have >> been done in other ways, e.g., by "add(x,y)". >> >> >> >I don't think in general you can copyright a file format. CSV is deliberately public domain, but I'm not at all certain that it could not be copyrighted otherwise. And I _am_ certain that a more restricted derivative could be copyrighted. [more below] You can copyright damn near anything. Under the Berne convention, the simple fact of authorship conveys certain rights even if the "work" is a verbatim copy of something else. The only rules - at least in the US - are that your expression of the idea is not known to be common usage, and that it is neither a verbatim copy, nor a derivative of a previously *registered* work. There are 3 kinds of copyrights: registered, explicit, and implicit. The US recognizes registered and explicit copyrights, but protection is available only for registered copyrights. The US does not follow the Berne convention wrt implicit or explicit copyrights. Explicit copyrights carry no legal weight in the US, but they *may* influence the court in case of a dispute because they do represent a claim of the work. The US does not recognize implicit copyrights at all. Checking that something is a verbatim copy of an existing work is not that easy ... every registered work is archived in the Library of Congress (LoC), but much of the LoC *still* is not electronically searchable. Determining whether something is or is not derivative of something else is not even within the US Copyright Office's mandate ... they leave that to the courts. >You can copyright things you create, Including data formats ... >and you can try to keep secret the information about how they work. Trade secret has nothing whatsoever to do with IP law. Secrets convey no legal protection, and a secret may still be an infringement on a protected, publicly available work. It might be hard to figure out that a secret is infringing ... >People can't steal your code to create CSV files, but >you can't tell people they can't string a bunch of values together with >commas in between if they can figure out how to do so all by themselves. >Plus it's hard to see how "fair use" wouldn't protect something as short as >"x+y", or ",". "Fair Use" has no effect on whether the work is copyrightable. It applies only after the fact as a possible defense against infringement. It doesn't even apply here. Fair Use is not a general exception to infringement under the law - it is in fact specifically limited to educational and historical use. Fair Use does not prescribe any minimum length "sampling" of the work. It describes that there is a maximum length "sampling" that should be permitted to be copied verbatim into a new work. But, even there, the allowable sample size is not fixed under the law: it is ad hoc, per work, and decided after the fact by a court in the event of a dispute. Note that some countries do not recognize the concept of Fair Use. Punctuation alone is not copyrightable by law - the "work" is required to be intelligible [for some definition - computer binaries can be copyrighted]. The issue for a legitimate work would be whether it is either in the public domain, or otherwise is a commo
Re: Pgagent is not reading pgpass file either in Windows or Linux.
On Tue, 29 May 2018 13:32:46 -0700, Adrian Klaver wrote: >On 05/29/2018 12:14 PM, nageswara Bandla wrote: > >> As per the link- >> (https://www.postgresql.org/docs/9.6/static/libpq-pgpass.html) I set >> PGPASSFILE environment variable to point to pgpass.conf location. Even >> then, it’s throwing same above error message. I have found out that >> pgagent is not reading pgpass.conf file when configured under >> LocalSystem account. >> >> When I change the properties of the pgagent service to run under my >> login user account. Then, it’s reading pgpass.conf file under >> %APPDATA%/postgresql/pgpass.conf. >> >> I am clueless, why pgagent is not honoring PGPASSFILE env variable. > >My guess because the LocalSystem user does not have permissions on your: > >%APPDATA%/postgresql/pgpass.conf > >file. This seems to be confirmed by it working when you run pgagent as >the login user. LocalSystem has administrator permissions to virtually everything. https://msdn.microsoft.com/en-us/library/windows/desktop/ms684190(v=vs.85).aspx It should be able to read files belonging to any user. But the LocalSystem account can see only *global* environment variables ... it can't see any user specific ones. I would check if the PGPASSFILE variable is set globally or only in the user account. I don't know anything specifically about running pgagent on Windows, so I can't say why it is giving an error if the docs say it should not. George >passfile > > Specifies the name of the file used to store passwords (see Section >33.15). Defaults to ~/.pgpass, or %APPDATA%\postgresql\pgpass.conf on >Microsoft Windows. (No error is reported if this file does not exist.) > ^^
Re: Pgagent is not reading pgpass file either in Windows or Linux.
On Thu, 31 May 2018 11:24:18 -0500, nageswara Bandla wrote: >On Wed, May 30, 2018 at 7:45 PM, George Neuner wrote: > >> LocalSystem has administrator permissions to virtually everything. >> https://msdn.microsoft.com/en-us/library/windows/desktop/ >> ms684190(v=vs.85).aspx >> >> It should be able to read files belonging to any user. >> >> But the LocalSystem account can see only *global* environment >> variables ... it can't see any user specific ones. I would check if >> the PGPASSFILE variable is set globally or only in the user account. >> >> >> I don't know anything specifically about running pgagent on Windows, >> so I can't say why it is giving an error if the docs say it should >> not. >> > > >I am setting the PGPASSFILE in system environment variables. I am not >setting it in user specific environmental variables. It just occurred to me that you said PGPASSFILE was set to %APPDATA%/postgresql/pgpass.conf The problem may be that when LocalSystem expands %APPDATA%, it is finding its own directory, which might be any of: C:\Windows\ServiceProfiles\LocalService\appdata C:\Windows\System32\config\systemprofile\AppData C:\Windows\SysWOW64\config\systemprofile\AppData depending on your Windows version, policies (if any), and whether the executable is 32 or 64 bit. I wouldn't try messing with any of these directories. Instead try setting PGPASSFILE to the full path to your file. George
Re: Pgagent is not reading pgpass file either in Windows or Linux.
On Thu, 31 May 2018 15:40:21 -0500, nageswara Bandla wrote: >On Thu, May 31, 2018 at 12:57 PM, George Neuner >wrote: > >> It just occurred to me that you said PGPASSFILE was set to >> >> %APPDATA%/postgresql/pgpass.conf >> >> >> The problem may be that when LocalSystem expands %APPDATA%, it is >> finding its own directory, which might be any of: >> >>C:\Windows\ServiceProfiles\LocalService\appdata >>C:\Windows\System32\config\systemprofile\AppData >>C:\Windows\SysWOW64\config\systemprofile\AppData >> >> depending on your Windows version, policies (if any), and whether the >> executable is 32 or 64 bit. >> >> >> I wouldn't try messing with any of these directories. Instead try >> setting PGPASSFILE to the full path to your file. >> >> >I have tried all of them, pgagent is not recognizing any of the above >locations. In fact, I have tried both options > > #1. By defining PGPASSFILE to the above locations one after the other. > #2. By copying pgpass.conf to all the three locations by creating >Roaming/postgresql directories. > >And also I have defined PGPASSFILE=C:\pgpass.conf; I think, this should be >accessible to any system account. This also not working. One more stupid question and then I'm out of ideas ... Have you rebooted after changing the environment variable? Global environment changes normally don't take effect until the user logs out/in again. LocalSystem is not an interactive user - you have to restart the system to let it see environment changes. PITA. George
Re: Code of Conduct plan
On Sun, 03 Jun 2018 17:47:58 -0400, Tom Lane wrote: >Benjamin Scherrey writes: > >> Another more specific factual question - have there been incidents within >> the active Postgresql community where behaviour by individuals who are >> participants in the community have conducted themselves in a manner that >> brought on the actual need for such a code of conduct to exist in the first >> place? > >I believe there were a couple of unfortunate incidents at conferences. >Now, conferences are generally expected to have their own CoCs and enforce >them themselves; this CoC is meant more to cover on-line interactions. >You could argue that we shouldn't create such a CoC until something bad >happens on-line; but I'd prefer to think that having a CoC might prevent >that from ever happening at all, which is surely better. Unfortunately, conduct codes generally aren't worth the paper they are written on. People who are inclined to behave badly towards others in the 1st place will do so regardless of any code or any consequences of violating the code. The only thing a conduct code really accomplishes is to make some subset of the signers feel good about themselves. Actions are more important than words. YMMV. >In any case, we went over all these sorts of arguments at excruciating >length in 2016. It's quite clear to the core team that a majority of >the community wants a CoC. I don't think any useful purpose will be >served by re-litigating that point. > > regards, tom lane I remember that thread, but I don't remember any vote being taken. And the participants in the thread were self-selected for interest in the topic, so any consensus there is not necessarily reflective of the community at large. I am completely in favor of civil discourse and behavior, but I am not in favor of unenforcible red tape. Just my 2 cents. George
Re: Pgagent is not reading pgpass file either in Windows or Linux.
On 6/4/2018 6:31 PM, nageswara Bandla wrote: I have figured out the issue with pgAgent both in Windows and Linux. PgAgent seems to ignore pgpass.conf/.pgpass whenever it has 127.0.0.1 (127.0.0.1:5432:*:postgres:postgres) throws an error: *DEBUG: Creating DB connection: user=postgres port=5432 hostaddr=127.0.0.1 dbname=linuxpostgresdb* *WARNING: Couldn't create the primary connection (attempt 1): fe_sendauth: no password supplied* * * The solution could be update .pgpass to have ( localhost:5432:*:postgres:postgres ) and then pgagent works fine without issues. I think, pgagent is not inline with libpq.dll while passing host address parameter. I have raised this concern with pgagent github where exactly they need to change the code in order for pgagent to be in line with psql program. https://github.com/postgres/pgagent/issues/14 Wow !!! I don't use pgpass much at all - usually I want the security of providing the password manually. Since we started with the idea that the file wasn't being read properly, I was mainly trying to figure out how Windows could be screwing that up. There is a difference between localhost and 127.0.0.1: localhost is DNS resolved - usually from the local hosts file - and thus works with either IPv4 or IPv6 addressing. But it never would have occurred to me that using one vs the other, on an otherwise properly configured system, should cause an error. Great work finding that. George
Re: pg_dump out of memory
On Tue, 3 Jul 2018 21:43:38 -0500, Andy Colson wrote: >Hi All, > >I moved a physical box to a VM, and set its memory to 1Gig. Everything >runs fine except one backup: > > >/pub/backup# pg_dump -Fc -U postgres -f wildfire.backup wildfirep > >g_dump: Dumping the contents of table "ofrrds" failed: PQgetResult() failed. >pg_dump: Error message from server: ERROR: out of memory >DETAIL: Failed on request of size 1073741823. ^^ pg_dump is trying to allocate 1GB. Obviously it can't if 1GB is all you have. >pg_dump: The command was: COPY public.ofrrds (id, updateddate, bytes) TO >stdout; > >wildfire=# \dt+ ofrrds >List of relations > Schema | Name | Type | Owner | Size | Description >++---+---+---+- > public | ofrrds | table | andy | 15 MB | > > >wildfire=# \d ofrrds > Table "public.ofrrds" >Column| Type | Modifiers >-++--- > id | character varying(100) | not null > updateddate | bigint | not null > bytes | bytea | >Indexes: > "ofrrds_pk" PRIMARY KEY, btree (id) > There must be a heck of a lot of data in that bytea column. >I'm not sure how to get this backup to run. Any hints would be appreciated. As Adrian mentioned already, you're going to have to give it more memory somehow. Either more RAM or a big swap file. George
Re: User documentation vs Official Docs
On Thu, 19 Jul 2018 21:02:16 -0400, Melvin Davidson wrote: >As universities DO NOT ALLOW software to be installed on shared computers, >and this is the case especially in a library, it implies the user has >their own computer. Many (most?) universities do allow students to install and run software locally under their own accounts. Of course, that doesn't help a visitor using a lab or library computer. >As libraries allow users/citizens to request books be purchased at no >cost to the user/citizen, the argument that someone cannot afford a book >is now a moot point. Libraries can't afford to purchase everything anyone might want. However, libraries lend to one another as well as to their patrons. You always can ask your library to borrow the book from some other library that does have it. Most libraries belong to one or more inter-library lending networks. [Of course, there is no telling how long it will take to get a book that way - you may wait months for something that's hard to find, or if you happen to be far from a decent sized city.] Another nice thing: often when the book needs to be ordered from another library, you can keep it checked out longer than if it came from the local collection. E.g., my local library allows (most) books to be checked out for up to 2 weeks. Specially ordered books, however, can be checked out for up to 5 weeks. These times may not be typical, but wherever I have been, the libraries have allowed for keeping specially ordered books longer. YMMV, George
Re: Question on postgresql.conf
On Mon, 30 Jul 2018 18:35:59 +, "Lu, Dan" wrote: >I am going through the PostgreSQL manual and came across a question >hoping you can help me. It appears the "postgresql.conf" file is >needed by default to start Postgres. Since we have standard with >other RDBMS to store the configuration file on a shared location for >easy comparison, I would like to use a different name for this file >like .conf. Is there some reason that postgresql.conf cannot be a link to your file? George
Re: Question on postgresql.conf
On Tue, 31 Jul 2018 12:59:03 -0500, Dimitri Maziuk wrote: >On 07/31/2018 12:33 PM, George Neuner wrote: > >> Is there some reason that postgresql.conf cannot be a link to your >> file? > >It's six of one, half a dozen of the other. The big problem is when the >network share is unavailable at host boot and all of your databases are >belong to bitbucket. If you want to do it right, you need chef/puppet as >already suggested. I understand the problem with network shares - but the OP mentioned only a "shared location", which could be just another directory on the same device. Because the OP said also that there were other DBMS being used in the same shop, I considered that there might be a respectable SAN involved. George
Re: Web GUI for PG table ?
On Thu, 12 Sep 2019 10:56:21 -0400, David Gauthier wrote: >Many good visualization options but I need one that runs on the web AND >allows insert/update/delete records. Most browser GUI toolkits have some kind of spreadsheet-like "grid" widget that allows editing the displayed data. Some frameworks also have "database" access widgets that can connect to the DBMS and execute queries there without needing additional middleware on the server. But understand that there likely will be some significant Javascript (or compatible) programming on the frontend to make it work as you desire. I've dabbled a bit in frontend web programming and found it to be a PITA ... I much prefer the server side. I've done a fair amount of complex desktop GUI programming over 20+ years (on Macintosh and on Windows, some of it before the introduction of the GUI frameworks), and personally I find browser application development to be quite unfriendly. YMMV. If you want to go the web route and you don't already have someone that is pretty well versed in frontend web coding, you might want to hire a professional. George
Re: Cascade Trigger Not Firing
On Sat, 14 Sep 2019 10:00:18 -0500, Ron wrote: >On 9/14/19 9:54 AM, Tom Lane wrote: >[snip >> The only mention of this feature in 38.1 "Overview of Trigger Behavior" >> is >> >> UPDATE triggers*can* moreover be set to fire only if certain columns >> are mentioned in the SET clause of the UPDATE statement. >> >> which seems to me to be plenty specific enough --- it is carefully >> *not* saying that the trigger will fire if the column changes value. >> The CREATE TRIGGER man page never says that, either. > >Given that the UPDATE "*can* ... be set to fire only if certain columns are >mentioned in the SET clause of the UPDATE statement", it logically follows >that the default behavior is something else (for example, if the field value >changes for whatever reason. But the default could be "any column mentioned", not necessarily any value changed. George
Re: Is this a bug ?
On Wed, 23 Oct 2019 11:27:12 -0500, Ron wrote: >On 10/23/19 11:20 AM, Geoff Winkless wrote: >> On Wed, 23 Oct 2019 at 17:09, Ron wrote: >> >>> As much as I hate to say it, MSFT was right to ignore this bug in the >>> standard. >> Standards are standards for a reason. It is almost never correct to >> deliberately ignore them. If you don't like them, then revise the >> standard. >> >> Historically Microsoft ignored standards either because they >> misunderstood them or because they wanted to lock in their customers, >> not for any reasons of altruism. >> >> For what it's worth, I can see a value to having >> >> SELECT 'this is quite a long string' >> 'which I've joined together ' >> 'across multiple lines'; >> >> although the advantage of it vs using a concat operator is slim. > >There is no advantage to using it vs using a concat operator, and all >disadvantage. It WAS an advantage querying interactively on 80 character text mode screens, and when SQL was embedded (directly) into programs written in other languages. Regardless of how recent the latest standard - SQL still has many ... warts really, but I will be charitable and call them "vestiges" ... of its roots as a 1970s language. YMMV, George
Re: Composite type storage overhead
On Wed, 23 Oct 2019 21:24:58 +, Laiszner Tamás wrote: Rob Sargent wrote >> Why not use UUID type? > 1. > Although it does not enforce, but the UUID type kind of suggests a > specific interpretation of the data. Of course the documentation says > you are free to use any algorithm to generate the values, but there > are quite a few standard UUID types and we are not planning to use > any of them. The usual interpretation suggests an IP address and a timestamp, but there is nothing that /requires/ that interpretation. The value is just a large integer and you can treat it that way if you want. Postgresql doesn't check UUID data for any particular format. You can store arbitrary integer values into UUID columns - with the caveat that, as UUIDs, you can only compare them and can't (easily) do any arithmetic. > 2. > The serialization format is different than needed by the application > and, while once again this is not a hard technical barrier, that > might cause slight additional complexity and confusion. Not sure what is the issue here. Admittedly I don't know how pglib passes binary UUIDs[*] but ceratinly they can be sent as strings if necessary. [*] Off the top of my head, I would guess a struct or array of four 32-bit values, but truly I haven't looked. > 3. > The value is logically defined as a 128-bit integer, that is in itself > a compound value split into a few "bit groups". Extracting these > parts can be done by simple (and supposedly efficient) bitwise > operators when stored as integer, but becomes much more cumbersome > with UUID, I guess. The groupings are only for display / printing, to make it easier for humans to read. WRT mixing logic into the key (sharding, etc.), all UUIDs except type 4 can provide you with a reasonable static sharding key. And even type 4 works if you shard by time. Also, keep in mind that a UUID can be generated by a client or a key server and thus have no relationship to the DBMS server that stores it. Depending on how you choose to generate and use it, a UUID doesn't necessarily carry or reveal any exploitable information. YMMV, George
Re: select view definition from pg_views feature request
On Sun, 03 Nov 2019 16:15:23 -0500, Tom Lane wrote: >Michael Shapiro writes: >> It seems that the definition of a view from pg_catalog.pg_views does not >> qualify the tables used in the view if the tables are in the current search >> path. > >> Is it possible to either have the definition always qualify all tables >> independent of the search_path (or else provide a new column that does >> that)? > >Why don't you just change the search path to empty before selecting? > > regards, tom lane I'm guessing this is fine inside a transaction, but what if you're using a client that commits by statement? In that case, wouldn't the change to the path affect the other backends? George
Re: select view definition from pg_views feature request
On Tue, 5 Nov 2019 14:29:00 +1300, David Rowley wrote: >See https://www.postgresql.org/docs/current/sql-set.html > >"SET only affects the value used by the current session." > >Also: > >"The effects of SET LOCAL last only till the end of the current transaction" > >Neither affects other sessions. Ok, so you need to "ALTER DATABASE ..." for the change to affect everyone. Thanks for the clarification. George
Re: status of CURSORs after DISCONNECT
On Thu, 28 Nov 2019 07:27:15 +0100, Matthias Apitz wrote: >When an ESQL/C written process issues a > >EXEC SQL DISCONNECT [connection]; > >do the opened CURSOR(s) still survive? We run into the problem that the >father process issues DISCONNECT before forking children, the forked child >CONNECTs to the same server and database again and "thinks" it has to CLOSE the >CURSOR (perhaps a long standing bug in our application we port now from >Sybase(...) to PostgreSQL). > >Thanks > > matthias Cursors are session specific so if the PG backend exits due to the disconnect, any cursors it is managing will be closed. However, if you are using a connection pooler to keep PG backend processes alive for reuse, then disconnecting does not necessarily end the session. In this case, you need to make sure any cursors are closed because the next connection to the same backend process may know nothing about them. In any event, it's good practice always to close cursors when you are done with them - if for no reason other than to release resources they are holding. George
Re: status of CURSORs after DISCONNECT
On Thu, 28 Nov 2019 09:58:50 -0500, Tom Lane wrote: >Matthias Apitz writes: >> When an ESQL/C written process issues a >> EXEC SQL DISCONNECT [connection]; >> do the opened CURSOR(s) still survive? > >No. Cursors are purely session-local objects in Postgres. >I'm a bit surprised to hear it might be different in Sybase. > > regards, tom lane > Sybase has a different notion of "session" that permits multiple connections, and certain per session objects such as temp tables can (optionally) be shared among all connections to the same session. I'm not sure whether cursors similarly can be shared. Sybase allows cursors and temp tables to be statically declared in the schema DDL. Declared cursors and temp tables do not need to be "created" explicitly - if they do not already exist, they can be instantiated simply by mention in a query. I have never tried sharing a cursor (or even using a declared cursor). Queries in Sybase are *connection* specific, so I would think it would be an error for multiple queries to try to reference the same cursor name simultaneously. But I haven't worked with Sybase for some years so I am not up to date on the current software. George
Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
On Tue, 10 Dec 2019 18:00:02 -0600, Ron wrote: >On 12/10/19 3:11 PM, Erwin Brandstetter wrote: >> I am looking for stable hash functions producing 8-byte or 4-byte hashes >> from long text values in Postgres 10 or later. >> >> There is md5(), the result of which can be cast to uuid. This reliably >> produces practically unique, stable 16-byte values. I have usecases where >> an 8-byte or even 4-byte hash would be good enough to make collisions >> reasonably unlikely. (I can recheck on the full string) - and expression >> indexes substantially smaller. I could truncate md5 and cast back and >> forth, but that seems like a lot of wasted computation. Are there >> suggestions for text hash functions that are >> - fast >> - keep collisions to a minimum >> - stable across major Postgres versions (so expression indexes don't break) >> - croptographic aspect is not needed (acceptable, but no benefit) > >What about a CRC32 function? It's fast, and an SSE4 instruction has been in >Intel CPUs for about 10 years. On long text CRC will not be as discriminating as a real cryptohash, but it may be considerably faster to compute depending on the length of the text. Given that the OP can check the actual string in the event of a collision, it may work well enough. One way to make it more discriminating is to compute a pair of CRCs using different polynomials. Unfortunately the SSE4.2 CRC32 instruction uses a fixed polynomial. You can compute it twice using different initial values, but the result won't be as good as actually using different polynomials. I used to create 4-byte hashes by concatenating two 16-bit CRCs - one using the X-modem polynomial and the other using the CCITT polynomial. Since these gave very different results, it worked quite well for my use case where all the strings were guaranteed < 16KB. YMMV, George
Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
On Sun, 15 Dec 2019 20:23:25 -0600, Ron wrote: >On 12/15/19 3:59 PM, George Neuner wrote: > >> On long text CRC will not be as discriminating as a real cryptohash, > >When specifying a 4 byte hash, something must be sacrificed... Obviously. But the main point is that CRC never was designed to uniquely fingerprint data - it was designed to detect corruption of the data, which is a much weaker guarantee than the cryptodigest hashes. Despite being the same length as an MD5 hash, a 128-bit CRC still might not be as discriminating ... it depends greatly on the CRC polynomial used and on the length of the input. George
Re: Changing default ../data/ directory
On Sat, 4 Jan 2020 07:20:44 -0800 (PST), Rich Shepard wrote: >As this is all new to me I want to learn how to: > >1) Define a postgresl.conf and learn where it should be located. > >2) Move all current databased in /var/lib/pgsql/11/data to /data/pgsql/11. > >3) Create all new databases in /data/pgsql/11 by default. > >I'm sure it's in the manual yet I'm unsure just where to start reading. > >Rich Since the new drive is local you can just move the data directory to its new location and link to it from the default (/var) location. No configuration changes needed. Won't help you learn necessarily, but solves the problem. YMMV, George
Re: WaitForMultipleObjects in C Extension
On Tue, 14 Jan 2020 10:35:09 +0300, ?lyas Derse wrote: > Hi guys I need your experiences. I made an Extension in C. When I stop the >query on PostgreSQL,I want to control my extension's result with use cases >in C.So I think , I should use WaitForMultipleObjects. Not really understanding this. >Is there a like a SignalObjectHandle() ? By the way I'm using Windows. Not exactly. Windows does has some signaling objects like Unix/Linux, but they exist only in the filesystem and network stack, and they are signaled only by events within those subsystems. IOW, they can't be signaled manually. For general communication/synchronization purposes Windows uses an event based notification system. See: https://docs.microsoft.com/en-us/windows/win32/sync/using-synchronization https://docs.microsoft.com/en-us/windows/win32/sync/using-event-objects George
testing post through gmane.io
gmane.org is now gmane.io hope this works
Re: Watching for view changes
On Thu, 20 Dec 2018 12:12:14 -0800, Mitar wrote: >On Thu, Dec 20, 2018 at 8:11 AM Mike Rylander wrote: >> If, as I assume, the FROM clause of the SELECT is essentially static, >> just with varying WHERE conditions, you could just use an AFTER >> INSERT/UPDATE/DELETE trigger on each of the tables involved to send a >> NOTIFY whenever data that is exposed by the SELECT is changed. > >You mean that for every SQL query I am making to a database, I would >manually determine which tables are involved and then setup triggers >with NOTIFY that the SELECT might have changed? You can just put an update trigger on every table. You aren't forced to listen for notifications. >I am trying to see if this could be something I could abstract out >that it would be done automatically for any query. I have looked into >using EXPLAIN to get a list of tables involved in a query, but the >issue is that it look non-trivial to determine what has really changed >as a consequence of those tables changing. I would have to cache >myself what was the result of a query and then myself do a diff? Are >there any ways to do this inside PostgreSQL without having to cache >results on the client and do it there? You don't need to cache anything on the client. An update trigger on a table can notify a listening client when data is changed. The difficulty is that views are not guaranteed to be updateable. As a technical matter, you can put a trigger on a view, but it may never fire. AFAIK, update and delete triggers do work on materialized views, because they really are physical tables. Depending on your use case, materialized views may or may not be of help to you. >> You can deliver a payload, such as the table name, primary key value and, >> with a little work, even a list of fields that were modified as a JSON >> blob, and let the application do whatever needs to be done to react >> the the changes -- issue other queries, etc. > >But this would mean that I would have to know how changes on involved >tables influence query results. The trigger function itself is agnostic WRT the format of the table - the old and new row data are provided generically as records, and you can to convert the record data, e.g., to JSON or XML, without knowing its format. AFAIHS, you really only need to know the table format to inspect or modify the row data. >I would like to not have to do SQL query parsing and understanding >on the client. So ideally, I would get information directly from >PostgreSQL. For me, an API where I could do AFTER UPDATE trigger on >FOR EACH ROW on a view would be perfect. In that trigger I could get >information which rows of the view changed and then use NOTIFY to >inform the client. You might want to use BEFORE UPDATE so you get both the old and new row data. YMMV, George
Re: Watching for view changes
On Fri, 21 Dec 2018 20:49:23 -0800, Mitar wrote: >On Thu, Dec 20, 2018 at 6:28 PM George Neuner wrote: > >> The difficulty is that views are not guaranteed to be updateable. > >What you mean by that? I mean, just to be clear. I care only about >read-only views. The changes to views I care about change view because >underlying tables are updated. I do not care about UPDATE queries >against views themselves. > >So you mean here that it is not guaranteed that you can make an UPDATE >query against a view? Yes, that I know. But if underlying tables >change, view is always updated, no? I mean, this is the main purpose >of a view. :-) Terminology. UPDATE vs update. I'm certain you know this already, but to be clear: A normal view really is just a SELECT whose results reflect the current data in the underlying tables. That is, a normal view is *code*, NOT data, and its result is an anonymous *virtual* table that exists only in the transaction that applied the view. Thus the results can't be monitored for changes without employing some kind of deliberate caching. A materialized view IS exactly such a deliberate cache of results from applying a view. It is a real table that can be monitored for changes using INSERT, UPDATE and/or DELETE triggers. >> As a technical matter, you can put a trigger on a view, but it may never >> fire. > >If fires only for UPDATE queries against views themselves (and only if >INSTEAD OF trigger does not invalidate the update). But it never fires >for updates which happen because of changes to the underlying tables. Right. A trigger on a (normal) view actually is on the virtual result table - but that table doesn't exist outside of the transaction that applied the view. I'm not certain offhand, but I *believe* that - like a SELECT - view results will change on the fly during a transaction if the underlying tables change and the isolation settings allow to see it. But even if a view will update while open, to use this you'd have to hold the result set open (with a cursor) while also keeping the transaction open. Once the result set is closed, the view results are gone. With a materialized view, you must apply the view code again (call REFRESH) to see changes to the underlying tables - it doesn't happen automagically. But when refreshed, triggers on the cache table would react to changes. >I would like to know: > >a) Do people agree/think that would be a good API for my use case? >b) How hard would it be to implement/develop something like that? Is >this something PostgreSQL already knows internally and it is just a >question of exposing it? It probably would be possible to take a trigger set on a view and transitively trigger on the base tables underlying it. But to what end? The results (of a normal view) don't persist beyond the current transaction, and allowing them to do so would, AFAICS, be a violation of the SQL standard. Postgresql is not going to do that (certainly not for a new feature, only for backward compatibility). >c) Is there some better way to achieve this? > : >Yes, materialized views are too heavy for me. But having update and >delete triggers only memory-only temporary views would be perfect. Unfortunately, materialized views are the only reasonable server side solution that I can think of. >Also, materialized views have to be manually refreshed, no? So it is >not really true that they get updated automatically (and that update >triggers would run) as soon as underling tables are modified? Yes. But you can do that with triggers on the base tables. Or timers if "most current" information is not critical. Perhaps you can use synchronized normal and materialized views. Use the normal view for current information in transactions, and the materialized view to asynchronously update passive LISTEN clients with delta information. >> >I would like to not have to do SQL query parsing and understanding >> >on the client. That I understand. I'm a compiler and language hobbiest ... even just parsing modern SQL can be painful. >> So ideally, I would get information directly from >> >PostgreSQL. For me, an API where I could do AFTER UPDATE trigger on >> >FOR EACH ROW on a view would be perfect. In that trigger I could get >> >information which rows of the view changed and then use NOTIFY to >> >inform the client. >> >> You might want to use BEFORE UPDATE so you get both the old and new >> row data. > >Not sure how this helps. If you know specifically what columns have changed, only they would need be communicated to a LISTEN client. Not necessary to send the whole row (which may be much larger). AFTER shows you only the new row. BEFORE shows you both the old and new rows so you can determin
Re: Watching for view changes
On Fri, 21 Dec 2018 23:41:16 -0800, Mitar wrote: >Hi! > >On Fri, Dec 21, 2018 at 11:10 PM George Neuner wrote: >> A materialized view IS exactly such a deliberate cache of results from >> applying a view. It is a real table that can be monitored for changes >> using INSERT, UPDATE and/or DELETE triggers. > >Caching is needed if you want to compute a difference between previous >version and new. But if you want to just know new value, then I could >imagine that (a simple implementation would) on every change to any >underlying table check if this change matches selectors of the query >and if such apply its operations/projections and produce the new >value. Yes, that could be done. But it isn't. In effect you are asking the DBMS also to be a spreadsheet: i.e. change this cell and everything that depends on it gets recomputed. A spreadsheet is an order of magnitude simpler to implement than a DBMS, but the combination would be an order of magnitude (or more) harder. Even object graph databases don't offer spreadsheet functionality, and it would be a lot easier to do there than in a table relational system. >So yes, you need caching if you want to decrease CPU use, but you >could also see it as new values being computed again and again through >query. Would such caching you are mentioning really improve >performance, I do not know, so it might be premature optimization? It may take only 3 cycles to multiply two numbers, but it can take thousands of cycles [or millions if the data is on disk] to get those two numbers into the multiplier. There always are exceptions, but the general rule is that whenever the result requires: - significant computation, - significant resources, or - significant time then you should cache the result instead of recomputing it. Joins and sorts can take a whole lot of memory (and spill onto disk if they overflow the work buffer). A fetch of a table or index not in memory is simple but takes a lot of time - as well as maybe pushing something else out (increasing the complexity of a competing query). >If we do go down the cache path, then I agree, materialized views >seems nice, but I would prefer temporary materialized views: they >should be cleaned up at the end of the session. Moreover, they should >be ideally just in memory, not really on disk. Materialized views are >currently stored to disk, no? In PG, all *named* tables are backed on disk - even temporary tables. Only anonymous tables of query results can exist entirely in memory [and even they can spill onto disk when necessary]. With enough memory you can cache all your tables in shared buffers and have enough extra that you never run out of work buffers and never overflow a work buffer. But that is the best you can achieve with PG. George
Re: Watching for view changes
On Sun, 23 Dec 2018 23:06:51 -0800, Mitar wrote: >On Fri, Dec 21, 2018 at 11:10 PM George Neuner wrote: >> A materialized view IS exactly such a deliberate cache of results from >> applying a view. It is a real table that can be monitored for changes >> using INSERT, UPDATE and/or DELETE triggers. > >Are you sure one can use triggers on a materialized view? I am getting: > >"my_materialized_view" is not a table or view > >as an error when I am trying to create a trigger on materialized view. IIRC the cache table's name is generated. I don't know the proper incantations to get it from the catalogs, but an easy way to find it is to put the materialized view into its own tablespace, then search pg_tables for objects in that space. George
Re: Watching for view changes
On Sun, 23 Dec 2018 10:10:50 -0800, Mitar wrote: >I can then wire triggers on underlying tables to REFRESH materialized >views automatically. Is there some easy way to debounce those >refreshes? If I just blindly trigger REFRESH in every row trigger, >this could do a lot of work. I would prefer to combine all changes for >example into 100 ms batches and refresh only once per 100 ms. So in >some way I would like to be able to control the level of real-time I >would need. I have found a blog post [1] about this, but it seems >pretty tricky and requires an outside cron job. For 100 ms batching >time this feels like something better done inside PostgreSQL itself. see the pg_cron extension. But cron doesn't allow sub-second scheduling. Why do you need it so frequently? PG is not a real time platform, or even an all-in-memory database. If you try to use it that way, you are guaranteed to have problems. >The last question is how do I get changes in materialized views >streamed to the client. It seems one option is a trigger on the >materialized view which uses NOTIFY to tell the client about the >change. But NOTIFY has limit on the payload size, so I cannot just >send the change to the client. If you trigger by row, the NOTIFY payload is just that one row. If the data can't fit into the 8K payload, then you need to send some kind of row id and have the client read the changes explicitly. Also remember that you may have to deal with DELETEd rows. If you can't send row data by NOTIFY, then the client HAS to cache the whole view anyway to see what's been deleted. >It seems I would need additional table >to store the change and then client would get notification, read from >that table, and remove the rows read. So in a way I would need my own >queue for changes. There's no reason to do that. The client has to interpret the view changes and incorporate them into its own local data structures. Segregating new/modified rows separately on the server side seems to me to be a waste of effort. If it makes sense, have the client collect some number of notifications and read all the indicated rows in one query. George
Re: Determine in a trigger if UPDATE query really changed anything
On Sun, 23 Dec 2018 20:21:22 -0800, Mitar wrote: >Currently I am doing: > >CREATE TRIGGER some_trigger AFTER UPDATE ON my_table REFERENCING NEW >TABLE AS new_table OLD TABLE AS old_table FOR EACH STATEMENT EXECUTE >FUNCTION my_trigger(); > >In my trigger I do: > >PERFORM * FROM ((TABLE new_table EXCEPT TABLE new_table) UNION ALL >(TABLE new_table EXCEPT TABLE old_table)) AS differences LIMIT 1; >IF FOUND THEN > ... > >But I wonder if there is an easier way. I would just like to know if >an UPDATE really changed anything. > >For DELETE I do "PERFORM * FROM old_table LIMIT 1" and for INSERT I >do "PERFORM * FROM new_table LIMIT 1" and I think this is reasonable. >Still, not sure why I have to store the whole relation just to know if >statement really changed anything. You're still thinking in terms of deltas for the whole table. Think individual rows instead. With a BY ROW trigger, the difference between the NEW and OLD variables lets you see the changes to the particular row. George
Re: Watching for view changes
On Mon, 24 Dec 2018 01:16:53 -0800, Mitar wrote: > >I am unable to find cache table's name. :-( > Did you execute the view or simply define it? I think the table is created the 1st time the view is executed. George
Re: Varlena with recursive data structures?
On Wed, 16 Jan 2019 22:08:35 +, Sam Patterson wrote: >Hi all, > >I've recently started developing an extension for Postgres for which I'll >need to create a new variable-length base type. The type will require a >tree-like structure in order to parse sufficiently, which of course >probably means having some sort of recursive data structure, like a struct >that has members which are pointers to itself for child nodes. After doing >some research, specifically looking at how other variable-length data types >store their data, it seems almost all of them store the data in a binary >representation, using bit masks and offsets etc in order to store/access >the data whilst having an in-memory representation that's used to >manipulate the data. > >I presume the purpose for using this approach is because all the data in a >varlena type has to be contiguous, and the moment you start using pointers >this is no longer possible. So my question is, given a structure that looks >something like this, > >typedef struct Node >{ >char *data; >Node *left; >Node *right; >} Node; > >am I right in saying that I wouldn't be able to store that representation >on-disk, but instead I'd have to transform it into some binary >representation and back again when writing/reading respectively, are there >any alternatives? > >Regards, > >Karl You might want to consider using a linear tree representation that can be searched directly. See: https://en.wikipedia.org/w/index.php?title=Binary_tree§ion=9#Arrays E.g., Make the tree nodes: { unsigned data, left, right } Store the nodes contiguously as an array, and represent the left/right branches using simple array indexing. Write the node data contiguously into a separate array/stream, keeping an offset [or index] to each data element in its correponding tree node. And then write the node array and the data array/stream contiguously onto the disk. For reading/searching this structure can be used directly: read in the homogenous node array, and suck the variable length data stream into a separate buffer. If it isn't important that the tree be optimally ordered, then insert is simple: just append a new node to the array, append its data to the data stream, and update branch indexes to link the new node at the correct spot in the tree. If you want the data to be as small as possible on disk, then update or delete of a node requires repacking the data and adjusting the data offsets of the "unchanged" nodes. https://en.wikipedia.org/wiki/Mark-compact_algorithm [Obviously, you can delete without repacking by just unlinking the tree node and rewriting the node array. The "deleted" node data is left in place on disk but it becomes unreachable.] If you can afford to (temporarily) sort the tree nodes by their data offset, repacking the data can be done in the same buffer by sliding each element forward to be contiguous with the previous. If you need to take the nodes in whatever order they occur, then to pack the data you need to copy it to a new buffer. Hope this helps. George
Re: Future Non-server Windows support???
On Sun, 03 Mar 2019 08:30:48 +0100, Laurenz Albe wrote: >Bill Haught wrote: >> My main concern is that Microsoft has Enterprise versions of Windows and >> versions for everything else which makes me wonder if at some point >> Windows versions for desktop use may not have features needed by some >> database applications or differences between the versions may be enough >> to necessitate slight tweaks to code and compiling additional versions. > >Speaking as a semi-ignorant, I had the impressions that all Windows versions >are pretty similar under the hood (with occasional annoying behavior changes), >and most of the differences are on the GUI level, while the C API is pretty >much the same. > >Yours, >Laurenz Albe Server versions are optimized for running background services rather than interactive programs. I don't know all the differences offhand, but I do know the servers use different algorithms (not just settings) for scheduling, memory management, and file caching. It isn't possible to tweak a Windows desktop into a server with just settings - the server actually is running (at least some) different code. George
Re: Is it possible to sort strings in EBCDIC order in PostgreSQL server?
On Tue, 12 Dec 2017 07:18:10 -0600, John McKown wrote: >?This is a guess on my part, based on many years on an EBCDIC system. But >I'll bet that they are doing a conversion off of the EBCDIC system (maybe >Db2 on z/OS) to an ASCII system (Linux or Windows) running PostgreSQL. They >want to be able to compare the output from the existing system to the >output on the new system. [he asks naively:] If a port off the EBCDIC system really is the case, wouldn't it be easier just to write an external compare tool? Or maybe a PG function to convert UTF strings as EBCDIC so as to compare string column dumps? Either of the above seems more rational than putting a whole lot of effort into creating an EBCDIC collation. [Not that it is terribly hard to create a new collation. It's more a matter of "why bother?" when you are trying to move away from that system in the 1st place.] YMMV, George
Re: data-checksums
"On Tue, 9 Jan 2018 20:02:37 +0100, "Rakesh Kumar" wrote: >Hello Mr. Pedantic, > >> noticeable != huge. > >and noticeable != small/negligible either, at least from English >point of view. small != negligible. The word "noticable" does not imply any particular magnitude of event. It means only that can be observed. There is no canon technical usage. In layman's use, "noticable" often *does* imply that an effect is small, and that one might not see it if not looking specifically for it. Unfortunately, English is a slippery language. Perhaps technical documents should be written in Sumerian. YMMV, George
Re: SSD filesystem aligned to DBMS
On Tue, 16 Jan 2018 16:50:28 +, Michael Loftis wrote: >Alignment definitely makes a difference for writes. It can also make a >difference for random reads as well since the underlying read may not line >up to the hardware add in a read ahead (at drive or OS Level) and youre >reading far more data in the drive than the OS asks for. Best performance will be when the filesystem block size matches the SSD's writeable *data* block size. The SSD also has a separate erase sector size which is some (large) multiple of the data block size. Recall that an SSD doesn't overwrite existing data blocks. When you update a file, the updates are written out to *new* "clean" data blocks, and the file's block index is updated to reflect the new structure. The old data blocks are marked "free+dirty". They must be erased (become "free+clean") before reuse. Depending on the drive size, the SSD's erase sectors may be anywhere from 64MB..512MB in size, and so a single erase sector will hold many individually writeable data blocks. When an erase sector is cleaned, ALL the data blocks it contains are erased. If any still contain good data, they must be relocated before the erase can be done. You don't want your filesystem block to be smaller than the SSD data block, because then you are subject to *unnecessary* write applification: the drive controller has to read/modify/write a whole data block to change any part of it. But, conversely, filesystem blocks that are larger than the SSD write block typically are not a problem because ... unless you do something really stupid [with really low level code] ... the large filesystem blocks will end up be an exact multiple of data blocks. Much of the literature re: alignment actually is related to the erase sectors rather than the data blocks and is targeted at embedded systems that are not using conventional filesystems but rather are accessing the raw SSD. You do want your partitions to start on erase sector boundaries, but that usually is trivial to do. >Stupidly a lot of this isnt published by a lot of SSD manufacturers, but >through benchmarks it shows up. Yes. The advice to match your filesystem to the data block size is not often given. >Another potential difference here with SAS vs SATA is the maximum queue >depth supported by the protocol and drive. Yes. The interface, and how it is configured, matters greatly. >SSD drives also do internal housekeeping tasks for wear leveling on writing. The biggest of which is always writing to a new location. Enterprise grade SSD's sometimes do perform erases ahead of time during idle periods, but cheap drives often wait until the free+dirty space is to be reused. >Ive seen SSD drives benchmark with 80-90MB sequential read or write, >change the alignment, and youll get 400+ on the same drive with sequential >reads (changing nothing else) > >A specific example >https://www.servethehome.com/ssd-alignment-quickly-benchmark-ssd/ I believe you have seen it, but if the read performance changed that drastically, then the controller/driver was doing something awfully stupid ... e.g., re-reading the same data block for each filesystem block it contains. YMMV. George
Re: Regex Replace with 2 conditions
On Mon, 5 Feb 2018 18:22:02 +0100, Francisco Olarte wrote: >I repeat for the last time. YOU ARE NOT USING ASCII. ASCII IS A SEVEN >BIT CODE, 0-128. "?" IS NOT IN THE ASCII CHARACTER SET. What ASCII table are you reading? The question mark symbol is #63. It lies between the numbers and the capital letter set. George
Re: Regex Replace with 2 conditions
On 2/6/2018 10:52 AM, David G. Johnston wrote: On Tue, Feb 6, 2018 at 8:46 AM, George Neuner <mailto:gneun...@comcast.net>>wrote: On Mon, 5 Feb 2018 18:22:02 +0100, Francisco Olarte mailto:fola...@peoplecall.com>> wrote: >I repeat for the last time. YOU ARE NOT USING ASCII. ASCII IS A SEVEN >BIT CODE, 0-128. "?" IS NOT IN THE ASCII CHARACTER SET. What ASCII table are you reading? The question mark symbol is #63. It lies between the numbers and the capital letter set. Your mail client mangled that sentence - the "?" you are seeing is a placeholder for the non-ASCII character "'a' with a bowl on top of it"... Thanks David. Apologies to everyone for the noise. George
Re: Regex Replace with 2 conditions
On Tue, 6 Feb 2018 17:57:33 +0100, Francisco Olarte wrote: >So, no question mark sent, I suspect your mail chain may be playing >tricks on you, or may be you are translating to 7 bits on purpose >since your mail came with the headers: > >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit ><<< > >I'll suggest you fix that before participating in threads with unicode >content. Also, many programs use ? as a placeholder for something not >in its charset, so always suspect you are not seeing the right char >when you encounter one of this things. Usually I do see unicode characters correctly. My news client defaults to *sending* in ISO-8859-1 (US acsii), but it displays incoming messages in UTF-8, and in HTML if applicable ... so I'm not sure why I'm not seeing whatever it was that you actually typed. It does keep coming through as a question mark in all the responses. I read this group through the Gmane mail->news reflector ... maybe that has something to do with it? George
Re: Connection loosing at some places - caused by firewall
On Tue, 14 Nov 2017 12:09:31 +0100, Durumdara wrote: >*I disabled my firewall at home - the [keepalive] problem vanished!!!* What firewall are you using? Windows own firewall doesn't interfere with keepalive packets. Most commercial SOHO firewalls won't either. George
Re: shared_buffers 8GB maximum
On Sat, 17 Feb 2018 00:36:57 +0200, Vitaliy Garnashevich wrote: >- I'm not a huge Linux expert, but I've heard someone saying that >reading from the filesystem cache requires a context switch. Yes. >I suspect >that such reads are slightly more expensive now after the >Meltdown/Spectre patch in the kernel. Not necessarily - it depends on exactly what was changed ... which unfortunately I don't know for certain. Any filesystem call is a kernel transition. That's a Meltdown issue. Meltdown can be avoided by using trampoline functions to call the (real) kernel functions and isolating each trampoline so that no other code immediately follows it. This wastes some memory but there is very little added time cost. Spectre is about snooping within the user space of a single process - it has nothing to do with kernel calls. The issues with Spectre are things like untrusted code breaking out of "sandboxes", snooping on password handling or encryption, etc. Fixing Spectre requires purposefully limiting speculative execution of code and can significantly affect performance. But the effects are situation dependent. >Could that be a reason for increasing the value of shared_buffers? > >- Could shared_buffers=128GB or more on a 250 GB RAM server be a >reasonable setting? What downsides could there be? It depends. 8GB is pretty small for such a large server, but taking 1/2 the RAM is not necessarily the right thing either. The size of shared buffers affects log size and the time to complete checkpoints. If a large(ish) percentage of your workload is writes, having a very large shared space could be bad for performance, or bad for space on the log device. Another reason may be that the server is not dedicated to PG but does other things as well. Dramatically increasing PG's memory use may negatively impact something else. >PS. Some background. We had shared_buffers=8GB initially. In >pg_stat_bgwriter we saw that dirty buffers were written to disk more >frequently by backends than during checkpoints (buffers_clean > >buffers_checkpoint, and buffers_backend > buffers_checkpoint). According >to pg_buffercache extension, there was very small percentage of dirty >pages in shared buffers. The percentage of pages with usagecount >= 3 >was also low. Some of our more frequently used tables and indexes are >more than 10 GB in size. This all suggested that probably the bigger >tables and indexes, whenever scanned, are constantly flushing pages from >the shared buffers area. After increasing shared_buffers to 32GB, the >picture started looking healthier. There were 1GB+ of dirty pages in >shared buffers (vs 40-200MB before), 30-60% of pages with usagecount >= >3 (vs 10-40% before), buffers_checkpoint started to grow faster than >buffers_clean or buffers_backend. There is still not all frequently used >data fits in shared_buffers, so we're considering to increase the >parameter more. I wanted to have some idea about how big it could >reasonably be. So now you know that 32GB is better for your workload than 8GB. But that is not necessarily a reason immediately to go crazy with it. Try increasing it gradually - e.g., adding 16GB at a time - and see if the additional shared space provides any real benefit. >PPS. I know any possible answer might be just a recommendation, and >parameter values should be tested for each specific case, but still >wanted to hear your opinion. Thanks. > >Regards, >Vitaliy George
Re: Database health check/auditing
On Sat, 17 Feb 2018 07:23:32 +1100, Tim Cross wrote: >Thomas Kellerer writes: > >> Plus: trimming the original content, so that not the whole email >> thread is repeated in the quote. > >While I'm happy to comply, I disagree with trimming/editing the >thread. Certainly made sense when networks were slower and MUAs were >less sophisticated, but these days I prefer using a MUA which is able to >fold the original content and keep all the content in the message. This >reduces confusion, misunderstanding and incorrect attribution arising >from poor editing and makes it easy to reference the full discussion in >one spot rather than having to go back through messages putting it all >back together. YMMV. > >Tim The problem is that conversations rarely remained confined to a single topic [this thread for example 8-)]. The list servers do have message size limits (independent of any email server limits). It isn't a problem for simple text messages, but a lot of people now are sending MIME messages that include all the text twice (or 3 times) plus embedded HTML+CSS formatting. On some lists, I have seen MIME messages (mostly sent from Google Groups users) that contain - not kidding - 5,000+ lines of hidden HTML+CSS surrounding a few lines of comment. I agree with you that messages should stand alone wrt to the subject they address. It's just that, too often, messages that contain a lot of quoted material have wandered considerably and so are mostly noise. YMMV, George
Re: Connection loosing at some places - caused by firewall
On Mon, 19 Feb 2018 13:02:30 +0100, Durumdara wrote: >2018-02-13 21:21 GMT+01:00 George Neuner : > >> On Tue, 14 Nov 2017 12:09:31 +0100, Durumdara >> wrote: >> >> >*I disabled my firewall at home - the [keepalive] problem vanished!!!* >> >> What firewall are you using? Windows own firewall doesn't interfere >> with keepalive packets. Most commercial SOHO firewalls won't either. >> >Normal Windows Firewall (Defender). An ASUS test notebook works. The DELL >isn't. >So something is different in the FW on DELL. With manually defined (faster) >keepalive it's working. Keepalive is part of the TCP protocol - it's an ACK that repeats the last used packet sequence number. Under normal circumstances, the sequence number is incremented (modulo rollover) for each new packet. Keepalive is recognized as a deliberate and particular breaking of the packet sequence. Windows firewall has no settings pertaining to keepalive separate from the operating system. Once you have permitted TCP connections to the port or program, if any keepalive packets are sent, the firewall must allow them through. [Of course, both sides must agree on the keepalive settings for it to work, but that is a different issue.] It may be that your firewall is f'd up. The firewall doe perform "stateful inspection" of open TCP connections - essentially looking for protocol mistakes that may indicate intrusion hacks or replayed connections. It could be that the SI module is losing synchronization when the connection goes idle. [Back in the day, there were a number of SOHO NAT routers that had this and other problems with their stateful inspection code.] You can try running a system file check to see if something has gotten corrupted: https://support.microsoft.com/en-us/help/929833/use-the-system-file-checker-tool-to-repair-missing-or-corrupted-system You also can try resetting the firewall rules to default from the control panel. Of course then you'll have to reauthorize every program that requires a connection. for more information on how the firewall works, see: https://technet.microsoft.com/en-us/library/cc779199(v=ws.10).aspx Hope this helps, George