Full text search - wildcard and a stop word
Hi all, I'm venturing into full text search in Postgres for the first time, and I'd like to be able to do a search by the start of a word - so I used the `:*` operator. However, this doesn't operate as I'd expect with a stop word - for example, my name is "Allan" so I often use it as a test string. It contains `all` which is a stop word, which is how I noticed this issue. To illustrate: => select to_tsquery('al:*'); to_tsquery 'al':* (1 row) => select to_tsquery('all:*'); NOTICE: text-search query contains only stop words or doesn't contain lexemes, ignored to_tsquery (1 row) => select to_tsquery('alla:*'); to_tsquery 'alla':* (1 row) I get why that is happening - the notification basically details it, but the wildcard at the end seems to me that it should return `'all':*` in this case? Is this by design or could it be considered a bug? I'm using Postgres 12.10. Thanks, Allan
Re: Full text search - wildcard and a stop word
Allan Jardine writes: > => select to_tsquery('all:*'); > NOTICE: text-search query contains only stop words or doesn't contain > lexemes, ignored > to_tsquery > > (1 row) > I get why that is happening - the notification basically details it, but > the wildcard at the end seems to me that it should return `'all':*` in this > case? Is this by design or could it be considered a bug? It's a hard problem. If we don't normalize the presented word, we risk not matching cases that users would expect to match (because the word is going to be compared to data that probably *was* normalized). In this particular case, you can skip the normalization by just not using to_tsquery: n=# select 'all:*'::tsquery; tsquery - 'all':* (1 row) but that might or might not be what you want in general. Perhaps the ideal behavior here would be "normalize, but don't throw away stopwords", but unfortunately our dictionary APIs don't support that. regards, tom lane
AWS vs GCP storage
Hi, I have a few databases with several TB-sized tables. We recently moved one of those databases from AWS EC2 to GCP. Today I ran VACUUM FREEZE on those tables and every time I saw our application freezing up (and throwing tons of errors) for a few minutes right after the message about the end of vacuum (duration: 30182257.392 ms statement: vacuum freeze verbose ...). We never saw anything like that on AWS. The database is 9.6. At the moment I have no details about the storage configuration on GCP. The machine was set up by another department. Any suggestions on why that is happening and how to prevent it would be appreciated. Thanks, Torsten
Re: AWS vs GCP storage
On 2/22/22 12:27, Torsten Förtsch wrote: Hi, I have a few databases with several TB-sized tables. We recently moved one of those databases from AWS EC2 to GCP. Today I ran VACUUM FREEZE on those tables and every time I saw our application freezing up (and throwing tons of errors) for a few minutes right after the message about the end of vacuum (duration: 30182257.392 ms statement: vacuum freeze verbose ...). We never saw anything like that on AWS. And the errors where? The database is 9.6. At the moment I have no details about the storage configuration on GCP. The machine was set up by another department. Any suggestions on why that is happening and how to prevent it would be appreciated. Thanks, Torsten -- Adrian Klaver adrian.kla...@aklaver.com
Re: AWS vs GCP storage
On Tue, Feb 22, 2022 at 12:27 PM Torsten Förtsch wrote: > Hi, > > I have a few databases with several TB-sized tables. We recently moved one > of those databases from AWS EC2 to GCP. Today I ran VACUUM FREEZE on those > tables and every time I saw our application freezing up (and throwing tons > of errors) for a few minutes right after the message about the end of > vacuum (duration: 30182257.392 ms statement: vacuum freeze verbose ...). > We never saw anything like that on AWS. > > The database is 9.6. At the moment I have no details about the storage > configuration on GCP. The machine was set up by another department. > > Any suggestions on why that is happening and how to prevent it would be > appreciated. > The first thing to check is that you're using SSD PDs, not regular PDs. Second: GCP has 2 rules for disk IO throttling (independent for regular PDs and SSD PDs): based on disk size, and based on the number of vCPUs. You need more vCPUs to get maximum disk IO, which may be counter-intuitive. I suggest checking out the docs and benchmarking the disks using fio: - https://cloud.google.com/compute/docs/disks/performance - https://cloud.google.com/compute/docs/disks/benchmarking-pd-performance