Re: Meow attacks

2020-07-28 Thread Odysci
Folks, thanks for the replies. We do use VPCs in AWS and the ZK ports are only open to the solr machines (also in the same VPC). We're using Solr 8.3 and ZK 3.5.6 We will investigate the Kerberos authentication. thanks Reinaldo On Tue, Jul 28, 2020 at 6:03 PM Jörn Franke wrote: > In Addition wh

Re: Searching for credit card numbers

2020-07-28 Thread Walter Underwood
If you reindex, I’ve become a big fan of adding a date field with an index timestamp. That will allow you to check whether everything has been reindexed. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jul 28, 2020, at 2:11 PM, Jörn Franke wrot

Re: Searching for credit card numbers

2020-07-28 Thread Jörn Franke
A regex search at query time would leave room for attacks (eg a regex can easily be designed to block the Solr server forever). If the field is store you can also try to use a cursor to go through all entries using a cursor and reindex the doc based on the field: https://lucene.apache.org/solr/

Re: Meow attacks

2020-07-28 Thread Jörn Franke
In Addition what has been said before (use private networks/firewall rules) - activate Kerberos authentication so that only Solr hosts can write to Zk (the Solr client needs no write access) and use encryption where possible. Upgrade Solr to the latest version, use ssl , enable Kerberos, have cl

Re: Meow attacks

2020-07-28 Thread David Hastings
so, your zookeeper/solr servers have public facing addresses/ports? On Tue, Jul 28, 2020 at 4:41 PM Odysci wrote: > Folks, > > I suspect one of our Zookeeper installations on AWS was subject to a Meow > attack ( > > https://arstechnica.com/information-technology/2020/07/more-than-1000-database

Re: Meow attacks

2020-07-28 Thread matthew sporleder
On Tue, Jul 28, 2020 at 4:39 PM Odysci wrote: > > Folks, > > I suspect one of our Zookeeper installations on AWS was subject to a Meow > attack ( > https://arstechnica.com/information-technology/2020/07/more-than-1000-databases-have-been-nuked-by-mystery-meow-attack/ > ) > > Basically, the configu

Meow attacks

2020-07-28 Thread Odysci
Folks, I suspect one of our Zookeeper installations on AWS was subject to a Meow attack ( https://arstechnica.com/information-technology/2020/07/more-than-1000-databases-have-been-nuked-by-mystery-meow-attack/ ) Basically, the configuration for one of our collections disappeared from the Zookeepe

Re: Searching for credit card numbers

2020-07-28 Thread lstusr 5u93n4
Possible... yes. Agreed that this is the right approach. But if we already have a big index that we're searching through? Any way to "hack it"? On Tue, 28 Jul 2020 at 14:55, Walter Underwood wrote: > I’d do that at index time. Add an update request processor script that > does the regex and adds

Re: Searching for credit card numbers

2020-07-28 Thread Walter Underwood
I’d do that at index time. Add an update request processor script that does the regex and adds a field has_credit_card_number:true. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jul 28, 2020, at 11:50 AM, lstusr 5u93n4 wrote: > > Let's say I have

Searching for credit card numbers

2020-07-28 Thread lstusr 5u93n4
Let's say I have a text field that's been indexed with the standard tokenizer, and I want to match the docs that have credit card numbers in them (this is for altruistic purposes, not nefarious ones!). What's the best way to build a search that will do this? Searching for " " see

Re: Production sizing and scaling guidelines -- Solr

2020-07-28 Thread Colvin Cowie
Maybe not the most up to date or relevant example for your usage but https://sbdevel.wordpress.com/2016/11/30/70tb-16b-docs-4-machines-1-solrcloud/ is one that sticks in my mind I definitely remember seeing a list of these sorts of blogs somewhere a long time ago... don't know where though On Tue,

Re: Production sizing and scaling guidelines -- Solr

2020-07-28 Thread Prashant Jyoti
Thanks Erick. 1> does Solr do what you want? You’re talking about reporting, and Solr is > primarily a search engine. That said, it has tons of analytics capabilities > built in. Depends on what “reporting” means in your situation. > There is a reporting UI which has various criteria the user can

Re: Production sizing and scaling guidelines -- Solr

2020-07-28 Thread Erick Erickson
Here’s a list of some sites using Solr: https://cwiki.apache.org/confluence/display/solr/PublicServers It’s not really what you’re looking for though, it doesn’t really have the details you’d like. There are two dimensions here: 1> does Solr do what you want? You’re talking about reporting, an

Production sizing and scaling guidelines -- Solr

2020-07-28 Thread Prashant Jyoti
Hi, I wanted to check if anybody has any references for tech companies' blogs detailing their Solr setup in production. I am more interested in storage and scaling guidelines. I intend to use Solr for one of my projects at work(back-end for a reporting tool) and need to convince higher management t

Re: Replication of Solr Model and feature store

2020-07-28 Thread krishan goyal
Hi Christine, I am using Solr 7.7 I am able to get it replicated now. I didn't know that the feature and model store are saved as files in the config structure. And by providing these names in /replication handle, I can replicate them. I guess this is something that can be provided in the LTR do