Re: Folding Repeated Letters

2020-10-08 Thread Mike Drob
I was thinking about that, but there are words that are legitimately different with repeated consonants. My primary school teacher lost hair over getting us to learn the difference between desert and dessert. Maybe we need something that can borrow the boosting behaviour of fuzzy query - match the

Re: Question about solr commits

2020-10-08 Thread Erick Erickson
This is a bit confused. There will be only one timer that starts at time T when the first doc comes in. At T+ 15 seconds, all docs that have been received since time T will be committed. The first doc to hit Solr _after_ T+15 seconds starts a single new timer and the process repeats. Best, rick >

Re: Folding Repeated Letters

2020-10-08 Thread Andy Webb
How about something like this? { "add-field-type": [ { "name": "norepeat", "class": "solr.TextField", "analyzer": { "tokenizer": { "class": "solr.StandardTokenizerFactory" }, "filter

Folding Repeated Letters

2020-10-08 Thread Mike Drob
I'm looking for a way to transform words with repeated letters into the same token - does something like this exist out of the box? Do our stemmers support it? For example, say I would want all of these terms to return the same search results: YES YESSS YYYEEESSS YYEE[...]S I don't know how

Re: Term too complex for spellcheck.q param

2020-10-08 Thread Andy Webb
I added the maxQueryLength option to DirectSolrSpellchecker in https://issues.apache.org/jira/browse/SOLR-14131 - that landed in 8.5.0 so should be available to you. Andy On Wed, 7 Oct 2020 at 23:53, gnandre wrote: > Is there a way to truncate spellcheck.q param value from Solr side? > > On Wed

Re: Solr endpoint on the public internet

2020-10-08 Thread Alexandre Rafalovitch
Could be fun red/blue team exercise. Just watch out for those cryptominors that get in through Solr injection (among many other unsecured methods) and are a real pain to remove. Regards, Alex. P.s. Don't ask me how I know :-( P.p.s. Read-only docker container may still be a good layer of defenc

[ANNOUNCE] Apache Solr 8.6.3 released

2020-10-08 Thread Jason Gerlowski
The Lucene PMC is pleased to announce the release of Apache Solr 8.6.3. Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integrat

Re: Solr endpoint on the public internet

2020-10-08 Thread David Hastings
Welp. Never mind I refer back to point #1 this is a bad idea > On Oct 8, 2020, at 3:01 PM, Alexandre Rafalovitch wrote: > > The update handlers are now implicitly defined (3 or 4 of them). So, > it actually needs to be explicitly shadowed and overridden with other > Noop handler. And block Con

Re: Solr endpoint on the public internet

2020-10-08 Thread Alexandre Rafalovitch
The update handlers are now implicitly defined (3 or 4 of them). So, it actually needs to be explicitly shadowed and overridden with other Noop handler. And block Config API to avoid attackers creating new handlers. Regards, Alex. On Thu, 8 Oct 2020 at 14:54, David Hastings wrote: > > Well th

Re: Solr endpoint on the public internet

2020-10-08 Thread David Hastings
Well that’s why I suggested deleting the update handler :) > On Oct 8, 2020, at 2:52 PM, Walter Underwood wrote: > > Let me know where it is and I’ll delete all the documents in your collection. > It is easy, just one HTTP request. > > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a

Re: Solr endpoint on the public internet

2020-10-08 Thread Walter Underwood
Let me know where it is and I’ll delete all the documents in your collection. It is easy, just one HTTP request. https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3 wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 8, 2020, a

Re: Solr endpoint on the public internet

2020-10-08 Thread Alexandre Rafalovitch
I think there were past discussions about people doing but they really really knew what they were doing from a security perspective, not just Solr one. You are increasing your risk factor a lot, so you need to think through this. What are you protecting and what are you exposing. Are you trying to

Re: Solr endpoint on the public internet

2020-10-08 Thread Jörn Franke
It is like opening a database to the Internet - you simply don’t do it and I don’t recommend it. If you despite the anti pattern want to do it use the latest Solr versions and put a reverse proxy in front. Always use authentication and authorization. Do only allow a minimal API endpoints and

Re: Solr endpoint on the public internet

2020-10-08 Thread Dave
#1. This is a HORRIBLE IDEA #2 If I was going to do this I would destroy the update request handler as well as the entire admin ui from the solr instance, set up a replication from a secure solr instance on an interval. This way no one could send an update /delete command, you could still update

Solr endpoint on the public internet

2020-10-08 Thread Marco Aurélio
Hi! We're looking into the option of setting up search with Solr without an intermediary application. This would mean our backend would index data into Solr and we would have a public Solr endpoint on the internet that would receive search requests directly. Since I couldn't find an existing solu

Re: Question about solr commits

2020-10-08 Thread Rahul Goswami
Shawn, So if the autoCommit interval is 15 seconds, and one update request arrives at t=0 and another at t=10 seconds, then will there be two timers one expiring at t=15 and another at t=25 seconds, but this would amount to ONLY ONE commit at t=15 since that one would include changes from both upda

'Exists' query not working for geospatial fields in Solr >= 8.5.0?

2020-10-08 Thread Ondra Horak
Hi, I just found Solr queries like field:* are not working anymore for fields of type SpatialRecursivePrefixTreeFieldType. It seems to work in 8.4.1, since 8.5.0 it just gives an empty result. Is this an intended behaviour, or a bug? Looking at Solr release notes I'd say it might be a consequenc

Re: Solr 8.6.2 - Admin UI Issue

2020-10-08 Thread Vinay Rajput
Thanks everyone for your replies. I definitely cleared browser cache and also tried in incognito mode to rule out this possibility. I think @Kevin got it right. This is the same issue already reported in SOLR-14549 Thanks, Vinay On Thu, Oct 8, 2

Re: Solr 8.6.2 - Admin UI Issue

2020-10-08 Thread Kevin Risden
Since the image didn't come through - it could be https://issues.apache.org/jira/browse/SOLR-14549 Definitely make sure to clear cache to ensure that JS files aren't cached, but if that doesn't fix it see if SOLR-14549 is related. Kevin Risden On Thu, Oct 8, 2020 at 9:38 AM Eric Pugh wrote: >

Re: Solr 8.6.2 - Admin UI Issue

2020-10-08 Thread Eric Pugh
I’ve seen this behavior as well jumping between versions of Solr.Typically in the browser console I see some sort of very opaque Javascript error. > On Oct 8, 2020, at 5:54 AM, Colvin Cowie wrote: > > Images won't be included on the mailing list. You need to put them > somewhere else and

Re: Master/Slave

2020-10-08 Thread Eric Pugh
I’ve met folks who’ve actually used the streaming expressions to move data around if you are looking for a “all Solr” approach. If you go down that route, I’d love to hear how it works. > On Oct 8, 2020, at 7:10 AM, Erick Erickson wrote: > > What Jan said. I wanted to add that the replica

Re: Master/Slave

2020-10-08 Thread Erick Erickson
What Jan said. I wanted to add that the replication API also makes use of it. A little-known fact: you can use the replication API in SolrCloud _without_ configuring anything in solrconfig.xml. You can specify the URL to pull from on the fly in the command…. Best, Erick > On Oct 8, 2020, at 2:

Re: Solr 8.6.2 - Admin UI Issue

2020-10-08 Thread Colvin Cowie
Images won't be included on the mailing list. You need to put them somewhere else and link to them. With that said, if you're switching between versions, maybe your browser has the old UI cached? Try clearing the cache / viewing it in a private window and see if it's any different. On Wed, 7 Oct