Re: [ANNOUNCE] Apache Solr 8.8.1 released

2021-02-27 Thread David Smiley
The corresponding docker image has been released as well: https://hub.docker.com/_/solr (credit to Tobias Kässmann for helping) ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Feb 23, 2021 at 10:39 AM Timothy Potter wrote: > The Lucene PMC

RE: Dynamic starting or stoping of zookeepers in a cluster

2021-02-24 Thread DAVID MARTIN NIETO
One doubt about it: In order to have a highly available zookeeper, you must have at least three separate physical servers for ZK. Running multiple zookeepers on one physical machine gains you nothing ... because if the whole machine fails, you lose all of those zookeepers. If you have three phys

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-19 Thread David Smiley
nd that ends up being LazyField if you have that feature enabled, or possible wasted space if you don't have that enabled. So I don't think the ability to exclude fields in "fl" would obsolete enableLazyFieldLoading which I think you are implying? ~ David Smiley Apach

Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-18 Thread David Smiley
Congratulations Jan! ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected Jan Høy

Dynamic starting or stoping of zookeepers in a cluster

2021-02-18 Thread DAVID MARTIN NIETO
Hi all, We've a solr cluster with 4 solr servers and 5 zookeepers in HA mode. We've tested about if our cluster can mantain the service with only the half of the cluster, in case of disaster os similar, and we've a problem with the zookepers config and its static configuration. In the start s

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-18 Thread David Smiley
ple imagine if you stored the entire input data as JSON in a _json_ field or some-such. Nowadays, I'd set large="true" on such a field, which is a much newer option. I was able to tweak my test to have only alphabetic IDs, and the test still failed. I don't see how the ID'

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
ing a query that only returns the "id" field. No highlighting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Feb 17, 2021 at 10:28 AM David Smiley wrote: > Thanks for more details. I was able to reproduce this locally! I hacke

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
th it here. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Feb 17, 2021 at 6:36 AM Nussbaum, Ronen wrote: > Hello David, > > Thank you for your reply. > It was very hard but finally I discovered how to reproduce it. I thought >

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread David Smiley
ata; maybe that can illustrate the problem? It's not clear if nested schema or nested docs are actually required in your example. If you share the JIRA issue with me, I'll chase this one down. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Re: SOLR upgrade

2021-02-12 Thread David Hastings
i generally will only upgrade every other release. since i started with 1.4, went to 3->5->7.X, and never EVER a .0 or an even .X release, On Fri, Feb 12, 2021 at 12:01 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Just avoid 8.8.0 for the moment, until 8.8.1 is released. 8.7.x s

Re: Incorrect distance returned for indexed polygone shape

2021-01-31 Thread David Smiley
enough for what you want to do. Basically, calculate the geodist but subtract the radius field... maybe something like this (untested!): sort=sub(geodist(),radius) desc. Use LatLonPointSpatialField to store point data if you can (is appropriate), which succeeded RPT for that. ~ David Smiley

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-29 Thread David Smiley
r than the original highlighter. Just because hl.requireFieldMatch=false is the default, doesn't mean it's the _right_ choice for everyone's app :-). I tend to think Solr should flip this in 9.0 for both accuracy & performance sake. And unset hl.maxAnalyzedChars -- mostly an obsolet

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-28 Thread David Smiley
x2=4 tests). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Jan 27, 2021 at 2:20 AM Kerwin wrote: > Hi, > > While upgrading to Solr 8 from 6 the Unified highlighter begins to have > performance issues going from approximately 100ms

Re: Exact and non exact highlighting

2021-01-22 Thread David Smiley
uching the Solr schema. If you are up for it, comment on that issue to let the original contributor know you want to help move this forward. Maybe they do too. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Jan 22, 2021 at 12:46 PM d

Re: Exact matching without using new fields

2021-01-19 Thread David R
We had the same requirement. Just to echo back your requirements, I understand your case to be this. Given these 2 doc titles: doc 1: "information retrieval" doc 2: "Advanced information retrieval with Solr" You want a phrase search for "information retrieval" to find both documents, but an EXA

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
likely to not highlight as much as you are highlighting now, and highlighting more is your goal right now it appears. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Jan 12, 2021 at 2:45 PM Shaun Campbell wrote: > That's great Da

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
On Tue, Jan 12, 2021 at 1:08 PM Shaun Campbell wrote: > Hi David > > Getting closer now. > > First of all, a bit of a mistake on my part. I have two cores set up and I > was changing the solrconfig.xml on the wrong core doh!! That's why > highlighting wasn't bei

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
On Tue, Jan 12, 2021 at 9:39 AM Shaun Campbell wrote: > Hi David > > First of all I wanted to say I'm working off your book!! Third edition, > and I think it's a bit out of date now. I was just going to try following > the section on the Postings highlighter, but I

Re: Highlighting large text fields

2021-01-11 Thread David Smiley
enefit from putting offsets into the search index (and re-index) -- storeOffsetsWithPositions. That's an option on the field/fieldType in your schema; it may not be obvious reading the docs. You have to opt-in to that; Solr doesn't normally store any info in the index for highlighting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

RE: Apache Solr in High Availability Primary and Secondary node.

2021-01-11 Thread DAVID MARTIN NIETO
. David Martín Nieto Analista Funcional Calle Cabeza Mesada 5 28031, Madrid T: +34 667 414 432 T: +34 91 779 56 98| Ext. 3198 E-mail: dmart...@viewnext.com | Web: www.viewnext.com [https://mail.google.com/mail/u/0?ui=2&ik=72317294cd&att

RE: Apache Solr in High Availability Primary and Secondary node.

2021-01-11 Thread DAVID MARTIN NIETO
I believe Solr dont have this configuration, you need a load balancer with that configuration mode for that. Kind regards. De: Kaushal Shriyan Enviado: lunes, 11 de enero de 2021 11:32 Para: solr-user@lucene.apache.org Asunto: Apache Solr in High Availability

Re: SPLITSHARD - data loss of child documents

2020-12-19 Thread David Smiley
https://issues.apache.org/jira/browse/SOLR-11191 and I assigned it to myself just now. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Dec 17, 2020 at 9:50 AM Mike Drob wrote: > I was under the impression that split shard doesn’t work w

Re: data import handler deprecated?

2020-11-30 Thread David Smiley
the audience of news / release notes), the functionality has *moved*. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Nov 30, 2020 at 8:04 AM Eric Pugh wrote: > You don’t need to abandon DIH right now…. You can just use the Github > h

Re: Faceting: !terms vs mincount precedence

2020-11-17 Thread David Smiley
ul. I know my response isn't a direct answer to your question RE mincount... perhaps it can be made to work? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Nov 17, 2020 at 8:21 AM Jason Gerlowski wrote: > Hey all, > > I was usi

Re: Frequent Index Replication Failure in solr.

2020-11-13 Thread David Hastings
looks like youre repeater is grabbing a file that the master merged into a different file, why not lower how often you go from master->repeater, and/or dont commit so often so you can make the index faster On Fri, Nov 13, 2020 at 12:13 PM Parshant Kumar wrote: > All,please help on this > > On Tu

Re: [ANNOUNCE] Apache Solr 8.7.0 released

2020-11-09 Thread David Smiley
FYI an updated Docker image was just published a few hours ago: https://hub.docker.com/_/solr ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Nov 4, 2020 at 9:06 AM Atri Sharma wrote: > 3/11/2020, Apache Solr™ 8.7 available > > The L

RE: How to raise open file limits

2020-11-04 Thread DAVID MARTIN NIETO
And too this: open files (-n) 1024 At least. David Martín Nieto Analista Funcional Calle Cabeza Mesada 5 28031, Madrid T: +34 667 414 432 T: +34 91 779 56 98| Ext. 3198 E-mail: dmart...@viewnext.com | Web: www.viewnext.com

RE: How to raise open file limits

2020-11-04 Thread DAVID MARTIN NIETO
Hi, You must have to change the ulimit -a parameters on your SO config. I believe the problem that you have is in: max user processes (-u) 4096 Kind regards. David Martín Nieto Analista Funcional Calle Cabeza Mesada 5 28031, Madrid T: +34 667 414 432

Re: Solr 8.6.3

2020-10-22 Thread David Smiley
cted the warning about this in 8.7, so you won't see that again. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Oct 15, 2020 at 4:13 PM Kris Gurusamy wrote: > I've just downloaded solr 8.6.3 and trying to create DIH for loading >

Re: converting string to solr.TextField

2020-10-16 Thread David Hastings
ng all the docs into an > existing index, things like changing from stored=true to > stored=false, adding new fields, deleting fields (although the > meta-data for the field is still kept around) etc. > > > On Oct 16, 2020, at 3:57 PM, David Hastings < > hastings.recurs...@

Re: converting string to solr.TextField

2020-10-16 Thread David Hastings
and we > need to be free to make important improvements with time." > > And all that aside, you have to re-index all the docs anyway or > your search results will be inconsistent. So leaving aside the > impossible task of covering all the possibilities on the fly, it’s > b

Re: converting string to solr.TextField

2020-10-16 Thread David Hastings
"If you want to keep the same field name, you need to delete all of the documents in the index, change the schema, and reindex." actually doesnt re-indexing a document just delete/replace anyways assuming the same id? On Fri, Oct 16, 2020 at 3:07 PM Alexandre Rafalovitch wrote: > Just as a side

Re: Solr endpoint on the public internet

2020-10-08 Thread David Hastings
dler. And block Config API to avoid attackers creating new > handlers. > > Regards, > Alex. > >> On Thu, 8 Oct 2020 at 14:54, David Hastings wrote: >> >> Well that’s why I suggested deleting the update handler :) >> >>>> On Oct 8, 2020, at 2:52

Re: Solr endpoint on the public internet

2020-10-08 Thread David Hastings
Well that’s why I suggested deleting the update handler :) > On Oct 8, 2020, at 2:52 PM, Walter Underwood wrote: > > Let me know where it is and I’ll delete all the documents in your collection. > It is easy, just one HTTP request. > > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a

Re: Master/Slave

2020-09-30 Thread David Hastings
>whether we should expect Master/Slave replication also to be deprecated it better not ever be depreciated. it has been the most reliable mechanism for its purpose, solr cloud isnt going to replace standalone, if it does, thats when I guess I stop upgrading or move to elastic On Wed, Sep 30, 202

HEY, are you using the Analytics contrib?

2020-09-03 Thread David Smiley
Solr maintainers continue to maintain it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Re: What is the Best way to block certain types of queries/ query patterns in Solr?

2020-09-03 Thread David Smiley
support arbitrary parameters you pass to Solr as-is that you don't know about in advance (i.e. use an allow-list). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Aug 31, 2020 at 10:57 AM Mark Robinson wrote: > Hi, > I had come across

Re: Error on searches containing specific character pattern

2020-09-03 Thread David Smiley
cf1ff/lucene/core/src/java/org/apache/lucene/util/QueryBuilder.java#L653 If you can reproduce this with the "techproducts" schema, please share the complete query. If there's a problem here, I suspect the synonyms you have may be pertinent. ~ David Smiley Apache Lucene/Solr S

Re: SOLR indexing takes longer time

2020-08-18 Thread David Hastings
Another thing to mention is to make sure the indexer you build doesnt send commits until its actually done. Made that mistake with some early in house indexers. On Tue, Aug 18, 2020 at 9:38 AM Charlie Hull wrote: > 1. You could write some code to pull the items out of Mongo and dump > them to d

[CVE-2020-13941] Apache Solr information disclosure vulnerability

2020-08-14 Thread David Smiley
to trusted paths * Prevent remote connection when using Windows UNC Paths ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Number of times in document

2020-08-12 Thread David Hastings
Is there any way to do a query for the minimum number of times a phrase or string exists in a document? This has been a request from some users as other search services (names not to be mentioned) have such a functionality. Ive been using solr since 1.4 and i think ive tried finding this ability

Re: Multiple "df" fields

2020-08-11 Thread David Hastings
why not use a copyfield for indexing? On Tue, Aug 11, 2020 at 9:59 AM Edward Turner wrote: > Hi all, > > Is it possible to have multiple "df" fields? (We think the answer is no > because our experiments did not work when adding multiple "df" values to > solrconfig.xml -- but we just wanted to do

Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread David Smiley
you are probably not using Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least the ones associated with the field indexes. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt wrote: >

Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-05 Thread David Smiley
What is the Solr field type definition for this field? And what sort of spatial data do you add here -- just points or what? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt wrote: > Hello Solr commun

Re: bin/solr auth enable

2020-07-31 Thread David Glick
/solr auth enable -prompt true Both fail with the NPE. Thanks, David. Sent from my iPhone > On Jul 31, 2020, at 7:03 PM, Jason Gerlowski wrote: > > Hi David, > > I tried this out locally but couldn't reproduce. The command you > provided above works just fine for me.

Re: solr query returns items with spaces removed

2020-07-29 Thread David Hastings
"Oh, and returning 100K docs is an anti-pattern, if you really need that many docs consider cursorMark and/or Streaming." er, i routinely ask for 2+ million records into a single file based on a query. I mean not into a web application or anything, its meant to be processed after the fact, but so

Re: Meow attacks

2020-07-28 Thread David Hastings
so, your zookeeper/solr servers have public facing addresses/ports? On Tue, Jul 28, 2020 at 4:41 PM Odysci wrote: > Folks, > > I suspect one of our Zookeeper installations on AWS was subject to a Meow > attack ( > > https://arstechnica.com/information-technology/2020/07/more-than-1000-database

bin/solr auth enable

2020-07-24 Thread David Glick
When I issue “bin/solr auth enable -prompt true -blockUnknown true”, I get a Null Pointer Exception. I’m using the 8.5.1 release. Am I doing something wrong? Thanks. Sent from my iPhone

Re: sorting help

2020-07-15 Thread David Hastings
ercaseFilter in front of your patternreplace, > you’re removing uppercase characters. > > Best, > Erick > > > On Jul 15, 2020, at 3:06 PM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > howdy, > > i have a field that sorts fine all ot

sorting help

2020-07-15 Thread David Hastings
howdy, i have a field that sorts fine all other content, and i cant seem to debug why it wont sort for me on this one chunk of it. "sort":"alphatitle asc", "debugQuery":"on", "_":"1594733127740"}}, "response ":{"numFound":3,"start":0,"docs":[ { "title":"Money orders", { "title":"Finance, consolidat

Re: Out of memory errors with Spatial indexing

2020-07-06 Thread David Smiley
I believe you are experiencing this bug: LUCENE-5056 <https://issues.apache.org/jira/browse/LUCENE-5056> The fix would probably be adjusting code in here org.apache.lucene.spatial.query.SpatialArgs#calcDistanceFromErrPct ~ David Smiley Apache Lucene/Solr Search Developer http://www.linked

Re: unified highlighter performance in solr 8.5.1

2020-07-04 Thread David Smiley
Here's my PR, which includes some edits to the ref guide docs where I tried to clarify these settings a little too. https://github.com/apache/lucene-solr/pull/1651 ~ David On Sat, Jul 4, 2020 at 8:44 AM Nándor Mátravölgyi wrote: > I guess that's fair. Let's have hl.fragsiz

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
;true"? We agree on better documenting the perf trade-off. Thanks again for working on these settings, BTW. ~ David On Fri, Jul 3, 2020 at 1:25 PM Nándor Mátravölgyi wrote: > Since the issue seems to be affecting the highlighter differently > based on which mode it is using,

Re: Out of memory errors with Spatial indexing

2020-07-03 Thread David Smiley
class="solr.RptWithGeometrySpatialField" which internally is based off a combination of a course grid and storing the original vector geometry for accurate verification: The internally coarser grid will lessen the impact of that pole bug. ~ David Smiley Apache Lucene/Solr Search

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
ing with '0') and additional performance benefit from that. What do you think Nandor, Michal? I'm hoping a change in settings (+ some better notes/docs on this) could slip into an 8.6, all done by myself ASAP. ~ David On Fri, Jun 19, 2020 at 2:32 PM Nándor Mátravölgyi wrote: > Hi

Re: How to determine why solr stops running?

2020-06-29 Thread David Hastings
script, you _should_ have had very clear > > evidence that that was the cause. > > > > If you were not running the killer script, the apologies for not asking > > about that > > in the first place. Java’s performance is unpredictable when OOMs happen, > > which is th

Re: How to determine why solr stops running?

2020-06-29 Thread David Hastings
> > > On Tue, Jun 16, 2020 at 1:00 PM David Hastings < > hastings.recurs...@gmail.com> > wrote: > > > me personally, around 290gb. as much as we could shove into them > > > > On Tue, Jun 16, 2020 at 12:44 PM Erick Erickson > > > wrote: > &g

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread David Cumings
ntinue this conversation here while making sure that we converge > without much bike-shedding. > > -Anshum > -- David Cumings AU: +61 498 137 841 US: +1 (929) 291-0801 UK: +44 7725 057 500 <-- Currently in the UK IN: +91 82771 96058 d...@cumings.com

Re: Master Slave Terminology

2020-06-17 Thread David Smiley
priv...@lucene.apache.org but it should have been public and expect it to spill out to the dev list today. ~ David On Wed, Jun 17, 2020 at 11:14 AM Mike Drob wrote: > Hi Jan, > > Can you link to the discussion? I searched the dev list and didn’t see > anything, is it on slack

Re: Facet Performance

2020-06-17 Thread David Smiley
I strongly recommend setting indexed=true on a field you facet on for the purposes of efficient refinement (fq=field:value). But it strictly isn't required, as you have discovered. ~ David On Wed, Jun 17, 2020 at 9:02 AM Michael Gibney wrote: > facet.method=enum works by executing

Re: Solr 7.6 optimize index size increase

2020-06-16 Thread David Hastings
I cant give you a 100% true answer but ive experienced this, and what "seemed" to happen to me was that the optimize would start, and that will drive the size up by 3 fold, and if you out of disk space in the process the optimize will quit since, it cant optimize, and leave the live index pieces in

Re: How to determine why solr stops running?

2020-06-16 Thread David Hastings
the sum of the heap allocations across all your JVMs should be below > that percentage. See Uwe Schindler's mmapdirectiry blog... > > Shot in the dark... > > On Tue, Jun 16, 2020, 11:51 David Hastings > wrote: > > > To add to this, i generally have solr start with

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-06-16 Thread Warren, David [USA]
David – It’s fine to take this conversation back to the mailing list. Thank you very much again for your suggestions. I think you are correct. It doesn’t appear necessary to set termOffsets, and it appears that that the unified highlighter is using the TERM_VECTORS offset source if I don’t

Re: How to determine why solr stops running?

2020-06-16 Thread David Hastings
To add to this, i generally have solr start with this: -Xms31000m-Xmx31000m and the only other thing that runs on them are maria db gallera cluster nodes that are not in use (aside from replication) the 31gb is not an accident either, you dont want 32gb. On Tue, Jun 16, 2020 at 11:26 AM Shawn H

Re: Question about Atomic Update

2020-06-15 Thread david . davila
Hi Erick, Thank you for your answer. Unfortunatelly our most important field is that text field, so, we need to index it. We will have to assume that big documents takes a long time to index. Best, David David Dávila Atienza AEAT - Departamento de Informática Tributaria Subdirección de

Question about Atomic Update

2020-06-15 Thread david . davila
tested with Solr 7.4 and Solr 4.10 Thanks, David

Re: using solr to extarct keywords from a long text?

2020-06-11 Thread David Zimmermann
Hi Mikhail Your suggested solution does seem to work for me. Thank you so much for the help! Best regards David For future reference in case someone else wants do the same, here are some more details about the steps needed: - The more like this handler is not in the default solrconfig.xml

using solr to extarct keywords from a long text?

2020-06-10 Thread David Zimmermann
up as standalone and not in cloud mode. Best David

Re: Getting rid of zookeeper

2020-06-09 Thread David Hastings
Zookeeper is annoying to both set up and manage, but then again the same thing can be said about solr cloud. not certain why you would want to deal with either On Tue, Jun 9, 2020 at 3:29 PM S G wrote: > Hello, > > I recently stumbled across KIP-500: Replace ZooKeeper with a Self-Managed > Meta

Re: Script to check if solr is running

2020-06-08 Thread David Hastings
> > Why have a cold backup and then switch? > my current set up is: 1. master indexer 2. master slave on a release/commit basis 3. 3 live slave searching nodes in two data different centers the three live nodes are in front of nginx load balancing and they are mostly hot but not all of them, i f

Re: Edismax query using different strings for different fields

2020-06-07 Thread David Zimmermann
then recombining the results. But that way I don’t know if the resulting scores are comparable? Can I assume a score of 15 from the English edismax is better than a score of 13 from the German edismax? Best regards David On 5 Jun 2020, at 19:39, Erick Erickson mailto:erickerick...@gmail.com

Edismax query using different strings for different fields

2020-06-05 Thread David Zimmermann
I could need some advice on how to handle a particular cross language search with Solr. I posted it on Stackoverflow 2 months ago, but could not find a solution. I have documents in 3 languages (English, German, French). For simplicity let's assume it's just two languages (English and German). T

Re: Why Did It Match?

2020-05-29 Thread David Smiley
searching on an aggregate catch-all field. ~ David On Thu, May 28, 2020 at 6:40 PM Walter Underwood wrote: > Are you sure they will wonder? I’d try it without that and see if the > simpler UI is easier to use. Simple almost always wins the A/B test. > > You can use the highlighter t

Re: unified highlighter performance in solr 8.5.1

2020-05-27 Thread David Smiley
arger than the Reuters ones (so it appears, any way). I had to do a bit of hacking to use the "LengthGoalBreakIterator, which wasn't previously used by this framework. ~ David On Tue, May 26, 2020 at 4:42 PM Michal Hlavac wrote: > fine, I'l try to write simple test, thank

Re: unified highlighter performance in solr 8.5.1

2020-05-26 Thread David Smiley
Please create an issue. I haven't reproduced it yet but it seems unlikely to be user-error. ~ David On Mon, May 25, 2020 at 9:28 AM Michal Hlavac wrote: > Hi, > > I have field: > stored="true" indexed="false" storeOffsetsWithPositions="true"/&

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread David Smiley
Wow that's terrible! So this problem is for SENTENCE in particular, and it's a regression in 8.5? I'll see if I can reproduce this with the Lucene benchmark module. I figure you have some meaty text, like "page" size or longer? ~ David On Mon, May 25, 2020 at 10:3

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
ich is what bin/post does with HTML. Good luck! ~ David On Sun, May 24, 2020 at 10:52 AM Serkan KAZANCI wrote: > Hi David, > > I have many meta-tags in html documents like content="2019-10-15T23:59:59Z"> which matches the field descriptions in > schema file. > > A

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
s a problem, and the root cause is here: LUCENE-5734 <https://issues.apache.org/jira/browse/LUCENE-5734> It's on my long TODO list but hasn't bitten me lately so I've neglected it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley O

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread David Smiley
I get around to implementing hl.preserveMulti for the UH, i'll have it make this assumption likewise. ~ David On Sat, May 23, 2020 at 1:48 PM Walter Underwood wrote: > I’m a little amused that this thread has become active after almost two > months of silence. > > I thin

Re: Creating custom PassageFormatter

2020-05-22 Thread David Smiley
You've probably gotten you answer now but "no". Basically, you'd need to specify your own subclass of UnifiedSolrHighlighter in solrconfig.xml like this: Error loading class 'solr.highlight.CustomPassageFormatter'". > > Example from solrconfig.xml: > class="solr.highlight.CustomPassageFormat

Re: hl.preserveMulti in Unified highlighter?

2020-05-22 Thread David Smiley
ntirety, which is a null concatenated sequence of all the values for this field for a document. ~ David On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood wrote: > We are testing 6.6.1. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread David Smiley
t'd be nice if Solr had a DocTransformer to accomplish that. I know it's been awhile; I'm curious how the UH has been working for you, assuming you are using it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Jun 2, 2019 at

Re: unified highlighter methods works unexpected

2020-05-22 Thread David Smiley
diagnose the underlying problem and possibly fix. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Apr 2, 2020 at 9:02 AM Szűcs Roland wrote: > Hi All, > > I use Solr 8.4.1 and implement suggester functionality. As part of the > suggestio

Re: Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2020-05-22 Thread David Smiley
ow to reproduce this from what Solr ships with, like the techproducts example schema and dataset. ~ David On Sun, Jul 21, 2019 at 10:07 PM Richard Walker wrote: > On 22 Jul 2019, at 11:32 am, Richard Walker > wrote: > > I'm trying out the advice in the user guide > > (

Re: Highlighting Solr 8

2020-05-22 Thread David Smiley
What did you end up doing, Eric? Did you migrate to the Unified Highlighter? ~ David On Wed, Oct 16, 2019 at 4:36 PM Eric Allen wrote: > Thanks for the reply. > > Currently we are migrating from solr4 to solr8 under solr 4 we wrote our > own highlighter because the provided one

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-22 Thread David Smiley
debug=query to Solr then you'll get a a parsed version of the query that would be helpful to me. ~ David On Mon, May 11, 2020 at 10:46 AM Warren, David [USA] wrote: > I am running Solr 8.4 and am attempting to use its highlighting feature. > It appears to work well when I use the origina

Re: What is the logical order of applying sorts in SOLR?

2020-05-16 Thread David Hastings
the bq parameter, heres a SO thread for it: https://stackoverflow.com/questions/45150856/how-to-know-when-to-use-solr-bq-vs-bf-and-how-to-apply-query-logic On Sat, May 16, 2020 at 6:27 PM Stephen Lewis Bianamara < stephen.bianam...@gmail.com> wrote: > Hi Paras, > > I'm not sure I follow. How wou

Re: Limiting random results set with facets.

2020-05-12 Thread David Lukowski
.mincount=1 &hl=false &fl=id, text, users &wt=json &sort=date desc Working well so far, but still not ideal. Thanks for the assist, David On Tue, May 12, 2020 at 7:31 PM Srijan wrote: > I see what you mean now. You could use two queries - first would return 100 > randomly

Re: Limiting random results set with facets.

2020-05-12 Thread David Lukowski
015, "3",3021, "2",736, "1",41, "34",41, "35",32, "72",8, "7",1]}, RESULTS I WANT: "response":{"numFound":100,"start":0,"docs":[] },

Limiting random results set with facets.

2020-05-11 Thread David Lukowski
I'm looking for a way if possible to run a query with random results, where I limit the number of results I want back, yet still have the facets accurately reflect the results I'm searching. When I run a search I use a filter query to randomize the results based on a modulo of a random seed. This

Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-11 Thread Warren, David [USA]
at the Solr highlighting documentation, I didn’t see any additional configuration which needs to be done to get the unified highlighter to work. I realize I have not provided a bunch of information here, but obviously can provide more if needed. Thank you, David Warren Booz | Allen | Hamilton 703-625-0311 mobile

Re: Stopwords impact on search

2020-04-24 Thread David Hastings
you should never use the stopword filter unless you have a very specific purpose On Fri, Apr 24, 2020 at 8:33 AM Steven White wrote: > Hi everyone, > > What is, if any, the impact of stopwords in to my search ranking quality? > Will my ranking improve is I do not index stopwords? > > I'm trying

Re: Solr index size has increased in solr 7.7.2

2020-04-15 Thread David Hastings
i wouldnt worry about the index size until you get above a half terabyte or so. adding doc values and other features means you sacrifice things that dont matter, like size. memory and ssd's are cheap. On Wed, Apr 15, 2020 at 1:21 PM Rajdeep Sahoo wrote: > Hi all > We are migrating from solr 4.

Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
master slave is the idea that you have an indexing server you do all indexing to and a search server that replicates the index, to deliver the results etc. if you keep the indexer separate you can tune it differently as well as protect the data. also means you can remove the delete/update request

Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
Honestly? I know this isnt what youre going to want to hear, but security through obscurity. no one else knows what port the servers on, and its not accessible from anything outside of the internal network. if your solr install can be accessed from an external IP you have much larger issues. On

Does LTRQueryParser accept local variables?

2020-03-09 Thread David White
Hi all, Consider the following edismax query parser usage: {!edismax qf="field1 field2" v=$query} A local variable, in this case, query, is used in the parser, through use of the dollar sign operator. Does the LTR query parser have the same capability? Thanks, David White Th

Re: [SUSPICIOUS] Re: Best Practises around relevance tuning per query

2020-02-18 Thread David Hastings
I don’t think anyone is responding because it’s too focused of a use case, where you just simply have to figure out an alternative on your own. > On Feb 19, 2020, at 12:28 AM, Ashwin Ramesh wrote: > > ping on this :) > >> On Tue, Feb 18, 2020 at 11:50 AM Ashwin Ramesh wrote: >> >> Hi, >>

Re: Re-creating deleted Managed Stopwords lists results in error

2020-02-17 Thread David Hastings
ful stuff. > > Luckily for you, the patent on that has expired. :-) > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Feb 17, 2020, at 10:46 AM, David Hastings < > hastings.recurs...@gmail.com> wrote: >

Re: Re-creating deleted Managed Stopwords lists results in error

2020-02-17 Thread David Hastings
i use stop words for building shingles into "interesting phrases" for my machine teacher/students, so i wouldnt say theres no reason, however my use case is very specific. Otherwise yeah, theyre gone for all practical reasons/search scenarios. On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood wro

Re: Syntax error while parsing Spatial Query as string

2020-02-14 Thread David Smiley
n 9.0 as it's obsolete. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Feb 14, 2020 at 6:47 AM vas aj wrote: > Hi team, > > I am using Lucene 6.6.2, Spatial4j 0.7, lucene-spatial-extras 6.6.2. I am > trying to create a Spat

  1   2   3   4   5   6   7   8   9   10   >