Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-19 Thread David Smiley
ighlighting. > > > Regarding the existing bug, I think there might be an additional issue > > > here because it happens only when id field contains an underscore > (didn't > > > check for other special characters). > > > Currently I have no other choice but to

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-19 Thread Gus Heck
issue > > here because it happens only when id field contains an underscore (didn't > > check for other special characters). > > Currently I have no other choice but to use enableLazyFieldLoading=false. > > I hope it wouldn't have a significant performance impact. &

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-18 Thread David Smiley
formance impact. > > -Original Message- > From: David Smiley > Sent: יום ה 18 פברואר 2021 01:03 > To: solr-user > Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy Field > Loading => Invalid Index > > I think the issue is this existing bug, but

RE: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-18 Thread Nussbaum, Ronen
but to use enableLazyFieldLoading=false. I hope it wouldn't have a significant performance impact. -Original Message- From: David Smiley Sent: יום ה 18 פברואר 2021 01:03 To: solr-user Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
; termVectors="true" termOffsets="true" termPositions="true" >> required="false" multiValued="true" /> >> Than I inserted one document with a nested child e.g. >> {id:"abc_1", utterances:{id:"abc_1-1", text_e

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
sets="true" termPositions="true" required="false" > multiValued="true" /> > Than I inserted one document with a nested child e.g. > {id:"abc_1", utterances:{id:"abc_1-1", text_en:"Solr is great"}} > > To reproduce:

RE: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread Nussbaum, Ronen
I inserted one document with a nested child e.g. {id:"abc_1", utterances:{id:"abc_1-1", text_en:"Solr is great"}} To reproduce: Do a search with surround and unified highlighter: hl.fl=text_en&hl.method=unified&hl=on&q=%7B!surround%7Dtext_en%3A4W("s

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread David Smiley
Hello Ronen, Can you please file a JIRA issue? Some quick searches did not turn anything up. It would be super helpful to me if you could list a series of steps with Solr out-of-the-box in 8.8 including what data to index and query. Solr already includes the "tech products" sample data; maybe t

Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread Ronen Nussbaum
Hi All, I discovered a strange behaviour with this combination. Not only the atomic update fails, the child documents are not properly indexed, and you can't use highlights on their text fields. Currently there is no workaround other than reindex. Checked on 8.3.0, 8.6.1 and 8.8.0. 1. Configure n

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-02-01 Thread Kerwin
Hi David, Thanks for filing this issue. The classic non-weightMatcher mode works well for us right now. Yes, we are using the POSTINGS mode for most of the fields although explicitly mentioning it gives an error since not all fields are indexed with offsets. So I guess the highlighter is picking

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-29 Thread David Smiley
https://issues.apache.org/jira/browse/SOLR-10321 -- near the end my opinion is we should just omit the field if there is no highlight, which would address your need to do this work-around. Glob or no glob. PR welcome! It's satisfying seeing that the Unified Highlighter is so much faster

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-28 Thread Kerwin
On another note, since response time is in question, I have been using a customhighlighter to just override the method encodeSnippets() in the UnifiedSolrHighlighter class since solr 6 since Solr sends back blank array (ZERO_LEN_STR_ARRAY) in the response payload for fields that do not match. Here

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-28 Thread Kerwin
Hi David, Thanks so much for your reply. hl.weightMatches was indeed the culprit. After setting it to false, I am now getting the same sub-second response as Solr 6. I am using Solr 8.6.1 (8.6.1) Here are the tests I carried out: hl.requireFieldMatch=true&hl.weightMatches=true (2458 ms) hl.requi

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-28 Thread David Smiley
tches=true is now the default. Try setting it to false. Does that help performance much? It's documented on the highlighting page of the ref guide: https://lucene.apache.org/solr/guide/8_7/highlighting.html#the-unified-highlighter You might want to try toggling hl.requireFieldMatch=true (default

Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-26 Thread Kerwin
Hi, While upgrading to Solr 8 from 6 the Unified highlighter begins to have performance issues going from approximately 100ms to more than 4 seconds with 76 fields in the hl.q and hl.fl parameters. So I played with different options and found that the hl.q parameter needs to have any one field

Re: unified highlighter performance in solr 8.5.1

2020-07-04 Thread David Smiley
Here's my PR, which includes some edits to the ref guide docs where I tried to clarify these settings a little too. https://github.com/apache/lucene-solr/pull/1651 ~ David On Sat, Jul 4, 2020 at 8:44 AM Nándor Mátravölgyi wrote: > I guess that's fair. Let's have hl.fragsizeIsMinimum=true as def

Re: unified highlighter performance in solr 8.5.1

2020-07-04 Thread Nándor Mátravölgyi
I guess that's fair. Let's have hl.fragsizeIsMinimum=true as default. On 7/4/20, David Smiley wrote: > I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms of > quality of the highlight since there are vastly more breaks to pick from. > I think that setting is more useful in S

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms of quality of the highlight since there are vastly more breaks to pick from. I think that setting is more useful in SENTENCE mode if you can stand the perf hit. If you agree, then why not just let this one default to "true"?

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread Nándor Mátravölgyi
Since the issue seems to be affecting the highlighter differently based on which mode it is using, having different defaults for the modes could be explored. WORD may have the new defaults as it has little effect on performance and it creates nicer highlights. SENTENCE should have the defaults tha

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I think we should flip the default of hl.fragsizeIsMinimum to be 'true', thus have the behavior close to what preceded 8.5. (a) it was very recently (<= 8.4) the previous behavior and so may require less tuning for users in 8.6 henceforth (b) it's significantly faster for long text -- seems to be 2

Re: unified highlighter performance in solr 8.5.1

2020-06-19 Thread Nándor Mátravölgyi
Hi! With the provided test I've profiled the preceding() and following() calls on the base Java iterators in the different options. === default highlighter arguments === Calling the test query with SENTENCE base iterator: - from LengthGoalBreakIterator.following(): 1130 calls of baseIter.precedin

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-06-16 Thread Warren, David [USA]
David – It’s fine to take this conversation back to the mailing list. Thank you very much again for your suggestions. I think you are correct. It doesn’t appear necessary to set termOffsets, and it appears that that the unified highlighter is using the TERM_VECTORS offset source if I don’t

Re: unified highlighter performance in solr 8.5.1

2020-06-08 Thread Michal Hlavac
Hi David, sorry for my late answer. I created simple test scenarios on github https://github.com/hlavki/solr-unified-highlighter-test[1] There are 2 documents, both bigger sized. Test method: https://github.com/hlavki/solr-unified-highlighter-test/blob/master/src/test/java/com/example

Re: unified highlighter performance in solr 8.5.1

2020-05-28 Thread Nándor Mátravölgyi
Hi! I've not been able to delve into this issue deeply, but it could be useful to know that "fragsizeIsMinimum" and "fragAlignRatio" are new parameters which have behavior changing default values. Leaving those with their default values makes the comparison between 8.4 and 8.5 like apples to oran

Re: unified highlighter performance in solr 8.5.1

2020-05-27 Thread David Smiley
try setting hl.fragsizeIsMinimum=true I did some benchmarking and found that this helps quite a bit BTW I used the highlights.alg benchmark file, with some changes to make it more reflective of your scenario -- offsets in postings, and used "enwiki" (english wikipedia) docs which are larger than

Re: unified highlighter performance in solr 8.5.1

2020-05-26 Thread Michal Hlavac
fine, I'l try to write simple test, thanks On utorok 26. mája 2020 17:44:52 CEST David Smiley wrote: > Please create an issue. I haven't reproduced it yet but it seems unlikely > to be user-error. > > ~ David > > > On Mon, May 25, 2020 at 9:28 AM Michal Hlavac wrote: > > > Hi, > > > > I have

Re: unified highlighter performance in solr 8.5.1

2020-05-26 Thread David Smiley
Please create an issue. I haven't reproduced it yet but it seems unlikely to be user-error. ~ David On Mon, May 25, 2020 at 9:28 AM Michal Hlavac wrote: > Hi, > > I have field: > stored="true" indexed="false" storeOffsetsWithPositions="true"/> > > and configuration: > true > unified > true >

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread Michal Hlavac
Yes, have no problems in 8.4.1, only 8.5.1 Also yes, those are multi page pdf files. m. On pondelok 25. mája 2020 19:11:31 CEST David Smiley wrote: > Wow that's terrible! > So this problem is for SENTENCE in particular, and it's a regression in > 8.5? I'll see if I can reproduce this with the Lu

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread David Smiley
Wow that's terrible! So this problem is for SENTENCE in particular, and it's a regression in 8.5? I'll see if I can reproduce this with the Lucene benchmark module. I figure you have some meaty text, like "page" size or longer? ~ David On Mon, May 25, 2020 at 10:38 AM Michal Hlavac wrote: >

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread Michal Hlavac
I did same test on solr 8.4.1 and response times are same for both hl.bs.type=SENTENCE and hl.bs.type=WORD m. On pondelok 25. mája 2020 15:28:24 CEST Michal Hlavac wrote: Hi, I have field: and configuration: true unified true content_txt_sk_highlight 2 true Doing query with hl.bs.type=S

unified highlighter performance in solr 8.5.1

2020-05-25 Thread Michal Hlavac
Hi, I have field: and configuration: true unified true content_txt_sk_highlight 2 true Doing query with hl.bs.type=SENTENCE it takes around 1000 - 1300 ms which is really slow. Same query with hl.bs.type=WORD takes from 8 - 45 ms is this normal behaviour or should I create issue? thanks, m.

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread Jörn Franke
iginal Message- > From: Jörn Franke [mailto:jornfra...@gmail.com] > Sent: Sunday, May 24, 2020 1:22 PM > To: solr-user@lucene.apache.org > Subject: Re: highlighting a whole html document using Unified highlighter > > hl.fragsize=0 > > https://lucene.apache.org/solr/guide/8_

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread Serkan KAZANCI
g the field data coming from meta-tags and not strip the html >> tags) >> >> Then I could use solr.HTMLStripCharFilterFactory for analysis. >> >> Thank You, >> >> Serkan, >> >> >> >> >> -Original Message- >> From: Davi

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
d Smiley [mailto:dsmi...@apache.org] > Sent: Sunday, May 24, 2020 5:26 PM > To: solr-user > Subject: Re: highlighting a whole html document using Unified highlighter > > Instead of stripping the HTML for the stored value, leave it be and remove > it during the analysis stage with solr.HT

RE: highlighting a whole html document using Unified highlighter

2020-05-24 Thread Serkan KAZANCI
, -Original Message- From: David Smiley [mailto:dsmi...@apache.org] Sent: Sunday, May 24, 2020 5:26 PM To: solr-user Subject: Re: highlighting a whole html document using Unified highlighter Instead of stripping the HTML for the stored value, leave it be and remove it during the analysis stage with

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
e=0 > parameter, it is displayed as original html document? > > Or > > Is it possible to give a whole html document as a parameter to the Unified > highlighter so that output is also a highlighted html document? > > Or > > Do you have a better idea to highlight

RE: highlighting a whole html document using Unified highlighter

2020-05-24 Thread Serkan KAZANCI
document as a parameter to the Unified highlighter so that output is also a highlighted html document? Or Do you have a better idea to highlight the keywords of the whole html document? Thanks, Serkan -Original Message- From: Jörn Franke [mailto:jornfra...@gmail.com] Sent: Sunday

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread Jörn Franke
eywords that are used to > find and access the document. > > > > Unified highlighter is fast, accurate and supports different languages but > only highlights passages with given parameters. > > > > How can I highlight a whole html document using Unified highlig

highlighting a whole html document using Unified highlighter

2020-05-24 Thread Serkan KAZANCI
Hi, I use solr to search over a million html documents, when a document is searched and displayed, I want to highlight the keywords that are used to find and access the document. Unified highlighter is fast, accurate and supports different languages but only highlights passages with given

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread David Smiley
e=WHOLE as well, then a a simpler PassageFormatter > >> could basically ignore the passage starts & ends and merely mark up the > >> original content in entirety, which is a null concatenated sequence of > all > >> the values for this field for a document. > &

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread Walter Underwood
which is a null concatenated sequence of all >> the values for this field for a document. >> >> ~ David >> >> >> On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood >> wrote: >> >>> We are testing 6.6.1. >>> >>> wunder >

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread Anthony Groves
; > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > On Mar 29, 2019, at 11:02 AM, Walter Underwood > > wrote: > > > > > > In testing, hl.preserveMulti=true works with the uni

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread Furkan KAMACI
Hi David, Thanks for the response! I use Unified Highlighter combined with maxAnalyzedChars to accomplish my needs. I'll file an issue and PR for it! Kind Regards, Furkan KAMACI On Fri, May 22, 2020 at 11:25 PM David Smiley wrote: > Feel free to file an issue; I know it's not

Re: hl.preserveMulti in Unified highlighter?

2020-05-22 Thread David Smiley
blog) > > > On Mar 29, 2019, at 11:02 AM, Walter Underwood > wrote: > > > > In testing, hl.preserveMulti=true works with the unified highlighter. > But the documentation says that the parameter is only implemented in the > original highlighter. > > > &g

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread David Smiley
6:47 AM Furkan KAMACI wrote: > Hi All, > > I want to switch to Unified Highlighter due to performance reasons for my > Solr 7.6 I was using these fields > > solrQuery.addHighlightField("content_*") > .set("f.content_en.hl.alternateField", "content&quo

Re: unified highlighter methods works unexpected

2020-05-22 Thread David Smiley
Hi Roland, I was not able to reproduce this. I modified the tech_products same config to change the name field to use a new field type that had a trivial edgengram config. Then I composed this query based. alittle on some of your parameters, and it did find highlights: http://localhost:8983/solr

Re: Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2020-05-22 Thread David Smiley
> https://lucene.apache.org/solr/guide/8_1/highlighting.html#schema-options-and-performance-considerations > ) > > for using the unified highlighter. > > > > ... > > * "set storeOffsetsWithPositions to true" > > * "set termVectors to true but no oth

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-22 Thread David Smiley
l highlighter or the term > vector highlighter, but when I try to use the unified highlighter, I get no > results returned. My Google searches so far have not revealed anybody > having this same problem (perhaps user error on my part), hence why I’m > asking a question to the Solr mail

Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-11 Thread Warren, David [USA]
I am running Solr 8.4 and am attempting to use its highlighting feature. It appears to work well when I use the original highlighter or the term vector highlighter, but when I try to use the unified highlighter, I get no results returned. My Google searches so far have not revealed anybody

unified highlighter methods works unexpected

2020-04-02 Thread Szűcs Roland
Hi All, I use Solr 8.4.1 and implement suggester functionality. As part of the suggestions I would like to show product info so I had to implement this functionality with normal query parsers instead of suggester component. I applied an edgengramm filter without stemming to fasten the analysis of

Unified highlighter on result of query with two required terms which matched separate fields

2019-07-25 Thread Richard Walker
Hi, I'm trying to understand what's going on with the combination of: * Solr 8.1.1 * edismax parser * qf with multiple fields specified (each of which has type text_en_splitting, some of which are multiValued) * unified highlight method * query with two terms * results where the two terms match

Re: Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2019-07-21 Thread Richard Walker
On 22 Jul 2019, at 11:32 am, Richard Walker wrote: > I'm trying out the advice in the user guide > ( > https://lucene.apache.org/solr/guide/8_1/highlighting.html#schema-options-and-performance-considerations > ) > for using the unified highlighter. > > ... > * &quo

Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2019-07-21 Thread Richard Walker
I'm trying out the advice in the user guide ( https://lucene.apache.org/solr/guide/8_1/highlighting.html#schema-options-and-performance-considerations ) for using the unified highlighter. I saw the note: "This is definitely the fastest option for highlighting wildcard queries on

Alternate Fields for Unified Highlighter

2019-06-02 Thread Furkan KAMACI
Hi All, I want to switch to Unified Highlighter due to performance reasons for my Solr 7.6 I was using these fields solrQuery.addHighlightField("content_*") .set("f.content_en.hl.alternateField", "content") .set("f.content_es.hl.alternateField", &quo

Re: hl.preserveMulti in Unified highlighter?

2019-03-29 Thread Walter Underwood
We are testing 6.6.1. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 29, 2019, at 11:02 AM, Walter Underwood wrote: > > In testing, hl.preserveMulti=true works with the unified highlighter. But the > documentation says that the

Re: hl.preserveMulti in Unified highlighter?

2019-03-29 Thread Walter Underwood
In testing, hl.preserveMulti=true works with the unified highlighter. But the documentation says that the parameter is only implemented in the original highlighter. Is the documentation wrong? Can we trust this to keep working with unified? wunder Walter Underwood wun...@wunderwood.org http

hl.preserveMulti in Unified highlighter?

2019-03-26 Thread Walter Underwood
It looks like hl.preserveMulti is only implemented in the Original highlighter. Has anyone looked at doing this for the Unified highlighter? We need to preserve order in the highlights for a multi-valued field. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my

Unified highlighter

2018-07-12 Thread Julien Massiera
Hi Solr community, I would like some help with a strange behavior that I observe on the unified highlighter. Here is the configuration of my highlighter : on unified false <span class="em"> </span> content_fr content_en exactContent true CHARACTER html 200 51200

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-06 Thread Yasufumi Mizoguchi
Hi Shawn, Thank you for your reply. > that sounds like a bug in the argument parser that needs to be fixed. I have created a JIRA about this. https://issues.apache.org/jira/browse/SOLR-11334 Thanks, Yasufumi On 2017/09/06 9:48 PM, Shawn Heisey wrote: On 9/4/2017 9:49 PM, Yasufumi Mizoguchi

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-06 Thread Shawn Heisey
On 9/4/2017 9:49 PM, Yasufumi Mizoguchi wrote: > I understood what you are saying. However, at least, I think it > strange that UnifiedSolrHighlighter > returns the same error when choosing ", " as the field delimiter in > hl.fl (e.g. hl.fl=name,%20manu). > This is because UnifiedSolrHighlighter de

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-04 Thread Yasufumi Mizoguchi
Hi, Shawn, (Sorry, I have sent this your private email address...) Thanks for your reply. I understood what you are saying. However, at least, I think it strange that UnifiedSolrHighlighter returns the same error when choosing ", " as the field delimiter in hl.fl (e.g. hl.fl=name,%20manu). Thi

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-04 Thread Shawn Heisey
On 9/3/2017 10:31 PM, Yasufumi Mizoguchi wrote: > I am testing UnifiedHighlighter(hl.method=unified) with Solr 6.6 and > found that the highlighter returns following error when hl.fl > parameter has undefined fields. > The error occurs even if hl.fl parameter has ", "( + ) > as a field delimiter. (

Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-03 Thread Yasufumi Mizoguchi
Hi, I am testing UnifiedHighlighter(hl.method=unified) with Solr 6.6 and found that the highlighter returns following error when hl.fl parameter has undefined fields. The error occurs even if hl.fl parameter has ", "( + ) as a field delimiter. (e.g. hl.fl=name, manu) Is this a bug? I think tha

Re: The unified highlighter html escaping. Seems rather extreme...

2017-07-20 Thread David Smiley
The escaping does appear excessive. Please file a bug to the Lucene project in Apache JIRA. On Fri, May 26, 2017 at 11:26 AM Michael Joyner wrote: > Isn't the unified html escaper a rather bit extreme in it's escaping? > > It makes it hard to deal with for simple post-processing. > > The origin

Re: The unified highlighter html escaping. Seems rather extreme...

2017-05-28 Thread Zheng Lin Edwin Yeo
Hi, I'm not so sure about the escaping, but to control how much text is returned as context around the highlighted frag, you can set the following in solrconfig.xml. 200 This will limit the fragments to consider for highlight to around 200 characters, and it will not return the whole chunk of da

The unified highlighter html escaping. Seems rather extreme...

2017-05-26 Thread Michael Joyner
Isn't the unified html escaper a rather bit extreme in it's escaping? It makes it hard to deal with for simple post-processing. The original html escaper seems to do minimial escaping, not every non-alphabetical character it can find. Also, is there a way to control how much text is returned

Unified highlighter and complexphrase

2017-03-17 Thread Bjarke Buur Mortensen
Hi list, Given the text: "Kontraktsproget vil være dansk og arbejdssproget kan være dansk, svensk, norsk og engelsk" and the query: {!complexphrase df=content_da}("sve* no*") the unified highlighter (hl.method=unified) does not return any highlights. For reference, the original