So are these really text locations or rather actually sections of the document. If later, can you parse out sections during indexing?
Regards, Alex On Wed, Oct 16, 2019, 3:57 AM Kaminski, Adi, <adi.kamin...@verint.com> wrote: > Hi, > Thanks for the responses. > > It's a soft boundary which is resulted by dynamic syntax from our > application. So may vary from different user searches, one user can search > some "word1" in starting 30 words, and another can search "word2" in > starting 10 words. The use case is to match some terms/phrase in specific > document places in order to identify scripts/specific word ocuurences. > > So I guess copy field won't work here. > > Any other suggestions/thoughts ? > Maybe some hidden position filters in native level to limit from start/end > of the document ? > > Thanks, > Adi > > -----Original Message----- > From: Tim Casey <tca...@gmail.com> > Sent: Tuesday, October 15, 2019 11:05 PM > To: solr-user@lucene.apache.org > Subject: Re: Position search > > If this is about a normalized query, I would put the normalization text > into a specific field. The reason for this is you may want to search the > overall text during any form of expansion phase of searching for data. > That is, maybe you want to know the context of up to the 120th word. At > least you have both. > Also, you may want to note which normalized fields were truncated or were > simply too small. This would give some guidance as to the bias of the > normalization. If 95% of the fields were not truncated, there is a chance > you are not doing good at normalizing because you have a set of > particularly short messages. So I would expect a small set of side fields > remarking this. This would allow you to carry the measures along with the > data. > > tim > > On Tue, Oct 15, 2019 at 12:19 PM Alexandre Rafalovitch <arafa...@gmail.com > > > wrote: > > > Is the 100 words a hard boundary or a soft one? > > > > If it is a hard one (always 100 words), the easiest is probably copy > > field and in the (unstored) copy, trim off whatever you don't want to > > search. Possibly using regular expressions. Of course, "what's a word" > > is an important question here. > > > > Similarly, you could do that with Update Request Processors and > > clone/process field even before it hits the schema. Then you could > > store the extract for highlighting purposes. > > > > Regards, > > Alex. > > > > On Tue, 15 Oct 2019 at 02:25, Kaminski, Adi <adi.kamin...@verint.com> > > wrote: > > > > > > Hi, > > > What's the recommended way to search in Solr (assuming 8.2 is used) > > > for > > specific terms/phrases/expressions while limiting the search from > > position perspective. > > > For example to search only in the first/last 100 words of the document > ? > > > > > > Is there any built-in functionality for that ? > > > > > > Thanks in advance, > > > Adi > > > > > > > > > This electronic message may contain proprietary and confidential > > information of Verint Systems Inc., its affiliates and/or > > subsidiaries. The information is intended to be for the use of the > > individual(s) or > > entity(ies) named above. If you are not the intended recipient (or > > authorized to receive this e-mail for the intended recipient), you may > > not use, copy, disclose or distribute to anyone this message or any > > information contained in this message. If you have received this > > electronic message in error, please notify us by replying to this e-mail. > > > > > This electronic message may contain proprietary and confidential > information of Verint Systems Inc., its affiliates and/or subsidiaries. The > information is intended to be for the use of the individual(s) or > entity(ies) named above. If you are not the intended recipient (or > authorized to receive this e-mail for the intended recipient), you may not > use, copy, disclose or distribute to anyone this message or any information > contained in this message. If you have received this electronic message in > error, please notify us by replying to this e-mail. >