you need to identify the white space characters
>> that are causing the problem.
>>
>> Von: Zheng Lin Edwin Yeo
>> Gesendet: Mittwoch, 13. März 2019 03:25:39
>> An: solr-user@lucene.apache.org
>> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \
using the problem.
>
> Von: Zheng Lin Edwin Yeo
> Gesendet: Mittwoch, 13. März 2019 03:25:39
> An: solr-user@lucene.apache.org
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
> Hi,
>
> We have managed to reso
;>> Regards,
>>> Edwin
>>>
>>>
>>>
>>>
>>> On Thu, 7 Mar 2019 at 20:44, wrote:
>>>
>>>> Hi Edwin
>>>>
>>>>
>>>>
>>>> I can’t understand why the pattern is not working a
>> I can’t understand why the pattern is not working and where the spaces
>>> between the are coming from. It should be possible to allow for spaces
>>> between the in the second match pattern however i.e. 2nd pattern
>>>
>>>
>>>
>>> (<br>[
>>
>>
>>
>> (<br>[ \t\x0b\f]]*){3,}
>>
>>
>>
>> /Paul
>>
>>
>>
>> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
>> Windows 10
>>
>>
>>
>> Von: Zheng Li
ttps://go.microsoft.com/fwlink/?LinkId=550986> für
> Windows 10
>
>
>
> Von: Zheng Lin Edwin Yeo<mailto:edwinye...@gmail.com>
> Gesendet: Mittwoch, 6. März 2019 16:28
> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Betreff: Re: RegexReplace
;>
> >>
> >>content
> >>(<br><br>){3,}
> >><br><br>
> >>true
> >>
> >>
> >> However, none of the \n is being removed this time round.
> >> Is the order and/or the pattern correct?
> >>
>
t;>>
>>> <br>
>>>
>>>
>>>
>>> Now all line endings and preceding whitespace characters should be
>>> changed to ‘’.
>>>
>>>
>>>
>>> The second pattern replacement should replace 3 or more ‘’ sequences
;>
>> <br><br>
>>
>>
>>
>> Hope this approach works. Sorry for not replying earlier and best regards,
>>
>> Paul
>>
>>
>>
>>
>>
>> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
&g
nd best regards,
>
> Paul
>
>
>
>
>
> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
> Windows 10
>
>
>
> Von: Zheng Lin Edwin Yeo<mailto:edwinye...@gmail.com>
> Gesendet: Dienstag, 5. März 2019 03:35
> An: solr-user@lu
If the second step is executed first, then you will get the unwanted 4
>>>
>>>
>>>
>>>
>>> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
>>> Windows 10
>>>
>>>
>>>
>>> Von: Zheng
o<mailto:edwinye...@gmail.com>
>> Gesendet: Mittwoch, 20. Februar 2019 09:29
>> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
>> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>>
>>
>>
>> Hi Jörn
;
>
> Von: Zheng Lin Edwin Yeo<mailto:edwinye...@gmail.com>
> Gesendet: Mittwoch, 20. Februar 2019 09:29
> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
>
>
> Hi Jörn ,
&
content: *Dear Sir, I am terminating
> >>>>>>>
> >>>>>>> Example 2: The sentence that the above regex pattern is partially
> >>>>>>> working (as you can see, instead of 2 , there are 4 )
> >>>>>>> *Origina
mailto:edwinye...@gmail.com>
> Gesendet: Mittwoch, 20. Februar 2019 08:13
> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
>
>
> Hi,
>
> Thanks for the reply.
>
&
gt;>>>>>>
>>>>>>> 3 Choa Chu Kang Avenue 4
>>>>>>> *Original content:* exalted \n \n\n Psalm 89:17 \n\n \n\n 3
>>>>>>> Choa Chu Kang Avenue 4, Singapore
>>>>>>> *Index content: *exalted Psalm 89:17 3
&
gt;>>>>>>
>>>>>>> 3 Choa Chu Kang Avenue 4
>>>>>>> *Original content:* exalted \n \n\n Psalm 89:17 \n\n \n\n 3
>>>>>>> Choa Chu Kang Avenue 4, Singapore
>>>>>>> *Index content: *exalted Ps
89:17 3
> >>>>> Choa Chu Kang Avenue 4, Singapore
> >>>>>
> >>>>> Example 3: The sentence that the above regex pattern is partially
> >>>>> working (as you can see, instead of 2 , there are 4 )
> >>>>> *Ori
nal content in EML file:*
>>>>>
>>>>> http://www.concordpri.moe.edu.sg/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> O
>>
>>>> On Tue, Dec 18, 2018 at 10:07 AM
>>>> *Original content:* http://www.concordpri.moe.edu.sg/ \n\n \n\n \n
>>>> \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n\n \n\n\n On Tue, Dec 18,
>>>> 2018 at 10:07 AM
>>>> *Index
gt;>
>>>> Example 2: The sentence that the above regex pattern is partially
>>>> working
>>>> (as you can see, instead of 2 , there are 4 )
>>>> *Original content:* exalted \n \n\n Psalm 89:17 \n\n \n\n 3 Choa
>>>> Chu Kang A
t;>
>>> Hi Edwin
>>>
>>>
>>>
>>> 1. Sorry, the pattern was wrong, the space should preceed the \n i.e.
>>> (\s*\n){2,}
>>> 2. Perhaps in the data you have other (non printing) characters than
>>> \n?
>>
he data you have other (non printing) characters than
>> \n?
>>
>>
>>
>> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
>> Windows 10
>>
>>
>>
>> Von: Zheng Lin Edwin Yeo<mailto:edwinye...@gmail.com>
>&g
Lin Edwin Yeo<mailto:edwinye...@gmail.com>
> Gesendet: Donnerstag, 7. Februar 2019 15:23
> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
>
>
> Hi Paul,
>
> We hav
t;
> Von: Zheng Lin Edwin Yeo<mailto:edwinye...@gmail.com>
> Gesendet: Donnerstag, 7. Februar 2019 15:10
> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
>
>
> Hi Paul,
>
Hi Paul,
Thanks for your reply.
When I use this pattern:
content
(\n+\s*){2,}
It is working for some sentence within the same content and not working for
some sentences. Please see below for the one that is working and another
that is not working (partially working):
Example
26 matches
Mail list logo