M, Jack Krupansky wrote:
>
>> Use XML then. Although you will need to escape the XML special characters as
>> I did in the pattern.
>>
>> The point is simply: Quickly and simply try to find the simple test scenario
>> that illustrates the problem.
>>
>>
ding nested HTML
formatting, while the latter strips nested HTML formatting as well.
The tokenizer will in fact strip out white space, but that happens after all
character filters have completed.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Tuesday, September 10,
--Original Message- From: Andreas Owen
> Sent: Monday, September 09, 2013 7:05 PM
> To: solr-user@lucene.apache.org
> Subject: Re: charfilter doesn't do anything
>
> i tried but that isn't working either, it want a data-stream, i'll have to
> check how to post
, September 09, 2013 7:05 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
i tried but that isn't working either, it want a data-stream, i'll have to
check how to post json instead of xml
On 10. Sep 2013, at 12:52 AM, Jack Krupansky wrote:
Did you a
@lucene.apache.org
Subject: Re: charfilter doesn't do anything
i've downloaded curl and tried it in the comman prompt and power shell on my
win 2008r2 server, thats why i used my dataimporter with a single line html
file and copy/pastet the lines into schema.xml
On 9. Sep 2013, at 11:2
example? If not, please do so.
>
> -- Jack Krupansky
>
> -Original Message- From: Andreas Owen
> Sent: Monday, September 09, 2013 4:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: charfilter doesn't do anything
>
> i index html pages with a lot
data. You can just
> use the standard Solr simple post tool.
>
> -- Jack Krupansky
>
> -Original Message- From: Andreas Owen
> Sent: Monday, September 09, 2013 6:40 PM
> To: solr-user@lucene.apache.org
> Subject: Re: charfilter doesn't do anything
>
>
curl "http://localhost:8983/solr/select/?q=body:def&indent=true&wt=json";
> shows nothing (outside of body)
>
> curl "http://localhost:8983/solr/select/?q=body:body&indent=true&wt=json";
> Shows nothing, HTML tag stripped
>
> In your original
Did you in fact try my suggested example? If not, please do so.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Monday, September 09, 2013 4:42 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
i index html pages with a lot of lines an
idn't show us what your default field, df
parameter, was.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Sunday, September 08, 2013 5:21 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
yes but that filter html and not the specific
yes but that filter html and not the specific tag i want.
On 7. Sep 2013, at 7:51 PM, Erick Erickson wrote:
> Hmmm, have you looked at:
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory
>
> Not quite the , perhaps, but might it help?
>
>
> On Fri, Se
For the second question, there is no multiline mode - the ends of lines are
just white space characters. IOW, it is implicitly multi-line.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Thursday, September 05, 2013 12:03 PM
To: solr-user@lucene.apache.org
Subject: charf
Hmmm, have you looked at:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory
Not quite the , perhaps, but might it help?
On Fri, Sep 6, 2013 at 11:33 AM, Andreas Owen wrote:
> ok i have html pages with .content i
> want.. i want to extract (
ok i have html pages with .content i
want.. i want to extract (index, store) only that
between the body-comments. i thought regexTransformer would be the best because
xpath doesn't work in tika and i cant nest a xpathEntetyProcessor to use xpath.
what i have also found out is that t
On 9/6/2013 7:09 AM, Andreas Owen wrote:
> i've managed to get it working if i use the regexTransformer and string is on
> the same line in my tika entity. but when the string is multilined it isn't
> working even though i tried ?s to set the flag dotall.
>
> dataSource="dataUrl" onError="skip"
opulated.
>
> -- Jack Krupansky
>
> -Original Message- From: Andreas Owen
> Sent: Friday, September 06, 2013 4:01 AM
> To: solr-user@lucene.apache.org
> Subject: Re: charfilter doesn't do anything
>
> the input string is a normal html page with the word Zahlungs
t handler definition and a sample of your actual Solr
input (Solr XML or JSON?) so that we can see what fields are being
populated.
-- Jack Krupansky
-Original Message-
From: Andreas Owen
Sent: Friday, September 06, 2013 4:01 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter
--- From: Shawn Heisey
> Sent: Thursday, September 05, 2013 2:41 PM
> To: solr-user@lucene.apache.org
> Subject: Re: charfilter doesn't do anything
>
> On 9/5/2013 10:03 AM, Andreas Owen wrote:
>> i would like to filter / replace a word during indexing but it doesn't do
On 9/5/2013 10:03 AM, Andreas Owen wrote:
> i would like to filter / replace a word during indexing but it doesn't do
> anything and i dont get a error.
>
> in schema.xml i have the following:
>
> multiValued="true"/>
>
>
>
>
> pattern="Zahlungsverkehr" replacement="A
And show us an input string and a query that fail.
-- Jack Krupansky
-Original Message-
From: Shawn Heisey
Sent: Thursday, September 05, 2013 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
On 9/5/2013 10:03 AM, Andreas Owen wrote:
i would li
20 matches
Mail list logo