Re: charfilter doesn't do anything

2013-09-11 Thread Andreas Owen
M, Jack Krupansky wrote: > >> Use XML then. Although you will need to escape the XML special characters as >> I did in the pattern. >> >> The point is simply: Quickly and simply try to find the simple test scenario >> that illustrates the problem. >> >>

Re: charfilter doesn't do anything

2013-09-10 Thread Jack Krupansky
ding nested HTML formatting, while the latter strips nested HTML formatting as well. The tokenizer will in fact strip out white space, but that happens after all character filters have completed. -- Jack Krupansky -Original Message- From: Andreas Owen Sent: Tuesday, September 10,

Re: charfilter doesn't do anything

2013-09-10 Thread Andreas Owen
--Original Message- From: Andreas Owen > Sent: Monday, September 09, 2013 7:05 PM > To: solr-user@lucene.apache.org > Subject: Re: charfilter doesn't do anything > > i tried but that isn't working either, it want a data-stream, i'll have to > check how to post

Re: charfilter doesn't do anything

2013-09-09 Thread Jack Krupansky
, September 09, 2013 7:05 PM To: solr-user@lucene.apache.org Subject: Re: charfilter doesn't do anything i tried but that isn't working either, it want a data-stream, i'll have to check how to post json instead of xml On 10. Sep 2013, at 12:52 AM, Jack Krupansky wrote: Did you a

Re: charfilter doesn't do anything

2013-09-09 Thread Jack Krupansky
@lucene.apache.org Subject: Re: charfilter doesn't do anything i've downloaded curl and tried it in the comman prompt and power shell on my win 2008r2 server, thats why i used my dataimporter with a single line html file and copy/pastet the lines into schema.xml On 9. Sep 2013, at 11:2

Re: charfilter doesn't do anything

2013-09-09 Thread Andreas Owen
example? If not, please do so. > > -- Jack Krupansky > > -Original Message- From: Andreas Owen > Sent: Monday, September 09, 2013 4:42 PM > To: solr-user@lucene.apache.org > Subject: Re: charfilter doesn't do anything > > i index html pages with a lot

Re: charfilter doesn't do anything

2013-09-09 Thread Andreas Owen
data. You can just > use the standard Solr simple post tool. > > -- Jack Krupansky > > -Original Message- From: Andreas Owen > Sent: Monday, September 09, 2013 6:40 PM > To: solr-user@lucene.apache.org > Subject: Re: charfilter doesn't do anything > >

Re: charfilter doesn't do anything

2013-09-09 Thread Andreas Owen
curl "http://localhost:8983/solr/select/?q=body:def&indent=true&wt=json"; > shows nothing (outside of body) > > curl "http://localhost:8983/solr/select/?q=body:body&indent=true&wt=json"; > Shows nothing, HTML tag stripped > > In your original

Re: charfilter doesn't do anything

2013-09-09 Thread Jack Krupansky
Did you in fact try my suggested example? If not, please do so. -- Jack Krupansky -Original Message- From: Andreas Owen Sent: Monday, September 09, 2013 4:42 PM To: solr-user@lucene.apache.org Subject: Re: charfilter doesn't do anything i index html pages with a lot of lines an

Re: charfilter doesn't do anything

2013-09-08 Thread Jack Krupansky
idn't show us what your default field, df parameter, was. -- Jack Krupansky -Original Message- From: Andreas Owen Sent: Sunday, September 08, 2013 5:21 AM To: solr-user@lucene.apache.org Subject: Re: charfilter doesn't do anything yes but that filter html and not the specific

Re: charfilter doesn't do anything

2013-09-08 Thread Andreas Owen
yes but that filter html and not the specific tag i want. On 7. Sep 2013, at 7:51 PM, Erick Erickson wrote: > Hmmm, have you looked at: > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory > > Not quite the , perhaps, but might it help? > > > On Fri, Se

Re: charfilter doesn't do anything

2013-09-07 Thread Jack Krupansky
For the second question, there is no multiline mode - the ends of lines are just white space characters. IOW, it is implicitly multi-line. -- Jack Krupansky -Original Message- From: Andreas Owen Sent: Thursday, September 05, 2013 12:03 PM To: solr-user@lucene.apache.org Subject: charf

Re: charfilter doesn't do anything

2013-09-07 Thread Erick Erickson
Hmmm, have you looked at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory Not quite the , perhaps, but might it help? On Fri, Sep 6, 2013 at 11:33 AM, Andreas Owen wrote: > ok i have html pages with .content i > want.. i want to extract (

Re: charfilter doesn't do anything

2013-09-06 Thread Andreas Owen
ok i have html pages with .content i want.. i want to extract (index, store) only that between the body-comments. i thought regexTransformer would be the best because xpath doesn't work in tika and i cant nest a xpathEntetyProcessor to use xpath. what i have also found out is that t

Re: charfilter doesn't do anything

2013-09-06 Thread Shawn Heisey
On 9/6/2013 7:09 AM, Andreas Owen wrote: > i've managed to get it working if i use the regexTransformer and string is on > the same line in my tika entity. but when the string is multilined it isn't > working even though i tried ?s to set the flag dotall. > > dataSource="dataUrl" onError="skip"

Re: charfilter doesn't do anything

2013-09-06 Thread Andreas Owen
opulated. > > -- Jack Krupansky > > -Original Message- From: Andreas Owen > Sent: Friday, September 06, 2013 4:01 AM > To: solr-user@lucene.apache.org > Subject: Re: charfilter doesn't do anything > > the input string is a normal html page with the word Zahlungs

Re: charfilter doesn't do anything

2013-09-06 Thread Jack Krupansky
t handler definition and a sample of your actual Solr input (Solr XML or JSON?) so that we can see what fields are being populated. -- Jack Krupansky -Original Message- From: Andreas Owen Sent: Friday, September 06, 2013 4:01 AM To: solr-user@lucene.apache.org Subject: Re: charfilter

Re: charfilter doesn't do anything

2013-09-06 Thread Andreas Owen
--- From: Shawn Heisey > Sent: Thursday, September 05, 2013 2:41 PM > To: solr-user@lucene.apache.org > Subject: Re: charfilter doesn't do anything > > On 9/5/2013 10:03 AM, Andreas Owen wrote: >> i would like to filter / replace a word during indexing but it doesn't do

Re: charfilter doesn't do anything

2013-09-05 Thread Shawn Heisey
On 9/5/2013 10:03 AM, Andreas Owen wrote: > i would like to filter / replace a word during indexing but it doesn't do > anything and i dont get a error. > > in schema.xml i have the following: > > multiValued="true"/> > > > > > pattern="Zahlungsverkehr" replacement="A

Re: charfilter doesn't do anything

2013-09-05 Thread Jack Krupansky
And show us an input string and a query that fail. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Thursday, September 05, 2013 2:41 PM To: solr-user@lucene.apache.org Subject: Re: charfilter doesn't do anything On 9/5/2013 10:03 AM, Andreas Owen wrote: i would li