Is there any chance that your changed your schema since you indexed the
data? If so, re-index the data.
If a "*" query finds nothing, that implies that the default field is empty.
Are you sure the "df" parameter is set to the field containing your data?
Show us your request handler definition and a sample of your actual Solr
input (Solr XML or JSON?) so that we can see what fields are being
populated.
-- Jack Krupansky
-----Original Message-----
From: Andreas Owen
Sent: Friday, September 06, 2013 4:01 AM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
the input string is a normal html page with the word Zahlungsverkehr in it
and my query is ...solr/collection1/select?q=*
On 5. Sep 2013, at 9:57 PM, Jack Krupansky wrote:
And show us an input string and a query that fail.
-- Jack Krupansky
-----Original Message----- From: Shawn Heisey
Sent: Thursday, September 05, 2013 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: charfilter doesn't do anything
On 9/5/2013 10:03 AM, Andreas Owen wrote:
i would like to filter / replace a word during indexing but it doesn't do
anything and i dont get a error.
in schema.xml i have the following:
<field name="text_html" type="text_cutHtml" indexed="true" stored="true"
multiValued="true"/>
<fieldType name="text_cutHtml" class="solr.TextField">
<analyzer>
<!-- <tokenizer class="solr.StandardTokenizerFactory"/> -->
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="Zahlungsverkehr" replacement="ASDFGHJK" />
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
my 2. question is where can i say that the expression is multilined like
in javascript i can use /m at the end of the pattern?
I don't know about your second question. I don't know if that will be
possible, but I'll leave that to someone who's more expert than I.
As for the first question, here's what I have. Did you reindex? That
will be required.
http://wiki.apache.org/solr/HowToReindex
Assuming that you did reindex, are you trying to search for ASDFGHJK in
a field that contains more than just "Zahlungsverkehr"? The keyword
tokenizer might not do what you expect - it tokenizes the entire input
string as a single token, which means that you won't be able to search
for single words in a multi-word field without wildcards, which are
pretty slow.
Note that both the pattern and replacement are case sensitive. This is
how regex works. You haven't used a lowercase filter, which means that
you won't be able to search for asdfghjk.
Use the analysis tab in the UI on your core to see what Solr does to
your field text.
Thanks,
Shawn