You are not using a string type, but a TextField. And in your analysis chain,
standardtokenizer strips the number sign (or #). You can check this in the
"analysis" part
of the solr backend.
You can either use a string type for seaches like C#, C++ and the like, or map
the
characters to something textual *before* tokenizing. My solution goes something
like this:
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-chars.txt"/>
while mapping-chars.txt is:
*****************
# ########
# Specials
# ########
# C+ => Cplus
# C++ => Cplusplus
"\u0043\u002B" => "Cplus"
"\u0063\u002B" => "Cplus"
"\u0043\u002B\u002B" => "Cplusplus"
"\u0063\u002B\u002B" => "Cplusplus"
# C#, C♯ => Csharp
"\u0043\u0023" => "Csharp"
"\u0063\u0023" => "Csharp"
"\u0043\u266f" => "Csharp"
"\u0063\u266f" => "Csharp"
# F#, F♯ => Fsharp
"\u0046\u0023" => "Fsharp"
"\u0066\u0023" => "Fsharp"
"\u0046\u266f" => "Fsharp"
"\u0066\u266f" => "Fsharp"
# J#, J♯ => Jsharp
"\u004A\u0023" => "Jsharp"
"\u006A\u0023" => "Jsharp"
"\u004A\u266f" => "Jsharp"
"\u006A\u266f" => "Jsharp"
# ♭ => b
"\u266d" => "b"
# @ => at
"\u0040" => "at"
*******************************
Then use any tokenizer
-------- Original-Nachricht --------
Betreff: Re: Is there any special meaning for # symbol in solr.
Von: veena rani <veenara...@gmail.com>
An: solr-user@lucene.apache.org
CC: te <t...@statsbiblioteket.dk>
Datum: 04.09.2012 09:49
this is the field type i m using for techskill,
<field name="techskill" type="text_general" indexed="true"
stored="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
On Tue, Sep 4, 2012 at 1:16 PM, veena rani <veenara...@gmail.com> wrote:
No, # is not a stop word.
On Tue, Sep 4, 2012 at 12:59 PM, 李赟 <liyun2...@corp.netease.com> wrote:
Is "#" in your stop words list ?
2012-09-04
Li Yun
Software Engineer @ Netease
Mail: liyun2...@corp.netease.com
MSN: rockiee...@gmail.com
发件人: veena rani
发送时间: 2012-09-04 12:57:26
收件人: solr-user; te
抄送:
主题: Re: Is there any special meaning for # symbol in solr.
if i use this link ,
http://localhost:8080/solr/select?&q=(techskill%3Ac%23)
, solr is going to display techskill:c result.
But i want to display only techskill:c# result.
On Mon, Sep 3, 2012 at 7:23 PM, Toke Eskildsen <t...@statsbiblioteket.dk
wrote:
On Mon, 2012-09-03 at 13:39 +0200, veena rani wrote:
I have an issue with the # symbol, in solr,
I m trying to search for string ends up with # , Eg:c#, it is
throwing
error Like, org.apache.lucene.queryparser.classic.ParseException:
Cannot
parse '(techskill:c': Encountered "<EOF>" at line 1, column 12.
Solr only received '(techskill:c', which has unbalanced parentheses.
My guess is that you do not perform a URL-encode of '#' and that you
were sending something like
http://localhost:8080/solr/select?&q=(techskill:c#)
when you should have been sending
http://localhost:8080/solr/select?&q=(techskill%3Ac%23)
--
Regards,
Veena.
Banglore.
--
Regards,
Veena.
Banglore.