Hi Robert,
Thanks for reply.
As you write, I used "textgen" but still not able to search hindi text.
Might be missing some important configuration.
following is my schema.xml configuration
<fieldType name="textgen" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fields>
<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="cat_name" type="textgen" indexed="true" stored="true" />
<field name="title" type="textgen" indexed="true" stored="true" />
<field name="summary" type="textgen" indexed="true" stored="true" />
<field name="textgen" type="textgen" indexed="true" stored="false"
multiValued="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<defaultSearchField>textgen</defaultSearchField>
<solrQueryParser defaultOperator="OR"/>
<copyField source="title" dest="textgen"/>
<copyField source="cat_name" dest="textgen"/>
<copyField source="summary" dest="textgen"/>
In the summary field there are hindi keywords.
Please help..
thanks
with regards
Ranveer K Kumar
On Thu, Jan 21, 2010 at 11:25 PM, Robert Muir <[email protected]> wrote:
> hello, take a look at field type "textgen" (a general unstemmed text field)
>
> the whitespacetokenizer + worddelimiterfilter used by this type will
> work correctly for hindi tokenization and punctuation.
>
> On Thu, Jan 21, 2010 at 10:55 AM, Ranveer kumar
> <[email protected]> wrote:
> > Hi all,
> >
> > I am very new in solr.
> > I download latest release 1.4 and install. For Indexing and Searching I
> am
> > using SolrJ api.
> > My Question is "How to enable solr to search hindi language text ?".
> > Please Help me..
> >
> > thanks
> > with regards
> > Ranveer K Kumar
> >
>
>
>
> --
> Robert Muir
> [email protected]
>