I think best way to get non-stemmed top terms is to index the field using a 
fieldType that does not employes any stem filter. For example:

<fieldType name="non_stemmed_text" class="solr.TextField">
      <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
</fieldType>

By using copyField you can store two (or more) versions of a field. Stemmed and 
non-stemmed.

Just a new field:
<field name="text" type="non_stemmed_text" indexed="true" stored="true" /> 

And a copy field:
<copyField source="your_original_field" dest="text" /> 

Schema Browser (Field: text) will give you top terms.

> Is it possible to retrieve the original words once solr
> (Porter algorithm)
> stems them?
> I need to index a bunch of data, store it in solr, and get
> back a list of
> most frequent terms out of solr. and i want to see the
> non-stemmed version
> of this data.
> 
> so basically, i want to enhance this:
> http://localhost:8983/solr/admin/schema.jsp to see the
> "top terms" in
> non-stemmed form.
> 
> thanks,
> thushara


      

Reply via email to