Re: Using SOLR

Erick Erickson Tue, 09 Mar 2010 17:17:56 -0800

Well, the LukeRequestHandler lets you peek at the
index, see:
http://wiki.apache.org/solr/LukeRequestHandler


warning: it'll take a bit for this to make lots of sense.

You can get a copy of Luke (google Lucene Luke) for
what the above is based on, point it at your index and
have at it.

One bit of warning though. It'll be easy to confuse
what you stored (which is just a raw copy of
your input) with what you indexed (which is
what's searched on). If you're looking at either tool
and what you see looks suspiciously like
your raw data, look further to see it you can find
the terms...

To answer your question about searching, it all depends
(tm). What do you mean by Taxonomy? Different
people use that term...er...differently. Some example
inputs and how searching should behave in your
problem space would be very helpful.

HTH
Erick

On Tue, Mar 9, 2010 at 7:53 PM, CP Hennessy <cp.henne...@openapp.ie> wrote:

> Hi,
>  I'm trying to figure out if SOLR is the component I need and if so that
> I'm asking the right questions :)
>
> I need to index a large set of multilingual documents against a project
> specific taxonomy.
>
> From what I've read SOLR should be perfect for this.
>
> However I'm not sure that my approach is correct. I've been able to run the
> example solr setup and index the given documents.
>
> Now I want to add my taxonomy (in English first), and this is where I'm
> stumbling (or not understanding the documentation).
>
> To do this I understand that I need to define a field to store the result
> of
> the taxonomy analysis. I also need to define the analysis steps used to
> generate the values for this field ( lowercase, synonyms, stemming, etc).
>
> In the file solr/conf/schema.xml in the <types> I've added :
>
>    <fieldType name="Taxonomy" class="solr.TextField" indexed="True">
>      <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="ontology-
> synonyms.txt" ignoreCase="true" expand="true"/>
>        <filter class="solr.SnowballPorterFilterFactory"
> language="English"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>        <filter class="solr.KeepWordFilterFactory" words="keepwords.txt"
> ignoreCase="true"/>
>      </analyzer>
>    </fieldType>
>
> and
>
>   <field name="taxonomy" type="Taxonomy" indexed="true" stored="true"
> required="true" multiValued="true"/>
>
> I am able to test my fieldType thru the /solr/admin/analysis.jsp page and
> it
> seems to be doing what I expect.
>
> When I now add a test document containing several words from the
> keepwords.txt
> file the result seems to indicate that it was processed correctly.
>
> How can I get the details of what has been indexed for my file?
>
>
> Also I do not know how to perform a search based on the taxonomy ?
>
> Any pointers would be greatly appreciated.
>
> Thanks in advance,
> CPH
>

Re: Using SOLR

Reply via email to