Hi,

On 7/17/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 7/17/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Is there a way to pass arguments to analyzers per document? Let's say
> that I have a field "foo" which is tokenized by WhitespaceTokenizer
> and then filtered by MyCustomStemmingFilter. MyCustomStemmingFilter
> can stem more than one language but (obviously) it needs to know the
> language of the document it is working on. So what I need is to
> specify the language per document (actually per field).
>
> Here is an example:
> <doc>
>    <field name="....
>     .....
>     <field name="foo" lang="en">My spam egg bars baz.</field>
> </doc>
>
> Is something like this possible with Solr?

You can pass extra args to a factory in the field-type definition, but
that means you would need a separate field-type per language.

Thanks for the answer.

Your suggestion would work for this particular use case, but IMHO
there are other use cases out there that can benefit (for example, one
may process the whole document and add parameters for each field based
on document-level analysis) from this. Also, again IMHO, per-field
parameters are more flexible.

Would this be useful feature for Solr? I would actually like to work
on it if others consider this as a useful add-on. It seems simple to
accomplish and it would probably be a good introduction to Solr
internals.


-Yonik



--
Doğacan Güney

Reply via email to