: >Is it possible to modify MoreLikeThis to use the schema.xml-defined
: >analyzer? That's the way the highlighting code currently works (it
: >picks the index-time analyzer).
:
: I looked at that briefly (passing the analyzer to use down to
: MoreLikeThis), but for my fields it's a lot more than just what
: analyzer is used, given all of the filters that are also in play.
that confuses me ... when dealing with the "plugin" level of things (ie:
writing java code) it's easy to access an IndexSchema instance, and from
there to get a SolrAnalyzer that already knows about all of the fields and
what token filters to use on each -- you could even access the "index"
analyzer instead of the "query" analyzer if you wanted for any field at
query time ... so if the MLT class allows some way of setting the Analyzer
to use, that should work fine.
what other problems did you run into when you looked into this Ken?
No other problems - just not knowing that it was possible to set up a
SolrAnalyzer so easily :)
If that's the case, then it seems like a minor tweak to call
MoreLikeThis.setAnalyzer
(http://krugle.com/kse/files/svn/svn.apache.org/lucene/java/trunk/contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java)
with the SolrAnalyzer.
Though I don't understand Mark's comment for the setAnalyzer() method
- he says that it's not required when using the like(docNum) method
call, but from what I can tell the analyzer (either the default
StandardAnalyzer or whatever gets set explicitly) will still get used
in that case, if there's no term vector.
: Also the performance really stunk when I didn't use stored term vectors.
well .. i'd still rather be able to say "using termVectors to make MLT
faster" then: "if you don't use termVectors MLT doesn't work at all"
Agreed.
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"