Assuming you mean significant in the traditional IR sense, I would start with the MoreLikeThis. See http://wiki.apache.org/solr/MoreLikeThisHandler In particular the mlt.interestingTerms option.

As for phrases, that is a bit harder. You could try playing around with token-based n-grams (called Shingles) and MoreLikeThis together, for starters, I think.

If you have some other notion of "significant" in relation to language in general, then you've got quite a bit more work to do, most of which is way beyond the scope of Solr (although it could plugin to Solr nicely).

HTH,
Grant


On Aug 14, 2008, at 2:43 PM, Jack Tuhman wrote:

Humm, I am new to the world of search

I am looking for something that will give me a list of significant words or
phrases extracted from a document stored in solr.
Jack


On Fri, Aug 8, 2008 at 9:33 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:

See https://issues.apache.org/jira/browse/SOLR-651. I've got some of this
coded up and hope to have a patch soon.

Or, do you mean, is there a way to get the terms the MLT uses to generate
the new query?


On Aug 5, 2008, at 8:41 PM, Jack Tuhman wrote:

Hi all,

is there a way to get key terms from an item? If each item has an id, can
I
pass that ID to a search and get back the key terms like you can with the
mlt filter.

Does this make sense?

Jack






Reply via email to