Assuming you mean significant in the traditional IR sense, I would
start with the MoreLikeThis. See http://wiki.apache.org/solr/MoreLikeThisHandler
In particular the mlt.interestingTerms option.
As for phrases, that is a bit harder. You could try playing around
with token-based n-grams (called Shingles) and MoreLikeThis together,
for starters, I think.
If you have some other notion of "significant" in relation to language
in general, then you've got quite a bit more work to do, most of which
is way beyond the scope of Solr (although it could plugin to Solr
nicely).
HTH,
Grant
On Aug 14, 2008, at 2:43 PM, Jack Tuhman wrote:
Humm, I am new to the world of search
I am looking for something that will give me a list of significant
words or
phrases extracted from a document stored in solr.
Jack
On Fri, Aug 8, 2008 at 9:33 AM, Grant Ingersoll
<[EMAIL PROTECTED]> wrote:
See https://issues.apache.org/jira/browse/SOLR-651. I've got some
of this
coded up and hope to have a patch soon.
Or, do you mean, is there a way to get the terms the MLT uses to
generate
the new query?
On Aug 5, 2008, at 8:41 PM, Jack Tuhman wrote:
Hi all,
is there a way to get key terms from an item? If each item has an
id, can
I
pass that ID to a search and get back the key terms like you can
with the
mlt filter.
Does this make sense?
Jack