Hello,

This time I'm trying to duplicate Luke's functionality of knowing which
terms occur in a search result/document (w/o parsing it again). Any Solrj
API to do that?

P.S. I've also posted the question on
SO<http://stackoverflow.com/q/7219111/300248>
.

On Wed, Jul 6, 2011 at 11:09 AM, Gabriele Kahlout
<gabri...@mysimpatico.com>wrote:

> From you patch I see TermFreqVector  which provides the information I
> want.
>
> I also found FieldInvertState.getLength() which seems to be exactly what I
> want. I'm after the word count (sum of tf for every term in the doc). I'm
> just not sure whether FieldInvertState.getLength() returns just the number
> of terms (not multiplied by the frequency of each term - word count) or not
> though. It seems as if it returns word count, but I've not tested it
> sufficienctly.
>
>
> On Wed, Jul 6, 2011 at 1:39 AM, Trey Grainger 
> <the.apache.t...@gmail.com>wrote:
>
>> Gabriele,
>>
>> I created a patch that does this about a year ago.  See
>> https://issues.apache.org/jira/browse/SOLR-1837.  It was written for Solr
>> 1.4 and is based upon the Document Reconstructor in Luke.  The patch adds
>> a
>> link to the main solr admin page to a docinspector page which will
>> reconstruct the document given a uniqueid (required).  Keep in mind that
>> you're only looking at what's "in" the index for non-stored fields, not
>> the
>> original text.
>>
>> If you have any issues using this on the most recent release, let me know
>> and I'd be happy to create a new patch for solr 3.3.  One of these days
>> I'll
>> remove the JSP dependency and this may eventually making it into trunk.
>>
>> Thanks,
>>
>> -Trey Grainger
>> Search Technology Development Team Lead, Careerbuilder.com
>> Site Architect, Celiaccess.com
>>
>>
>> On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout
>> <gabri...@mysimpatico.com>wrote:
>>
>> > Hello,
>> >
>> > With an inverted index the term is the key, and the documents are the
>> > values. Is it still however possible that given a document id I get the
>> > terms indexed for that document?
>> >
>> > --
>> > Regards,
>> > K. Gabriele
>> >
>> > --- unchanged since 20/9/10 ---
>> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
>> > receipt within 48 hours then I don't resend the email.
>> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
>> > time(x)
>> > < Now + 48h) ⇒ ¬resend(I, this).
>> >
>> > If an email is sent by a sender that is not a trusted contact or the
>> email
>> > does not contain a valid code then the email is not received. A valid
>> code
>> > starts with a hyphen and ends with "X".
>> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
>> > L(-[a-z]+[0-9]X)).
>> >
>>
>
>
>
> --
> Regards,
> K. Gabriele
>
> --- unchanged since 20/9/10 ---
> P.S. If the subject contains "[LON]" or the addressee acknowledges the
> receipt within 48 hours then I don't resend the email.
> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x) < Now + 48h) ⇒ ¬resend(I, this).
>
> If an email is sent by a sender that is not a trusted contact or the email
> does not contain a valid code then the email is not received. A valid code
> starts with a hyphen and ends with "X".
> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> L(-[a-z]+[0-9]X)).
>
>


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Reply via email to