Hello, This time I'm trying to duplicate Luke's functionality of knowing which terms occur in a search result/document (w/o parsing it again). Any Solrj API to do that?
P.S. I've also posted the question on SO<http://stackoverflow.com/q/7219111/300248> . On Wed, Jul 6, 2011 at 11:09 AM, Gabriele Kahlout <gabri...@mysimpatico.com>wrote: > From you patch I see TermFreqVector which provides the information I > want. > > I also found FieldInvertState.getLength() which seems to be exactly what I > want. I'm after the word count (sum of tf for every term in the doc). I'm > just not sure whether FieldInvertState.getLength() returns just the number > of terms (not multiplied by the frequency of each term - word count) or not > though. It seems as if it returns word count, but I've not tested it > sufficienctly. > > > On Wed, Jul 6, 2011 at 1:39 AM, Trey Grainger > <the.apache.t...@gmail.com>wrote: > >> Gabriele, >> >> I created a patch that does this about a year ago. See >> https://issues.apache.org/jira/browse/SOLR-1837. It was written for Solr >> 1.4 and is based upon the Document Reconstructor in Luke. The patch adds >> a >> link to the main solr admin page to a docinspector page which will >> reconstruct the document given a uniqueid (required). Keep in mind that >> you're only looking at what's "in" the index for non-stored fields, not >> the >> original text. >> >> If you have any issues using this on the most recent release, let me know >> and I'd be happy to create a new patch for solr 3.3. One of these days >> I'll >> remove the JSP dependency and this may eventually making it into trunk. >> >> Thanks, >> >> -Trey Grainger >> Search Technology Development Team Lead, Careerbuilder.com >> Site Architect, Celiaccess.com >> >> >> On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout >> <gabri...@mysimpatico.com>wrote: >> >> > Hello, >> > >> > With an inverted index the term is the key, and the documents are the >> > values. Is it still however possible that given a document id I get the >> > terms indexed for that document? >> > >> > -- >> > Regards, >> > K. Gabriele >> > >> > --- unchanged since 20/9/10 --- >> > P.S. If the subject contains "[LON]" or the addressee acknowledges the >> > receipt within 48 hours then I don't resend the email. >> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ >> > time(x) >> > < Now + 48h) ⇒ ¬resend(I, this). >> > >> > If an email is sent by a sender that is not a trusted contact or the >> email >> > does not contain a valid code then the email is not received. A valid >> code >> > starts with a hyphen and ends with "X". >> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ >> > L(-[a-z]+[0-9]X)). >> > >> > > > > -- > Regards, > K. Gabriele > > --- unchanged since 20/9/10 --- > P.S. If the subject contains "[LON]" or the addressee acknowledges the > receipt within 48 hours then I don't resend the email. > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ > time(x) < Now + 48h) ⇒ ¬resend(I, this). > > If an email is sent by a sender that is not a trusted contact or the email > does not contain a valid code then the email is not received. A valid code > starts with a hyphen and ends with "X". > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ > L(-[a-z]+[0-9]X)). > > -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains "[LON]" or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) < Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with "X". ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)).