Re: Luke response format explained

2008-01-09 Thread Robert Young
On Jan 8, 2008 8:13 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Perhaps consider using a copyField to copy the relevant values into > another field - then you can get the top tokens across all these fields > with luke. That sounds like the best solution, thanks. Also means I'd be able to have i

Re: Luke response format explained

2008-01-08 Thread Ryan McKinley
Robert Young wrote: Thanks, that is very helpfull. So, is there a way to find out the total number of distinct tokens, regardless of which field they're associated with? And to find which are most popular? nothing standard does that... the semantics of what it would mean get a little wierd -

Re: Luke response format explained

2008-01-08 Thread Yonik Seeley
On Jan 8, 2008 3:07 PM, Robert Young <[EMAIL PROTECTED]> wrote: > Thanks, that is very helpfull. So, is there a way to find out the > total number of distinct tokens, regardless of which field they're > associated with? No. A term in lucene consists of a field and value, so the same word in diff

Re: Luke response format explained

2008-01-08 Thread Robert Young
Thanks, that is very helpfull. So, is there a way to find out the total number of distinct tokens, regardless of which field they're associated with? And to find which are most popular? Cheers Rob On Jan 8, 2008 5:04 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > numTerms counts the unique terms

Re: Luke response format explained

2008-01-08 Thread Ryan McKinley
numTerms counts the unique terms (field:value pair) in the index. The source is: TermEnum te = reader.terms(); int numTerms = 0; while (te.next()) { numTerms++; } indexInfo.add("numTerms", numTerms ); "distinct" is a similar calculation, but fo