Re: Replacing payloads for per-document-per-keyword scores

2012-06-01 Thread Chris Hostetter
: > Hoss guessed that we could override Term Frequency with PreAnalyzedField[1] : > for the per-keyword scores, since keywords (tags) always have a Term : > Frequency of 1 and the TF calculation is very fast. However it turns out : > that you can't[2] specify TF in the PreAnalyzedField. Yeah ... s

Re: Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Mikhail Khludnev
Hello Neil, if "manipulating tf" is a possible approach, why don't extend KeywordTokenizer to make it work in the following manner: "3|wheel" -> {wheel,wheel,wheel} it will allow supply your per-term-per-doc boosts as a prefixes for field values and multiply them during indexing internally. The

Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Neil Hooey
Hello Hoss and the list, We are currently using Lucene payloads to store per-document-per-keyword scores for our dataset. Our dataset consists of photos with keywords assigned (only once each) to them. The index is about 90 GB, running on 24-core machines with dedicated 10k SAS drives, and 16/32 G