: > Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
: > for the per-keyword scores, since keywords (tags) always have a Term
: > Frequency of 1 and the TF calculation is very fast. However it turns out
: > that you can't[2] specify TF in the PreAnalyzedField.
Yeah ... s
Hello Neil,
if "manipulating tf" is a possible approach, why don't extend
KeywordTokenizer to make it work in the following manner:
"3|wheel" -> {wheel,wheel,wheel}
it will allow supply your per-term-per-doc boosts as a prefixes for field
values and multiply them during indexing internally.
The
Hello Hoss and the list,
We are currently using Lucene payloads to store per-document-per-keyword
scores for our dataset. Our dataset consists of photos with keywords
assigned (only once each) to them. The index is about 90 GB, running on
24-core machines with dedicated 10k SAS drives, and 16/32 G