Mikhail - I can imagine a filter that strips out everything but numbers
and then indexes those with a (separate) numeric (trie) field. But I
don't believe you can do phrase or other proximity queries across
multiple fields. As long as an or-query is good enough, I think this
problem is not too hard? But if you need proximity it becomes more
complicated. Once in the distant past we coded a numeric range query
using a complicated set of wildcard queries that could handle large
numbers efficiently - this search index (Verity) had no range
capability, so we had to mock it up using text. The way this worked was
something along these lines:
1) transform all the numbers into their binary encoding (8 = 0b00001000, eg)
2) write queries by encoding the range as a set of bitmasks represented
by wildcard queries:
[8 TO 20] becomes (0b00001000 0b000100?? 0b00010100)
I know you said you cannot use [0-9]* terms, but you will not see
terrible term explosion with this. What's your concern there?
-Mike
On 12/02/2014 02:59 PM, Mikhail Khludnev wrote:
Hello Searchers,
Don't you remember any examples of indexing numbers inside of plain text.
eg. if I have a text: "foo and 10 bars" I want to find it with a query like
foo [8 TO 20] bars.
The question no.1 whether to put trie terms into the separate field or they
can reside at the same text one? Note, enumerating [0-9]* terms in
MultiTermQuery is not an option for me, I definitely need the trie field
magic!
Perhaps you can remind a blog or chapter, whatever makes me happy.
Thanks a lot!