Ummm, 400k documents is _tiny_ by Solr/Lucene standards. I've seen 150M
docs fit in 16G on Solr. I put 11M docs on my laptop....

So I would _strongly_ advise that you don't worry about space at all as a
first approach and freely copy as many fields as you need to support your
use-case. Only after you've proved that this is untenable would I recommend
you develop custom code. You'll be in production much faster that way ;)

Of course this is irrelevant if each doc is "War and Peace", but....

Best,
Erick


On Thu, Jul 31, 2014 at 3:29 PM, Juan Pablo Albuja <jpalb...@dustland.com>
wrote:

> Good afternoon guys, I really appreciate if someone on the community can
> help me with the following issue:
>
> I need to implement a Solr autosuggest that supports:
>
> 1.       Get autosuggestion over multivalued fields
>
> 2.       Case - Insensitiveness
>
> 3.       Look for content in the middle for example I have the value
> "Hello World" indexed, and I need to get that value when the user types
> "wor"
>
> 4.       Filter by an additional field.
>
> I was using the terms component because with it I can satisfy 1 to 3, but
> for point 4 is not possible. I also was looking at faceting searches and
> Ngram.Edge-Ngrams, but the problem with those approaches is that I need to
> copy fields over to make them tokenized or apply grams to those, and I
> don't want to do that because I have more than 6 fields that needs
> autosuggest, my index is big I have more than 400k documents and I don't
> want to increase the size.
> I was trying to Extend the terms component in order to add an additional
> filter but it uses TermsEnum that is a vector over an specific field and I
> couldn't figure out how to filter it in a really efficient way.
> Do you guys have an idea on how can I satisfy my requirements in an
> efficient way? If there is another way without using the terms component
> for me is also awesome.
>
> Thanks
>
>
>
>
> Juan Pablo Albuja
> Senior Developer
>
>
>

Reply via email to