I'm at the point in my Solr deployment where I want to start using it
for autosuggest, but I've run into a snag. Because the fields that I
want to use for autosuggest are tokenized, I can only get single terms
out of it. I would like to have it find common phrases that are between
two and five words long, so that if someone starts typing "ang" their
autosuggest list will include "Angelina Jolie" as well as possibly "Brad
Pitt and Angelina Jolie."
My index is already quite large, so I do not want to add shingles. I
tried to use the clustering component, but that will only give you
halfway decent results if you make the "rows=" parameter absolutely huge
and therefore things run very slowly. Also, it only works against
stored fields, so I can only run it against the field where we retrieve
captions, not the full description. It's impractical to get results
based on an entire index, much less all seven shards.
I'm OK with offline analysis to generate a list of suggestions, and I'm
also OK with doing that analysis against the MySQL data source rather
than Solr. I just need some pointers about what software and/or
techniques I can use to generate a good list, and then some idea of how
to configure Solr to use that list. Can anyone help?
Thanks,
Shawn
- How can I create a good autosuggest list with phrases? Shawn Heisey
-