At my company. I've been spending some time figuring out the best approach for inline prefix Auto completion. Most of the support for auto completion is based solely on prefix matching, as it can jump to a certain term within a field quickly and break the enumeration loop when the prefix no longer matches (really nice and freaking quick).
Doing inline prefixing means you lose this functionality and have to check each word specifically. Some approaches I investigated: - TermsComponent - test each term for inline prefixes - For large corpa, this is slow, really really slow... - TST - currently only supports prefixing as well, but could manage to have each node pointing back to the document and then perform document intersection. - Probably the quickest solution, but rebuilding this tree every time a commit happens could get really ugly on memory - RAMDirectory - again a memory hog. A quick and not so bad solution: - New poly field type that splits a term value into prefix-able terms. - CopyField of dynamic type __s (string) to this fieldtype - Ex. Jennifer Love Hewitt -> Jennifer Love Hewitt, Love Hewitt, Hewitt How it looks indexed: jennifer love hewitt<DELIM>Jennfier Love Hewitt love hewitt<DELIM>Jennfier Love Hewitt hewitt<DELIM>Jennfier Love Hewitt As a user is typing name values we prefix match on the term they typed and then return whatever is after the delimiter. I also lower cased, so I could get case insensitivity. Note: The only case that I'm not currently supporting is out of order prefixing, (e.g. user types Hewitt Jennfier). Although this can be accomplished using this approach, you would index each poly term split separately and maintain a map while your prefix algorithm is running. Thanks, Matt public class AutocompleteStrField extends StrField { private static Character DELIMITER = '\u00ff'; @Override public boolean isPolyField(){ return true; } /** * Given a {@link org.apache.solr.schema.SchemaField}, create one or more {@link org.apache.lucene.document.Fieldable} instances * @param field the {@link org.apache.solr.schema.SchemaField} * @param externalVal The value to add to the field * @param boost The boost to apply * @return An array of {@link org.apache.lucene.document.Fieldable} * * @see #createField(SchemaField, String, float) * @see #isPolyField() */ @Override public Fieldable[] createFields(SchemaField field, String externalVal, float boost) { String[] st = externalVal.toLowerCase().split(" "); LinkedList<String> tokens = new LinkedList<String>(Arrays.asList(st)); Fieldable[] f = new Fieldable[st.length]; int count = 0; String value = ""; while(!tokens.isEmpty()) { value = tokens.pollLast() + " " + value; f[count] = createField(field, value + DELIMITER + externalVal, boost); count++; } return f==null ? new Fieldable[]{} : f; } /** Given an indexed term, return the human readable representation */ @Override public String indexedToReadable(String indexedForm) { return indexedForm.substring(indexedForm.lastIndexOf(DELIMITER) + 1); } } -- View this message in context: http://lucene.472066.n3.nabble.com/Contribution-Multiword-Inline-Prefix-Autocomplete-Idea-tp2965854p2965854.html Sent from the Solr - User mailing list archive at Nabble.com.