You should clean up your data before sending it to Solr. Theoretically, you could develop a custom update processor to do that cleanup within Solr, but it probably wouldn't be worth the extra effort.

Once you have decided what the clean input format is, then you can decide what the details of the Solr schema should be.

Actually, the first question is what schema your applications will be expecting to see. I mean, is it simply a multivalued string field, or do they want to do keyword search? Decide how the app will consume the data, then design the rough schema, then decide what the clean data will look like, then tune the schema for any nuances. For example, maybe you want both a multivalued list of strings (e.g., for a formatted display) and a multivalued list of keyword text values. Or, maybe you want just a simple keyword text field for the whole list as one value.

In any case, start with the app usage requirements.

-- Jack Krupansky

-----Original Message----- From: anurag.jain
Sent: Thursday, March 21, 2013 10:10 AM
To: solr-user@lucene.apache.org
Subject: CommaSplit and query is free text search

I have field named as  worked_company_name.

in json input i am giving value like

{
"worked_company_name":["Dell","Microsoft,Facebook"]
}

-> data is very bad. means it may have comma etc.


<field name="worked_company_name" type="comaSplitwithsearch" indexed="true"
stored="true"/>


so can you please tell me how type should ?


comaSplitwithsearch ??


thanks






--
View this message in context: http://lucene.472066.n3.nabble.com/CommaSplit-and-query-is-free-text-search-tp4049734.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to