You should clean up your data before sending it to Solr. Theoretically, you
could develop a custom update processor to do that cleanup within Solr, but
it probably wouldn't be worth the extra effort.
Once you have decided what the clean input format is, then you can decide
what the details of the Solr schema should be.
Actually, the first question is what schema your applications will be
expecting to see. I mean, is it simply a multivalued string field, or do
they want to do keyword search? Decide how the app will consume the data,
then design the rough schema, then decide what the clean data will look
like, then tune the schema for any nuances. For example, maybe you want both
a multivalued list of strings (e.g., for a formatted display) and a
multivalued list of keyword text values. Or, maybe you want just a simple
keyword text field for the whole list as one value.
In any case, start with the app usage requirements.
-- Jack Krupansky
-----Original Message-----
From: anurag.jain
Sent: Thursday, March 21, 2013 10:10 AM
To: solr-user@lucene.apache.org
Subject: CommaSplit and query is free text search
I have field named as worked_company_name.
in json input i am giving value like
{
"worked_company_name":["Dell","Microsoft,Facebook"]
}
-> data is very bad. means it may have comma etc.
<field name="worked_company_name" type="comaSplitwithsearch" indexed="true"
stored="true"/>
so can you please tell me how type should ?
comaSplitwithsearch ??
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/CommaSplit-and-query-is-free-text-search-tp4049734.html
Sent from the Solr - User mailing list archive at Nabble.com.