Thanks Erick For best Explanation. The issue with My data is as below. :- I have few data on my books table.
cqlsh:nandan> select * from books; id | author | date | isbn | solr_query | title --------------------------------------+----------+------+----------+------------+----------- 3910b29d-c957-4312-9b8b-738b1d0e25d0 | Chandan | 2015 | 1asd33s | null | Solr d7534021-80c2-4315-8027-84f04bf92f53 | 现在有货 | 2015 | 现在有货 | null | Solr 780b5163-ca6b-40bf-a523-af2c075ef7df | 在有货 | 2015 | 在有货 | null | Solr e6229268-d0fd-485b-ad89-bbde73a07ed6 | 货 | 2015 | 现有货 | null | Solr 76461e7e-6c31-4a4b-8a36-0df5ce746d50 | Nandan | 2017 | 11111 | null | Datastax 9a9c66c2-cd34-460e-a301-6d8e7eb14e55 | Kundan | 2016 | 12ws | null | Cassandra 7e87dc3a-5e4e-4653-84cc-3d83239708d4 | 现有货 | 2015 | 现有货 | null | Solr 6971976e-2528-4956-94a8-345deefe5796 | 现货 | 2015 | 现货 | null | Solr When I am trying to select from table based on author as:- cqlsh:nandan> SELECT * from books where solr_query = 'author:现有货'; id | author | date | isbn | solr_query | title --------------------------------------+----------+------+----------+------------+------- d7534021-80c2-4315-8027-84f04bf92f53 | 现在有货 | 2015 | 现在有货 | null | Solr 7e87dc3a-5e4e-4653-84cc-3d83239708d4 | 现有货 | 2015 | 现有货 | null | Solr 6971976e-2528-4956-94a8-345deefe5796 | 现货 | 2015 | 现货 | null | Solr 780b5163-ca6b-40bf-a523-af2c075ef7df | 在有货 | 2015 | 在有货 | null | Solr It should return me one value , but I am getting other records also, But when I am trying to retrive another way, then it is returning me 0 rows as :- cqlsh:nandan> SELECT * from books where solr_query = 'author:*现有货*'; id | author | date | isbn | solr_query | title ----+--------+------+------+------------+------- (0 rows) cqlsh:nandan> SELECT * from books where solr_query = 'author:*现有货'; id | author | date | isbn | solr_query | title ----+--------+------+------+------------+------- (0 rows) cqlsh:nandan> SELECT * from books where solr_query = 'author:现有货*'; id | author | date | isbn | solr_query | title ----+--------+------+------+------------+------- (0 rows) In Some cases, I am getting correct data but in some case, I am getting wrong data. Please check. Thanks Nandan On Thu, Jun 15, 2017 at 11:47 AM, Erick Erickson <erickerick...@gmail.com> wrote: > Back up a bit and tell us why you want to use StrField, because what > you're trying to do is somewhat confused. > > First of all, StrFields are totally unanalyzed. So defining an > <analyzer> as part of a StrField type definition is totally > unsupported. I'm a bit surprised that Solr even starts up. > > Second, you can't search a StrField unless you search the whole thing > exactly. That is, if your title field is "My dog has fleas", there > only a few ways to match anything in that field > > 1> search "My dog has fleas" exactly. Even "my dog has fleas" wouldn't > match because of the capitalization. "My dog has fleas." would also > fail because of the period. StrField types are intended for data that > should be invariant and not tokenized. > > 2> prefix search as "My dog*" > > 3> pre-and-postfix as "*dog*" > > <2> is actually reasonable if you have more than, say, 3 or 4 "real" > characters before the wildcard. > > <3> performs very poorly at any kind of scale. > > A search for "dog" would not match. A search for "fleas" wouldn't > match. You see where this is going. > > If those restrictions are OK, just use the already-defined "string" type. > > As for the English/Chinese that's actually kind of a tough one. > Splitting Chinese up into searchable tokens is nothing like breaking > English up. There are examples in the managed-schema file that have > field definitions for Chinese, but I know of no way to have a single > field type shard the two different analysis chains. One solution > people have used is to have a title_ch and title_en field and search > both. Or search one or the other preferentially if the input is in one > language or the other. > > I strongly advise you use the admin UI>>analysis page to understand > the effects of tokenization, it's the heart of searching. > > Best, > Erick > > On Wed, Jun 14, 2017 at 6:23 PM, @Nandan@ > <nandanpriyadarshi...@gmail.com> wrote: > > Hi , > > > > I am using Apache Solr for do advanced searching with my Big Data. > > > > When I am creating Solr core , then by default for text field , it is > > coming as TextField data type and class. > > > > Can you please tell me how to change TextField to StrField. My table > > contains record into English as well as Chinese . > > > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > > > > <schema name="autoSolrSchema" version="1.5"> > > > > <types> > > > > <fieldType class="org.apache.solr.schema.StrField" name="StrField"> > > > > <analyzer> > > > > <tokenizer class="solr.StandardTokenizerFactory"/> > > > > <filter class="solr.LowerCaseFilterFactory"/> > > > > </analyzer> > > > > </fieldType> > > > > <fieldType class="org.apache.solr.schema.UUIDField" > name="UUIDField"/> > > > > <fieldType class="org.apache.solr.schema.TrieIntField" > > name="TrieIntField"/> > > > > </types> > > > > <fields> > > > > <field indexed="true" multiValued="false" name="title" stored="true" > > type="StrField"/> > > > > <field indexed="true" multiValued="false" name="isbn" stored="true" > > type="StrField"/> > > > > <field indexed="true" multiValued="false" name="publisher" > > stored="true" type="StrField"/> > > > > <field indexed="true" multiValued="false" name="author" stored="true" > > type="StrField"/> > > > > <field docValues="true" indexed="true" multiValued="false" name="id" > > stored="true" type="UUIDField"/> > > > > <field docValues="true" indexed="true" multiValued="false" > name="date" > > stored="true" type="TrieIntField"/> > > > > </fields> > > > > > > Please guide me for correct StrField. > > > > Thanks. >