hi guys,
thank you very much for the help. sorry been so lated to reply. 1. "commit" didn't help. after commit, the 'numFound' of "*:*" query is still the same. 2. "id" field in every doc is generated by solr using UUID, i have idea how to check if there is a duplicated one. but i assuming there shouldn't be, unless solr cloud has some known bug when using UUID in a distributed environment. the environment is solr cloud with: 3 linux boxes, use zookeeper 3.4.6 + solr 5.2.1, oracle JDK 1.7.80 any ideas? thank you very much. At 2016-04-05 12:09:14, "John Bickerstaff" <j...@johnbickerstaff.com> wrote: >Both of us implied it, but to be completely clear - if you have a duplicate >ID in your data set, SOLR will throw away previous documents with that ID >and index the new one. That's fine if your duplicates really are >duplicates - it's not OK if there's a problem in the data set and the >duplicates ID's are on documents that are actually unique. > >On Mon, Apr 4, 2016 at 9:51 PM, John Bickerstaff <j...@johnbickerstaff.com> >wrote: > >> Sweet - that's a good point - I ran into that too - I had not run the >> commit for the last "batch" (I was using SolrJ) and so numbers didn't match >> until I did. >> >> On Mon, Apr 4, 2016 at 9:50 PM, Binoy Dalal <binoydala...@gmail.com> >> wrote: >> >>> 1) Are you sure you don't have duplicates? >>> 2) All of your records might have been indexed but a new searcher may not >>> have opened on the updated index yet. Try issuing a commit and see if that >>> works. >>> >>> On Tue, 5 Apr 2016, 08:56 cqlangyi, <cqlan...@163.com> wrote: >>> >>> > hi there, >>> > >>> > >>> > i have an solr 5.2.1, when i do data import, after the job is done, >>> it's >>> > shown 165,191 rows processed successfully. >>> > >>> > >>> > but when i query with *:*, the "numFound" shown only 163,349 docs in >>> index. >>> > >>> > >>> > when i tred to do it again, , it's shown 165,191 rows processed >>> > successfully. but the *:* query result now is 162,390. >>> > >>> > >>> > no errors in any log, >>> > >>> > >>> > any idea? >>> > >>> > >>> > thank you very much! >>> > >>> > >>> > cq >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > At 2016-04-05 09:19:48, "Chris Hostetter" <hossman_luc...@fucit.org> >>> > wrote: >>> > > >>> > >: I am not sure how to use "Sort By Function" for Case. >>> > >: >>> > >: >>> |10#40|14#19|33#17|27#6|15#6|19#5|7#2|6#1|29#1|5#1|30#1|28#1|12#0|20#0| >>> > >: >>> > >: Can you tell how to fetch 40 when input is 10. >>> > > >>> > >Something like... >>> > > >>> > >>> > >>> >if(termfreq(f,10),40,if(termfreq(f,14),19,if(termfreq(f,33),17,....))))))))))) >>> > > >>> > >But i suspect there may be a much better way to achieve your ultimate >>> goal >>> > >if you tell us what it is. what do these fields represent? what makes >>> > >these numeric valuessignificant? do you know which values are >>> significant >>> > >when indexing, or do they vary for every query? >>> > > >>> > >https://people.apache.org/~hossman/#xyproblem >>> > >XY Problem >>> > > >>> > >Your question appears to be an "XY Problem" ... that is: you are >>> dealing >>> > >with "X", you are assuming "Y" will help you, and you are asking about >>> "Y" >>> > >without giving more details about the "X" so that we can understand the >>> > >full issue. Perhaps the best solution doesn't involve "Y" at all? >>> > >See Also: http://www.perlmonks.org/index.pl?node_id=542341 >>> > > >>> > > >>> > > >>> > > >>> > >-Hoss >>> > >http://www.lucidworks.com/ >>> > >>> -- >>> Regards, >>> Binoy Dalal >>> >> >>