hi guys,

thank you very much for the help. sorry been so lated to reply.


1. "commit" didn't help.
    after commit, the 'numFound' of "*:*" query is still the same.


2. "id" field in every doc is generated by solr using UUID, i have
    idea how to check if there is a duplicated one. but i assuming
    there shouldn't be, unless solr cloud has some known bug when
    using UUID in a distributed environment.


the environment is


solr cloud with:
3 linux boxes, use zookeeper 3.4.6  + solr 5.2.1, oracle JDK 1.7.80


any ideas?


thank you very much.






At 2016-04-05 12:09:14, "John Bickerstaff" <j...@johnbickerstaff.com> wrote:
>Both of us implied it, but to be completely clear - if you have a duplicate
>ID in your data set, SOLR will throw away previous documents with that ID
>and index the new one.  That's fine if your duplicates really are
>duplicates - it's not OK if there's a problem in the data set and the
>duplicates ID's are on documents that are actually unique.
>
>On Mon, Apr 4, 2016 at 9:51 PM, John Bickerstaff <j...@johnbickerstaff.com>
>wrote:
>
>> Sweet - that's a good point - I ran into that too - I had not run the
>> commit for the last "batch" (I was using SolrJ) and so numbers didn't match
>> until I did.
>>
>> On Mon, Apr 4, 2016 at 9:50 PM, Binoy Dalal <binoydala...@gmail.com>
>> wrote:
>>
>>> 1) Are you sure you don't have duplicates?
>>> 2) All of your records might have been indexed but a new searcher may not
>>> have opened on the updated index yet. Try issuing a commit and see if that
>>> works.
>>>
>>> On Tue, 5 Apr 2016, 08:56 cqlangyi, <cqlan...@163.com> wrote:
>>>
>>> > hi there,
>>> >
>>> >
>>> > i have an solr 5.2.1,  when i do data import, after the job is done,
>>> it's
>>> > shown 165,191 rows processed successfully.
>>> >
>>> >
>>> > but when i query with *:*, the "numFound" shown only 163,349 docs in
>>> index.
>>> >
>>> >
>>> > when i tred to do it again, , it's shown 165,191 rows processed
>>> > successfully. but the *:* query result now is 162,390.
>>> >
>>> >
>>> > no errors in any log,
>>> >
>>> >
>>> > any idea?
>>> >
>>> >
>>> > thank you very much!
>>> >
>>> >
>>> > cq
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > At 2016-04-05 09:19:48, "Chris Hostetter" <hossman_luc...@fucit.org>
>>> > wrote:
>>> > >
>>> > >: I am not sure how to use "Sort By Function" for Case.
>>> > >:
>>> > >:
>>> |10#40|14#19|33#17|27#6|15#6|19#5|7#2|6#1|29#1|5#1|30#1|28#1|12#0|20#0|
>>> > >:
>>> > >: Can you tell how to fetch 40 when input is 10.
>>> > >
>>> > >Something like...
>>> > >
>>> >
>>> >
>>> >if(termfreq(f,10),40,if(termfreq(f,14),19,if(termfreq(f,33),17,....)))))))))))
>>> > >
>>> > >But i suspect there may be a much better way to achieve your ultimate
>>> goal
>>> > >if you tell us what it is.  what do these fields represent? what makes
>>> > >these numeric valuessignificant? do you know which values are
>>> significant
>>> > >when indexing, or do they vary for every query?
>>> > >
>>> > >https://people.apache.org/~hossman/#xyproblem
>>> > >XY Problem
>>> > >
>>> > >Your question appears to be an "XY Problem" ... that is: you are
>>> dealing
>>> > >with "X", you are assuming "Y" will help you, and you are asking about
>>> "Y"
>>> > >without giving more details about the "X" so that we can understand the
>>> > >full issue.  Perhaps the best solution doesn't involve "Y" at all?
>>> > >See Also: http://www.perlmonks.org/index.pl?node_id=542341
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >-Hoss
>>> > >http://www.lucidworks.com/
>>> >
>>> --
>>> Regards,
>>> Binoy Dalal
>>>
>>
>>

Reply via email to