This is using the solr.TrieDateField, it is the field type "date" from the example schema in solr 4.6.1: <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0" />
After further testing I was only able to reproduce this in a sharded & replicated environment (numShards=3, replicationFactor=2) and I think I have narrowed down the issue, and at this point it may be expected behavior... I took a query like q=create_date:[2014-05-19T00:00:00Z TO 2014-05-19T23:59:59Z]&sort=create_date DESC&start=0&rows=10000 which should get all the documents for yesterday sorted by create date, and then added distrib=false and ran it against shard1_replica1 and shard1_replica2. Then I diff'd the files and it showed 5 occurrences where two consecutive rows in one replica were reversed in the other replica, and in all 5 cases the flipped flopped rows had the exact same create_date value, which happened to only go down to the minute. As an example: shard1_replica1: ... docX, 2014-05-19T20:15:00Z docY, 2014-05-19T20:15:00Z ... shard1_replica2: ... docY, 2014-05-19T20:15:00Z docX, 2014-05-19T20:15:00Z ... So I think when I was paging through the results, if the query for page N was handled by replica1 and page N+1 handled by replica2, and the page boundary happened to be where the reversed rows were, this would produce the behavior I was seeing where the last row from the previous page was also the first row from the next page. I guess the obvious solution is to ensure the date field is always more granular than minutes, or add another field to the sort order to consistently break ties. On Mon, May 19, 2014 at 4:19 PM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > : Using Solr 4.6.1 and in my schema I have a date field storing the time a > : document was added to Solr. > > what *exactly* does your schema look like? are you using "solr.DateField" > or "solr.TrieDateField" ? what field options do you have specified? > > : I have a utility program which: > : - queries for all of the documents in the previous day sorted by create > date > : - pages through the results keeping track of the unique document ids > : - compare the total number of unique doc ids to the numFound to see if it > : they match > > what *exactly* do your queries look like? show us some examples please > (URL & results). Are you using distributed searching across multiple > nodes, or a single node? do you have concurrent updates going on during > your test? > > : It is not consistent between tests, the number of occurrences changes and > : the locations of the occurrences can change as well. The larger the > result > : set, and smaller the page size, the more frequent the occurrences are. > > if you bring up a test instance of Solr using your current configs, can > you reproduce (even occasionally) with some synthetic data you can share > with us? If so please provide your full configs & sample data (ie: create > a Jira & attach all the neccessary files i na ZIP) > > > -Hoss > http://www.lucidworks.com/ >