Thanks Erick. We've added the '_version_' and we'll see if that makes a difference tomorrow. Also, have downloaded the RC1 and will try that next week.
Regards, David Q -----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: 05 October 2012 15:40 To: solr-user@lucene.apache.org Subject: Re: SOLR 4.0 Beta documents being duplicated How are you indexing? There was a problem with indexing from SolrJ if you indexed documents in batches, server.add(doclist) that's fixed in 4.0 RC#. The work-around is to add docs singly, server.add(doc) Second thing. Bad Things Happen if you don't have a _version_ field in your schema.xml. Solr 4.0 RC# isn't happy on startup if this field is missing... Personally, I think you'd be better off using one of the release candidates. Robert cut one here: http://people.apache.org/~rmuir/staging_area/lucene-solr-4.0RC1-rev13911 44/solr/ There will be an RC2 sometime, a couple of problems have been found, but using RC1 should minimize any update to the official 4.0 plus have a lot of improvements over BETA... Best Erick On Fri, Oct 5, 2012 at 10:25 AM, David Quarterman <da...@corexe.com> wrote: > Hi, > > We've been using V4.x of SOLR since last November without too much > trouble. Our MySQL database is refreshed daily and a full import is > run automatically after the refresh and generally produces around > 86,000 products, obviously on unique doc_id's. > > > > So, we upgraded to 4.0 Beta a few days ago, with only mild difficulty, > reindexed and all was fine. Except after the next data refresh and > full-import, we had duplicate products appearing on different unique > doc_ids. Not all products are being duplicated, just random ones. > We've just deleted the data directory and reindexed and the product > count has dropped from 116,711 to 86,543. There'll be another > refresh/import early tomorrow morning and I fear we'll have more duplicates. > > > > The call to the import now contains clean=true, commit=true and > optimize=true but it seems to make no difference. > > > > Anyone have any ideas? > > > > Regards, > > > > David Q > > >