Hi,

Seems you try to reindex from one server to the other.

Be aware that it could be easier for you to simply copy the whole index folder 
over to your 4.6.1 server and start Solr as it will be able to read your 3.x 
index. This is unless you also want to do major upgrades of your schema or 
update processors so that you'll need a re-index anyway.

If you believe you really need a re-index, then please try to batch index 
without triggering commits every few seconds - this is really heavy on the 
system and completely unnecessary. You won't get the benefit of SoftCommit if 
you're not running SolrCloud, so no need to configure that.

I would change your <autoCommit> into maxDocs=10000 and maxTime=120000 (every 
2min). 
Further please index without 1s commitWithin, i.e. instead of
>                server.add(iDoc, 1000);
use
>                server.add(iDoc);

This will make sure the server gets room to breathe and not constantly 
generating new indices.

Finally, it's probably not a good idea to use recursion here. You really don't 
need to, filling up your stack. You can instead refactor the method to do the 
whole indexing. And a hint is that it is generally better to ask for ALL 
documents in one go and stream to the end rather than increasing offsets with 
new queries all the time - because high offsets/start can be time consuming, 
especially with multiple shards. If you increase the timeout enough you should 
be able to retrieve all documents in one go!

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

24. mars 2014 kl. 22:36 skrev Ravi Solr <ravis...@gmail.com>:

> Hello,
>        We are trying to reindex as part of our move from 3.6.2 to 4.6.1
> and have faced various issues reindexing 1.5 Million docs. We dont use
> solrcloud, its still Master/Slave config. For testing this Iam using a
> single test server reading from it and putting back into same index.
> 
> We send docs in batches of 100 but only 10/100 are getting indexed, is this
> related to the maxBufferedAddsPerServer setting that is hard coded ?? Also
> I tried to play with autocommit and softcommit settings but in vain.
> 
>    <autoCommit>
>       <maxDocs>5</maxDocs>
>       <maxTime>5000</maxTime>
>       <openSearcher>true</openSearcher>
>    </autoCommit>
> 
>    <autoSoftCommit>
>        <maxTime>1000</maxTime>
>    </autoSoftCommit>
> 
> I use these on the test system just to check if docs are being indexed, but
> even with a batch of 5 my solrj client code runs faster than indexing
> causing some docs to not get indexed. The function that's indexing is a
> recursive method call  (shown below) which fails after sometime with stack
> overflow (I did not have this issue with 3.6.2 with same code)
> 
>    private static void processDocs(HttpSolrServer server, Integer start,
> Integer rows) throws Exception {
>        SolrQuery query = new SolrQuery();
>        query.setQuery("*:*");
>        query.addFilterQuery("-allfields:[* TO *]");
>        QueryResponse resp = server.query(query);
>        SolrDocumentList list =  resp.getResults();
>        Long total = list.getNumFound();
> 
>        if(list != null && !list.isEmpty()) {
>            for(SolrDocument doc : list) {
>                SolrInputDocument iDoc =
> ClientUtils.toSolrInputDocument(doc);
>                //To index full doc again
>                iDoc.removeField("_version_");
>                server.add(iDoc, 1000);
>            }
> 
>            System.out.println("Indexed " + (start+rows) + "/" + total);
>            if (total >= (start + rows)) {
>                processDocs(server, (start + rows), rows);
>            }
>        }
>    }
> 
> I also tried turning on the updateLog but that was filling up so fast to
> the point where it is useless.
> 
> How do we do bulk updates in solr 4.x environment ?? Is there any setting
> that Iam missing ??
> 
> Thanks
> 
> Ravi Kiran Bhaskar
> Technical Architect
> The Washington Post

Reply via email to