Iam also seeing the following in the log. Is it really commiting ??? Now I
am totally confused about how solr 4.x indexes. My relavant update config
is as shown below

  <updateHandler class="solr.DirectUpdateHandler2">
    <maxPendingDeletes>1</maxPendingDeletes>
    <autoCommit>
       <maxDocs>100</maxDocs>
       <maxTime>120000</maxTime>
       <openSearcher>false</openSearcher>
    </autoCommit>
  </updateHandler>

[#|2014-03-25T13:44:03.765-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=86;_ThreadName=commitScheduler-6-thread-1;|820509
[commitScheduler-6-thread-1] INFO  org.apache.solr.update.UpdateHandler  -
start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
|#]

[#|2014-03-25T13:44:03.766-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=83;_ThreadName=http-thread-pool-8080(4);|820510
[http-thread-pool-8080(4)] INFO
org.apache.solr.update.processor.LogUpdateProcessor  - [sitesearchcore]
webapp=/solr-admin path=/update params={wt=javabin&version=2}
{add=[09f693e6-9a6f-11e3-9900-dd917233cf9c]} 0 13
|#]

[#|2014-03-25T13:44:03.898-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=86;_ThreadName=commitScheduler-6-thread-1;|820642
[commitScheduler-6-thread-1] INFO  org.apache.solr.core.SolrCore  -
SolrDeletionPolicy.onCommit: commits: num=3

commit{dir=/data/solr/core/sitesearch-data/index,segFN=segments_9y68,generation=464192}

commit{dir=/data/solr/core/sitesearch-data/index,segFN=segments_9yjf,generation=464667}

commit{dir=/data/solr/core/sitesearch-data/index,segFN=segments_9yjg,generation=464668}
|#]

[#|2014-03-25T13:44:03.898-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=86;_ThreadName=commitScheduler-6-thread-1;|820642
[commitScheduler-6-thread-1] INFO  org.apache.solr.core.SolrCore  - newest
commit generation = 464668
|#]

[#|2014-03-25T13:44:03.908-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=86;_ThreadName=commitScheduler-6-thread-1;|820652
[commitScheduler-6-thread-1] INFO
org.apache.solr.search.SolrIndexSearcher  - Opening
Searcher@1e2ca86e[sitesearchcore]
realtime
|#]

[#|2014-03-25T13:44:03.909-0400|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=86;_ThreadName=commitScheduler-6-thread-1;|820653
[commitScheduler-6-thread-1] INFO  org.apache.solr.update.UpdateHandler  -
end_commit_flush


Thanks

Ravi Kiran Bhaskar


On Tue, Mar 25, 2014 at 1:10 PM, Ravi Solr <ravis...@gmail.com> wrote:

> Thank you very much for responding Mr. Høydahl. I removed the recursion
> which eliminated the stack overflow exception. However, I still
> encountering my main problem with the docs not getting indexed in solr 4.x
> as I mentioned in my original email. The reason I am reindexing is that
> with solr 4.x EnglishPorterFilterFactory has been removed and also I wanted
> to add another copyField of all field values into destination "allfields"
>
> As per your suggestion I removed softcommit and had autoCommit to maxDocs
> 100 and maxTime to 120000. I was printing out the indexing call...You can
> clearly see still it does index around 10 at a time (testing code and
> results shown below). Again my code finished fully and just for a good
> measure I commited manually after 10 minutes still when I query I only see
> "13513" docs got indexed.
>
> There must be something else I am missing
>
> <response>
>      <lst name="responseHeader">
>       <int name="status">0</int>
>       <int name="QTime">1</int>
>       <lst name="params">
>            <str name="q">allfields:[* TO *]</str>
>             <str name="wt">xml</str>
>             <str name="rows">0</str>
>       </lst>
>       </lst>
>       <result name="response" numFound="13513" start="0"/></response>
>
> TEST INDEXER CODE
>  -------------------------------
>         Long total = null;
>         Integer start = 0;
>         Integer rows = 100;
>         while(total == null || total >= (start+rows)) {
>
>             SolrQuery query = new SolrQuery();
>             query.setQuery("*:*");
>             query.setSort("displaydatetime", ORDER.desc);
>
>             query.addFilterQuery("-allfields:[* TO *]");
>             QueryResponse resp = server.query(query);
>             SolrDocumentList list =  resp.getResults();
>             total = list.getNumFound();
>
>             if(list != null && !list.isEmpty()) {
>                 for(SolrDocument doc : list) {
>                     SolrInputDocument iDoc =
> ClientUtils.toSolrInputDocument(doc);
>                     //To index full doc again
>                     iDoc.removeField("_version_");
>                     server.add(iDoc);
>
>                 }
>
>                 System.out.println("Indexed " + (start+rows) + "/" +
> total);
>                 start = (start+rows);
>             }
>         }
>
>        System.out.println("COMPLETELY DONE");
>
> System.out output
> -------------------------
> Indexed 1252100/1256575
> Indexed 1252200/1256575
> Indexed 1252300/1256575
> Indexed 1252400/1256575
> Indexed 1252500/1256575
> Indexed 1252600/1256575
> Indexed 1252700/1256575
> Indexed 1252800/1256575
> Indexed 1252900/1256575
> Indexed 1253000/1256575
> Indexed 1253100/1256566
> Indexed 1253200/1256566
> Indexed 1253300/1256566
> Indexed 1253400/1256566
> Indexed 1253500/1256566
> Indexed 1253600/1256566
> Indexed 1253700/1256566
> Indexed 1253800/1256566
> Indexed 1253900/1256566
> Indexed 1254000/1256566
> Indexed 1254100/1256566
> Indexed 1254200/1256566
> Indexed 1254300/1256566
> Indexed 1254400/1256566
> Indexed 1254500/1256566
> Indexed 1254600/1256566
> Indexed 1254700/1256566
> Indexed 1254800/1256566
> Indexed 1254900/1256566
> Indexed 1255000/1256566
> Indexed 1255100/1256566
> Indexed 1255200/1256566
> Indexed 1255300/1256566
> Indexed 1255400/1256566
> Indexed 1255500/1256566
> Indexed 1255600/1256566
> Indexed 1255700/1256557
> Indexed 1255800/1256557
> Indexed 1255900/1256557
> Indexed 1256000/1256557
> Indexed 1256100/1256557
> Indexed 1256200/1256557
> Indexed 1256300/1256557
> Indexed 1256400/1256557
> Indexed 1256500/1256557
> COMPLETELY DONE
>
>
> Thanks,
> Ravi Kiran Bhaskar
>
>
>
> On Tue, Mar 25, 2014 at 7:13 AM, Jan Høydahl <jan....@cominvent.com>wrote:
>
>> Hi,
>>
>> Seems you try to reindex from one server to the other.
>>
>> Be aware that it could be easier for you to simply copy the whole index
>> folder over to your 4.6.1 server and start Solr as it will be able to read
>> your 3.x index. This is unless you also want to do major upgrades of your
>> schema or update processors so that you'll need a re-index anyway.
>>
>> If you believe you really need a re-index, then please try to batch index
>> without triggering commits every few seconds - this is really heavy on the
>> system and completely unnecessary. You won't get the benefit of SoftCommit
>> if you're not running SolrCloud, so no need to configure that.
>>
>> I would change your <autoCommit> into maxDocs=10000 and maxTime=120000
>> (every 2min).
>> Further please index without 1s commitWithin, i.e. instead of
>> >                server.add(iDoc, 1000);
>> use
>> >                server.add(iDoc);
>>
>> This will make sure the server gets room to breathe and not constantly
>> generating new indices.
>>
>> Finally, it's probably not a good idea to use recursion here. You really
>> don't need to, filling up your stack. You can instead refactor the method
>> to do the whole indexing. And a hint is that it is generally better to ask
>> for ALL documents in one go and stream to the end rather than increasing
>> offsets with new queries all the time - because high offsets/start can be
>> time consuming, especially with multiple shards. If you increase the
>> timeout enough you should be able to retrieve all documents in one go!
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> 24. mars 2014 kl. 22:36 skrev Ravi Solr <ravis...@gmail.com>:
>>
>> > Hello,
>> >        We are trying to reindex as part of our move from 3.6.2 to 4.6.1
>> > and have faced various issues reindexing 1.5 Million docs. We dont use
>> > solrcloud, its still Master/Slave config. For testing this Iam using a
>> > single test server reading from it and putting back into same index.
>> >
>> > We send docs in batches of 100 but only 10/100 are getting indexed, is
>> this
>> > related to the maxBufferedAddsPerServer setting that is hard coded ??
>> Also
>> > I tried to play with autocommit and softcommit settings but in vain.
>> >
>> >    <autoCommit>
>> >       <maxDocs>5</maxDocs>
>> >       <maxTime>5000</maxTime>
>> >       <openSearcher>true</openSearcher>
>> >    </autoCommit>
>> >
>> >    <autoSoftCommit>
>> >        <maxTime>1000</maxTime>
>> >    </autoSoftCommit>
>> >
>> > I use these on the test system just to check if docs are being indexed,
>> but
>> > even with a batch of 5 my solrj client code runs faster than indexing
>> > causing some docs to not get indexed. The function that's indexing is a
>> > recursive method call  (shown below) which fails after sometime with
>> stack
>> > overflow (I did not have this issue with 3.6.2 with same code)
>> >
>> >    private static void processDocs(HttpSolrServer server, Integer start,
>> > Integer rows) throws Exception {
>> >        SolrQuery query = new SolrQuery();
>> >        query.setQuery("*:*");
>> >        query.addFilterQuery("-allfields:[* TO *]");
>> >        QueryResponse resp = server.query(query);
>> >        SolrDocumentList list =  resp.getResults();
>> >        Long total = list.getNumFound();
>> >
>> >        if(list != null && !list.isEmpty()) {
>> >            for(SolrDocument doc : list) {
>> >                SolrInputDocument iDoc =
>> > ClientUtils.toSolrInputDocument(doc);
>> >                //To index full doc again
>> >                iDoc.removeField("_version_");
>> >                server.add(iDoc, 1000);
>> >            }
>> >
>> >            System.out.println("Indexed " + (start+rows) + "/" + total);
>> >            if (total >= (start + rows)) {
>> >                processDocs(server, (start + rows), rows);
>> >            }
>> >        }
>> >    }
>> >
>> > I also tried turning on the updateLog but that was filling up so fast to
>> > the point where it is useless.
>> >
>> > How do we do bulk updates in solr 4.x environment ?? Is there any
>> setting
>> > that Iam missing ??
>> >
>> > Thanks
>> >
>> > Ravi Kiran Bhaskar
>> > Technical Architect
>> > The Washington Post
>>
>>
>

Reply via email to