Hello all,

I’m running a large streaming expression and feeding the result to update 
expression.

 update(targetCollection, ...long running stream here..., 

I tried sending the exact same query multiple times, it sometimes works and 
indexes some results, then gives exception, other times fails with an exception 
after 2 minutes.

Response is like:
"EXCEPTION":"java.util.concurrent.ExecutionException: java.io.IOException: 
params distrib=false&numWorkers=4.... and my long stream expression

Server log (short):
[c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall 
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout 
expired: 120000/120000 ms
o.a.s.s.HttpSolrCall null:java.io.IOException: 
java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms

I tried to increase the jetty idle timeout value on the node which hosts my 
target collection to something like an hour. It didn’t affect.


Server logs (long)
ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] 
o.a.s.s.HttpSolrCall null:java.io.IOException: 
java.util.concurrent.TimeoutException: Idle timeout expired: 1                  
                            20000/120000 ms
solr-01    |    at 
org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
solr-01    |    at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
solr-01    |    at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
solr-01    |    at 
org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
solr-01    |    at 
java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
solr-01    |    at 
java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
solr-01    |    at 
java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
solr-01    |    at 
java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
solr-01    |    at 
java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:211)
solr-01    |    at 
org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
solr-01    |    at 
org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
solr-01    |    at 
org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
solr-01    |    at 
org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
solr-01    |    at 
org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
solr-01    |    at 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
solr-01    |    at 
org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
solr-01    |    at 
org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
solr-01    |    at 
org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
solr-01    |    at 
org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
solr-01    |    at 
org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
--
solr-01    |    at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
solr-01    |    at 
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
solr-01    |    at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
solr-01    |    at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
solr-01    |    at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
solr-01    |    at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
solr-01    |    at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
solr-01    |    at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
solr-01    |    at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle timeout 
expired: 120000/120000 ms
solr-01    |    at 
org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
solr-01    |    at 
org.eclipse.jetty.io.IdleTimeout.idleCheck(IdleTimeout.java:113)
solr-01    |    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
solr-01    |    at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
solr-01    |    at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
solr-01    |    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
solr-01    |    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
solr-01    |    ... 1 more


My expression, in case it helps. To summarize, it finds the document ids which 
exists on sourceCollection but not on target collection (DNM). Joins on itself 
to duplicate some fields (I couldn’t find another way to duplicate the value of 
field into 2 fields). Then sends the result to update. Source collection has 
about 300M documents, 24GB heap, 2 shards, 2 replicas of each shard.

update(
    DNM,
    batchSize=1000,
    parallel(
        WorkerCollection,
        leftOuterJoin(
            fetch(
                sourceCollection,
                complement(
                    search(
                        sourceCollection,
                        q="*:*",
                        qt="/export",
                        fq="...some filters...",
                        sort="id_str asc",
                        fl="id_str",
                        partitionKeys="id_str"
                    ),
                    search(
                        DNM,
                        q="*:*",
                        qt="/export",
                        sort="id_str asc",
                        fl="id_str",
                        partitionKeys="id_str"
                    ),
                    on="id_str"
                ),
                fl="...my many fields...",
                on="id_str",
                batchSize="1000"
            ),
            select(
                fetch(
                    sourceCollection,
                    complement(
                        search(
                            sourceCollection,
                            q="*:*",
                            qt="/export",
                            fq="...some other filters...",
                            sort="id_str asc",
                            fl="id_str",
                            partitionKeys="id_str"
                        ),
                        search(
                            DNM,
                            q="*:*",
                            qt="/export",
                            sort="id_str asc",
                            fl="id_str",
                            partitionKeys="id_str"
                        ),
                        on="id_str"
                    ),
                    fl="...some other fields...",
                    on="id_str",
                    batchSize="1000"
                ),
                id_str, ..some other fields as...
            ),
            on="id_str"
        ),
        workers="4", sort="id_str asc"
    )
)

Sent from Mail for Windows 10

Reply via email to