On Wed, 2019-05-15 at 21:37 -0400, Rahul Goswami wrote:
> fq={!graph from=from_field to=to_field returnRoot=false}
> 
> Executing _only_ the graph filter query takes about 64.5 seconds. The
> total number of documents from this filter query is a little over 1
> million.

I tried building an index in Solr 7.6 with 4M simple records with every
4th record having a from_field and a to_field, each containing a random
number from 0-65535 as a String.


Asking for the first 10 results:

time curl -s '
http://localhost:8983/solr/gettingstarted/select?rows=10&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*'
 | jq .response.numFound
1000000

real    0m0.018s
user    0m0.011s
sys     0m0.005s


Asking for 1M results (ignoring that export or streaming should be used
for exports of that size):

time curl -s '
http://localhost:8983/solr/gettingstarted/select?rows=1000000&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*'
 | jq .response.numFound
1000000

real    0m10.101s
user    0m3.344s
sys     0m0.419s

> Is this performance expected out of graph query ?

As the sanity check above shows, there is a huge difference between
evaluating a graph query (any query really) and asking for 1M results
to be returned. With that in mind, what do you set rows to? 


- Toke Eskildsen, Royal Danish Library


Reply via email to