Thanks a lot for your for your help, Joel. Just wondering, why does "export" have such limitations? It uses the same query handler with "select", isn't it?
2014-12-31 10:28 GMT+08:00 Joel Bernstein <joels...@gmail.com>: > For the initial release only JSON output format is supported with the > /export feature. Also there is no built-in distributed support yet. Both of > these features are likely to follow in future releases. > > For the initial release you'll need a client that can handle the JSON > format and distributed logic. The Heliosearch project includes a client > called CloudSolrStream that you can use for this purpose. Here are two > links to get started with CloudSolrStream: > > > https://github.com/Heliosearch/heliosearch/blob/helio_4_10/solr/solrj/src/java/org/apache/solr/client/solrj/streaming/CloudSolrStream.java > http://heliosearch.org/streaming-aggregation-for-solrcloud/ > > > > > > Joel Bernstein > Search Engineer at Heliosearch > > On Mon, Dec 29, 2014 at 2:20 AM, Sandy Ding <sandy.ding...@gmail.com> > wrote: > > > Hi, Joel > > > > Thanks for your reply. > > It seems that the weird export results is because that I removed the > "<str > > name>xsort</str>" invariant of the export request handler in the default > > sorlconfig.xml to get csv-format output. > > I don't quite understand the meaning of "xsort", but I removed it > because I > > always get json response (as you said) with the xsort invariant. > > Is there a way to get a csv output using export? > > And also, can I get full results from all shards? (I tried to set > > "distrib=true" but get "SyntaxError:xport RankQuery is required for > xsort: > > rq={!xport}", and I do have rq={!xport} in the export invariants) > > > > > > 2014-12-27 3:21 GMT+08:00 Joel Bernstein <joels...@gmail.com>: > > > > > Hi Sandy, > > > > > > I pulled Solr 4.10.3 to see if I could recreate the issue you are > seeing > > > with export and I wasn't able to recreate the bug you are seeing. For > > > example the following query: > > > > > > http://localhost:8983/solr/collection1/export?q=join_i:[500000 TO > > > 500010]&wt=json&indent=true&sort=join_i+asc&fl=join_i,ShopId_i > > > > > > > > > Brings back the following result: > > > > > > > > > {"responseHeader": {"status": 0}, "response":{"numFound":11, > > > > > > > > > "docs":[{"join_i":500000,"ShopId_i":578917},{"join_i":500001,"ShopId_i":294217},{"join_i":500002,"ShopId_i":199805},{"join_i":500003,"ShopId_i":633461},{"join_i":500004,"ShopId_i":472995},{"join_i":500005,"ShopId_i":672122},{"join_i":500006,"ShopId_i":394637},{"join_i":500007,"ShopId_i":446443},{"join_i":500008,"ShopId_i":697329},{"join_i":500009,"ShopId_i":166988},{"join_i":500010,"ShopId_i":191261}]}} > > > > > > > > > Notice the join_i values are all within the correct range. > > > > > > If you can post the export handler configuration we should be able to > > > see the issue. > > > > > > > > > Joel Bernstein > > > Search Engineer at Heliosearch > > > > > > On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein <joels...@gmail.com> > > > wrote: > > > > > > > Hi Sandy, > > > > > > > > The export handler should only return documents in JSON format. The > > > > results in your second example are in XML for format so something > looks > > > to > > > > be wrong in the configuration. Can you post what your solrconfig > looks > > > like? > > > > > > > > Joel > > > > > > > > Joel Bernstein > > > > Search Engineer at Heliosearch > > > > > > > > On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson < > > > erickerick...@gmail.com> > > > > wrote: > > > > > > > >> I think you missed a very important part of Jack's reply: > > > >> > > > >> bq: I notice that you don't have distrib=false on your select, which > > > >> would make your select be from all nodes, while export would only be > > > >> docs from the specific node you sent the request to. > > > >> > > > >> And from the Reference Guide on export > > > >> > > > >> bq: The initial release treats all queries as non-distributed > > > >> requests. So the client is responsible for making the calls to each > > > >> Solr instance and merging the results. > > > >> > > > >> So the export statement you're sending is _only_ exporting the > results > > > >> from the shard on 8983 and completely ignoring the other (6?) > shards, > > > >> whereas the query you're sending is getting the results from all the > > > >> shards. > > > >> > > > >> As Jack said, add &distrib=false to the query, send it to the same > > > >> shard you send the export command to and the results should match. > > > >> > > > >> Also, be sure your configuration for the /select handler doesn't > have > > > >> any additional default parameters that might alter the results, but > I > > > >> doubt that's really a problem here. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >> On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan > > <iori...@yahoo.com.invalid > > > > > > > >> wrote: > > > >> > Hi, > > > >> > > > > >> > Do you have any custom solr components deployed? May be custom > > > response > > > >> writer? > > > >> > > > > >> > Ahmet > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > On Friday, December 26, 2014 3:26 PM, Sandy Ding < > > > >> sandy.ding...@gmail.com> wrote: > > > >> > Hi, Ahmet, > > > >> > > > > >> > I use libuuid for unique id and I guess there shouldn't be > duplicate > > > >> ids. > > > >> > Also, the results are not just incomplete, they are screwed. > > > >> > > > > >> > > > > >> > 2014-12-26 20:19 GMT+08:00 Ahmet Arslan <iori...@yahoo.com.invalid > > >: > > > >> > > > > >> >> Hi, > > > >> >> > > > >> >> Two different things : > > > >> >> > > > >> >> If you have unique key defined document with same id override > > within > > > a > > > >> >> single shard. > > > >> >> > > > >> >> Plus, uniqueIDs expected to be unique across shards. > > > >> >> > > > >> >> Ahmet > > > >> >> > > > >> >> > > > >> >> > > > >> >> On Friday, December 26, 2014 11:00 AM, Sandy Ding < > > > >> sandy.ding...@gmail.com> > > > >> >> wrote: > > > >> >> Hi, all > > > >> >> > > > >> >> I've recently set up a solr cluster and found that "export" > returns > > > >> >> different results from "select". > > > >> >> And I confirmed that the "export" results are wrong by manually > > query > > > >> the > > > >> >> results. > > > >> >> Even simple queries as follows will get different results: > > > >> >> > > > >> >> curl " > > > >> http://localhost:8983/solr/pa_info/select?q=*:*&fl=id&sort=id+desc > ": > > > >> >> > > > >> >> <response><lst name="responseHeader"><int > > > name="status">0</int><int > > > >> >> name="QTime">11</int><lst name="params"><str name="sort">id > > > >> desc</str><str > > > >> >> name="fl">id</str><str name="q">*:*</str></lst></lst><result > > > >> >> name="response" *numFound="1197"* > start="0"><doc>...</doc></result> > > > >> >> > > > >> >> curl " > > > >> http://localhost:8983/solr/pa_info/export?q=*:*&fl=id&sort=id+desc" > > > >> >> : > > > >> >> {*"numFound":172*, "docs":[..] > > > >> >> > > > >> >> Don't have a clue why this happen! Anyone help? > > > >> >> > > > >> >> Best, > > > >> >> Sandy > > > >> >> > > > >> > > > > > > > > > > > > > >