Joel - can you please elaborate a bit on how this compares with Hoss' approach? Complementary?
Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Tue, Dec 17, 2013 at 6:45 PM, Joel Bernstein <joels...@gmail.com> wrote: > SOLR-5244 is also working in this direction. This focuses on efficient > binary extract of entire search results. > > > On Tue, Dec 17, 2013 at 2:33 PM, Otis Gospodnetic < > otis.gospodne...@gmail.com> wrote: > > > Hoss is working on it. Search for deep paging or cursor in JIRA. > > > > Otis > > Solr & ElasticSearch Support > > http://sematext.com/ > > On Dec 17, 2013 12:30 PM, "Petersen, Robert" < > > robert.peter...@mail.rakuten.com> wrote: > > > > > Hi solr users, > > > > > > We have a new use case where need to make a pile of data available as > XML > > > to a client and I was thinking we could easily put all this data into a > > > solr collection and the client could just do a star search and page > > through > > > all the results to obtain the data we need to give them. Then I > > remembered > > > we currently don't allow deep paging in our current search indexes as > > > performance declines the deeper you go. Is this still the case? > > > > > > If so, is there another approach to make all the data in a collection > > > easily available for retrieval? The only thing I can think of is to > > query > > > our DB for all the unique IDs of all the documents in the collection > and > > > then pull out the documents out in small groups with successive queries > > > like 'UniqueIdField:(id1 OR id2 OR ... OR idn)' 'UniqueIdField:(idn+1 > OR > > > idn+2 OR ... etc)' which doesn't seem like a very good approach because > > the > > > DB might have been updated with new data which hasn't been indexed yet > > and > > > so all the ids might not be in there (which may or may not matter I > > > suppose). > > > > > > Then I was thinking we could have a field with an incrementing numeric > > > value which could be used to perform range queries as a substitute for > > > paging through everything. Ie queries like 'IncrementalField:[1 TO > 100]' > > > 'IncrementalField:[101 TO 200]' but this would be difficult to maintain > > as > > > we update the index unless we reindex the entire collection every time > we > > > update any docs at all. > > > > > > Is this perhaps not a good use case for solr? Should I use something > > else > > > or is there another approach that would work here to allow a client to > > pull > > > groups of docs in a collection through the rest api until the client > has > > > gotten them all? > > > > > > Thanks > > > Robi > > > > > > > > > > > > -- > Joel Bernstein > Search Engineer at Heliosearch >