So we have to optimize our current implementation of our indexing. Our current implementation is to index per batch and each batch will have a query call from the database that will return multiple result sets and the application will be responsible in assembling the document based on the result sets.
This creates a problem especially if there are new business requirements that will require us to add another result set in our query in indexing. It sometimes slows down the database because it's hard to estimate the ideal batch size because the data grows unexpectedly in other result sets. Another problem that the documents being indexed is too big because of the nested documents which results to a heap error in solrj (a parent document can have up to a thousand documents). So to those who has an idea and or experience with this, what can we do to make this scalable and stable? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html