Thanks Eric. For the record, we are using 1.4.1 and SolrJ.

On 31 October 2010 01:54, Erick Erickson <erickerick...@gmail.com> wrote:

> What version of Solr are you using?
>
> About committing. I'd just let the solr defaults handle that. You configure
> this in the autocommit section of solrconfig.xml. I'm pretty sure this
>  gets
> triggered even if you're using SolrJ.
>
> That said, it's probably wise to issue a commit after all your data is
> indexed
> too, just to flush any remaining documents since the last autocommit.
>
> Optimize should not be issued until you're all done, if at all. If
> you're not deleting (or updating) documents, don't bother to optimize
> unless the number of files in your index directory gets really large.
> Recent Solr code almost removes the need to optimize unless you
> delete documents, but I confess I don't know the revision number
> "recent" refers to, perhaps only trunk...
>
> HTH
> Erick
>
> On Thu, Oct 28, 2010 at 9:56 AM, Savvas-Andreas Moysidis <
> savvas.andreas.moysi...@googlemail.com> wrote:
>
> > Hello,
> >
> > We currently index our data through a SQL-DIH setup but due to our model
> > (and therefore sql query) becoming complex we need to index our data
> > programmatically. As we didn't have to deal with commit/optimise before,
> we
> > are now wondering whether there is an optimal approach to that. Is there
> a
> > batch size after which we should fire a commit or should we execute a
> > commit
> > after indexing all of our data? What about optimise?
> >
> > Our document corpus is > 4m documents and through DIH the resulting index
> > is
> > around 1.5G
> >
> > We have searched previous posts but couldn't find a definite answer. Any
> > input much appreciated!
> >
> > Regards,
> > -- Savvas
> >
>

Reply via email to