8 million documents in two hours is over 1000/sec. That is a pretty fast indexing rate. It may be hard to go faster than that.
wunder On Jun 10, 2013, at 7:12 AM, Shawn Heisey wrote: > On 6/10/2013 2:32 AM, Sebastian Steinfeld wrote: >> Hi Shawn, >> >> thank you for your answer. >> >> I am using Oracle. This is the configuration I am using: >> --------- >> <dataSource >> name="local" >> driver="oracle.jdbc.driver.OracleDriver" >> url="jdbc:oracle:thin:@localhost:1521:XE" >> user="****" >> password="****" >> batchSize="20000" >> /> >> ------------ >> >> There are 12GB free memory on the server I hope this is enough. >> I will test the import with 4GB vm memory. > > I don't know how to ensure streaming results with Oracle. It is likely > that someone here does, though. The default for most JDBC drivers is to > buffer the entire SQL result. > >> Do you know if the "autocommit" inside solrconfig.xml configuration works >> when using the DIH with the url: >> /dataimport?command=full-import&clean=true&commit=true >> >> I read, that "commit=true" will only make one commit in the end of the >> import and so "autocommit" won't work. > > The autoCommit settings always work, but exactly what that means will > depend on what you want from autoCommit. The autoCommit settings that > are in the example config will result in a hard commit every fifteen > seconds, but that commit will NOT open a new searcher, so the added > documents will not be visible in search results. This is IMHO the best > way to go, although I would probably increase the interval to a minute > or five minutes. You *DO* want these hard commits happening if you're > on Solr 4.x, to control the size of the updateLog. > > If you want the index changes to become visible on a regular basis, then > uncomment and use the autoSoftCommit settings. This defaults to once a > second, which I would probably increase, although that's up to you. A > soft commit open a new searcher, so index changes become visible. > > Thanks, > Shawn > -- Walter Underwood wun...@wunderwood.org