On 10/26/2011 6:16 PM, Michael Sokolov wrote:
Have you checked to see when you are committing? Is the pattern the
same in both instances? If you are committing after each delete
request in Java, but not in Perl, that could slow things down.
The commit happens separately, not during the process. The java logs I
pasted did not include the other things that happen afterwards, or the
commit, which can take another 10-30 seconds.
Here's the outer-level code that does the full update cycle. It does
deletes, reinserts (documents that have been changed), and inserts (new
content), then a commit. The innermost commit method (passed down from
the code below through a couple of object levels) spits log messages of
its own, and indicates that no commits are happening until after
everything is done.
/**
* Do all the updates.
*
* @throws IdxException
*/
public synchronized void updateIndex(boolean fullUpdate,
boolean useBuildCore) throws IdxException
{
refreshFlags();
if (fullUpdate)
{
_fullDelete = true;
_fullReinsert = true;
}
if (_dailyOptimizeStarted)
{
LOG.info(_lp
+ "Skipping delete and reinsert
- optimization underway.");
}
else
{
doDelete(_fullDelete, useBuildCore);
doReinsert(_fullReinsert, useBuildCore);
turnOffFullUpdate();
}
doInsert(useBuildCore);
doCommit(useBuildCore);
}
Due to the multihreading of delete requests, I now have the full delete
down to 10-15 seconds instead of a minute or more. This is now an
acceptable time, but I am completely mystified as to why the Pelr code
can do it without multithreading just as fast, and often faster. The
Java code is long-running, and the Perl code is started by cron. If you
look back to the first message on the thread, you'll see commit messages
in the Perl log, but those commits are done with the wait options set to
false. That's an extra step the Java code isn't doing - and it's STILL
faster.
Thanks,
Shawn