Oh OK, phew.  I misunderstood your answer too!

So it seems like fsync with ZFS can be very slow?

Mike

Uwe Klosa wrote:

Oh, you meant index files. I misunderstood your question. Sorry, now that I read it again I see what you meant. There are only 136 index files. So no
problem there.

Uwe

On Sat, Oct 4, 2008 at 1:59 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Yikes!  That's way too many files.  Have you changed mergeFactor?  Or
implemented a custom DeletionPolicy or MergePolicy?

Or... does anyone know of something else in Solr's configuration that could
lead to such an insane number of files?

Mike


Uwe Klosa wrote:

There are around 35.000 files in the index. When I started Indexing 5
weeks
ago with only 2000 documents I did not this issue. I have seen it the
first
time with around 10.000 documents.

Before that I have been using the same instance on a Linux machine with up to 17.000 documents and I haven't seen this issue at all. The original
plan
has always been to use Solr on Linux, but I'm still waiting for the new
server.

Uwe

On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Hmm OK that seems like a possible explanation then. Still it's spooky
that
it's taking 5 minutes. How many files are in the index at the time you
call
commit?

I wonder if you were to simply pause for say 30 seconds, before issuing
the
commit, whether you'd then see the commit go faster? On Windows at least such a silly trick does seem to improve performance, I think because it allows the OS to move the bytes from its write cache onto stable storage
"on
its own schedule" whereas when we commit we are demanding the OS move the
bytes on our [arbitrary] schedule.

I really wish OSs would add an API that would just block & return once
the
file has made it to stable storage (letting the OS sync on its own
optimal
schedule), rather than demanding the file be fsync'd immediately.

I really haven't explored the performance of fsync on different
filesystems. I think I've read that ReiserFS may have issues, though it could have been addressed by now. I *believe* ext3 is OK (at least, it didn't show the strange "sleep to get better performance" issue above, in
my
limited testing).

Mike


Uwe Klosa wrote:

Thanks Mike


The use of fsync() might be the answer to my problem, because I have installed Solr for lack of other possibilities in a zone on Solaris with
ZFS
which slows down when many fsync() calls are made. This will be fixed in
a
upcoming release of Solaris, but I will move as soon as possible the
Solr
instances to another server with a different file system. Would the use
of
a
different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Yonik Seeley wrote:

On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:


I have a big problem with one of my solr instances. A commit can take

up
to
5 minutes. This time does not depend on the number of documents which
are
updated. The difference for 1 or 100 updated documents is only a few
seconds.


Since Solr's commit logic really hasn't changed, I wonder if this
could be lucene related somehow.


Lucene's commit logic has changed: we now fsync() each file in the
index
to
ensure all bytes are on stable storage, before returning.

But I can't imagine that taking 5 minutes, unless there are somehow a
great
many files added to the index?

Uwe, what filesystem are you using?

Yonik, when Solr commits what does it actually do?

Mike






Reply via email to