On Sat, May 30, 2015, at 09:51 AM, Gili Nachum wrote:
> Hi, What would be an optimal FS block size to use?
> 
> Using Solr 4.7.2, I have an RAID-5 of SSD drives currently configured
> with
> a 128KB block size.
> Can I expect better indexing/query time performance with a smaller block
> size (say 8K)?
> Considering my documents are almost always smaller than 8K.
> I assume all stored fields would fit into one block which is good, but
> what
> will Lucene prefer for reading a long posting list and other data
> structures.
> 
> Any rules of thumb or anyone that had experimented on this?

I'm gonna start this response with the observation that I don't know
anything about the topic you are asking about.

So, with that out of the way, a Lucene index is "write only", that is,
when you do a commit, all of the data that makes up your index is
written to disk - that is, all documents making up a single commit are
written into a set of files, making a segment.

Therefore, it isn't the size of a document that matters, more the number
and size of documents making up a single commit. There's a lot more to
it too, e.g. whether fields are stored, how they are analysed, etc.

You could do a simple experiment. Write a little app that pushes docs to
Solr and commits, then look at the file sizes on disk. Then repeat with
more documents, see what impact on file sizes. I suspect you can answer
your question relatively easily.

Upayavira

Reply via email to