On Sat, May 30, 2015, at 09:51 AM, Gili Nachum wrote: > Hi, What would be an optimal FS block size to use? > > Using Solr 4.7.2, I have an RAID-5 of SSD drives currently configured > with > a 128KB block size. > Can I expect better indexing/query time performance with a smaller block > size (say 8K)? > Considering my documents are almost always smaller than 8K. > I assume all stored fields would fit into one block which is good, but > what > will Lucene prefer for reading a long posting list and other data > structures. > > Any rules of thumb or anyone that had experimented on this?
I'm gonna start this response with the observation that I don't know anything about the topic you are asking about. So, with that out of the way, a Lucene index is "write only", that is, when you do a commit, all of the data that makes up your index is written to disk - that is, all documents making up a single commit are written into a set of files, making a segment. Therefore, it isn't the size of a document that matters, more the number and size of documents making up a single commit. There's a lot more to it too, e.g. whether fields are stored, how they are analysed, etc. You could do a simple experiment. Write a little app that pushes docs to Solr and commits, then look at the file sizes on disk. Then repeat with more documents, see what impact on file sizes. I suspect you can answer your question relatively easily. Upayavira