Re: PostingsFormat block size

Mikhail Khludnev Tue, 27 Jan 2015 08:24:06 -0800

Hm.. It's not blocks which I'm familiar with. Regarding performance impact
from bigger ID blocks: if you have <uniqueKey>ID</uniqueKey> and sends
update for existing docs. And IDs are also used for some of the distributed
search stages, I suppose. Here it is.


On Tue, Jan 27, 2015 at 4:33 PM, Trym Møller <t...@sigmat.dk> wrote:

> Hi
>
> Thanks for your clarifying questions.
>
> In the constructor of the Lucene41PostingsFormat class the minimum and
> maximum block size is provided. These sizes are used when creating the
> BlockTreeTermsWriter (responsible for writing the .tim and .tip files of
> the lucene index). It is the blocksizes of the BlockTreeTermsWriter I refer
> to.
>
> I'm not quite sure I understand your second question - sorry.
> I can tell that I have not tried if the PulsingPostingsFormat is of any
> help in regards to lowering the Solr JVM Memory usage, but I can see the
> same BlockTreeTermsWriter with its block sizes are used by the
> PulsingPostingsFormat.
> Should I expect something else from the PulsingPostingsFormat in regards
> to memory usage or in regards to searching (if have have changed to block
> sizes of the BlockTreeTermsWriter)?
>
> Best regards Trym
>
>
> On 27-01-2015 14:00, Mikhail Khludnev wrote:
>
>> Hello Trym,
>>
>> Can you clarify, which blockSize do you mean? And the second q, just to
>> avoid unnecessary explanation, do you know what's Pulsing?
>>
>> On Tue, Jan 27, 2015 at 2:28 PM, Trym Møller <t...@sigmat.dk> wrote:
>>
>>  Hi
>>>
>>> I have successfully create a really cool Lucene41x8PostingsFormat class
>>> (a
>>> copy of the Lucene41PostingsFormat class modified to use 8 times the
>>> default block size), registered the format as required. In the
>>> schema.xml I
>>> have created a field type string with this postingsformat and lastly I'm
>>> using this field type for my id field. This all works great and as a
>>> consequence the .tip files of the Lucene index (segments) are
>>> considerably
>>> smaller and the same goes for the Solr JVM Memory usage (which was the
>>> end
>>> goal).
>>>
>>> Now I need to find the consequences (besides the disk and memory usage)
>>> of
>>> this change to the id-field. I would expect that id-searches are slower.
>>> But when will Solr/Lucene do id-searches? I have myself no user scenarios
>>> where my documents are searched by the id value.
>>>
>>> Thanks for any comments.
>>>
>>> Best regards Trym
>>>
>>>
>>>
>>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mkhlud...@griddynamics.com>

Re: PostingsFormat block size

Reply via email to