Hi all,

  In my code, I'd like to keep a subset of my 14M docs which is around
100k large.

 What is according to you the best option in terms of speed and memory usage ?

 Some basic thoughts tells me the BitDocSet should be the fastest for
lookup, but takes ~ 14M * sizeof(int) in memory, whereas
 the HashDocSet takes just ~ 100k * sizeof(int)  , but is a bit slower lookup.

 The doc of HashDocSet says "t can be a better choice if there are few
docs in the set" . What does 'few' means in this context ?

 Cheers !

 Jerome.


-- 
Jerome Eteve.

Chat with me live at http://www.eteve.net

[EMAIL PROTECTED]

Reply via email to