bitdocset does not take ~ 14M * sizeof(int) in memory it may take a maximum of 14M/8 bytes in memory ~= 1.75MB
On Tue, Oct 28, 2008 at 6:06 PM, Jérôme Etévé <[EMAIL PROTECTED]> wrote: > Hi all, > > In my code, I'd like to keep a subset of my 14M docs which is around > 100k large. > > What is according to you the best option in terms of speed and memory usage ? > > Some basic thoughts tells me the BitDocSet should be the fastest for > lookup, but takes ~ 14M * sizeof(int) in memory, whereas > the HashDocSet takes just ~ 100k * sizeof(int) , but is a bit slower lookup. > > The doc of HashDocSet says "t can be a better choice if there are few > docs in the set" . What does 'few' means in this context ? > > Cheers ! > > Jerome. > > > -- > Jerome Eteve. > > Chat with me live at http://www.eteve.net > > [EMAIL PROTECTED] > -- --Noble Paul