: 2. I understand this architecture of LazyFields, but i did not understand
: why multiple LazyFields should be created for the multivalued field. You
: can't load a part of them. If you request the field, you will get ALL of
: its values. so 100 (or more) placeholders are not necessary in this case.
: Moreover, why should Solr KNOW how much values are in that unloaded field?

It's been a while since i looked at it closely, but i believe the crux of 
the reasoning has to do with the way the lucene Document API is 
structured -- each document consists a list of IndexableField objects 
which contain the field name and the field value -- there is not single object 
representing a fieldname and all of it's intidivual values hanging off of 
it, so the lucene Documents produced by LazyDocument have to register a 
LazyField instance as a placeholder for each of those IndexableField 
instances, so that if/when the Document API is used to access them, they 
can be used to fetch the corisponding value.  

there just isn't really any other way that the LazyDocument class can 
modify the Document object to know about the lazy fields.

But as i mentioned before: these LazyField objects are *tiny*.  Unless a 
subsequent request that reuses the doc from the cache asks to fetch the 
underlying value having 100+K of them in RAM shouldn't amount to much. 
 (And if the underlying field values are requested, then the amount 
of space they take up should be a fairly insignificant amount 
more then the underlying value itself -- if the underlying values are 
small enough that it's noticable overhead, you probably don't want to 
bother using it all, evne if you frequently don't need those values).

FWIW: if/when you ask for one LazyField's real value, it goes ahead and 
populates the values of all the other LazyField's with the same name (so 
no redundent work is done when iterative over al the values of a field in 
the typical flow)

: What do you think about temporary disabling documentCache, for a specific
: query?

I don't see anything wrong with teh idea conceptually, but I'm not sure 
how feasible that would be or have any suggestions to how/where to 
implement it.

I still think you should really consider dialing your documentCache size 
way, way down and test the performance -- even with your multiple 
concurrent requests asking for rows=2000 i suspect you won't see any 
painful increases in response time, and it will most certainly help your 
OOM porblems.


-Hoss

Reply via email to