: 2. I understand this architecture of LazyFields, but i did not understand : why multiple LazyFields should be created for the multivalued field. You : can't load a part of them. If you request the field, you will get ALL of : its values. so 100 (or more) placeholders are not necessary in this case. : Moreover, why should Solr KNOW how much values are in that unloaded field?
It's been a while since i looked at it closely, but i believe the crux of the reasoning has to do with the way the lucene Document API is structured -- each document consists a list of IndexableField objects which contain the field name and the field value -- there is not single object representing a fieldname and all of it's intidivual values hanging off of it, so the lucene Documents produced by LazyDocument have to register a LazyField instance as a placeholder for each of those IndexableField instances, so that if/when the Document API is used to access them, they can be used to fetch the corisponding value. there just isn't really any other way that the LazyDocument class can modify the Document object to know about the lazy fields. But as i mentioned before: these LazyField objects are *tiny*. Unless a subsequent request that reuses the doc from the cache asks to fetch the underlying value having 100+K of them in RAM shouldn't amount to much. (And if the underlying field values are requested, then the amount of space they take up should be a fairly insignificant amount more then the underlying value itself -- if the underlying values are small enough that it's noticable overhead, you probably don't want to bother using it all, evne if you frequently don't need those values). FWIW: if/when you ask for one LazyField's real value, it goes ahead and populates the values of all the other LazyField's with the same name (so no redundent work is done when iterative over al the values of a field in the typical flow) : What do you think about temporary disabling documentCache, for a specific : query? I don't see anything wrong with teh idea conceptually, but I'm not sure how feasible that would be or have any suggestions to how/where to implement it. I still think you should really consider dialing your documentCache size way, way down and test the performance -- even with your multiple concurrent requests asking for rows=2000 i suspect you won't see any painful increases in response time, and it will most certainly help your OOM porblems. -Hoss