Hi Dennis, I would not expect the index growth to be quite linear as the number of shapes grows, but nonetheless it may be significant. Indexing non-point shapes will index more term data than it ideally should: LUCENE-4942 I need to find the time/priority to do it. Probably within the next couple months.
In the meantime, you could perhaps modify distErrPct on the field type definition to be looser; it depends on your requirements what you can live with. The default is 0.025 (2.5% of approximate radius); maybe you'll be satisfied with precision of 10% or more? Tweaking this number trades off precision for index size. It can make a big difference. ~ David On 11/6/13 8:20 AM, "Dennis Reichelt" <dennis.reich...@askvisual.de> wrote: >Hi, > >we are testing Solr and index a huge amount of files. We integrated a >Spatial4j field which is only used to index rectangulars so we removed >the JTS dependency. However we had some problems with this. At first >Solr seems to get a GC OutOfMemory error which seems to be fixed with >more memory for the server (atleast i hope so ;)) and the second is, >that the index grows kinda big... > >It takes around 150mb when indexing rects for 30k documents which is a >factor 8 more than we would not. And its normally only one rect per >document. Though the functions we get through this are pretty cool this >could be a huge drawback because scaling seems to be linear and we >target around 500k documents. Are we doing something wrong or do we have >to live with that?