xavi jmlucjav <jmluc...@gmail.com> wrote:
> They reason for such a large number of fields:
> - users create dynamically 'classes' of documents, say one user creates 10
> classes on average
> - for each 'class', the fields are created like this: "unique_id_"+fieldname
> - there are potentially hundreds of thousands of users.

Switch to a scheme where you control the names of fields outside of Solr, but 
share the fields internally:

User 1 has 10 custom classes: u1_a, u1_b, u1_c, ... u1_j
Internally they are mapped to class1, class2, class3, ... class10

User 2 uses 2 classes: u2_horses, u2_elephants
Internally they are mapped to class1, class2

When User 2 queries field u2_horses, you rewrite the query to use class1 
instead.

> There is faceting in each users' fields.
> So this will result in >1M fields, very sparsely populated.

If you are faceting on all of them and if you are not using DocValues, this 
will explode your memory requirements with vanilla Solr: UnInverted faceting 
maintains separate a map from all documentIDs to field values (ordinals for 
Strings) for _all_ the facet fields. Even if you only had 10 million documents 
and even if your 1 million facet fields all had just 1 value, represented by 1 
bit, it would still require 10M * 1M * 1 bits in memory, which is 10 terabyte 
of RAM.

- Toke Eskildsen

Reply via email to