Hi, Chris.
Usually I'm concerned when values (enums) leak into field names. It's
better to control the set of fields.
I rather go with separate values for faster searching and faceting and
concatenations:
"locations" : [ "denver", "chicago", "washington" ],
"roles : [ "admin", "staff","basic" ]
"location_roles : [ "denver_admin", "chicago_staff",
"washington_basic" ]
so:
> users who are admins in denver?
q=location_roles:denver_admin
The next level of complexity is dependent facets: ie count facets roles in
denver.
For this case Chis shouldn't contribute "staff", "basic" as facet values.
However, they are counted by facet.field.
Yon still can count such dependent faces with these concatenations via
tricky post-processing.
FWIW, bulletproof solution for this is indexing roles as user's
subdocuments. It's really performant, but deadly complex.
https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-nested-documents.html
https://solr.apache.org/guide/solr/latest/query-guide/json-query-dsl.html#faceting-nested-documents
On Thu, Jun 2, 2022 at 1:04 AM Christopher Schultz <
[email protected]> wrote:
> All,
>
> Since Solr / Lucene can't define arbitrary fields in documents, I wonder
> what the recommended technique is for storing structured-information in
> a document?
>
> I'd like to store information about an entity and is specific to related
> entities (not stored in the index). This isn't my actual use-case, but
> let's take an example of a user who has different privileges in
> different locations.
>
> If Solr / Lucene allowed me to do so, I might model it like this:
>
> users : [
> {
> "username": "chris",
> "locations" : [ "denver", "chicago", "washington" ],
> "location_denver_role" : "admin",
> "location_chicago_role" : "staff",
> "location_washington_role" : "basic"
> },
> { ... }
> ]
>
> Since I can't have a field called "location_denver_role",
> "location_chicago_role", etc. (well, I can, but I have a huge number of
> locations to deal with and defining a field for each seems stupid), I
> was thinking of maybe something like this:
>
> users : [
> {
> "username": "chris",
> "locations" : [ "denver", "chicago", "washington" ],
> "location_roles : [ { "denver" : "admin", "chicago" : "staff",
> "washington" : "basic" ]
> },
> { ... }
> ]
>
> So now I have a single field called "location_roles" but it's got
> "structure" inside of it. I could obviously search the other fields
> directly in Solr and then filter-out the records I want manually, but
> what's the best way to structure the index so that I can tell Solr I
> only care about users who are admins in denver?
>
> Lest you think I can invert the index and use:
>
> "admin" : [ "denver" ],
> "staff" : [ "chicago" ],
> "basic" : [ "washington" ]
>
> ... I can't. The "role" is just a proxy for a bunch of user metadata
> that may need to grow over time, including a large range of possible
> values, so I can't just invert the index.
>
> Thanks,
> -chris
>
--
Sincerely yours
Mikhail Khludnev