Hi, I have a schema that has a descendent_path field as configured in the PathTokenizerHierarchyFactory docs:
<fieldType name="descendent_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> </fieldType> Using the example in the docs: *For example, in the configuration below a query for Books/NonFic will match documents indexed with values like Books/NonFic, Books/NonFic/Law, Books/NonFic/Science/Physics, etc. But it will not match documents indexed with values like Books, or Books/Fic.* This works great and solves a primary use case. However, we have a secondary use case where we need to get all documents that match a single level. For example, let's say I wanted all of the categories in Books/NonFic/, like Books/NonFic/Science, Books/NonFic/Art, Books/NonFic/Math, etc.. I can query for Books/NonFic, but this gives me all children records too. One solution is to query for: category:Books/NonFic/* -category:Books/NonFic/*/* which seems like it works, but feels a little clunky. The other solution I can think of is to put a separate, non-tokenized field into the document at index time for each record, something like parentCategory, which would be non-tokenized and indexed (not stored) like Books/NonFic for each of the Books/NonFic/[Science, Art, Math] documents. However, with this solution I'm duplicating the information and increasing my index size. This is not the worst thing, I know, but the field is by far the largest contributor to the index size already, and doubling the information there will have a noticeable impact on the disk footprint. So my question: with a projected index size in the billions of documents, would you take either one of those two approaches? Or a third that I haven't thought of? Thanks, Kyle