Solr index size affected by duplication

2018-11-18 Thread sagandhi
Hi,

This is a sample doc -

parent
shirt

child
Red
XL
6

Red
XL
6


The parent doc represents an item/object and the nested docs contain
extended properties of the object in parent doc.
So while searching the nested docs are filtered out for proper result count.
This required duplicating the nested doc fields in the parent doc.

This duplication of fields has resulted in huge Solr index size and I am
planning to get rid of them and use blockjoin for nested doc fields. 
This has caused another serious problem where if the value I am searching
for is present in a nested doc, no results are found (as nested docs are
filtered out as a rule. This used to work before because even if the nested
doc is filtered out, the parent doc is still returned)

I have come up with 2 approaches to solve this.
1. Include global field while indexing:
For each field in nested doc add the corresponding value in global field in
the parent doc.

parent

child
Red
XL
6

Red
XL
6


2. Use a new copy field:
The fields in nested doc have unique name patterns from other fields so I
can easily create another copy field that contains only the nested doc
fields.
Now while querying, I use block-join on this copy field along with the
existing global field like so -

global:(red) OR {!parent which=doc_type:parent}c_global:(red)

Add this in schema:


3. I came across another approach/hack accidentally.
I had modified the existing schema to remove duplicate parent fields but the
data I used for reindexing contained the duplicate parent fields.
So the global field contains values from both parent and nested field. But
the indexed doc itself will skip the parent doc fields as the schema doesn't
have them.
I was able to search for nested doc field values, and the total index size
was less than the above two.

Can someone please suggest which is the better option and why?

Thanks!
Soham



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Using solrconfig for json facet sorting

2019-04-11 Thread sagandhi
Hi,

Is it possible to configure sorting on json.facet in solrconfig.xml just
like for traditional facets?

Thanks,
Soham



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Block Join Faceting issue

2018-07-24 Thread sagandhi
Hi Mikhail,

Thank you for suggesting to use json facet. I tried json.facet, it works
great and I am able to make a single query instead of two. Now I am planning
to get rid of the duplicate child fields in parent docs. However I ran into
problems while forming negative queries with block join.

Here's what I would like to query - Get me parent docs whose children do not
have a particular field.
I tried these but none worked - 

q=*:*&fq={!parent which="doc_type:parent"}*-*child_color:*
q=*:*&fq={!parent which="doc_type:parent" v=$qq}&qq=(!child_color:*)

Currently I have duplicate entries of child fields in parent docs, so I am
able to do this - 
&fq=!parent_color:*

Is there a way to form this query using block join? 

Thanks,
Soham




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html