Hello,

I posted this question within another thread and I think it got lost so I 
wanted to start a new thread about it. I have built a small POC for a 
customization I am hoping to get some validation on in case what I have built 
is a really bad implementation. I have been doing a lot of digging over the 
last 2 weeks, and SOLR out of the box does not offer exactly what I need, so I 
decided to go custom (extended). 

I do ecommerce, when you type in a search in the website, we want to execute 
the search against available sku’s (i.e.. small red shirt, dark 34x34 jeans) 
then we want to perform a rollup (or join) to those sku’s parent products 
(shirt, jeans). We don’t want to return the item documents at all, but we do 
want the facets to represent the child document set, not the final parent 
document set; for example, if your viewing a “shirt” category, we don’t want to 
show the “Green” facet if none of the shirts your viewing have green available 
or were part of the original search (maybe you a shopper searched "blue 
shirts").

It is also important to note we are on SOLR 4.3.0 and don’t have parent/child 
support yet; upgrading to 4.10.2 could be tricky since our SOLR instance is 
embedded within another piece of software.

I can’t use grouping to do this solution which was suggested to me in the past 
because grouping will throw off our pagination, and I also won’t be able to use 
grouping for other things. For example, when viewing a “shirt” category, we 
want to group the final product set by another element such as a shirt “type” 
or sub category (t-shirt, sweater, “tank tops” etc….)

To support our needs, I have built a small POC over the weekend (and thanks to 
everyone for putting up with all my random emails as I spent time learning 
SOLR/Lucene and getting my head wrapped around the internals). The POC involves 
a custom query parser, query, search component and document transformer, and I 
uploaded all my code to github this morning 
(https://github.com/damos/SolrRollupQuery/blob/master/src/com/vast/solr/rollup/ 
<https://github.com/damos/SolrRollupQuery/blob/master/src/com/vast/solr/rollup/>),
 it still needs tuning and is definitely not finished, but I have mostly based 
this code off of other elements such as the join query, dismax query parser, 
and some of the 4.10.2 parent/child code.

I want my customer to be able to build a SOLR request like the following:
select?q={!rollup from=parent 
to=id}name:(*Shirt*)&cfq=color:green&cfq=size:small&facet=true&facet.field=color&facet.field=size&fl=field1,field2,field3,[ru
 fl=childField1,childField2]child_*

Where:
q= A query against item documents that rolls up into parent documents
cfq= (child filter query) a query that will filter the child document set 
before the rollup happens. This will allow you to still be able to use the fq 
parameter to filter the parent document set after the fact.
[ru]=a document transformer that will bring fields from the child documents 
into the parent documents dynamic field child_*


The implementation includes 4 main components:

1) New “rollup" Query Parser: Parses the incoming request, builds the rollup 
query and puts the query into the request context
2) New “rollup” Query: I modelled this after the code in JoinUti, the 
constructor executes the primary query with a custom collector that collects 
the terms scores and also collects the entire child docset. This query makes 
the child docset available externally in an accessor method.
3) An extended QueryComponent that checks the request context for the rollup 
query, if it exists, it overrides the rb.getResult().docset with the child 
docset so the facets are built off the children, not the parents. (This part 
feels very clumsy but I have reasons for not completely overriding 
QueryComponent.process())
4) A custom doc transformer that adds child fields to a parent dynamic field. I 
DON’T want to return fields for all children, only the ones that were in the 
main query, so this also needs the results of the rollup queries child docset.


Sorry to send such a long email, I want to contribute this discussion because I 
am pretty sure we are not the first people in ecommerce to have a similar use 
case that is very specific like this. If you have taken some time to read this, 
thank you very much for your time, it is very much appreciated.

Cheers!

Darin


PS:

The following query:
http://localhost:8983/solr/testcore/select?q={!rollup%20from=parentSku%20to=sku}name:(*Awesome*)&facet=true&facet.mincount=1&facet.field=size&fl=id,sku,name,[ru%20fl=name,sku]child_*&cfq=size:small
 
<http://localhost:8983/solr/testcore/select?q=%7B!rollup%20from=parentSku%20to=sku%7Dname:(*Awesome*)&facet=true&facet.mincount=1&facet.field=size&fl=id,sku,name,%5Bru%20fl=name,sku%5Dchild_*&cfq=size:small>

Returns the following results:
<result name="response" numFound="2" start="0">
<doc>
<str name="id">0001</str>
<str name="sku">shirt-0001</str>
<str name="name">Awesome Shirt</str>
<int name="child_count">1</int>
<str name="child_name">Small Awesome Shirt</str>
<str name="child_sku">shirt-0001-01</str>
</doc>
<doc>
<str name="id">0002</str>
<str name="sku">jeans-0001</str>
<str name="name">Awesome Jeans</str>
<int name="child_count">1</int>
<str name="child_name">Small Awesome Jeans</str>
<str name="child_sku">jeans-0001-01</str>
</doc>
</result>




Reply via email to