I had missed a field in ContractItem index (ClientId) *ContractItem*
ContractItemId - string ItemId - string ClientId - string ContractCode - string (facet and filter on this) Priority - integer (order by priority descending) Active - boolean (filter on this) 2) It appears that I cannot have fromIndex=Contracts because it is very large and has to be sharded. Per my understanding SolrCloud join does not support multiple shards 4) The Item index contains approximately 2 million items. For ContractItem there are about 10000 clients with about 1.5 million records for each client. So the total ContractItem records are close to 15 billion. Several updates are made to Item during the day. Sometimes clients will made large changes to ContractItem. Any thoughts/suggestions? On Thu, Oct 1, 2015 at 6:09 AM, Mikhail Khludnev <mkhlud...@griddynamics.com > wrote: > 1. i'd say it's challenge. > 2. can't you do the opposite filter active contracts, join them back to > items, and facet then? > q=(Description:colgate OR Categories:colgate OR > Sellers:colgate)&fq={!join from=ItemId to=ItemId > fromIndex=Contracts)Active:true&facet.field=SellersString > 3. note: there is {!terms} QParser (which makes leg-shooting easier). > 4. what are number of documents you operate? what is update frequency? Is > there a chance to keep both types in the single index? > > On Thu, Oct 1, 2015 at 5:58 AM, Troy Edwards <tedwards415...@gmail.com> > wrote: > > > I am working with the following indices > > > > *Item* > > > > ItemId - string > > Description - text (query on this) > > Categories - Multivalued text (query on this) > > Sellers - Multivalued text (query on this) > > SellersString - Multivalued string (Need to facet and filter on this) > > > > *ContractItem* > > > > ContractItemId - string > > ItemId - string > > ContractCode - string (facet and filter on this) > > Priority - integer (order by priority descending) > > Active - boolean (filter on this) > > > > Say someone is searching for colgate > > > > I am doing two queries: > > > > First query: q = {!join from=ItemId to=ItemId > > fromIndex=Item)(Description:colgate OR Categories:colgate OR > > Sellers:colgate)&facet.field=ContractCode > > > > From the first query I get all the ItemIds and do a second query on Item > > index using q=ItemId:(Id1 Id2 Id3) and generate facet on SellersString > > > > I have to do some custom coding to retain Priority (so that I can sort on > > it) > > > > Following are the issues I am running into: > > > > 1) Since there are a lot of Items and ContractItems, the number of Ids > > becomes large and I had to increase maxBooleanClause (possible > performance > > degradation?) > > > > 2) Since I have to return a lot of items from first query, the data size > > becomes very large (again a performance concern) > > > > 3) When a filter is applied on the second query, I have to adjust the > facet > > results of the first query > > > > 4) Overall this seems complex > > > > Is it possible to do just one query and apply filters (if any) and get > > results along with facets? > > > > Any suggestions on simplifying this and improving performance? > > > > Thanks in advance > > > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> >