[ https://issues.apache.org/jira/browse/SOLR-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris M. Hostetter updated SOLR-15048: -------------------------------------- Attachment: SOLR-15048.patch Status: Open (was: Open) so, if we ignore nulls, the way collapse deals with "boosted" docs is that any boosted doc is explicitly collected (even if there are multiple boosted docs from the same group), and all other docs in the same group(s) as boosted docs are _not_ collected: ie: {{elevateIds=1,5}} causes docs 1 & 5 to be boosted & returned, even if they would not normally be group heads, and any other docs in the same group(s) as either of those docs will not be returned (even if they would normally be the group head) I initially thought that (from the code I looked at & some quick testing) that the "intent" when dealing with "null group id" docs was that boosting these docs would still cause all other "null group id" docs to behave as they would w/o boosting. ie: nullPolicy=collapse would still result in a "null" group using the "best" group head that wasn't already boosted; nullPolicy=expand would still result in all docs w/null group id being returned as if they were their own group. In practice though, there is no rhyme or reason to how the current code behaves. some group head selectors cause nullPolicy=collapse to completely eliminate the null group if any docs in it are boosted (like a normal group) others cause the group to still be returned (like i initially thought). but when nullPolicy=expand, some group head selector code causes (non-boosted) null group docs to actually be returned in the wrong order acording to the top level sort (evidently due to a logic error, but i'm not sure how since they still get returned with the correct score?) *TL;DR: some of the nullPolicy+boosting logic is flat out broken, the logic that isn't broken is internally inconsistent.* ---- I would like to try and just ignore all the existing (lack of) "logic" and instead "fix" nullPolicy+boosting such that: * boosting null docs (continues) to work even when nullPolicy=ignore * boosting w/nullPolicy=collapse is fixed to treat the "null group" exactly the same as a "real" group: if any docs in it are boosted, the group is not returned * boosting w/nullPolicy=expand should be fixed to ensure all "null" docs are returned (as if they were exach in their own group) in the correct order, w/correct scores, but any "boosted" null docs should come first (just like boosted regular docs) ...but before i can do that, i really need to get to the bottom of how/why nullPolicy=expand is sometimes collected _non_-boosted "null" docs out of their expected order ... because i can't even figure out how that's possible given the way the DelegatingCollector API works (let alone how the collapse code is – erroneously – making it happen. [~jbernste] can you please take a look at testSmallWTF in the latest patch and help me understand how the ValueSource and SortSpec related "collapseStrategy" impls are moving "doc id=4" out of it's normal position in the results (inspite of it's score) ? > collapse + query elevation behaves inconsistenty w/ 'null group' docs > depending on group head selector > ------------------------------------------------------------------------------------------------------ > > Key: SOLR-15048 > URL: https://issues.apache.org/jira/browse/SOLR-15048 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Chris M. Hostetter > Assignee: Chris M. Hostetter > Priority: Major > Attachments: SOLR-15048.patch, SOLR-15048.patch > > > while working on SOLR-15047, I realized I wasn't really clear on what the > _expected_ semantics of were suppose to be when "boosting" > docs that had null values in the collapse field. > I expanded on my test from that jira, to demonstrate the logic i (thought) i > understood from the Ord based collector - but then discovered that depending > on the group head selector used (ie: OrdScoreCollector vs > OrdFieldValueCollector+OrdIntStrategy vs > OrdFieldValueCollector+OrdValueSourceStrategy , etc...) you get different > behavior - not just in what group head is selected, but even when the > behavior should be functionally equivilent, you can get different sets of > groups. (even for simple string field collapsing, independent of the bugs in > numeric field collapsing). > > I have not dug into WTF is happening here, but I'll attach my WIP test -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org