A rant about field collapsing
I am working on an implementation of search within our application using solr. About 2 months ago we had the need to group results by a certain field. After some searching I came across the JIRA in progress for this - field collapsing: https://issues.apache.org/jira/browse/SOLR-236 It was scheduled for the next solr release and had a full set of proper JIRA subtasks and patch files of almost complete implementations attached. So as you can imagine I was happy to apply this patch and build it into our application and await for the next release when it would be part of the main trunk. Now imagine my surprise when we have come around to upgrade to see that suddenly field collapsing has been thrown away in favour of a totally different grouping implementation https://issues.apache.org/jira/browse/SOLR-2524 How was it decided that this would be used instead? It was not made very clear that LUCENE-1421 was in progress which would effectively make the field collapsing work irrelevant by fixing the problem in lucene rather than primarily in solr. This has cost me days of work to now merge our custom changes somehow to the new implementation. I guess it is my own fault for basing our custom changes around an unresolved enhancement but as SOLR-236 had been 3-4 years in progress and SOLR-2524 did not exist at the time it seemed pretty safe to assume that the same problem was not being fixed in 2 totally different ways! -- View this message in context: http://lucene.472066.n3.nabble.com/A-rant-about-field-collapsing-tp3222798p3222798.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A rant about field collapsing
Ok thank you very much for clearing that up a little. I think another reason I was confused was that the wiki page for grouping was based around the original field collapsing plan at the time which led me to the jira and hence the patch files, rant over! Perhaps you can help to clarify if the current grouping changes work with solrj? In QueryResponse.setResponse() there is a loop which builds up the results object, but has no check at present for "grouped" in the NamedList, so the solrj client gets no results back when searching with grouping parameters. I assume I can add this on my local working copy and all will be well? -- View this message in context: http://lucene.472066.n3.nabble.com/A-rant-about-field-collapsing-tp3222798p3225361.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A rant about field collapsing
Many thanks. I took your changes for the following commits: SOLR-2642 SOLR-2637 SOLR-2523 I have gone withhout the group.main option as on hindsight it is quite useful to use the GroupCommand and Group objects with the results - especially group.ngroups has optimized our code where we used to perform a secondary query to get the total groups (as we were using collapse.threshold) And with a bit of tweaking all my unit tests have gone green. Except one - sharding tests. We use grouping and sharding and I get a null pointer @ QueryComponent.mergeIds(QueryComponent.java:492) Looks like this is because it is trying to use the "response". Very similar to https://issues.apache.org/jira/browse/SOLR-2270 -- View this message in context: http://lucene.472066.n3.nabble.com/A-rant-about-field-collapsing-tp3222798p3228753.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A rant about field collapsing
Sorry - should have read the manual: "Distributed search support for result grouping has not yet been implemented." I wonder if this is planned for any time soon? https://issues.apache.org/jira/browse/SOLR-2066 looks like it was more field collapsing based than grouping? -- View this message in context: http://lucene.472066.n3.nabble.com/A-rant-about-field-collapsing-tp3222798p3228835.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A rant about field collapsing
Actually I retract last comment - the patch on SOLR-2066 looks like it could work after all...it gets further but then dies in the HighlightComponent.. -- View this message in context: http://lucene.472066.n3.nabble.com/A-rant-about-field-collapsing-tp3222798p3229194.html Sent from the Solr - User mailing list archive at Nabble.com.
Invalid Date String for highlighting any date field match
I must be missing something.. It appears to me with solr 3.2 and 3.3 if you highlight on a date field (e.g by searching on *:*) the application blows up with: ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Invalid Date String:'1306406051000' at org.apache.solr.schema.DateField.parseMath(DateField.java:165) at org.apache.solr.analysis.TrieTokenizer.reset(TrieTokenizerFactory.java:106) at org.apache.solr.analysis.TrieTokenizer.(TrieTokenizerFactory.java:76) at org.apache.solr.analysis.TrieTokenizerFactory.create(TrieTokenizerFactory.java:51) at org.apache.solr.analysis.TrieTokenizerFactory.create(TrieTokenizerFactory.java:41) at org.apache.solr.analysis.TokenizerChain.getStream(TokenizerChain.java:68) at org.apache.solr.analysis.SolrAnalyzer.reusableTokenStream(SolrAnalyzer.java:75) at org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.reusableTokenStream(IndexSchema.java:385) at org.apache.solr.highlight.DefaultSolrHighlighter.createAnalyzerTStream(DefaultSolrHighlighter.java:550) I am using solrj beans to save Date objects to a schema type of 'date' or 'tdate' - makes no difference. >From what I can see this code will never work as the DefaultSolrHighlighter passes the date as a millisecond string all the way down to the TrieTokenizer which calls DateField.parseMath() and this immediately rejects anything which is not formatted as a datestring. -- View this message in context: http://lucene.472066.n3.nabble.com/Invalid-Date-String-for-highlighting-any-date-field-match-tp3255469p3255469.html Sent from the Solr - User mailing list archive at Nabble.com.