[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields
[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382822#comment-17382822 ] Mayya Sharipova commented on LUCENE-9450: - [~gworah] That's indeed a concern. The workaround would be to add a binary doc values field in version 8.x, force merge to a single segment, so that a FieldInfo for $full_path$ contains doc values as well, and then upgrade to 9.0. We don't do data structures consistency checks for older indices on individual docs , just on a segment level. Do you think it is a viable workaround? > Taxonomy index should use DocValues not StoredFields > > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.5.2 >Reporter: Gautam Worah >Priority: Minor > Labels: performance > Fix For: main (9.0) > > Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mocobeta commented on pull request #215: LUCENE-10028: Add git pre-commit hook that runs precommit task.
mocobeta commented on pull request #215: URL: https://github.com/apache/lucene/pull/215#issuecomment-882058333 It surely won't fit everyone's (especially git experts') use-cases, and it's a local setup anyway; devs should be able to set up `pre-commit` or `pre-push` hook without the help of Gradle if they'd like. I threw in this since I would like to make sure all linters are locally run before pushing changes to the remote repo (and I almost always forget that) though, I don't intend to force or recommend others to do so. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley opened a new pull request #216: Introduce DocTermVectors in lieu of Fields.
dsmiley opened a new pull request #216: URL: https://github.com/apache/lucene/pull/216 https://issues.apache.org/jira/browse/LUCENE-10018 Let's not use the Fields class anymore for TermVectors. In this PR, we introduce a new class "DocTermVectors" in its stead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley commented on pull request #216: Introduce DocTermVectors in lieu of Fields.
dsmiley commented on pull request #216: URL: https://github.com/apache/lucene/pull/216#issuecomment-882247189 In the first commit of this PR, I introduce "DocTermVectors" subclassing Fields. Another commit can inline Fields. What do we think of the name? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9935) Bulk merges for stored fields when index sorting is enabled
[ https://issues.apache.org/jira/browse/LUCENE-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9935. -- Resolution: Fixed It looks like this issue has been fully merged, so marking fixed. > Bulk merges for stored fields when index sorting is enabled > --- > > Key: LUCENE-9935 > URL: https://issues.apache.org/jira/browse/LUCENE-9935 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Nhat Nguyen >Priority: Minor > Fix For: 9.0, 8.10 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Today stored fields disable bulk merges entirely when index sorting is > enabled. However when sorting by low-cardinality fields or when the index > sort is correlated with the order in which documents get indexed, we could > likely still have efficient bulk merges. > For instance, if you are merging two segments that are sorted on a field that > can only take 2 values, one could bulk merge the first half of the first > segment, then the first half of the second segment, then the second half of > the first segment, and finally the second half of the second segment. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org