[GitHub] [lucene] jpountz commented on issue #11770: Optimization for time series data
jpountz commented on issue #11770: URL: https://github.com/apache/lucene/issues/11770#issuecomment-1247796727 > it seems that the core idea in this paper is similar to IndexSortSortedNumericDocValuesRangeQuery This is my understanding as well, though it says it uses the BKD tree to figure out the range of doc IDs, not doc values, which seems to be the idea that is proposed at https://github.com/apache/lucene/pull/687 (which I just realized I had completely forgotten about :grimacing:). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz merged pull request #1068: LUCENE-10674: Update subiterators when BitSetConjDISI exhausts
jpountz merged PR #1068: URL: https://github.com/apache/lucene/pull/1068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] shaie opened a new pull request, #11775: Minor refactoring and cleanup to taxonomy index code
shaie opened a new pull request, #11775: URL: https://github.com/apache/lucene/pull/11775 ### Description Aside from some cleanups (typos, improving comments), this PR addresses few issues: 1. `DirTaxoWriter.nextID` is declared `volatile` however this `nextID++` is not a safe-operation. Switched to `AtomicInteger`. 2. `DirTaxoReader` protected constructor couldn't really be extended since `TaxonomyIndexArrays` is package-private and isn't exported by the module. Therefore I think it's safe to change the constructor to package-private too. 3. Changed the [Double-Check Lock](https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html) pattern implementation to assign the `volatile` field to a local variable, so that we do a volatile read only once if the reference isn't null. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on pull request #11774: GH-11172: remove WindowsDirectory and native subproject.
dweiss commented on PR #11774: URL: https://github.com/apache/lucene/pull/11774#issuecomment-1247974260 Ah. missed the bull's eye, didn't I. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on issue #11772: remove WindowsDirectory
rmuir commented on issue #11772: URL: https://github.com/apache/lucene/issues/11772#issuecomment-1248043771 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on issue #11772: remove WindowsDirectory
uschindler commented on issue #11772: URL: https://github.com/apache/lucene/issues/11772#issuecomment-1248061894 Let's remove it. Actually the whole code is not tested at all. The removed Testcase extends LuceneTestCase and not BaseDirectoryTestcase. The only thing it does is to instantiate a Directory and an IndexOutput (!!) that is not even triggering custom code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss merged pull request #11774: GH-11172: remove WindowsDirectory and native subproject.
dweiss merged PR #11774: URL: https://github.com/apache/lucene/pull/11774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on issue #11772: remove WindowsDirectory
dweiss commented on issue #11772: URL: https://github.com/apache/lucene/issues/11772#issuecomment-1248177490 Applied on 9x and main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss closed issue #11772: remove WindowsDirectory
dweiss closed issue #11772: remove WindowsDirectory URL: https://github.com/apache/lucene/issues/11772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #11774: GH-11172: remove WindowsDirectory and native subproject.
uschindler commented on PR #11774: URL: https://github.com/apache/lucene/pull/11774#issuecomment-1248218763 Thanks. I was just wondering, why this strange title of PR with "GH-"? I would just put issue number in usual # notation. This does not highlight at all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] llermaly opened a new issue, #11776: Non self intersecting polygons can't be indexed
llermaly opened a new issue, #11776: URL: https://github.com/apache/lucene/issues/11776 ### Description The following polygons are valid, but considered self intersecting by Lucene : ``` POLYGON ((8.8970989818779 54.4134906575883, 8.90042774485873 54.4146874897743, 8.90594809529893 54.4171621281855, 8.91004327482905 54.4202335124536, 8.9093605425 54.421660818, 8.923427357 54.429292006, 8.892597461 54.412569037, 8.8870137444 54.412826828, 8.8802484759 54.411739775, 8.8704911837 54.40738926, 8.8572578773 54.406537888, 8.832464316 54.410877071, 8.83028999859022 54.410779813, 8.8301056348 54.409738029, 8.83542087096422 54.4081201758963, 8.8434158599249 54.4059310703591, 8.8498879933749 54.4038371457592, 8.85426620240666 54.4029805394939, 8.85731191163137 54.4032660762063, 8.86483100908713 54.4043130389967, 8.87230838608615 54.4060266046257, 8.88148723601366 54.4091671397265, 8.88577026584612 54.4101189229804, 8.89195686439317 54.4116417778086, 8.8970989818779 54.4134906575883)) POLYGON ((7.89437024403906 47.5862590252318, 7.89312177803361 47.5869704801294, 7.89281574806746 47.5870946189537, 7.89525097569983 47.5857177586665, 7.89806367361792 47.5841274339808, 7.90068804512661 47.5825559862467, 7.89998956367071 47.5830121477752, 7.89515885167079 47.585809621127, 7.89437024403906 47.5862590252318)) POLYGON ((11.077430168 54.298432536, 11.0827805841396 54.2829539912519, 11.0830386027471 54.2818703102967, 11.0832788032709 54.2797565892852, 11.083278269 54.279818029, 11.0830099985918 54.282292149, 11.077430168 54.298432536)) ``` @craigtaverner made some research and noted the following : _all have a feature in common, a very narrow constriction, causing one side of the polygon to touch (almost) the other side. This is likely the source of the issue, and in-line with your theory regarding numerical errors._ _I've just written tests for these polygons in the latest version of lucene, and all three fail triangulation (a necessary step for indexing the polygons). So this is, I believe, the lucene tessellator (triangulator) code not being able to handle these points being so close to the other edge of the polygon._ _I think it is the indexing that is failing, since triangulation is only needed for indexing, So until you import it to the index, it is fine. I was reading the code in this area, and found an interesting comment about it not including some special floating point tricks to be more accurate. So it could be that we just have to implement those tricks. I saw a link to this page which might cover those needs. https://www.cs.cmu.edu/~quake/robust.html_ ### Version and environment details _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on issue #11765: Query optimizer statistics
jpountz commented on issue #11765: URL: https://github.com/apache/lucene/issues/11765#issuecomment-1248281846 Lucene has a `QueryProfilerIndexSearcher` that allows to capture some of this information for a given search, but it adds a lot of overhead. The way that Lucene interleaves evaluation of queries doesn't allow tracking statistics in a cheap way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz closed issue #11765: Query optimizer statistics
jpountz closed issue #11765: Query optimizer statistics URL: https://github.com/apache/lucene/issues/11765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on issue #11761: Expand TieredMergePolicy deletePctAllowed limits
jpountz commented on issue #11761: URL: https://github.com/apache/lucene/issues/11761#issuecomment-1248288919 Historically this was not configurable and Lucene would allow up to 50% deleted documents. When we introduced an option, we made sure to introduce a lower bound on the value because a value of zero would essentially require Lucene to rewrite every segment that has a deletion after every update operation, which is certainly undesirable. Allowing users to go from 50% to 20% felt like a significant improvement already. We could discuss lowering the limit if we feel like this could lead to merging patterns that are still acceptable. E.g. I used `BaseMergePolicyTestCase#doTestSimulateUpdates` in the past to get a sense of how this option would influence write amplification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] LuXugang commented on issue #11770: Optimization for time series data
LuXugang commented on issue #11770: URL: https://github.com/apache/lucene/issues/11770#issuecomment-1248298615 > Could you tell me which lucene's files should I read, so I could implement that algorithm? I think you could first read `IndexSortSortedNumericDocValuesRangeQuery`, then you would understand more about that paper. I would also be more than happy to learn from each other about Lucene with [WeChat](https://www.amazingkoala.com.cn/Lucene/2018/1204/22.html). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mdmarshmallow commented on issue #11761: Expand TieredMergePolicy deletePctAllowed limits
mdmarshmallow commented on issue #11761: URL: https://github.com/apache/lucene/issues/11761#issuecomment-1248348122 Hi, thanks for the response! Your explanation of 0% not being allowed makes complete sense. For some context though, using our own forked version of `TieredMergePolicy`, we have tested with down to 2% allowable deletion and still see that behavior is desirable for us (more specifically much lower index sizes than at 20% deletes). Maybe if we want to maintain those limits, we could create a `setDeletesPctAllowedUnsafe()` or something like that which has no limits on it (or at least drops the lower bound to being > 0 instead of > 20)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] llermaly opened a new issue, #11777: Unusually slow indexing polygons
llermaly opened a new issue, #11777: URL: https://github.com/apache/lucene/issues/11777 ### Description Some polygons are taking a lot of time to index (13MB, 15 minutes), and some way larger ones (50MB+) taking just a couple of minutes. Attached two of this polygons. [FE-2456.txt](https://github.com/apache/lucene/files/9577391/FE-2456.txt) [ORG-24132378.txt](https://github.com/apache/lucene/files/9577398/ORG-24132378.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] llermaly commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?
llermaly commented on issue #11767: URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248367379 Hi @iverase would be nice if you could go to https://github.com/apache/lucene/issues/11777 and test with those polygons as well. We are having Elastic Cloud timeouts because of the time it takes to triangulate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] nknize commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?
nknize commented on issue #11767: URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248382179 > My proposal is to remove the method completely or at least not call this method if the Tessellator has been called with the flag `checkSelfIntersections` set to true. > > @nknize introduced this method on the first version of the Tessellator, he might have more background about the need of this method. what do you think? I was actually looking into this before turning my attention to the shape doc values and I was just getting ready to come back to it. The method was originally introduced to postpone self intersection removal unless absolutely necessary (e.g., tessellation failed). Essentially it was a lazy cleaning approach. I believe, though this needs thorough evaluation, some of the improvements made to [filter points](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/geo/Tessellator.java#L1326) rendered much of this logic obsolete. My last test (random and explicit) never actually exercised this logic, even with self intersections. Furthermore, the follow up SPLIT logic was not exercised either so I was exploring the possibility of removing both of these phases. > if you could go to https://github.com/apache/lucene/issues/11777 and test with those polygons as well. We are having Elastic Cloud timeouts because of the time it takes to triangulate. These adversarial cases are important to capture in our tests. Our randomized polygon generator doesn't inject any self intersections so we really have a gap in testing the logic coverage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] patelprateek commented on issue #11765: Query optimizer statistics
patelprateek commented on issue #11765: URL: https://github.com/apache/lucene/issues/11765#issuecomment-1248413173 @jpountz : After a query runs , i read that lucene uses filter cache where it encodes the posting list using compressed bitmaps (roaring) , is there any api to retrieve these compressed bitmap rather than iterating over the actual document ids ? My use case is some filters can have possibly large hits (>10 million) and in such scenarios the compressed bitmaps can possibly help for downstream logic . Any recommendations or pointers for any other approaches ? For a query is it possible to have a quick dry run to get estimated number of documents it will return ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] llermaly commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?
llermaly commented on issue #11767: URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248415566 Here I have some valid polygons being rejected for self intersecting, in case are useful for you to test: https://github.com/apache/lucene/issues/11776 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on pull request #11774: GH-11172: remove WindowsDirectory and native subproject.
dweiss commented on PR #11774: URL: https://github.com/apache/lucene/pull/11774#issuecomment-1248464122 This is an alternative notation for issue numbers that github actually understands; see commit links, for example:  I tend to prefer it to hash+number because hash+number is treated as a commented line if you edit any previous commit (rebase interactive, amend, etc.)... There are ways to work around it, but it requires some manual tweaks - I prefer to just use GH-xxx. https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/autolinked-references-and-urls#issues-and-pull-requests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] danmuzi commented on pull request #11774: GH-11172: remove WindowsDirectory and native subproject.
danmuzi commented on PR #11774: URL: https://github.com/apache/lucene/pull/11774#issuecomment-1248479475 I think the issue number for this patch is wrong again. It needs to be changed from #11172 to #11772. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] danmuzi opened a new issue, #11778: add detailed part-of-speech tag for particle and ending on Nori
danmuzi opened a new issue, #11778: URL: https://github.com/apache/lucene/issues/11778 ### Description There are several tag types for **particle**(조사) and **ending**(어미) in mecab-ko-dic. (https://docs.google.com/spreadsheets/d/1-9blXKjtjeKZqsf4NzHeYJCrr49-nXeRF6D80udfcwY) But Nori only tags **J(particle), E(ending)** for that. When using a Korean morpheme analyzer, detailed part-of-speech information is often required. (E.g., misanalysis debugging) Or, there is case that user want to remove specific pos tag. For this case, Lucene currently supports `KoreanPartOfSpeechStopFilter`. With the current structure, it is impossible to remove a specific tag for particle and ending. (E.g., I only want to remove "Sentence-closing ending" pos tag) To solve that, detailed pos tag information for particle and ending is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] danmuzi opened a new pull request, #11779: GITHUB#11778: Add detailed part-of-speech tag for particle and ending on Nori
danmuzi opened a new pull request, #11779: URL: https://github.com/apache/lucene/pull/11779 add detailed part-of-speech tag for particle and ending on nori. The part-of-speech name was set based on the **Korean-English Learners' Dictionary** of [National Institute of the Korean Language](https://korean.go.kr/front_eng/main.do). (https://krdict.korean.go.kr/eng/) Closes #11778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] janhoy commented on issue #10269: Lucene web site broken links [LUCENE-9229]
janhoy commented on issue #10269: URL: https://github.com/apache/lucene/issues/10269#issuecomment-1248570742 I'll close this old issue. Anyone discovering any new broken links on the site can fix those in new PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] janhoy closed issue #10269: Lucene web site broken links [LUCENE-9229]
janhoy closed issue #10269: Lucene web site broken links [LUCENE-9229] URL: https://github.com/apache/lucene/issues/10269 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] janhoy commented on pull request #591: LUCENE-10365 Wizard changes contributed from Solr
janhoy commented on PR #591: URL: https://github.com/apache/lucene/pull/591#issuecomment-1248760591 @msokolov This has been hanging for a while, and I'll now merge it into main and then to branch_9x. Just though I'd alert you as 9.4.0 RM, although I don't anticipate any issues with the ongoing release, as these are mostly bug fixes and improvements related to release signing (ability to sign with gradle plugin instead of GPG). I'll let you make the call whether you merge it into branch_9_4 for use with this release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] iverase commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?
iverase commented on issue #11767: URL: https://github.com/apache/lucene/issues/11767#issuecomment-1248975086 >The method was originally introduced to postpone self intersection removal I don't understand this. We re claiming in the java docs that polygons should not be self-intersecting and we do not introduce self-intersections in our code, why we want to remove them? ``` * * Requires valid polygons: * * No self intersections * Holes may only touch at one vertex * Polygon must have an area (e.g., no "line" boxes) * sensitive to overflow (e.g, subatomic values such as E-200 can cause unexpected * behavior) * * ``` Looking at the original code which the tessellator is inspired on, the method was introduced to handle some OSM polygons that contain self-intersections, hence not valid: https://github.com/mapbox/earcut/issues/8 As we claim we only support valid polygons, I think it is safe to remove the method entirely. @llermaly I found this issue by looking into one of your polygons so we should expect nice performance improvements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] nknize commented on issue #11767: Does the method #cureLocalIntersections in the Tessellator make any sense?
nknize commented on issue #11767: URL: https://github.com/apache/lucene/issues/11767#issuecomment-124898 > I don't understand this. We re claiming in the java docs that polygons should not be self-intersecting and we do not introduce self-intersections in our code, why we want to remove them? Because real life Geo data doesn't care what our javadocs say. Small self intersections are a reality that rears its head every now and then in real data and the performance hit to "best effort" detect and clean in the tessellator's cure phase at index time was worth more than directing user's to a third party cleaning utility before indexing. Our blind polygon class does nothing to enforce those javadocs so maybe before completely removing we might consider flagging that phase as an optional validation step disabled by default? (Think `ignore_malformed`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org