nknize commented on PR #12287: URL: https://github.com/apache/lucene/pull/12287#issuecomment-1883224420
> There are existing tests which should fails. https://github.com/apache/lucene/blob/main/lucene/core/src/test/org/apache/lucene/document/TestShapeDocValues.java#L67 > However because we are catching exception and try until we get the correct polygon, it never fails. That try-catch is intentional. The evil, esoteric / invalid test geometry generation is also intentional so we can maximize our test coverage across the tessellator and Shape indexing. This is why we have [Tessellator logic](https://github.com/apache/lucene/blob/4d916a754be608e7ffb4e5626e4f99b21b8513f1/lucene/core/src/java/org/apache/lucene/geo/Tessellator.java#L1293) to filter colinear and coplanar points so we can best effort index shapes that may not be well formed or may not have been pre-validated or cleaned. While we expect well formed data, that's not the real world. Users (especially geo users) often throw dirty data at the API. And to pre-validate geometries is an expensive operation. Not all users want to pay that price. This is why we left much of that diligence out of Lucene and in the hands of the integrating API. For example, see Elasticsearch [`ignore_malformed` parameter](https://github.com/elastic/elasticsearch/pull/24654). For the explicit shape doc value boundi ng box test referenced, we expect a valid polygon bounding box for test purposes, but (as just pointed out) that reused polygon generator could create a malformed one and an invalid bounding box. Rather than write a new polygon generator I opted to reuse the existing one and just retry until a valid polygon is created. What you could do is just write a new polygon generator that always creates a valid well-formed polygon. I'd be concerned, though, about new contributors always using that to create perfect data, which is not real world. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org