benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307208338

   > We have to start building up tests for these cases because this seems like 
deja vu as far as int overflows in this area.
   
   I am right there with ya @rmuir. 100% feels like "whackamole". 
   
   > Looks like we need more vectors but they can have less dimensions to 
trigger this one? 
   
   Yeah, we can probably trigger this overflow by using `16268815` `byte` 
vectors of few dimensions. Something as small as 2 dimensions could work. 
   
   One issue with HNSW is that completely random vectors can make it run 
dog-slow on index. Maybe having few dimensions could alleviate this.
   
   > but its still pretty slow so I'm gonna leave it running.
   
   Thanks for digging into writing a test up. I am thinking on how to test it. 
I initially was thinking about moving all these numeric calculations outside of 
the I/O path. But that pattern is not prevalent in Lucene. Also, doing this 
type of "unit tests" that wouldn't ACTUALLY be using large amounts of data 
still won't solve our lack of coverage in larger test scenarios.
   
   ---------
   
   As an aside, all these scenarios fixed here @rmuir had a Java auto-cast 
warning. Running IntelliJ's analyzer there are over 100 issues of integer 
multiplication being auto-cast to long. It may be prudent to dig through these 
warnings and see if they need fixing. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to