gibrown commented on issue #11507: URL: https://github.com/apache/lucene/issues/11507#issuecomment-1495137559
I'll preface this by saying I am also skeptical that going beyond 1024 makes sense for most use cases and scaling is a concern. However, amidst the current excitement to try and use openai embeddings the first cut at choosing a system to store and use those embeddings was Elasticsearch. Then the 1024 limit was run into and so various folks are looking at other alternatives largely because of this limit. The use cases tend to be Q/A, summarization, and recommendation systems for WordPress and Tumblr. There are multiple proof of concept systems people have built (typically on top of various typscript, javascript, or python libs) which use the openai embeddings directly (and give quite impressive results). Even though I am pretty certain that reducing the dimensions will be a better idea for many of these, the ability to build and prototype on higher dimensions would be extremely useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org