Hello, There are a few features I would like to see in SOLR going forward and I am interested in finding out what other folks thought about them to get a priority list. I believe there are many features that Google and FAST have that SOLR and Lucene will want to implement in future releases.
1. Machine learning based suggest feature https://issues.apache.org/jira/browse/LUCENE-626 which is implemented as is similar to what Google in their suggest implementation. The Fuzzy based spellchecker is ok, but it would be better to incorporate use behavior. 2. Realtime updates https://issues.apache.org/jira/browse/LUCENE-1313 and work being planned for IndexWriter 3. Realtime untokenized field updates https://issues.apache.org/jira/browse/LUCENE-1292 4. BM25 Scoring 5. Integration with an open source SQL database such as H2. This would mean under the hood, SOLR would enable storing data in a relational database to allow for joins and things. It would need to be combined with realtime updates. H2 has Lucene integration but it is the usual index everything at once, non-incrementally. The new system would simply index as a new row in a table is added. The SOLR schema could allow for certain fields being stored in an SQL database. 6. SOLR schema allowing for multiple indexes without using the multicore. The indexes could be defined like SQL tables in the schema.xml file. 6. Crowd by feature ala GBase http://code.google.com/apis/base/attrs-queries.html#crowding which is similar to Field Collapsing. I am thinking it is advantageous from a performance perspective to obtain an excessive amount of results, then filter down the result set, rather than first sort a result set. 7. Improved relevance based on user clicks of individual query results for individual queries. This can be thought of as similar to what Digg does. I'm sure Google does something similar. It is a feature that would be of value to almost any SOLR implementation. 8. Integration of LocalSolr into the standard SOLR distribution. Location is something many sites use these days and is standard in GBase and most likely other products like FAST. 9. Distributed search and updates using a object serialization which could use. https://issues.apache.org/jira/browse/LUCENE-1336 This allows span queries, custom payload queries, custom similarities, custom analyzers, without compiling and deploying and a new SOLR war file to individual servers. Cheers, Jason