Here's the Jira issue for the distributed search issue. https://issues.apache.org/jira/browse/SOLR-1632
I tried applying this patch but, get the same error that is posted in the discussion section for that issue. I will be glad to help too on this one. On Sat, Oct 23, 2010 at 2:35 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Ah, I should have read more carefully... > > I remember this being discussed on the dev list, and I thought there might > be > a Jira attached but I sure can't find it. > > If you're willing to work on it, you might hop over to the solr dev list > and > start > a discussion, maybe ask for a place to start. I'm sure some of the devs > have > thought about this... > > If nobody on the dev list says "There's already a JIRA on it", then you > should > open one. The Jira issues are generally preferred when you start getting > into > design because the comments are preserved for the next person who tries > the idea or makes changes, etc.... > > Best > Erick > > On Wed, Oct 20, 2010 at 9:52 PM, Ben Boggess <ben.bogg...@gmail.com> > wrote: > > > Thanks Erick. The problem with multiple cores is that the documents are > > scored independently in each core. I would like to be able to search > across > > both cores and have the scores 'normalized' in a way that's similar to > what > > Lucene's MultiSearcher would do. As far a I understand, multiple cores > > would likely result in seriously skewed scores in my case since the > > documents are not distributed evenly or randomly. I could have one > > core/index with 20 million docs and another with 200. > > > > I've poked around in the code and this feature doesn't seem to exist. I > > would be happy with finding a decent place to try to add it. I'm not > sure > > if there is a clean place for it. > > > > Ben > > > > On Oct 20, 2010, at 8:36 PM, Erick Erickson <erickerick...@gmail.com> > > wrote: > > > > > It seems to me that multiple cores are along the lines you > > > need, a single instance of Solr that can search across multiple > > > sub-indexes that do not necessarily share schemas, and are > > > independently maintainable...... > > > > > > This might be a good place to start: > > http://wiki.apache.org/solr/CoreAdmin > > > > > > HTH > > > Erick > > > > > > On Wed, Oct 20, 2010 at 3:23 PM, ben boggess <ben.bogg...@gmail.com> > > wrote: > > > > > >> We are trying to convert a Lucene-based search solution to a > > >> Solr/Lucene-based solution. The problem we have is that we currently > > have > > >> our data split into many indexes and Solr expects things to be in a > > single > > >> index unless you're sharding. In addition to this, our indexes > wouldn't > > >> work well using the distributed search functionality in Solr because > the > > >> documents are not evenly or randomly distributed. We are currently > > using > > >> Lucene's MultiSearcher to search over subsets of these indexes. > > >> > > >> I know this has been brought up a number of times in previous posts > and > > the > > >> typical response is that the best thing to do is to convert everything > > into > > >> a single index. One of the major reasons for having the indexes split > > up > > >> the way we do is because different types of data need to be indexed at > > >> different intervals. You may need one index to be updated every 20 > > minutes > > >> and another is only updated every week. If we move to a single index, > > then > > >> we will constantly be warming and replacing searchers for the entire > > >> dataset, and will essentially render the searcher caches useless. If > we > > >> were able to have multiple indexes, they would each have a searcher > and > > >> updates would be isolated to a subset of the data. > > >> > > >> The other problem is that we will likely need to shard this large > single > > >> index and there isn't a clean way to shard randomly and evenly across > > the > > >> of > > >> the data. We would, however like to shard a single data type. If we > > could > > >> use multiple indexes, we would likely be also sharding a small sub-set > > of > > >> them. > > >> > > >> Thanks in advance, > > >> > > >> Ben > > >> > > >