We have found that 200-250mb per Lucene index is where efficiency
drops off and Lucene gets slow. You will have to use a sharding
approach: many small indexes, and all have different sets of
documents. Solr has a tool for doing queries across many shards,
called Distributed Search.
http://wiki.apa
We will be ingesting gigabytes of new data per day, but have a lot of legacy
data (petabytes) that will also need to be indexed. We will probably index
many fields per record (ave. 50/record) and hope to add facets in the near
future.
If this solution gives us the speed and facet capabilities we
Liz,
I've built terrabyte (1-2 TB) test Lucene indexes, but have not
reached to the petabyte level, so I am not sure. Certainly there is
overhead in using the http and xml marshaling/de-marshaling, which may
or may not be a critical factor for you.
Could you give more information with respect to
We do have synonyms.txt in our config directory. The config directory is a
copy of the example directory. We will probably also run into this problem
with stopwords.xml.
We don't understand how to make it look in the correct directory. We
thought it got the correct directory out of the solrconf
I was worried that it wouldn't scale. We are going to be indexing petabytes
of data. Does the httpserver solution scale?
Thanks
Liz Sommers
lizswo...@gmail.com
On Tue, Aug 24, 2010 at 12:23 PM, Thomas Joiner
wrote:
> Is there any reason you aren't using http://wiki.apache.org/solr/Solrj to
>
Hello!
The exception thrown by Solr says that You do not have synonyms.txt
file either in classpath or in solr core config directory. Check Your
schema.xml file for a filter - SynonymFilterFactory. That filter use
synonyms.txt file to read synonyms definitions. If You don`t need
synonyms filter
Is there any reason you aren't using http://wiki.apache.org/solr/Solrj to
interact with Solr?
On Tue, Aug 24, 2010 at 11:12 AM, Liz Sommers wrote:
> I am very new to the solr/lucene world. I am using solr 1.4.0 and cannot
> move to 1.4.1.
>
> I have to index about 50 fields for each document, t