> We are beginners to Apache SOLR, We need following clarifications from you.
> 
> 
> 
> 1.      In SOLRCloud, How can we install more than one Shared on Single PC? 

You typically have one installation of Solr on each server. Then you can add a 
collection with multiple shards, specifying how many shards you wish when 
creating the collection, e.g.

bin/solr create -c mycoll -shards 4

Although possible, it is normally not advised to install multiple instances of 
Solr on the same server.

> 2.      How many maximum number of shared can be added under on SOLRCloud?

There is no limit. You should find a good number based on the number of 
documents, the size of your data, the number of servers in your cluster, 
available RAM and disk size and the required performance.

In practice you will guess the initial #shards and then benchmark a few 
different settings before you decide.
Note that you can also adjust the number of shards as you go through 
CREATESHARD / SPLITSHARD APIs, so even if you start out with few shards you can 
grow later.

> 3.      In my application there is no need of ACID properties, other than
> this can I use SOLR as a Complete Database?

You COULD, but Solr is not intended to be your primary data store. You should 
always design your system so that you can re-index all content from some source 
(does not need to be a database) when needed. There are several use cases for a 
complete re-index that you should consider.

> 4.      In Which OS we can feel the better performance, Windows Server OS /
> Linux?

I'd say Linux if you can. If you HAVE to, then you could also run on Windows :-)

> 5.      If a SOLR Core contains 2 Billion indexes, what is the recommended
> RAM size and Java heap space for better performance? 

It depends. It is not likely that you will ever put 2bn docs in one single 
core. Normally you would have sharded long before that number.
The amount of physical RAM and the amount of Java heap to allocate to Solr must 
be calculated and decided on a per case basis.
You could also benchmark this - test if a larger RAM size improves performance 
due to caching. Depending on your bottlennecks, adding more RAM may be a way to 
scale further before needing to add more servers.

Sounds like you should consult with a Solr expert to dive deep into your exact 
usecase and architect the optimal setup for your case, if you have these 
amounts of data.

> 6.      I have 20 fields per document, how many maximum number of documents
> can be inserted / retrieved in a single request?

No limit. But there are practical limits.
For indexing (update), attempt various batch sizes and find which gives the 
best performance for you. It is just as important to do inserts (updates) in 
many parallell connections as in large batches.

For searching, why would you want to know a maximum? Normally the usecase for 
search is to get TOP N docs, not a maximum number?
If you need to retrieve thousands of results, you should have a look at /export 
handler and/or streaming expressions.

> 7.       If I have Billions of indexes, If the "start" parameter is 10th
> Million index and "end" parameter is  start+100th index, for this case any
> performance issue will be raised ?

Don't do it!
This is a warning sign that you are using Solr in a wrong way.

If you need to scroll through all docs in the index, have a look at streaming 
expressions or cursorMark instead!

> 8.      Which .net client is best for SOLR?

The only I'm aware of is SolrNET. There may be others. None of them are 
supported by the Solr project.

> 9.      Is there any limitation for single field, I mean about the size for
> blob data?

I think there is some default cutoff for very large values.

Why would you want to put very large blobs into documents?
This is a warning flag that you may be using the search index in a wrong way. 
Consider storing large blobs outside of the search index and reference them 
from the docs.


In general, it would help a lot if you start telling us WHAT you intend to use 
Solr for, what you try to achieve, what performance goals/requirements you have 
etc, instead of a lot of very specific max/min questions. There are very seldom 
hard limits, and if there are, it is usually not a good idea to approach them :)

Jan

Reply via email to