Re: 2.1billion+ document

Ali, Saqib Fri, 05 Jul 2013 21:16:22 -0700

Thanks Jason! That was very helpful.

I read on the solr wiki that:
"Documents must have a unique key and the unique key must be stored
(stored="true" in schema.xml)"


What is this unique key? Is this just a id that we define in the schema.xml
that is unique to all documents? We have something as follows:
        <field name="id" type="long" indexed="true" stored="true"/>

Will this suffice?



Thanks.

On Fri, Jul 5, 2013 at 7:45 PM, Jason Hellman <
jhell...@innoventsolutions.com> wrote:

> Saqib:
>
> At the simplest level:
>
> 1)  Source the machine
> 2)  Install Java
> 3)  Install a servlet container of your choice
> 4)  Copy your Solr WAR and conf directories as desired (probably a rough
> mirror of your current single server)
> 5)  Start it up and start sending data there
> 6)  Query both by simply adding:
>  shards=host1/solr/collection,host2/solr/collection
> 7)  Profit
>
> Or, in shorthand:
>
> 1)  Install new Solr instance and start indexing data there
> 2)  Add the shards parameter to your queries with both (or more) servers
> 3)  …
> 4)  Profit
>
> Now…we usually want to be concerned about how to manage the data so that
> we don't send duplicates.  Without SolrCloud it is our responsibility to
> delegate traffic for updates and deletes.  We also like to think a bit more
> about how to take advantage of our lovely parallelism to increase index or
> query time.  We should also consider strategies to isolate domain data to
> single shards so as to allow isolated queries against dedicated data models
> in single shards.
>
> But if you just want to basics, it really is as easy as describe above.
>
> Jason
>
>
> On Jul 5, 2013, at 7:36 PM, "Ali, Saqib" <docbook....@gmail.com> wrote:
>
> > Hello Otis,
> >
> > I was thinking more in terms of Solr DistributedSearch rather than
> > SolrCloud. I was hoping to add another Solr instance, when the time
> comes.
> > This is a low use application, but with lot of data. Uptime and query
> speed
> > are not of importance. However we would like to be able to index more
> then
> > 2.1 b document when the time comes......
> >
> > Any advise will be highly appreciated.
> >
> >
> > Thanks!!! :)
> > Saqib
> >
> >
> > On Fri, Jul 5, 2013 at 6:23 PM, Otis Gospodnetic <
> otis.gospodne...@gmail.com
> >> wrote:
> >
> >> Hi,
> >>
> >> It's a broad question, but it starts with getting a few servers,
> >> putting Solr 4.3.1 on it (soon 4.4), setting up Zookeeper, creating a
> >> Solr Collection (index) with N shards and M replicas, and reindexing
> >> your old data to this new cluster, which you can expand with new nodes
> >> over time.  If you have specific questions...
> >>
> >> Otis
> >> --
> >> Solr & ElasticSearch Support -- http://sematext.com/
> >> Performance Monitoring -- http://sematext.com/spm
> >>
> >>
> >>
> >> On Fri, Jul 5, 2013 at 8:42 PM, Ali, Saqib <docbook....@gmail.com>
> wrote:
> >>> Question regarding the 2.1 billion+ document.
> >>>
> >>> I understand that a single instance of solr has a limit of 2.1 billion
> >>> documents.
> >>>
> >>> We currently have a single solr server. If we reach 2.1billion
> documents
> >>> limit, what is involved in moving to the Solr DistributedSearch?
> >>>
> >>> Thanks! :)
> >>
>
>

Re: 2.1billion+ document

Reply via email to