Hope I'll succeed) Anyway, solr-user community surprised me in a good way.
Thanks again. _ _ Batalova Kseniya NP. It's something of a step when moving to SolrCloud to "let go" of the details you've had to (painfully) pay attention to, but worth it. The price is, of course, learning to do things a new way ;)... Best, Erick On Thu, Jun 4, 2015 at 10:04 AM, Ксения Баталова <batalova...@gmail.com> wrote: > Erick, > > Thank you so much. It became a bit clearer. > > It was decided to upgrade Solr to 5.2 and use SolrCloud in our next release. > > I think I'll write here about it yet :) > > _ _ > > Batalova Kseniya > > > I have to ask then why you're not using SolrCloud with multiple shards? It > seems to me that that gives you the indexing throughput you need (be sure to > use CloudSolrServer from your client). At 300M complex documents, you > pretty much certainly will need to shard anyway so in some sense you're > re-inventing the wheel here. > > You can host multiple shards on the same machine, and these _are_ separate > Solr cores under the covers so you problem with atomic updates disappears. > > Although I would consider upgrading to Solr 4.10.3 or even 5.2 (which is being > voted on even now and should be out in a week or so barring problems). > > Best, > Erick > > On Wed, Jun 3, 2015 at 11:04 AM, Ксения Баталова <batalova...@gmail.com> > wrote: >> Jack, >> >> Decision of using several cores was made to increase indexing and >> searching performance (experimentally). >> >> In my project index is about 300-500 millions documents (each document >> has rather difficult structure) and it may be larger. >> >> So, while indexing the documents are being added in different cores by >> some amount of threads. >> >> In other words, each thread collect nessesary information for list of >> documents and generate create-documents query to specific core. >> >> At this moment it doesn't matter (and it can't be found out) which >> document in which core will be. >> >> And now there is necessary to update (atomic update) this index. >> >> Something like this.. >> >> _ _ >> >> Batalova Kseniya >> >> >> Explain a little about why you have separate cores, and how you decide >> which core a new document should reside in. Your scenario still seems a bit >> odd, so help us understand. >> >> >> -- Jack Krupansky >> >> On Wed, Jun 3, 2015 at 3:15 AM, Ксения Баталова <batalova...@gmail.com> >> wrote: >> >>> Hi! >>> >>> Thanks for your quick reply. >>> >>> The problem that all my index is consists of several parts (several cores) >>> >>> and while updating I don't know in advance in which part updated id is >>> lying (in which core the document with specified id is lying). >>> >>> For example, I have two cores (*Core1 *and *Core2*) and I want to >>> update the document with id *Id1 *and I don't know where this document >>> is lying. >>> >>> So, I have to do two select-queries to my cores to know where it is. >>> >>> And then generate update-query to necessary core. >>> >>> What am I doing wrong? >>> >>> I remind that I'm using SOLR 4.4.0. >>> >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> Best regards, >>> Batalova Kseniya >>> >>> >>> What exactly is the problem? And why do you care about cores, per se - >>> other than to send the update to the core/collection you are trying to >>> update? You should specify the core/collection name in the URL. >>> >>> You should also be using the Solr reference guide rather than the (old) >>> wiki: >>> >>> https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents >>> >>> >>> -- Jack Krupansky >>> >>> On Tue, Jun 2, 2015 at 10:15 AM, Ксения Баталова <batalova...@gmail.com> >>> wrote: >>> >>> > Hi! >>> > >>> > I'm using *SOLR 4.4.0* for searching in my project. >>> > Now I am facing a problem of atomic updates in multiple cores. >>> > From wiki: >>> > >>> > curl *http://localhost:8983/solr/update >>> > <http://localhost:8983/solr/update> *-H >>> > 'Content-type:application/json' -d ' >>> > [ >>> > { >>> > "*id*" : "*TestDoc1*", >>> > "title" : {"set":"test1"}, >>> > "revision" : {"inc":3}, >>> > "publisher" : {"add":"TestPublisher"} >>> > }, >>> > { >>> > "id" : "TestDoc2", >>> > "publisher" : {"add":"TestPublisher"} >>> > } >>> > ]' >>> > >>> > As well as I understand, this means that the document, for example, with >>> id >>> > *TestDoc1*, will be searched for updating *only in one core*. >>> > And if there is no any document with id *TestDoc1*, the document will be >>> > created. >>> > Can I somehow to specify the* list of cores* for searching and then >>> > updating necessary document with specific id? >>> > >>> > It's something like *shards *parameter in *select* query. >>> > From wiki: >>> > >>> > #now do a distributed search across both servers with your browser or >>> curl >>> > curl ' >>> > >>> http://localhost:8983/solr/*select*?*shards*=localhost:8983/solr,localhost:7574/solr&indent=true&q=ipod+solr >>> > ' >>> > >>> > Or is it planned in the future? >>> > >>> > Thanks in advance. >>> > >>> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> > >>> > Best regards, >>> > Batalova Kseniya >>> > >>>