Re: deleting large amount data from solr cloud

2014-04-17 Thread Vinay Pothnis
Thanks Erick! On 17 April 2014 08:35, Erick Erickson wrote: > bq: Will it get split at any point later? > > "Split" is a little ambiguous here. Will it be copied into two or more > segments? No. Will it disappear? Possibly. Eventually this segment > will be merged if you add enough documents to

Re: deleting large amount data from solr cloud

2014-04-17 Thread Erick Erickson
bq: Will it get split at any point later? "Split" is a little ambiguous here. Will it be copied into two or more segments? No. Will it disappear? Possibly. Eventually this segment will be merged if you add enough documents to the system. Consider this scenario: you add 1M docs to your system and i

Re: deleting large amount data from solr cloud

2014-04-17 Thread Vinay Pothnis
Thanks a lot Shalin! On 16 April 2014 21:26, Shalin Shekhar Mangar wrote: > You can specify maxSegments parameter e.g. maxSegments=5 while optimizing. > > > On Thu, Apr 17, 2014 at 6:46 AM, Vinay Pothnis wrote: > > > Hello, > > > > Couple of follow up questions: > > > > * When the optimize comm

Re: deleting large amount data from solr cloud

2014-04-16 Thread Shalin Shekhar Mangar
You can specify maxSegments parameter e.g. maxSegments=5 while optimizing. On Thu, Apr 17, 2014 at 6:46 AM, Vinay Pothnis wrote: > Hello, > > Couple of follow up questions: > > * When the optimize command is run, looks like it creates one big segment > (forceMerge = 1). Will it get split at any

Re: deleting large amount data from solr cloud

2014-04-16 Thread Vinay Pothnis
Hello, Couple of follow up questions: * When the optimize command is run, looks like it creates one big segment (forceMerge = 1). Will it get split at any point later? Or will that big segment remain? * Is there anyway to maintain the number of segments - but still merge to reclaim the deleted d

Re: deleting large amount data from solr cloud

2014-04-16 Thread Vinay Pothnis
Thank you Erick! Yes - I am using the expunge deletes option. Thanks for the note on disk space for the optimize command. I should have enough space for that. What about the heap space requirement? I hope it can do the optimize with the memory that is allocated to it. Thanks Vinay On 16 April 2

Re: deleting large amount data from solr cloud

2014-04-16 Thread Erick Erickson
The optimize should, indeed, reduce the index size. Be aware that it may consume 2x the disk space. You may also try expungedeletes, see here: https://wiki.apache.org/solr/UpdateXmlMessages Best, Erick On Wed, Apr 16, 2014 at 12:47 AM, Vinay Pothnis wrote: > Another update: > > I removed the rep

Re: deleting large amount data from solr cloud

2014-04-15 Thread Vinay Pothnis
Another update: I removed the replicas - to avoid the replication doing a full copy. I am able delete sizeable chunks of data. But the overall index size remains the same even after the deletes. It does not seem to go down. I understand that Solr would do this in background - but I don't seem to

Re: deleting large amount data from solr cloud

2014-04-14 Thread Vinay Pothnis
Some update: I removed the auto warm configurations for the various caches and reduced the cache sizes. I then issued a call to delete a day's worth of data (800K documents). There was no out of memory this time - but some of the nodes went into recovery mode. Was able to catch some logs this tim

Re: deleting large amount data from solr cloud

2014-04-14 Thread Vinay Pothnis
Yes, that is our approach. We did try deleting a day's worth of data at a time, and that resulted in OOM as well. Thanks Vinay On 14 April 2014 00:27, Furkan KAMACI wrote: > Hi; > > I mean you can divide the range (i.e. one week at each delete instead of > one month) and try to check whether y

Re: deleting large amount data from solr cloud

2014-04-14 Thread Furkan KAMACI
Hi; I mean you can divide the range (i.e. one week at each delete instead of one month) and try to check whether you still get an OOM or not. Thanks; Furkan KAMACI 2014-04-14 7:09 GMT+03:00 Vinay Pothnis : > Aman, > Yes - Will do! > > Furkan, > How do you mean by 'bulk delete'? > > -Thanks > V

Re: deleting large amount data from solr cloud

2014-04-13 Thread Vinay Pothnis
Aman, Yes - Will do! Furkan, How do you mean by 'bulk delete'? -Thanks Vinay On 12 April 2014 14:49, Furkan KAMACI wrote: > Hi; > > Do you get any problems when you index your data? On the other hand > deleting as bulks and reducing the size of documents may help you not to > hit OOM. > > Tha

Re: deleting large amount data from solr cloud

2014-04-12 Thread Furkan KAMACI
Hi; Do you get any problems when you index your data? On the other hand deleting as bulks and reducing the size of documents may help you not to hit OOM. Thanks; Furkan KAMACI 2014-04-12 8:22 GMT+03:00 Aman Tandon : > Vinay please share your experience after trying this solution. > > > On Sat,

Re: deleting large amount data from solr cloud

2014-04-11 Thread Aman Tandon
Vinay please share your experience after trying this solution. On Sat, Apr 12, 2014 at 4:12 AM, Vinay Pothnis wrote: > The query is something like this: > > > *curl -H 'Content-Type: text/xml' --data 'param1:(val1 OR > val2) AND -param2:(val3 OR val4) AND date_param:[138395520 TO > 13851648

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
The query is something like this: *curl -H 'Content-Type: text/xml' --data 'param1:(val1 OR val2) AND -param2:(val3 OR val4) AND date_param:[138395520 TO 138516480]' 'http://host:port/solr/coll-name1/update?commit=true'* Trying to restrict the number of documents deleted via the date par

Re: deleting large amount data from solr cloud

2014-04-11 Thread Shawn Heisey
On 4/10/2014 7:25 PM, Vinay Pothnis wrote: When we tried to delete the data through a query - say 1 day/month's worth of data. But after deleting just 1 month's worth of data, the master node is going out of memory - heap space. Wondering is there any way to incrementally delete the data without

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Tried to increase the memory to 24G but that wasn't enough as well. Agree that the index has now grown too much and had to monitor this and take action much earlier. The search operations seem to run ok with 16G - mainly because the bulk of the data that we are trying to delete is not getting sear

Re: deleting large amount data from solr cloud

2014-04-11 Thread Erick Erickson
Using 16G for a 360G index is probably pushing things. A lot. I'm actually a bit surprised that the problem only occurs when you delete docs The simplest thing would be to increase the JVM memory. You should be looking at your index to see how big it is, be sure to subtract out the *.fdt and *

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Sorry - yes, I meant to say leader. Each JVM has 16G of memory. On 10 April 2014 20:54, Erick Erickson wrote: > First, there is no "master" node, just leaders and replicas. But that's a > nit. > > No real clue why you would be going out of memory. Deleting a > document, even by query should jus

Re: deleting large amount data from solr cloud

2014-04-10 Thread Erick Erickson
First, there is no "master" node, just leaders and replicas. But that's a nit. No real clue why you would be going out of memory. Deleting a document, even by query should just mark the docs as deleted, a pretty low-cost operation. how much memory are you giving the JVM? Best, Erick On Thu, Apr