Block until replication finishes

2014-03-27 Thread Fermin Silva
Hi,

we are moving to native replication with SOLR 3.5.1.
Because we want to control the replication from another program (a cron
job), we decided to curl the slave to issue a fetchIndex command.

The problem we have is that the curl returns immediately, while the
replication still goes in the background.
We need to know when the replication is done, and then resume the cron job.

Is there a way to block on the replication call until it's done similar to
waitForSearcher=true when committing ?
If not, what other possibilities we have?

Just in case, here is the solrconfig part in the slave (we pass masterUrl
in the curl url)



  

  


Many thanks in advance

-- 
Fermin Silva


Re: Block until replication finishes

2014-03-28 Thread Fermin Silva
Hi,

that's what I'm trying. I'm however really cautious when it comes to a
while (somethingIsTrue) { doSomething; sleep; }

Is that safe? What if the slave hungs up, the network is slow/fails, etc?

Thanks


On Thu, Mar 27, 2014 at 1:40 PM, Chris W  wrote:

> Hi
>
>  You can use the "details" command to check the status of replication.
> http://localhost:8983/solr/core_name/replication?command=details
>
> The command returns an xml output and look out for the "isReplicating"
> field in the output. Keep running the command in a loop until the flag
> becomes false. Thats when you know its done. I would also recommend you to
> check the # of docs in the output at source/destination after the
> replication to be sure
>
>
> HTH
>
>
>
>
> On Thu, Mar 27, 2014 at 6:35 AM, Fermin Silva  wrote:
>
> > Hi,
> >
> > we are moving to native replication with SOLR 3.5.1.
> > Because we want to control the replication from another program (a cron
> > job), we decided to curl the slave to issue a fetchIndex command.
> >
> > The problem we have is that the curl returns immediately, while the
> > replication still goes in the background.
> > We need to know when the replication is done, and then resume the cron
> job.
> >
> > Is there a way to block on the replication call until it's done similar
> to
> > waitForSearcher=true when committing ?
> > If not, what other possibilities we have?
> >
> > Just in case, here is the solrconfig part in the slave (we pass masterUrl
> > in the curl url)
> >
> > 
> > 
> >   
> > 
> >   
> >
> >
> > Many thanks in advance
> >
> > --
> > Fermin Silva
> >
>
>
>
> --
> Best
> --
> C
>



-- 
Fermin Silva
Speed & Scalability Team


Re: Block until replication finishes

2014-04-01 Thread Fermin Silva
The ReplicationHandler class is not the most exemplar code to be looking at.
I found however the line that could be changed:

new Thread() {
@Override
public void run() {
  doFetch(paramsCopy);
}
  }.start();
rsp.add(STATUS, OK_STATUS);

It should be really simple to join on that thread depending on a rest
parameter.
I would change that code myself (which I did to my custom SOLR
installation) but I guess the fix should go for SOLR 4.x and not 3.x.
Sorry but I have no clue about how to contribute with code. Will check that
but if someone can point me to the right direction it would be nice.

Thanks


On Sat, Mar 29, 2014 at 9:49 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello,
> We did this for our fork, if you are not happy with "RESTful polling", or
> think that the synchronous replication handler might be useful, please
> raise a jira.
>  27.03.2014 17:35 пользователь "Fermin Silva"  написал:
>
> > Hi,
> >
> > we are moving to native replication with SOLR 3.5.1.
> > Because we want to control the replication from another program (a cron
> > job), we decided to curl the slave to issue a fetchIndex command.
> >
> > The problem we have is that the curl returns immediately, while the
> > replication still goes in the background.
> > We need to know when the replication is done, and then resume the cron
> job.
> >
> > Is there a way to block on the replication call until it's done similar
> to
> > waitForSearcher=true when committing ?
> > If not, what other possibilities we have?
> >
> > Just in case, here is the solrconfig part in the slave (we pass masterUrl
> > in the curl url)
> >
> > 
> > 
> >   
> > 
> >   
> >
> >
> > Many thanks in advance
> >
> > --
> > Fermin Silva
> >
>



-- 
Fermin Silva
Speed & Scalability Team


Re: Block until replication finishes

2014-04-01 Thread Fermin Silva
When trying to add the fix to the trunk version, I found that this was
already implemented.
There is a parameter '*wait*' that does exactly that.

if (solrParams.getBool(WAIT, false))
{
puller.join();
}

So the only possible way to do this in SOLR 3.x is to create a plugin with
a new replication handler (which I did) or re-compile SOLR.


On Tue, Apr 1, 2014 at 10:02 AM, Fermin Silva  wrote:

> The ReplicationHandler class is not the most exemplar code to be looking
> at.
> I found however the line that could be changed:
>
> new Thread() {
> @Override
> public void run() {
>   doFetch(paramsCopy);
> }
>   }.start();
> rsp.add(STATUS, OK_STATUS);
>
> It should be really simple to join on that thread depending on a rest
> parameter.
> I would change that code myself (which I did to my custom SOLR
> installation) but I guess the fix should go for SOLR 4.x and not 3.x.
> Sorry but I have no clue about how to contribute with code. Will check
> that but if someone can point me to the right direction it would be nice.
>
> Thanks
>
>
> On Sat, Mar 29, 2014 at 9:49 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> Hello,
>> We did this for our fork, if you are not happy with "RESTful polling", or
>> think that the synchronous replication handler might be useful, please
>> raise a jira.
>>  27.03.2014 17:35 пользователь "Fermin Silva"  написал:
>>
>> > Hi,
>> >
>> > we are moving to native replication with SOLR 3.5.1.
>> > Because we want to control the replication from another program (a cron
>> > job), we decided to curl the slave to issue a fetchIndex command.
>> >
>> > The problem we have is that the curl returns immediately, while the
>> > replication still goes in the background.
>> > We need to know when the replication is done, and then resume the cron
>> job.
>> >
>> > Is there a way to block on the replication call until it's done similar
>> to
>> > waitForSearcher=true when committing ?
>> > If not, what other possibilities we have?
>> >
>> > Just in case, here is the solrconfig part in the slave (we pass
>> masterUrl
>> > in the curl url)
>> >
>> > 
>> > 
>> >   
>> > 
>> >   
>> >
>> >
>> > Many thanks in advance
>> >
>> > --
>> > Fermin Silva
>> >
>>
>
>
>
> --
> Fermin Silva
> Speed & Scalability Team
>



-- 
Fermin Silva
Speed & Scalability Team


Re: Mongo DB Users

2014-09-16 Thread Fermin Silva
Remove

On Tue, Sep 16, 2014 at 2:19 PM, Xavier Morera 
wrote:

> I think what some people are actually saying is "burn in hell Aaron Susan
> for using a solr apache dl for marketing purposes"?
>
> On Tue, Sep 16, 2014 at 8:31 AM, Suman Ghosh 
> wrote:
>
> > Remove
> >
> > On Mon, Sep 15, 2014 at 11:35 AM, Aaron Susan 
> > wrote:
> >
> > > Hi,
> > >
> > > I am here to inform you that we are having a contact list of *Mongo DB
> > > Users *would you be interested in it?
> > >
> > > Data Field’s Consist Of: Name, Job Title, Verified Phone Number,
> Verified
> > > Email Address, Company Name & Address Employee Size, Revenue size, SIC
> > > Code, Industry Type etc.,
> > >
> > > We also provide other technology users as well depends on your
> > requirement.
> > >
> > > For Example:
> > >
> > >
> > > *Red Hat *
> > >
> > > *Terra data *
> > >
> > > *Net-app *
> > >
> > > *NuoDB*
> > >
> > > *MongoHQ ** and many more*
> > >
> > >
> > > We also provide IT Decision Makers, Sales and Marketing Decision
> Makers,
> > > C-level Titles and other titles as per your requirement.
> > >
> > > Please review and let me know your interest if you are looking for
> above
> > > mentioned users list or other contacts list for your campaigns.
> > >
> > > Waiting for a positive response!
> > >
> > > Thanks
> > >
> > > *Aaron Susan*
> > > Data Specialist
> > >
> > > If you are not the right person, feel free to forward this email to the
> > > right person in your organization. To opt out response Remove
> > >
> >
>
>
>
> --
> *Xavier Morera*
> email: xav...@familiamorera.com
> CR: +(506) 8849 8866
> US: +1 (305) 600 4919
> skype: xmorera
>


Re: SOLR 4.x vs 3.x parsedquery differences

2013-09-06 Thread Fermin Silva
Besides liking or not the behaviour we are getting in 3.x, Im required to
keep everything working as close as possible as before.

Have no idea why this is happening, but setting that field to true solved
the issue, now I get the exact same amount of items in both queries!

I wouldn't bother checking why that was so since we'll be moving away from
the older version, which shows the inconsistency.

But thanks a million.

If you have a SO user I can mark yours as answer here:
http://stackoverflow.com/questions/18661996/solr-4-x-vs-3-x-parsedquery-differences

Cheers
On Sep 6, 2013 4:15 PM, "Chris Hostetter"  wrote:

>
> : Our schema is identical except the version.
> : In 3.x it's 1.1 and in 4.x it's 1.5.
>
> That's kind of a significant difference to leave out -- indepenent of the
> question you are asking about here, it's going to make quite a few
> differences in how things are being being parsed, and what defaults are.
>
> If i'm understanding correctly: you like the behavior you are getting from
> Solr 3.x where phrases are generated automatically for you.
>
> what i can't understand, is how/why phrases are being generated
> automatically for you if you have that 'autoGeneratePhraseQueries="false"'
> on your fieldType in your 3x schema ... that makes no sense to me.
>
> if you didn't have "autoGeneratePhraseQueries" specified at all, then the
> 'version="1.1"' would explain it (up to version=1.3, the default for
> autoGeneratePhraseQueries was true, but in version=1.4 and above, it
> defaults to false)  but with an explicit
> 'autoGeneratePhraseQueries="false"' i can't explain why 3x works the way
> you say it works for you.
>
> Bottom line: if you *want* the auto generated phrase query behavior
> in 4.x, you should just set 'autoGeneratePhraseQueries="true"' on your
> fieldType.
>
>
>
> : > : I'm migrating from 3.x to 4.x and I'm running some queries to verify
> that
> : > : everything works like before. I've found however that the query
> "galaxy
> : > s3"
> : > : is giving much less results. In 3.x numFound=1628, in 4.x
> numFound=70.
> : >
> : > is your entire schema 100% identical in both cases?
> : > what is the luceneMatchVersion set to in your solrconfig.xml?
> : >
> : >
> : > By the looks of your debug output, it appears that you are using
> : > autoGeneratePhraseQueries="true" in 3x, but have it set to false in 4x
> --
> : > but the fieldType you posted here shows it set to false
> : >
> : > :  : > : positionIncrementGap="100" autoGeneratePhraseQueries="false">
> : >
> : > ...i haven't tried to reproduce your specific situation, but that
> : > configuration doesn't smell right compared with what you are showing
> for
> : > the 3x output...
> : >
> : > : SOLR 3.x
> : > :
> : > : +(title_search_pt:galaxy
> : > : title_search_pt:galax) +MultiPhraseQuery(title_search_pt:"(sii s3 s)
> : > : 3")
> : > :
> : > : SOLR 4.x
> : > :
> : > : +((title_search_pt:galaxy
> : > : title_search_pt:galax)/no_coord) +(+title_search_pt:sii
> : > : +title_search_pt:s3 +title_search_pt:s +title_search_pt:3)/str>
> : >
> : >
> : > -Hoss
> : >
> :
>
> -Hoss
>


Re: SOLR 4.x vs 3.x parsedquery differences

2013-09-06 Thread Fermin Silva
Hi,

Our schema is identical except the version.
In 3.x it's 1.1 and in 4.x it's 1.5.

Also in solrconfig.xml we have no lucene version for 3.x (so it's using 2_4
i believe) and in 4.x we fixed it to 4_4.

Thanks
On Sep 6, 2013 3:34 PM, "Chris Hostetter"  wrote:

>
> : I'm migrating from 3.x to 4.x and I'm running some queries to verify that
> : everything works like before. I've found however that the query "galaxy
> s3"
> : is giving much less results. In 3.x numFound=1628, in 4.x numFound=70.
>
> is your entire schema 100% identical in both cases?
> what is the luceneMatchVersion set to in your solrconfig.xml?
>
>
> By the looks of your debug output, it appears that you are using
> autoGeneratePhraseQueries="true" in 3x, but have it set to false in 4x --
> but the fieldType you posted here shows it set to false
>
> :  : positionIncrementGap="100" autoGeneratePhraseQueries="false">
>
> ...i haven't tried to reproduce your specific situation, but that
> configuration doesn't smell right compared with what you are showing for
> the 3x output...
>
> : SOLR 3.x
> :
> : +(title_search_pt:galaxy
> : title_search_pt:galax) +MultiPhraseQuery(title_search_pt:"(sii s3 s)
> : 3")
> :
> : SOLR 4.x
> :
> : +((title_search_pt:galaxy
> : title_search_pt:galax)/no_coord) +(+title_search_pt:sii
> : +title_search_pt:s3 +title_search_pt:s +title_search_pt:3)/str>
>
>
> -Hoss
>


SOLR 4 stopwords and token positions

2013-09-09 Thread Fermin Silva
Hi Everyone,

I'm migrating from SOLR 3.x to 4.x and I'm required to keep the results as
close as possible as before.
So I'm running some tests and found some differences.

My query is: *title_search_pt:(geladeira/refrigerador)*
And the parsed query becomes: *MultiPhraseQuery(title_search_pt:"(refriger
geladeir) (refriger geladeir)")*
*
*
This is identical in both instances (3.x and 4.x) so that's not the problem.

My document is:
*balcão refrigerado e geladeira frigorifica*
*
*
Which, after analysis, becomes:
*balca refriger geladeir frigorif*
*
*
That is also identical in both versions, *except for the token positions*.
Notice how 'e' disappears, because of being a stopword.

In SOLR 3.x the positions are: 1, 2, *3*, 4
In SOLR 4.x the positions are: 1, 2, *4*, 5

Could that be the problem?

I've posted a question before here: phrase queries on
punctuation
which I believe that, with the issue with token positions, is causing the
discrepancies.

I couldn't found any documentation/changelog about token positions with
stopwords, hell, I can barely google SOLR-4 specific things.
Can this be solved?

I whish i could fix the original StackOverflow answer (prevent phrase query
generation with punctuation), but I could live with fixing the token
position thing at least (remember that if things work as before, then I am
able to upgrade to 4.x).

Thank you in advance

PS: just in case I'm adding the schema (version="1.5") part: