Re: Export Index Data.

2010-11-20 Thread Ahmet Arslan
> Is possible to export one set of documents indexed in one
> solr server for do
> a sincronization with other solr server?

Replication? http://wiki.apache.org/solr/SolrReplication


  


Re: Issue with relevancy

2010-11-20 Thread Ahmet Arslan
> I am getting the below results ,But for the first doc the
> score is higher
> than second doc, Even though the prod_n only has
> "Computers" word.
> I want to push down the first doc to second.H

You can use Jan's magic solution -that uses map function- for that.
http://search-lucene.com/m/nK6t9j1fuc2/



  


Re: String field with lower case filter

2010-11-20 Thread sivaprasad

Thank you,It is perfectly working
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/String-field-with-lower-case-filter-tp1930941p1935283.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with synonyms

2010-11-20 Thread sivaprasad

Even after expanding the synonyms also i am unable to get same results.

Is there any other method to achieve this
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-synonyms-tp1905051p1935419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to Transmit and Append Indexes

2010-11-20 Thread Alex Baranau
Make sure you are not going to "reinvent the wheel" here ;). There's been
done a lot around the problem of distributes search engine.
This thread might be useful for you: http://search-hadoop.com/m/ARlbS1MiTNY

Alex Baranau

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

On Fri, Nov 19, 2010 at 5:52 PM, Bing Li  wrote:

> Hi, all,
>
> I am working on a distributed searching system. Now I have one server only.
> It has to crawl pages from the Web, generate indexes locally and respond
> users' queries. I think this is too busy for it to work smoothly.
>
> I plan to use two servers at at least. The jobs to crawl pages and generate
> indexes are done by one of them. After that, the new available indexes
> should be transmitted to anther one which is responsible for responding
> users' queries. From users' point of view, this system must be fast.
> However, I don't know how I can get the additional indexes which I can
> transmit. After transmission, how to append them to the old indexes? Does
> the appending block searching?
>
> Thanks so much for your help!
>
> Bing Li
>


Re: Problem with synonyms

2010-11-20 Thread Robert Muir
On Tue, Nov 16, 2010 at 1:16 AM, sivaprasad  wrote:
> Query1:hdtv
>
> MultiPhraseQuery(searchtext:"high definit (televis
> tv tvs)")
>
> and the number of results returned is ZERO.
>
> Query2:High Definition Television
>
> The parsed query is given below.
> +searchtext:high +searchtext:definit
> +(searchtext:televis searchtext:tv searchtext:tvs)
>
> And the number of resullts is 1.
>

Please see 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201011.mbox/%3caanlktimatgvplph_mgfbsughdoedc8tc2brrwxhid...@mail.gmail.com%3e
which explains the problem, which is "autophrase" generation by the queryparser.

you will need to either use the workaround, or upgrade to an
unreleased version and manually turn off this *very bad* default.


Re: Problem with synonyms

2010-11-20 Thread Ahmet Arslan
What happens  when you use synonym filter at index time only with expand="true" 
with this synonym_index.txt?

I use only comma operator:

hdtv, High Definition Television, High Definition TV, High Definition
Televisions, High Definition TVs

Also putting the synonym filter under the stem filter can be useful in your 
case. Porter can own televisions to television transformation.

--- On Tue, 11/16/10, sivaprasad  wrote:

> From: sivaprasad 
> Subject: Re: Problem with synonyms
> To: solr-user@lucene.apache.org
> Date: Tuesday, November 16, 2010, 8:16 AM
> 
> I did changes to the schema file as shown below.
> 
> 
>          class="solr.WhitespaceTokenizerFactory"/>
>          class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>     
>   
>          class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>       
>          class="solr.LowerCaseFilterFactory"/>
>          class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>          class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
> 
> And i have an entry in the synonym.txt file as shown
> below.
> 
> hdtv => High Definition Television, High Definition
> TV,High Definition
> Televisions,High Definition TVs
> 
> Now i submitted the query with debugQuery=on .
> 
> Query1:hdtv
> 
> The parsed query is given below.
> 
> hdtv 
> hdtv 
>  name="parsedquery">MultiPhraseQuery(searchtext:"high
> definit (televis
> tv tvs)") 
> searchtext:"high
> definit (televis tv
> tvs)" 
> 
> and the number of results returned is ZERO.
> 
> Query2:High Definition Television
> 
> The parsed query is given below.
> High Definition
> Television 
> High Definition
> Television 
> +searchtext:high
> +searchtext:definit
> +(searchtext:televis searchtext:tv
> searchtext:tvs) 
> +searchtext:high
> +searchtext:definit
> +(searchtext:televis searchtext:tv
> searchtext:tvs) 
> 
> And the number of resullts is 1.
> 
> Why i am getting the results like this even after expanding
> the synonyms.
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Problem-with-synonyms-tp1905051p1909369.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
> 





Re: Master/Slave High CPU Usage

2010-11-20 Thread Ofer Fort
Another question on that configuration, when the "master" commits, how does
the "slave" knows that the index has changed? Does it check the index and
finds out that it has a newer version?
Thanks again for the help,
Ofer



ב-19 בנוב 2010, בשעה 05:30, Lance Norskog  כתב/ה:

If they are on the same server, you do not need to replicate.

If you only do queries, the query server can use the same index
directory as the master. Works quite well. Both have to have the same
LockPolicy in solrconfig.xml. For security reasons, I would run the
query server as a different user who has read-only access to the
index; that way it cannot touch the index.

On Wed, Nov 17, 2010 at 11:28 PM, Ofer Fort  wrote:

anybody?


On Wed, Nov 17, 2010 at 12:09 PM, Ofer Fort  wrote:


Hi, I'm working with Erez,

we experienced this again, and this time the slave index folder didn't
contain the index.XXX folder, only one index folder.

if we shutdown the slave, the CPU on the master was normal, as soon as we
started the slave again, the CPU went up to 100% again.

thanks for any help

ofer


On Wed, Nov 17, 2010 at 11:15 AM, Erez Zarum  wrote:


Hi all,

We've been seeing this for the second time already.

I have a solr (1.4.1) master and a slave. both are located on the same
machine (16GB RAM, 4GB allocated to the slave and 3GB to the master)

All our updates are going towards the master, and all the queries are
towards the slave.

Once in a while the slave gets OutOfMemoryError. This is not the big problem
(i have a about 100M documents)

The problem is that from that moment the CPU of the slave AND the master is
almost 100%.

If i shutdown the slave, the CPU of the master drops.

If i start the slave again, the CPU is 100% again.

I have the replication set on commit and startup.

I see that in the data folder contains three index folders: index,
index.XXXYYY and  index.XXXYYY.ZZZ


The only way i was able to get pass it (worked two times already), is to
shutdown the two servers, and to copy all the index of the master to the
slave, and start them again.

>From that moment and on, they continue to work and replicate with a very
reasonable CPU usage.


Our guess is that it failed to replicate due to the OOM and since then tries
to do a full replication again and again?

but why is the CPU of the master so high?






-- 
Lance Norskog
goks...@gmail.com


Re: Master/Slave High CPU Usage

2010-11-20 Thread Erick Erickson
The slave polls. See: http://wiki.apache.org/solr/SolrReplication

Best
Erick

On Sat, Nov 20, 2010 at 1:13 PM, Ofer Fort  wrote:

> Another question on that configuration, when the "master" commits, how does
> the "slave" knows that the index has changed? Does it check the index and
> finds out that it has a newer version?
> Thanks again for the help,
> Ofer
>
>
>
> ב-19 בנוב 2010, בשעה 05:30, Lance Norskog  כתב/ה:
>
> If they are on the same server, you do not need to replicate.
>
> If you only do queries, the query server can use the same index
> directory as the master. Works quite well. Both have to have the same
> LockPolicy in solrconfig.xml. For security reasons, I would run the
> query server as a different user who has read-only access to the
> index; that way it cannot touch the index.
>
> On Wed, Nov 17, 2010 at 11:28 PM, Ofer Fort  wrote:
>
> anybody?
>
>
> On Wed, Nov 17, 2010 at 12:09 PM, Ofer Fort  wrote:
>
>
> Hi, I'm working with Erez,
>
> we experienced this again, and this time the slave index folder didn't
> contain the index.XXX folder, only one index folder.
>
> if we shutdown the slave, the CPU on the master was normal, as soon as we
> started the slave again, the CPU went up to 100% again.
>
> thanks for any help
>
> ofer
>
>
> On Wed, Nov 17, 2010 at 11:15 AM, Erez Zarum  wrote:
>
>
> Hi all,
>
> We've been seeing this for the second time already.
>
> I have a solr (1.4.1) master and a slave. both are located on the same
> machine (16GB RAM, 4GB allocated to the slave and 3GB to the master)
>
> All our updates are going towards the master, and all the queries are
> towards the slave.
>
> Once in a while the slave gets OutOfMemoryError. This is not the big
> problem
> (i have a about 100M documents)
>
> The problem is that from that moment the CPU of the slave AND the master is
> almost 100%.
>
> If i shutdown the slave, the CPU of the master drops.
>
> If i start the slave again, the CPU is 100% again.
>
> I have the replication set on commit and startup.
>
> I see that in the data folder contains three index folders: index,
> index.XXXYYY and  index.XXXYYY.ZZZ
>
>
> The only way i was able to get pass it (worked two times already), is to
> shutdown the two servers, and to copy all the index of the master to the
> slave, and start them again.
>
> From that moment and on, they continue to work and replicate with a very
> reasonable CPU usage.
>
>
> Our guess is that it failed to replicate due to the OOM and since then
> tries
> to do a full replication again and again?
>
> but why is the CPU of the master so high?
>
>
>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Empty value/string matching

2010-11-20 Thread Viswa S

Folks,Am trying to query documents which have no values present, I have used 
the following constructs and it doesn't seem to work on the solr dev tip (as of 
09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) - returns no 
documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[* TO *]"2. 
-FieldName:[* TO *] -  returns no documents, parsedquery was "-FieldName:[* TO 
*]"3. FieldName:"" - returns no documents, parsedquery was empty ()The field is type string, using the LuceneQParser, I have 
also tried to see if "FieldName:[* TO *]" if the documents with no terms are 
ignored and didn't seem to be the case, the result set was everything.Any help 
would be appreciated.-Viswa

Re: Empty value/string matching

2010-11-20 Thread Erick Erickson
Are you absolutely sure your documents really don't have any values for
"FieldName"? Because your results are perfectly correct if every doc has a
value for "FieldName".

Or are you saying there no such field as "FieldName"?

Best
Erick

On Sat, Nov 20, 2010 at 3:12 PM, Viswa S  wrote:

>
> Folks,Am trying to query documents which have no values present, I have
> used the following constructs and it doesn't seem to work on the solr dev
> tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> returns no documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[*
> TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> "-FieldName:[* TO *]"3. FieldName:"" - returns no documents, parsedquery was
> empty ()The field is type string, using the
> LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> documents with no terms are ignored and didn't seem to be the case, the
> result set was everything.Any help would be appreciated.-Viswa
>


RE: Empty value/string matching

2010-11-20 Thread Viswa S

Yes I do have a couple of documents with no values and one with an empty 
string. Find below the output of a facet on the fieldName.
ThanksViswa


1
> Date: Sat, 20 Nov 2010 15:29:06 -0500
> Subject: Re: Empty value/string matching
> From: erickerick...@gmail.com
> To: solr-user@lucene.apache.org
> 
> Are you absolutely sure your documents really don't have any values for
> "FieldName"? Because your results are perfectly correct if every doc has a
> value for "FieldName".
> 
> Or are you saying there no such field as "FieldName"?
> 
> Best
> Erick
> 
> On Sat, Nov 20, 2010 at 3:12 PM, Viswa S  wrote:
> 
> >
> > Folks,Am trying to query documents which have no values present, I have
> > used the following constructs and it doesn't seem to work on the solr dev
> > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[*
> > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents, parsedquery was
> > empty ()The field is type string, using the
> > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > documents with no terms are ignored and didn't seem to be the case, the
> > result set was everything.Any help would be appreciated.-Viswa
> >
  

Re: Master/Slave High CPU Usage

2010-11-20 Thread Ofer Fort
thanks Erick,
but my question was regard the configuration Lance suggested, a
configuration where i have two servers, set set logical master and slave,
not as a true replication. Since both are running on the same machine, just
have one only doing updates, and the other only queries, but both are using
the same index files.

Ofer


On Sat, Nov 20, 2010 at 8:52 PM, Erick Erickson wrote:

> The slave polls. See: http://wiki.apache.org/solr/SolrReplication
>
> Best
> Erick
>
> On Sat, Nov 20, 2010 at 1:13 PM, Ofer Fort  wrote:
>
> > Another question on that configuration, when the "master" commits, how
> does
> > the "slave" knows that the index has changed? Does it check the index and
> > finds out that it has a newer version?
> > Thanks again for the help,
> > Ofer
> >
> >
> >
> > ב-19 בנוב 2010, בשעה 05:30, Lance Norskog  כתב/ה:
> >
> > If they are on the same server, you do not need to replicate.
> >
> > If you only do queries, the query server can use the same index
> > directory as the master. Works quite well. Both have to have the same
> > LockPolicy in solrconfig.xml. For security reasons, I would run the
> > query server as a different user who has read-only access to the
> > index; that way it cannot touch the index.
> >
> > On Wed, Nov 17, 2010 at 11:28 PM, Ofer Fort  wrote:
> >
> > anybody?
> >
> >
> > On Wed, Nov 17, 2010 at 12:09 PM, Ofer Fort  wrote:
> >
> >
> > Hi, I'm working with Erez,
> >
> > we experienced this again, and this time the slave index folder didn't
> > contain the index.XXX folder, only one index folder.
> >
> > if we shutdown the slave, the CPU on the master was normal, as soon as we
> > started the slave again, the CPU went up to 100% again.
> >
> > thanks for any help
> >
> > ofer
> >
> >
> > On Wed, Nov 17, 2010 at 11:15 AM, Erez Zarum  wrote:
> >
> >
> > Hi all,
> >
> > We've been seeing this for the second time already.
> >
> > I have a solr (1.4.1) master and a slave. both are located on the same
> > machine (16GB RAM, 4GB allocated to the slave and 3GB to the master)
> >
> > All our updates are going towards the master, and all the queries are
> > towards the slave.
> >
> > Once in a while the slave gets OutOfMemoryError. This is not the big
> > problem
> > (i have a about 100M documents)
> >
> > The problem is that from that moment the CPU of the slave AND the master
> is
> > almost 100%.
> >
> > If i shutdown the slave, the CPU of the master drops.
> >
> > If i start the slave again, the CPU is 100% again.
> >
> > I have the replication set on commit and startup.
> >
> > I see that in the data folder contains three index folders: index,
> > index.XXXYYY and  index.XXXYYY.ZZZ
> >
> >
> > The only way i was able to get pass it (worked two times already), is to
> > shutdown the two servers, and to copy all the index of the master to the
> > slave, and start them again.
> >
> > From that moment and on, they continue to work and replicate with a very
> > reasonable CPU usage.
> >
> >
> > Our guess is that it failed to replicate due to the OOM and since then
> > tries
> > to do a full replication again and again?
> >
> > but why is the CPU of the master so high?
> >
> >
> >
> >
> >
> >
> > --
> > Lance Norskog
> > goks...@gmail.com
> >
>


Re: Empty value/string matching

2010-11-20 Thread Erick Erickson
I don't think that's correct. The documents wouldn't be showing
up in the facets if they had no value for the field. So I think you're
being mislead by the printout from the faceting. Perhaps you
have unprintable characters in there or some such. Certainly the
name:" " is actually a value, admittedly just a space. As for the
other, I suspect something similar.

What results do you get back when you just search for
FieldName:[* TO *]? I'm betting you get all the docs back,
but I've been very wrong before.

Best
Erick

On Sat, Nov 20, 2010 at 5:02 PM, Viswa S  wrote:

>
> Yes I do have a couple of documents with no values and one with an empty
> string. Find below the output of a facet on the fieldName.
> ThanksViswa
>
>
> 22 name="GDOGPRODY.424">221
> > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > Subject: Re: Empty value/string matching
> > From: erickerick...@gmail.com
> > To: solr-user@lucene.apache.org
> >
> > Are you absolutely sure your documents really don't have any values for
> > "FieldName"? Because your results are perfectly correct if every doc has
> a
> > value for "FieldName".
> >
> > Or are you saying there no such field as "FieldName"?
> >
> > Best
> > Erick
> >
> > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S  wrote:
> >
> > >
> > > Folks,Am trying to query documents which have no values present, I have
> > > used the following constructs and it doesn't seem to work on the solr
> dev
> > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> -FieldName:[*
> > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> parsedquery was
> > > empty ()The field is type string, using the
> > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > > documents with no terms are ignored and didn't seem to be the case, the
> > > result set was everything.Any help would be appreciated.-Viswa
> > >
>
>


RE: Empty value/string matching

2010-11-20 Thread Viswa S

Erick,
Thanks for the quick response. The output i showed is on a test instance i 
created to simulate this issue. I intentionally tried to create documents with 
no values by creating xml nodes with "", but 
having values in the other fields in a document. 
Are you saying that there is no way have a field with no value?, with text 
fields they seem to make sense than for string?.
You are right on fieldName:[* TO *] results, which basically returned all the 
documents which included the couple of documents in question. 
-Viswa
> Date: Sat, 20 Nov 2010 17:20:53 -0500
> Subject: Re: Empty value/string matching
> From: erickerick...@gmail.com
> To: solr-user@lucene.apache.org
> 
> I don't think that's correct. The documents wouldn't be showing
> up in the facets if they had no value for the field. So I think you're
> being mislead by the printout from the faceting. Perhaps you
> have unprintable characters in there or some such. Certainly the
> name:" " is actually a value, admittedly just a space. As for the
> other, I suspect something similar.
> 
> What results do you get back when you just search for
> FieldName:[* TO *]? I'm betting you get all the docs back,
> but I've been very wrong before.
> 
> Best
> Erick
> 
> On Sat, Nov 20, 2010 at 5:02 PM, Viswa S  wrote:
> 
> >
> > Yes I do have a couple of documents with no values and one with an empty
> > string. Find below the output of a facet on the fieldName.
> > ThanksViswa
> >
> >
> > 22 > name="GDOGPRODY.424">221
> > > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > > Subject: Re: Empty value/string matching
> > > From: erickerick...@gmail.com
> > > To: solr-user@lucene.apache.org
> > >
> > > Are you absolutely sure your documents really don't have any values for
> > > "FieldName"? Because your results are perfectly correct if every doc has
> > a
> > > value for "FieldName".
> > >
> > > Or are you saying there no such field as "FieldName"?
> > >
> > > Best
> > > Erick
> > >
> > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S  wrote:
> > >
> > > >
> > > > Folks,Am trying to query documents which have no values present, I have
> > > > used the following constructs and it doesn't seem to work on the solr
> > dev
> > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> > -FieldName:[*
> > > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> > parsedquery was
> > > > empty ()The field is type string, using the
> > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > > > documents with no terms are ignored and didn't seem to be the case, the
> > > > result set was everything.Any help would be appreciated.-Viswa
> > > >
> >
> >
  

Re: Master/Slave High CPU Usage

2010-11-20 Thread Lance Norskog
Ah! If the program doing the indexing has manual commits, the program
could send a commit to the slave. If the indexer uses automatic
commits, there is a trick: you can add a program as a postCommit event
in solrconfig.xml. This can just be a shell script or a curl command
that sends a commit to the slave Solr.

Be sure to make all of the wait options false to this command; you
don't want the master to block while the slave loads up the new index.
Or, to control the maximum load on your server, you might actually
want to make the master wait while the slave loads up

Lance

On Sat, Nov 20, 2010 at 2:13 PM, Ofer Fort  wrote:
> thanks Erick,
> but my question was regard the configuration Lance suggested, a
> configuration where i have two servers, set set logical master and slave,
> not as a true replication. Since both are running on the same machine, just
> have one only doing updates, and the other only queries, but both are using
> the same index files.
>
> Ofer
>
>
> On Sat, Nov 20, 2010 at 8:52 PM, Erick Erickson 
> wrote:
>
>> The slave polls. See: http://wiki.apache.org/solr/SolrReplication
>>
>> Best
>> Erick
>>
>> On Sat, Nov 20, 2010 at 1:13 PM, Ofer Fort  wrote:
>>
>> > Another question on that configuration, when the "master" commits, how
>> does
>> > the "slave" knows that the index has changed? Does it check the index and
>> > finds out that it has a newer version?
>> > Thanks again for the help,
>> > Ofer
>> >
>> >
>> >
>> > ב-19 בנוב 2010, בשעה 05:30, Lance Norskog  כתב/ה:
>> >
>> > If they are on the same server, you do not need to replicate.
>> >
>> > If you only do queries, the query server can use the same index
>> > directory as the master. Works quite well. Both have to have the same
>> > LockPolicy in solrconfig.xml. For security reasons, I would run the
>> > query server as a different user who has read-only access to the
>> > index; that way it cannot touch the index.
>> >
>> > On Wed, Nov 17, 2010 at 11:28 PM, Ofer Fort  wrote:
>> >
>> > anybody?
>> >
>> >
>> > On Wed, Nov 17, 2010 at 12:09 PM, Ofer Fort  wrote:
>> >
>> >
>> > Hi, I'm working with Erez,
>> >
>> > we experienced this again, and this time the slave index folder didn't
>> > contain the index.XXX folder, only one index folder.
>> >
>> > if we shutdown the slave, the CPU on the master was normal, as soon as we
>> > started the slave again, the CPU went up to 100% again.
>> >
>> > thanks for any help
>> >
>> > ofer
>> >
>> >
>> > On Wed, Nov 17, 2010 at 11:15 AM, Erez Zarum  wrote:
>> >
>> >
>> > Hi all,
>> >
>> > We've been seeing this for the second time already.
>> >
>> > I have a solr (1.4.1) master and a slave. both are located on the same
>> > machine (16GB RAM, 4GB allocated to the slave and 3GB to the master)
>> >
>> > All our updates are going towards the master, and all the queries are
>> > towards the slave.
>> >
>> > Once in a while the slave gets OutOfMemoryError. This is not the big
>> > problem
>> > (i have a about 100M documents)
>> >
>> > The problem is that from that moment the CPU of the slave AND the master
>> is
>> > almost 100%.
>> >
>> > If i shutdown the slave, the CPU of the master drops.
>> >
>> > If i start the slave again, the CPU is 100% again.
>> >
>> > I have the replication set on commit and startup.
>> >
>> > I see that in the data folder contains three index folders: index,
>> > index.XXXYYY and  index.XXXYYY.ZZZ
>> >
>> >
>> > The only way i was able to get pass it (worked two times already), is to
>> > shutdown the two servers, and to copy all the index of the master to the
>> > slave, and start them again.
>> >
>> > From that moment and on, they continue to work and replicate with a very
>> > reasonable CPU usage.
>> >
>> >
>> > Our guess is that it failed to replicate due to the OOM and since then
>> > tries
>> > to do a full replication again and again?
>> >
>> > but why is the CPU of the master so high?
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Lance Norskog
>> > goks...@gmail.com
>> >
>>
>



-- 
Lance Norskog
goks...@gmail.com


Re: Empty value/string matching

2010-11-20 Thread Lance Norskog
If a string field has a value with " ", that has to be searched for.
fieldName:" " should work.
If there is a 0-length value in a string field, that might be found
with fieldName:"" but I have no experience with 0-length values. I
don't know if this adds a value to the field or not:
""

One way to find out is to make that field required in the schema. If
no value goes in, you'll get an error.

The facet output should list " " and "".


On Sat, Nov 20, 2010 at 2:38 PM, Viswa S  wrote:
>
> Erick,
> Thanks for the quick response. The output i showed is on a test instance i 
> created to simulate this issue. I intentionally tried to create documents 
> with no values by creating xml nodes with "", 
> but having values in the other fields in a document.
> Are you saying that there is no way have a field with no value?, with text 
> fields they seem to make sense than for string?.
> You are right on fieldName:[* TO *] results, which basically returned all the 
> documents which included the couple of documents in question.
> -Viswa
>> Date: Sat, 20 Nov 2010 17:20:53 -0500
>> Subject: Re: Empty value/string matching
>> From: erickerick...@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> I don't think that's correct. The documents wouldn't be showing
>> up in the facets if they had no value for the field. So I think you're
>> being mislead by the printout from the faceting. Perhaps you
>> have unprintable characters in there or some such. Certainly the
>> name:" " is actually a value, admittedly just a space. As for the
>> other, I suspect something similar.
>>
>> What results do you get back when you just search for
>> FieldName:[* TO *]? I'm betting you get all the docs back,
>> but I've been very wrong before.
>>
>> Best
>> Erick
>>
>> On Sat, Nov 20, 2010 at 5:02 PM, Viswa S  wrote:
>>
>> >
>> > Yes I do have a couple of documents with no values and one with an empty
>> > string. Find below the output of a facet on the fieldName.
>> > ThanksViswa
>> >
>> >
>> > 22> > name="GDOGPRODY.424">221
>> > > Date: Sat, 20 Nov 2010 15:29:06 -0500
>> > > Subject: Re: Empty value/string matching
>> > > From: erickerick...@gmail.com
>> > > To: solr-user@lucene.apache.org
>> > >
>> > > Are you absolutely sure your documents really don't have any values for
>> > > "FieldName"? Because your results are perfectly correct if every doc has
>> > a
>> > > value for "FieldName".
>> > >
>> > > Or are you saying there no such field as "FieldName"?
>> > >
>> > > Best
>> > > Erick
>> > >
>> > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S  wrote:
>> > >
>> > > >
>> > > > Folks,Am trying to query documents which have no values present, I have
>> > > > used the following constructs and it doesn't seem to work on the solr
>> > dev
>> > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
>> > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
>> > -FieldName:[*
>> > > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
>> > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
>> > parsedquery was
>> > > > empty ()The field is type string, using the
>> > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
>> > > > documents with no terms are ignored and didn't seem to be the case, the
>> > > > result set was everything.Any help would be appreciated.-Viswa
>> > > >
>> >
>> >
>



-- 
Lance Norskog
goks...@gmail.com


Re: Master/Slave High CPU Usage

2010-11-20 Thread Ofer Fort
OK,
so to make sure i understand, even though the "slave" doesn't do any
indexing, i will call commit and it will do nothing to the index itself, but
will reload it?
thanks

On Sun, Nov 21, 2010 at 8:26 AM, Lance Norskog  wrote:

> Ah! If the program doing the indexing has manual commits, the program
> could send a commit to the slave. If the indexer uses automatic
> commits, there is a trick: you can add a program as a postCommit event
> in solrconfig.xml. This can just be a shell script or a curl command
> that sends a commit to the slave Solr.
>
> Be sure to make all of the wait options false to this command; you
> don't want the master to block while the slave loads up the new index.
> Or, to control the maximum load on your server, you might actually
> want to make the master wait while the slave loads up
>
> Lance
>
> On Sat, Nov 20, 2010 at 2:13 PM, Ofer Fort  wrote:
> > thanks Erick,
> > but my question was regard the configuration Lance suggested, a
> > configuration where i have two servers, set set logical master and slave,
> > not as a true replication. Since both are running on the same machine,
> just
> > have one only doing updates, and the other only queries, but both are
> using
> > the same index files.
> >
> > Ofer
> >
> >
> > On Sat, Nov 20, 2010 at 8:52 PM, Erick Erickson  >wrote:
> >
> >> The slave polls. See: http://wiki.apache.org/solr/SolrReplication
> >>
> >> Best
> >> Erick
> >>
> >> On Sat, Nov 20, 2010 at 1:13 PM, Ofer Fort  wrote:
> >>
> >> > Another question on that configuration, when the "master" commits, how
> >> does
> >> > the "slave" knows that the index has changed? Does it check the index
> and
> >> > finds out that it has a newer version?
> >> > Thanks again for the help,
> >> > Ofer
> >> >
> >> >
> >> >
> >> > ב-19 בנוב 2010, בשעה 05:30, Lance Norskog  כתב/ה:
> >> >
> >> > If they are on the same server, you do not need to replicate.
> >> >
> >> > If you only do queries, the query server can use the same index
> >> > directory as the master. Works quite well. Both have to have the same
> >> > LockPolicy in solrconfig.xml. For security reasons, I would run the
> >> > query server as a different user who has read-only access to the
> >> > index; that way it cannot touch the index.
> >> >
> >> > On Wed, Nov 17, 2010 at 11:28 PM, Ofer Fort 
> wrote:
> >> >
> >> > anybody?
> >> >
> >> >
> >> > On Wed, Nov 17, 2010 at 12:09 PM, Ofer Fort 
> wrote:
> >> >
> >> >
> >> > Hi, I'm working with Erez,
> >> >
> >> > we experienced this again, and this time the slave index folder didn't
> >> > contain the index.XXX folder, only one index folder.
> >> >
> >> > if we shutdown the slave, the CPU on the master was normal, as soon as
> we
> >> > started the slave again, the CPU went up to 100% again.
> >> >
> >> > thanks for any help
> >> >
> >> > ofer
> >> >
> >> >
> >> > On Wed, Nov 17, 2010 at 11:15 AM, Erez Zarum 
> wrote:
> >> >
> >> >
> >> > Hi all,
> >> >
> >> > We've been seeing this for the second time already.
> >> >
> >> > I have a solr (1.4.1) master and a slave. both are located on the same
> >> > machine (16GB RAM, 4GB allocated to the slave and 3GB to the master)
> >> >
> >> > All our updates are going towards the master, and all the queries are
> >> > towards the slave.
> >> >
> >> > Once in a while the slave gets OutOfMemoryError. This is not the big
> >> > problem
> >> > (i have a about 100M documents)
> >> >
> >> > The problem is that from that moment the CPU of the slave AND the
> master
> >> is
> >> > almost 100%.
> >> >
> >> > If i shutdown the slave, the CPU of the master drops.
> >> >
> >> > If i start the slave again, the CPU is 100% again.
> >> >
> >> > I have the replication set on commit and startup.
> >> >
> >> > I see that in the data folder contains three index folders: index,
> >> > index.XXXYYY and  index.XXXYYY.ZZZ
> >> >
> >> >
> >> > The only way i was able to get pass it (worked two times already), is
> to
> >> > shutdown the two servers, and to copy all the index of the master to
> the
> >> > slave, and start them again.
> >> >
> >> > From that moment and on, they continue to work and replicate with a
> very
> >> > reasonable CPU usage.
> >> >
> >> >
> >> > Our guess is that it failed to replicate due to the OOM and since then
> >> > tries
> >> > to do a full replication again and again?
> >> >
> >> > but why is the CPU of the master so high?
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Lance Norskog
> >> > goks...@gmail.com
> >> >
> >>
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>