multicore replication slave

2010-10-11 Thread Christopher Bottaro
Hello,

I can't get my multicore slave to replicate from the master.

The master is setup properly and the following urls return "00OKNo
command" as expected:
http://solr.mydomain.com:8983/solr/core1/replication
http://solr.mydomain.com:8983/solr/core2/replication
http://solr.mydomain.com:8983/solr/core3/replication

The following pastie shows how my slave is setup:
http://pastie.org/1214209

But it's not working (i.e. I see no replication attempts in the slave's log).

Any ideas?

Thanks for the help.


Re: multicore replication slave

2010-10-12 Thread Christopher Bottaro
Answered my own question.  Instead of naming each core in the
replication handler, you use a variable instead:


  
http://solr.mydomain.com:8983/solr/${solr.core.name}/replication
00:00:60
  


That will get all of your cores replicating.

-- C

On Mon, Oct 11, 2010 at 6:25 PM, Christopher Bottaro
 wrote:
> Hello,
>
> I can't get my multicore slave to replicate from the master.
>
> The master is setup properly and the following urls return "00OKNo
> command" as expected:
> http://solr.mydomain.com:8983/solr/core1/replication
> http://solr.mydomain.com:8983/solr/core2/replication
> http://solr.mydomain.com:8983/solr/core3/replication
>
> The following pastie shows how my slave is setup:
> http://pastie.org/1214209
>
> But it's not working (i.e. I see no replication attempts in the slave's log).
>
> Any ideas?
>
> Thanks for the help.
>


stopwords not working in multicore setup

2011-03-24 Thread Christopher Bottaro
Hello,

I'm running a Solr server with 5 cores.  Three are for English content and
two are for German content.  The default stopwords setup works fine for the
English cores, but the German stopwords aren't working.

The German stopwords file is stopwords-de.txt and resides in the same
directory as stopwords.txt.  The German cores use a different schema (named
schema.page.de.xml) which has the following text field definition:
http://pastie.org/1711866

The stopwords-de.txt file looks like this:  http://pastie.org/1711869

The query I'm doing is this:  q => "title:für"

And it's returning documents with für in the title.  Title is a text field
which should use the stopwords-de.txt, as seen in the aforementioned pastie.

Any ideas?  Thanks for the help.


Re: stopwords not working in multicore setup

2011-03-25 Thread Christopher Bottaro
Ahh, thank you for the hints Martin... German stopwords without Umlaut work
correctly.

So I'm trying to figure out where the UTF-8 chars are getting messed up.
 Using the Solr admin web UI, I did a search for title:für and the xml (or
json) output in the browser shows the query with the proper encoding, but
the Solr logs show this:

INFO: [page_30d_de] webapp=/solr path=/select
params={explainOther=&fl=*,score&indent=on&start=0&q=title:f?r&hl.fl=&qt=standard&wt=xml&fq=&version=2.2&rows=10}
hits=76 status=0 QTime=2

Notice the title:f?r.  How do I fix that?  I'm using Jetty btw...

Thanks for the help.

On Fri, Mar 25, 2011 at 3:05 AM, Martin Rödig  wrote:

> I have some questions about your config:
>
> Is the stopwords-de.txt in the same diractory as the shema.xml?
> Is the title field from type text?
> Have you the same problem with german stopwords with out Umlaut (ü,ö,ä)
> like the word "denn"?
>
> A Problem can be that the stopwords-de.txt is not save as UTF-8, so the
> filter can not read the umlaut ü in the file.
>
>
> Mit freundlichen Grüßen
> M.Sc. Dipl.-Inf. (FH) Martin Rödig
>
> SHI Elektronische Medien GmbH
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - -
> AKTUELL - NEU - AB SOFORT
> Solr/Lucene Schulung vom 19. - 21. April in Berlin
>
> Als erster zertifizierter Trainingspartner von Lucid Imagination in
> Deutschland, Österreich und Schweiz bietet SHI ab sofort
> deutschsprachige Solr Schulungen an.
> Weitere Informationen: www.shi-gmbh.com/services/solr-training
> Achtung: Die Anzahl der Plätze ist beschränkt!
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - -
> Postadresse: Watzmannstr. 23, 86316 Friedberg
> Besuchsadresse: Curt-Frenzel-Str. 12, 86167 Augsburg
> Tel.: 0821 7482633 18
> Tel.: 0821 7482633 0 (Zentrale)
> Fax: 0821 7482633 29
>
> Internet: http://www.shi-gmbh.com
> Registergericht Augsburg HRB 17382
> Geschäftsführer: Peter Spiske
> Steuernummer: 103/137/30412
>
> -Ursprüngliche Nachricht-
> Von: Christopher Bottaro [mailto:cjbott...@onespot.com]
> Gesendet: Freitag, 25. März 2011 05:37
> An: solr-user@lucene.apache.org
> Betreff: stopwords not working in multicore setup
>
> Hello,
>
> I'm running a Solr server with 5 cores.  Three are for English content and
> two are for German content.  The default stopwords setup works fine for the
> English cores, but the German stopwords aren't working.
>
> The German stopwords file is stopwords-de.txt and resides in the same
> directory as stopwords.txt.  The German cores use a different schema (named
> schema.page.de.xml) which has the following text field definition:
> http://pastie.org/1711866
>
> The stopwords-de.txt file looks like this:  http://pastie.org/1711869
>
> The query I'm doing is this:  q => "title:für"
>
> And it's returning documents with für in the title.  Title is a text field
> which should use the stopwords-de.txt, as seen in the aforementioned pastie.
>
> Any ideas?  Thanks for the help.
>


Boost a document score via query using MoreLikeThisHandler

2010-03-01 Thread Christopher Bottaro
Hello,

Is it possible to boost a document's score based on something like
fq=site(com.google*).  In other words, I want to boost the score of
documents who's "site" field starts with "com.google".

I'm using the MoreLikeThisHandler.

Thanks for the help,
-- Christopher


Re: Boost a document score via query using MoreLikeThisHandler

2010-03-01 Thread Christopher Bottaro
On Mon, Mar 1, 2010 at 7:36 PM, Christopher Bottaro
 wrote:
> Hello,
>
> Is it possible to boost a document's score based on something like
> fq=site(com.google*).  In other words, I want to boost the score of
> documents who's "site" field starts with "com.google".
>
> I'm using the MoreLikeThisHandler.
>
> Thanks for the help,
> -- Christopher
>

Ok, I think I need to do this with BoostQParserPlugin and nested
queries, but I can't quite figure it out.

So this works...
q={!boost b=log(popularity)}(title:barack OR title:obama)

But instead of boosting by popularity, I want to boost by site:
q={!boost b=query({ !query q='site:*.yahoo.com' })}(title:barack OR title:obama)

This is the exception I get...
org.apache.lucene.queryParser.ParseException: Expected identifier at
pos 18 str='{!boost b=query({ !query q='site:*.yahoo.com'
})}(title:barack OR title:obama)'

But that doesn't work.  Any tips?  Thanks.


How to see the query generated by MoreLikeThisHandler?

2010-03-03 Thread Christopher Bottaro
Hello,

Is there a way to see exactly what query is generated by the
MoreLikeThisHandler?  If I send debugQuery=true then I see in the
response a key called "parsedquery" but it doesn't seem quite right.

What I mean by that is when I make the MoreLikeThis query, I set
"mlt.fl" to "title,content" but the query shown in "parsedquery" does
not query on "title" at all... only on "content".  Furthermore, the
query looks something like this "content:word1 content:word2
content:word3" but if I copy and paste that into a standard query,
nothing comes back because the default term operator is AND.

If I change that query to "content:word1 OR content:word2 OR
content:word3", I get results but they are not the same as what the
MLT query returns.

Is there a way to see the generated query without actually running it?
 As of now, I'm making a MLT query with rows=0, but I think it's still
running the query because it takes a non trivial amount of time and it
also shows "numFound" in the response.

Thanks for the help,
-- Christopher


DisMaxRequestHandler questions about bf and bq

2010-03-03 Thread Christopher Bottaro
Hello,

I have a couple of questions regarding the bf and bq params to the
DisMaxRequestHandler.

1)  Can I specify them more than once?  Ex:
bf=log(popularity)&bf=log(comment_count)

2)  When using bq, how can I specify what score to use for documents
not returned by the query?  In other words, how do I mimic this
behavior using bq:
bf=query($qq, 0.1)&qq=site:news.yahoo.com


Thanks for the help!