from:"peter360"

searching from command line?

2008-03-13 Thread peter360


Hi,

Is anyone aware of a command line tool that builds and searches a solr index
without running solr as a servlet?

My plan is to do the following: build and validate an index on a single
indexer machine, then push the index to a few search machines.  It seems to
me that there is no need to run a servlet container on the indexer box if a
tool as mentioned above exists.

Can someone point me to the right direction?
Thanks!
Peter
-- 
View this message in context: 
http://www.nabble.com/searching-from-command-line--tp16040889p16040889.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching from command line?

2008-03-14 Thread peter360


Great!  I'll take a look at Luke.  Thanks.
- peter


jonbaer wrote:
> 
> You can do this via Luke
> 
> http://www.getopt.org/luke/
> 
> -snip-
> Command-line argument parsing. Now you can open an index on startup,  
> and optionally execute a script
> Scripting plugin, which allows you to interactively experiment with  
> Luke and Lucene indexes. This plugin also can run scripts from Luke  
> command-line.
> -snip-
> 
> - Jon
> 
> On Mar 13, 2008, at 7:12 PM, peter360 wrote:
> 
>>
>> Hi,
>>
>> Is anyone aware of a command line tool that builds and searches a  
>> solr index
>> without running solr as a servlet?
>>
>> My plan is to do the following: build and validate an index on a  
>> single
>> indexer machine, then push the index to a few search machines.  It  
>> seems to
>> me that there is no need to run a servlet container on the indexer  
>> box if a
>> tool as mentioned above exists.
>>
>> Can someone point me to the right direction?
>> Thanks!
>> Peter
>> -- 
>> View this message in context:
>> http://www.nabble.com/searching-from-command-line--tp16040889p16040889.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/searching-from-command-line--tp16040889p16048942.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching from command line?

2008-03-14 Thread peter360


Thanks for your suggestion and  I pretty much agree.  Part of the reason,
which I didn't mention in my original question, that I am looking for a
command line tool is to use it for quick diagnosis.  I could point it to a
different index just by changing one of the command line parameters, without
having to modify the solrconfig.xml file and restart a servlet container,
then go to a browser to issue query.

Thanks for all who replied.  The solr community is impressive.
- peter



Golly, let me think. I can use the out-of-the-box, tested Solr
stuff for syncing indexes or I can invent some command line kludge
that does the same thing, except I will need to write it and test
it myself. Which one is easier?

Seriously, the existing Solr index distribution is great stuff.
I strongly recommend that you try it and ask the list about
any problems you run into. I'm quite impressed with Solr. It is
working very well here at Netflix. We have 2M queries/day on
the front end and 6M qpd at the search farm (five servers).

wunder


-- 
View this message in context: 
http://www.nabble.com/searching-from-command-line--tp16040889p16049036.html
Sent from the Solr - User mailing list archive at Nabble.com.

capping term frequency?

2008-04-11 Thread peter360


Hi,
How do I cap the term frequency when computing relevancy scores in solr?

The problem is if a keyword repeats many times in the same document, I don't
want it to hijack the relevancy score.  Can I tell solr to cap the term
frequency at a certain threshold?

thanks.
-- 
View this message in context: 
http://www.nabble.com/capping-term-frequency--tp16628189p16628189.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: capping term frequency?

2008-04-14 Thread peter360


Thanks that worked!


Otis Gospodnetic wrote:
> 
> Hi,
> 
> Probably by writing your own Similarity (Lucene codebase) and implementing
> the following method with capping:
> 
>   /** Implemented as sqrt(freq). */
>   public float tf(float freq) {
> return (float)Math.sqrt(freq);
>   }
> 
> Then put that custom Similarity in a jar in Solr's lib and specify your
> Similarity FQCN at the bottom of solrconfig.xml
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> - Original Message 
> From: peter360 <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Friday, April 11, 2008 2:16:53 PM
> Subject: capping term frequency?
> 
> 
> Hi,
> How do I cap the term frequency when computing relevancy scores in solr?
> 
> The problem is if a keyword repeats many times in the same document, I
> don't
> want it to hijack the relevancy score.  Can I tell solr to cap the term
> frequency at a certain threshold?
> 
> thanks.
> -- 
> View this message in context:
> http://www.nabble.com/capping-term-frequency--tp16628189p16628189.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/capping-term-frequency--tp16628189p16695631.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Interleaved results form different sources

2008-04-15 Thread peter360


How do you get the top N/2 results from each source?  What if you have more
than 2 sources?


Mike Klaas wrote:
> 
> By far the easiest way is to get the top N/2 results from each source  
> and interleave on the client side.
> 
> regards,
> -Mike
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Interleaved-results-form-different-sources-tp16693128p16703399.html
Sent from the Solr - User mailing list archive at Nabble.com.

dismax handler and WordDelimiterFilterFactory

2008-05-21 Thread peter360


Hi,

Let's say I have an index with two fields: f1 and f2, and queries to both
are analyzed using WhiteSpaceTokenizerFactory and
WordDelimiterFilterFactory.  I use dismax handler for queries and observed
the following anomally.

Suppose I have a document with f1="american" and f2="idol".  Then a search
"q=american+idol&qt=dismax&qf=f1+f2" matches.  However, the search
"q=american-idol&qt=dismax&qf=f1+f2" does not, even though the analyzer
(WordDelimiterFilterFactory) turns "american-idol" into "american idol".

Upon closer look, the dismax handler is parsing the first query as something
like 
+(f1:american f2:american) +(f1:idol f2:idol)
while parsing the second as something like
f1:"american idol" f2:"american idol"

I feel this is an anormaly because from end user point of view american-idol
should be treated the same as american idol.  How do I achieve this?  One
possible solution is to index f1 and f2 as one field, but I want to be able
to give separate boosts to them, such as "qf=f1^2+f2".  Any ideas?  Do
people feel this is a bug in the dismax handler?

-- 
View this message in context: 
http://www.nabble.com/dismax-handler-and-WordDelimiterFilterFactory-tp17367660p17367660.html
Sent from the Solr - User mailing list archive at Nabble.com.

searching from command line?

Re: searching from command line?

Re: searching from command line?

capping term frequency?

Re: capping term frequency?

Re: Interleaved results form different sources

top documented in faceted query?

dismax handler and WordDelimiterFilterFactory

8 matches

Site Navigation

Mail list logo

Footer information