searching from command line?
Hi, Is anyone aware of a command line tool that builds and searches a solr index without running solr as a servlet? My plan is to do the following: build and validate an index on a single indexer machine, then push the index to a few search machines. It seems to me that there is no need to run a servlet container on the indexer box if a tool as mentioned above exists. Can someone point me to the right direction? Thanks! Peter -- View this message in context: http://www.nabble.com/searching-from-command-line--tp16040889p16040889.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: searching from command line?
Great! I'll take a look at Luke. Thanks. - peter jonbaer wrote: > > You can do this via Luke > > http://www.getopt.org/luke/ > > -snip- > Command-line argument parsing. Now you can open an index on startup, > and optionally execute a script > Scripting plugin, which allows you to interactively experiment with > Luke and Lucene indexes. This plugin also can run scripts from Luke > command-line. > -snip- > > - Jon > > On Mar 13, 2008, at 7:12 PM, peter360 wrote: > >> >> Hi, >> >> Is anyone aware of a command line tool that builds and searches a >> solr index >> without running solr as a servlet? >> >> My plan is to do the following: build and validate an index on a >> single >> indexer machine, then push the index to a few search machines. It >> seems to >> me that there is no need to run a servlet container on the indexer >> box if a >> tool as mentioned above exists. >> >> Can someone point me to the right direction? >> Thanks! >> Peter >> -- >> View this message in context: >> http://www.nabble.com/searching-from-command-line--tp16040889p16040889.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > > -- View this message in context: http://www.nabble.com/searching-from-command-line--tp16040889p16048942.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: searching from command line?
Thanks for your suggestion and I pretty much agree. Part of the reason, which I didn't mention in my original question, that I am looking for a command line tool is to use it for quick diagnosis. I could point it to a different index just by changing one of the command line parameters, without having to modify the solrconfig.xml file and restart a servlet container, then go to a browser to issue query. Thanks for all who replied. The solr community is impressive. - peter Golly, let me think. I can use the out-of-the-box, tested Solr stuff for syncing indexes or I can invent some command line kludge that does the same thing, except I will need to write it and test it myself. Which one is easier? Seriously, the existing Solr index distribution is great stuff. I strongly recommend that you try it and ask the list about any problems you run into. I'm quite impressed with Solr. It is working very well here at Netflix. We have 2M queries/day on the front end and 6M qpd at the search farm (five servers). wunder -- View this message in context: http://www.nabble.com/searching-from-command-line--tp16040889p16049036.html Sent from the Solr - User mailing list archive at Nabble.com.
capping term frequency?
Hi, How do I cap the term frequency when computing relevancy scores in solr? The problem is if a keyword repeats many times in the same document, I don't want it to hijack the relevancy score. Can I tell solr to cap the term frequency at a certain threshold? thanks. -- View this message in context: http://www.nabble.com/capping-term-frequency--tp16628189p16628189.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: capping term frequency?
Thanks that worked! Otis Gospodnetic wrote: > > Hi, > > Probably by writing your own Similarity (Lucene codebase) and implementing > the following method with capping: > > /** Implemented as sqrt(freq). */ > public float tf(float freq) { > return (float)Math.sqrt(freq); > } > > Then put that custom Similarity in a jar in Solr's lib and specify your > Similarity FQCN at the bottom of solrconfig.xml > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > - Original Message > From: peter360 <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Friday, April 11, 2008 2:16:53 PM > Subject: capping term frequency? > > > Hi, > How do I cap the term frequency when computing relevancy scores in solr? > > The problem is if a keyword repeats many times in the same document, I > don't > want it to hijack the relevancy score. Can I tell solr to cap the term > frequency at a certain threshold? > > thanks. > -- > View this message in context: > http://www.nabble.com/capping-term-frequency--tp16628189p16628189.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > -- View this message in context: http://www.nabble.com/capping-term-frequency--tp16628189p16695631.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Interleaved results form different sources
How do you get the top N/2 results from each source? What if you have more than 2 sources? Mike Klaas wrote: > > By far the easiest way is to get the top N/2 results from each source > and interleave on the client side. > > regards, > -Mike > > -- View this message in context: http://www.nabble.com/Interleaved-results-form-different-sources-tp16693128p16703399.html Sent from the Solr - User mailing list archive at Nabble.com.
top documented in faceted query?
The faceted query returns a list of values with associated doc count. Is it possible to also get the top document id associated with each value? Basically I want a "representative" for each facet group. What is the best way to achieve this? To be more precise, suppose the query "q=x&facet=true&facet.field=f" yields the following result: ... 100 90 90 ... ... I could then get the top document for each value by issuing a sequence of queries q=x&fq=f:a&row=1 q=x&fq=f:b&row=1 q=x&fq=f:c&row=1 ... Is there a way to do this in one query? Thanks. -- View this message in context: http://www.nabble.com/top-documented-in-faceted-query--tp16996445p16996445.html Sent from the Solr - User mailing list archive at Nabble.com.
dismax handler and WordDelimiterFilterFactory
Hi, Let's say I have an index with two fields: f1 and f2, and queries to both are analyzed using WhiteSpaceTokenizerFactory and WordDelimiterFilterFactory. I use dismax handler for queries and observed the following anomally. Suppose I have a document with f1="american" and f2="idol". Then a search "q=american+idol&qt=dismax&qf=f1+f2" matches. However, the search "q=american-idol&qt=dismax&qf=f1+f2" does not, even though the analyzer (WordDelimiterFilterFactory) turns "american-idol" into "american idol". Upon closer look, the dismax handler is parsing the first query as something like +(f1:american f2:american) +(f1:idol f2:idol) while parsing the second as something like f1:"american idol" f2:"american idol" I feel this is an anormaly because from end user point of view american-idol should be treated the same as american idol. How do I achieve this? One possible solution is to index f1 and f2 as one field, but I want to be able to give separate boosts to them, such as "qf=f1^2+f2". Any ideas? Do people feel this is a bug in the dismax handler? -- View this message in context: http://www.nabble.com/dismax-handler-and-WordDelimiterFilterFactory-tp17367660p17367660.html Sent from the Solr - User mailing list archive at Nabble.com.