Need "OR" in DisMax Query
Hi There, Maybe I'm missing something, but I can't seem to get the dismax request handler to perform and OR query. It appears that OR is removed by the stop words. I like to do something like "qt=dismax&q=red+OR+green" and get all green and all red results. Thanks, David
Re: Need "OR" in DisMax Query
So, I remove the stop word OR from the stopwords and get the same result. Using the standard query handler syntax like this "fq=((tags:red)+OR+(tags:green))" I get 421,000 results. Using dismax "q=red+OR+green" I get 29,000 results. The debug output from parsedquery_toString show this: +(((tags:red)~0.01 (tags:green)~0.01)~2) It feels like the dismax handler is not handling the "OR" properly. I also tried "q=red+|+green" and got the same 29,000 results. Thanks, David On Mon, Oct 5, 2009 at 3:02 PM, Christian Zambrano wrote: > David, > > If your schema includes fields with analyzers that use the StopFilterFactory > and the dismax QueryHandler is set-up to search within those fields, then > you are correct. > > > On 10/05/2009 01:36 PM, David Giffin wrote: >> >> Hi There, >> >> Maybe I'm missing something, but I can't seem to get the dismax >> request handler to perform and OR query. It appears that OR is removed >> by the stop words. I like to do something like >> "qt=dismax&q=red+OR+green" and get all green and all red results. >> >> Thanks, >> David >> >
facet.query and fq
Hi There, Is there a way to get facet.query= to ignore the fq= param? We want to do a query like this: select?fl=*&start=0&q=cool&fq=in_stock:true&facet=true&facet.query=in_stock:false&qt=dismax To understand the count of items not in stock, when someone has filtered items that are in stock. Or is there a way to combine two queries into one? Thanks, David
Re: facet.query and fq
Thanks, that was just what I was looking for! On Tue, Oct 27, 2009 at 1:27 PM, Jérôme Etévé wrote: > Hi, > > you need to 'tag' your filter and then exclude it from the faceting. > > An example here: > http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters > > J. > > 2009/10/27 David Giffin : >> Hi There, >> >> Is there a way to get facet.query= to ignore the fq= param? We want to >> do a query like this: >> >> select?fl=*&start=0&q=cool&fq=in_stock:true&facet=true&facet.query=in_stock:false&qt=dismax >> >> To understand the count of items not in stock, when someone has >> filtered items that are in stock. Or is there a way to combine two >> queries into one? >> >> Thanks, >> David >> > > > > -- > Jerome Eteve. > http://www.eteve.net > jer...@eteve.net >
Solr Replication Performance
Hi There, I have been building a Solr environment that indexes roughly 3 million products. The current index is roughly 9gig in size. We have bumped into some issues performance issues with Solr's Replication. During the Solr slave snapshot installation, query times take longer and may in some cases timeout. Here are some of the details: Every 3 minutes approximately 2000 updates are committed to the master Solr index and a snapshot is taken. There are 4 Solr slaves (2 way quad cores / 32gig ram / 15k scsi) which poll every minute to look for a new snapshot and install it. During the install of the snapshot on the slaves I'm seeing two things, 1. the disk i/o hit, and 2. cpu load on the Java/Jetty/Solr process jumps up. I know the i/o is related to the transfer of the snapshot to the local box. I believe the cpu load is related to cache warming, which takes roughly 10-30 seconds to complete. Currently for cache warming I have the following settings: 2 50 200 1024 true false I have thought about turning off the cache warming completely and looking at the search performance. I would love to hear any ideas or experiences that people have had in tuning Solr Replication. Thanks, David
New Searcher / Commit / Cache Warming Time
Hi All, I have been trying to reduce the cpu load and time it takes to put a new snapshot in place on our slave servers. I have tried tweaking many of the system memory, jvm and cache size setting used by Solr. When running a commit from the command line I'm seeing roughly 16 seconds before the commit completes. This is a ~7gig index with no pending changes, nothing else running, no load: INFO: {commit=} 0 15771 Jan 15, 2009 11:29:35 PM org.apache.solr.core.SolrCore execute INFO: [listings] webapp=/solr path=/update params={} status=0 QTime=15771 So I started disabling things. I disabled everything under times went down: INFO: {commit=} 0 103 Jan 15, 2009 11:35:22 PM org.apache.solr.core.SolrCore execute INFO: [listings] webapp=/solr path=/update params={} status=0 QTime=103 So I started adding things back in, and found that adding the section was causing the slow down. When I comment that section commit times go down, the cpu spikes go away. So I tried putting the newSearcher section back in with no queries to run, same thing... times jump up: INFO: {commit=} 0 16306 Jan 15, 2009 11:49:32 PM org.apache.solr.core.SolrCore execute INFO: [listings] webapp=/solr path=/update params={} status=0 QTime=16306 Do you know what would be causing "newSearcher" to create such a delays, and cpu spikes? Is there any reason not to disable the "newSearcher" section? Thanks, David
Custom Sorting Based on Relevancy
Hi There, I'm working on a sorting issue. Our site currently sorts by creation date descending, so users list similar products multiple times to show up at the top of the results. When sorting based on score, we want to move items by the "same user" with the "same title" down search results. It would be best if the first item stayed in place based on score, and each additional item is moved out (rows * repeated user/title). Is custom sorting the best way? or is there something else I'm not thinking about. At the moment I'm looking at doing roughly the opposite of the Query Elevate Search component. Thanks, David
Token filter on multivalue field
Hi There, I'm working on a unique token filter, to eliminate duplicates on a multivalue field. My filter works properly for a single value field. It seems that a new TokenFilter is created for each value in the multivalue field. I need to maintain an array of used tokens across all of the values in the multivalue field. Is there a good way to do this? Here is my current code: public class UniqueTokenFilter extends TokenFilter { private ArrayList words; public UniqueTokenFilter(TokenStream input) { super(input); this.words = new ArrayList(); } @Override public final Token next(Token in) throws IOException { for (Token token=input.next(in); token!=null; token=input.next()) { if ( !words.contains(token.term()) ) { words.add(token.term()); return token; } } return null; } } Thanks, David
Re: Token filter on multivalue field
I'm doing a combination of update processor and token filter. The token filter is necessary to reduce the duplicates after stemming has occurred. David 2009/6/4 Noble Paul നോബിള് नोब्ळ् : > isn't better to use an UpdateProcessor for this? > > On Thu, Jun 4, 2009 at 1:52 AM, Otis Gospodnetic > wrote: >> >> Hello, >> >> It's ugly, but the first thing that came to mind was ThreadLocal. >> >> Otis >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> >> >> - Original Message >>> From: David Giffin >>> To: solr-user@lucene.apache.org >>> Sent: Wednesday, June 3, 2009 1:57:42 PM >>> Subject: Token filter on multivalue field >>> >>> Hi There, >>> >>> I'm working on a unique token filter, to eliminate duplicates on a >>> multivalue field. My filter works properly for a single value field. >>> It seems that a new TokenFilter is created for each value in the >>> multivalue field. I need to maintain an array of used tokens across >>> all of the values in the multivalue field. Is there a good way to do >>> this? Here is my current code: >>> >>> public class UniqueTokenFilter extends TokenFilter { >>> >>> private ArrayList words; >>> public UniqueTokenFilter(TokenStream input) { >>> super(input); >>> this.words = new ArrayList(); >>> } >>> >>> @Override >>> public final Token next(Token in) throws IOException { >>> for (Token token=input.next(in); token!=null; token=input.next()) { >>> if ( !words.contains(token.term()) ) { >>> words.add(token.term()); >>> return token; >>> } >>> } >>> return null; >>> } >>> } >>> >>> Thanks, >>> David >> >> > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >