Re: use a solr-built index with lucene?
This looks like an interesting avenue for a smooth transition from lucene to solr. thanks for more hints you find around. (e.g. maybe it is not too hard to pre-generate a schema.xml from an actual index for the field-types?) paul Le 09-avr.-10 à 02:32, Erik Hatcher a écrit : Yes... gotta jive with schema.xml though. Erik On Apr 8, 2010, at 7:18 PM, Tommy Chheng wrote: If i build an index with solr, is it possible to use the index folder with lucene? -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com
Re: use a solr-built index with lucene?
I was thinking of the reverse case: from solr to lucene. lucene doesn't use a schema.xml Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 4/9/10 12:15 AM, Paul Libbrecht wrote: This looks like an interesting avenue for a smooth transition from lucene to solr. thanks for more hints you find around. (e.g. maybe it is not too hard to pre-generate a schema.xml from an actual index for the field-types?) paul Le 09-avr.-10 à 02:32, Erik Hatcher a écrit : Yes... gotta jive with schema.xml though. Erik On Apr 8, 2010, at 7:18 PM, Tommy Chheng wrote: If i build an index with solr, is it possible to use the index folder with lucene? -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com
Replication process on Master/Slave slowing down slave read/search performance
Hi guys, I have noticed that Master/Slave replication process is slowing down slave read/search performance during replication being done. please help cheers
Re: Replication process on Master/Slave slowing down slave read/search performance
Hi Marcin, This is because when you do the replication, all the caches are rebuild cause the index has changed, so the searchs performance decrease. You can change your architecture to a multicore one to reduce the impact of the replication. Using two cores, one to do the replication, and other to search, when the replication is done, do a swap of the cores so the caches are updated all the time. Regards Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/4/9 Marcin > Hi guys, > > I have noticed that Master/Slave replication process is slowing down slave > read/search performance during replication being done. > > > please help > cheers >
Solr giving 500's
Hi, I was seeing this error from Solr this morning "Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_infomation_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_solrconfigxml ___javalangRuntimeException_javaioFileNotFoundException_no_segments_file_found_in_orgapachelucenestoreFSDirectoryoptsolrsolrdataindex_files__at_orgapachesolrcoreSolrCoregetSearcherSolrCorejava433__at_orgapachesolrcoreSolrCoreinitSolrCorejava216 __at_orgapachesolrcoreSolrCoregetSolrCoreSolrCorejava177__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava69__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__ at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500 __at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_ orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__ at_orgmortbayjettyhandlerHandlerWrapperdoStartHandlerWrapperjava117__at_orgmortbayjettyServerdoStartServerjava210__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayxmlXmlConfigurationmainXmlConfigurationj" at http://127.0.0.1:8983/solr in /var/www/aspire/releases/20100407094800/vendor/plugins/acts_as_solr/lib/acts_as_solr.rb:49:in I have been playing around with getting logging configured but it doesn't seem to output anything to the log file I used this link http://wiki.apache.org/solr/LoggingInDefaultJettySetup as a guide I noticed there was nothing in the index folder which at this moment in time I am not sure why but I had a copy of the index from yesterday afternoon so I have copied that into its place but now I am seeing the following errors RuntimeError (Solr exception 500 "Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_infomation_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationErro __in_solrconfigxml___javalangRuntimeException_javaioFileNotFoundException_no_segments_file_found_in_orgapachelucenestoreFSDirectoryoptsolrsolrdataindex_files__ratpfrq__of8kfdt__rb4afnm__rb4atis__r7pvfnm__rb4dtis__of8ktis__rb4ffdt__r970fnm __ratpnrm__rb4gfrq__rb4cfrq__rb4efrq__r970nrm__rb4dprx__ratpfdt__rb4gnrm__of8k_2pmdel__r6jlfrq__r7pvtis__lj9hfdt__of8knrm__of8kfnm__r88sfdt__rb4aprx__r6jlfdt__r970_2sdel__r88s_2sdel__rb4dfdx__r6jltii__r7pvfdx__rb4ctii__rb4ffdx__r6jlfnm__ra76frq __rb4dfdt__rb4dnrm__rb4ffrq__rb4btii__r72ifnm__r7pvtii__r9p3frq__r88stii__lj9htii__rb4ftii__lj9hprx__r72inrm__of8kfrq__r7pvprx__rb4cfdx__rb4afrq__r9p3fdt__rb4gfdt__r72itii__of8ktii__lj9hnrm__rb4g_1del__r6jlnrm__ra76nrm__rb4gprx__r8ontii__r88stis__r72itis __rb4ffnm__rb4f_1del__rb4afdx__ra76tii__r72iprx__of8kfdx__ratptis__r9p3nrm__r9p3_2gdel__ratpprx__rb4gfnm__ra76fdt__rb4cprx__r8onfrq__r8onnrm__lj9hfrq__r8onprx__rb4a_5del__rb4dtii__r8ontis__rb4e_2del__rb4gfdx__rb4etii__r7pvnrm__rb4fnrm__rb4enrm __lj9htis__rb4efnm__r9p3fdx__rb4ftis__rb4efdt__r72ifdt__rb4gtii__rb4anrm__r8onfnm__of8kprx__r9p3fnm__r970frq__r7pvfrq__r9p3tii__r88sfrq__rb4cfnm__rb4efdx__ratptii__rb4dfrq__ratpfnm__r88snrm__r72ifdx__rb4btis__rb4fprx__r9p3tis__ra76tis__r88sprx__r 88sfnm__r6jlfdx__r8on_3ddel__lj9hfdx__r970tii__r7pvfdt__ra76fnm__r72ifrq__r970tis__r6jl_nvdel__rb4eprx__r88sfdx__rb4cnrm__r9p3prx__rb4afdt__rb4bfnm__rb4bfdt_segmentsgen__rb4bfdx__r8onfdx__lj9hfnm__rb4bfrq__rb4bnrm__r970fdt__r6jltis__rb4dfnm__ b4gtis__rb4ctis__ra76prx__ratp_13del__r970prx__r7pv_3jdel__rb4bprx__r970fdx__ra76fdx__r6jlprx__ra76_2adel__" at http://127.0.0.1:8983/solr in /var/www/webapp/releases/20100407094800/vendor/plugins/acts_as_solr/lib/acts_as_solr.rb:49:in `execute' while performing search {:query=>"(visible_to_candidates_b:(true) AND site_id_t:(68)) AND (type_s:Vacancy OR type_s:VacancyLite);opening_at_d desc", :operator=>nil, :rows=>20, :start=>0, :field_list=>["pk_i", "score"]}) As you can see it uses a standalone Solr installation but uses the ruby acts_as_solr plugin to interact with the index, I am not really sure what to do after this apart from reindex which could take a long time. Any suggestions? If anyone has any ideas on the logging too that would be great! Thanks, Will
Re: Faceting on a multi-valued field by index
Though if you added a prefix to all your root id's, say "root" format, then you could use facet.prefix=root Erik On Apr 8, 2010, at 10:24 PM, Lance Norskog wrote: Nope! Lucene is committed to maintaining the order of values added to a field, but does not have this feature. On Thu, Apr 8, 2010 at 6:44 PM, Blargy wrote: Is there anyway to facet on a multi-valued field at a particular index? For example, I have a field category_ids which is multi-valued containing category ids. The first value in that field is always the root category and I would like to be able to facet on just that one field. Is this possible without explicitly creating a separate field? Thanks -- View this message in context: http://n3.nabble.com/Faceting-on-a-multi-valued-field-by-index-tp707436p707436.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: use a solr-built index with lucene?
Oh, sorry, I got the direction backwards in my initial reply. Yes, of course you can use an index from Solr with Lucene directly. It's just a Lucene index. Just make sure you use the same version of Lucene (pull the JARs from solr.war, I'd say). For example, you can open a "Solr index" with Luke. If you're using a Lucene app against a live Solr index, be careful with locking (in short, the default lock setting in solrconfig.xml isn't set for sharing the index between two processes). Erik On Apr 9, 2010, at 3:18 AM, Tommy Chheng wrote: I was thinking of the reverse case: from solr to lucene. lucene doesn't use a schema.xml Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 4/9/10 12:15 AM, Paul Libbrecht wrote: This looks like an interesting avenue for a smooth transition from lucene to solr. thanks for more hints you find around. (e.g. maybe it is not too hard to pre-generate a schema.xml from an actual index for the field-types?) paul Le 09-avr.-10 à 02:32, Erik Hatcher a écrit : Yes... gotta jive with schema.xml though. Erik On Apr 8, 2010, at 7:18 PM, Tommy Chheng wrote: If i build an index with solr, is it possible to use the index folder with lucene? -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com
Re: Solr giving 500's
Looks like you're missing one of the index files... segments_ It points to all the other index files. -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Fri, Apr 9, 2010 at 6:20 AM, william pink wrote: > Hi, > > I was seeing this error from Solr this morning > > "Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_infomation_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_solrconfigxml > ___javalangRuntimeException_javaioFileNotFoundException_no_segments_file_found_in_orgapachelucenestoreFSDirectoryoptsolrsolrdataindex_files__at_orgapachesolrcoreSolrCoregetSearcherSolrCorejava433__at_orgapachesolrcoreSolrCoreinitSolrCorejava216 > __at_orgapachesolrcoreSolrCoregetSolrCoreSolrCorejava177__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava69__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__ > at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500 > __at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_ > orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__ > at_orgmortbayjettyhandlerHandlerWrapperdoStartHandlerWrapperjava117__at_orgmortbayjettyServerdoStartServerjava210__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayxmlXmlConfigurationmainXmlConfigurationj" > at http://127.0.0.1:8983/solr in > /var/www/aspire/releases/20100407094800/vendor/plugins/acts_as_solr/lib/acts_as_solr.rb:49:in > > > I have been playing around with getting logging configured but it doesn't > seem to output anything to the log file I used this link > http://wiki.apache.org/solr/LoggingInDefaultJettySetup as a guide > > I noticed there was nothing in the index folder which at this moment in time > I am not sure why but I had a copy of the index from yesterday afternoon so > I have copied that into its place but now I am seeing the following errors > > RuntimeError (Solr exception 500 > "Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_infomation_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationErro > __in_solrconfigxml___javalangRuntimeException_javaioFileNotFoundException_no_segments_file_found_in_orgapachelucenestoreFSDirectoryoptsolrsolrdataindex_files__ratpfrq__of8kfdt__rb4afnm__rb4atis__r7pvfnm__rb4dtis__of8ktis__rb4ffdt__r970fnm > __ratpnrm__rb4gfrq__rb4cfrq__rb4efrq__r970nrm__rb4dprx__ratpfdt__rb4gnrm__of8k_2pmdel__r6jlfrq__r7pvtis__lj9hfdt__of8knrm__of8kfnm__r88sfdt__rb4aprx__r6jlfdt__r970_2sdel__r88s_2sdel__rb4dfdx__r6jltii__r7pvfdx__rb4ctii__rb4ffdx__r6jlfnm__ra76frq > __rb4dfdt__rb4dnrm__rb4ffrq__rb4btii__r72ifnm__r7pvtii__r9p3frq__r88stii__lj9htii__rb4ftii__lj9hprx__r72inrm__of8kfrq__r7pvprx__rb4cfdx__rb4afrq__r9p3fdt__rb4gfdt__r72itii__of8ktii__lj9hnrm__rb4g_1del__r6jlnrm__ra76nrm__rb4gprx__r8ontii__r88stis__r72itis > __rb4ffnm__rb4f_1del__rb4afdx__ra76tii__r72iprx__of8kfdx__ratptis__r9p3nrm__r9p3_2gdel__ratpprx__rb4gfnm__ra76fdt__rb4cprx__r8onfrq__r8onnrm__lj9hfrq__r8onprx__rb4a_5del__rb4dtii__r8ontis__rb4e_2del__rb4gfdx__rb4etii__r7pvnrm__rb4fnrm__rb4enrm > __lj9htis__rb4efnm__r9p3fdx__rb4ftis__rb4efdt__r72ifdt__rb4gtii__rb4anrm__r8onfnm__of8kprx__r9p3fnm__r970frq__r7pvfrq__r9p3tii__r88sfrq__rb4cfnm__rb4efdx__ratptii__rb4dfrq__ratpfnm__r88snrm__r72ifdx__rb4btis__rb4fprx__r9p3tis__ra76tis__r88sprx__r > 88sfnm__r6jlfdx__r8on_3ddel__lj9hfdx__r970tii__r7pvfdt__ra76fnm__r72ifrq__r970tis__r6jl_nvdel__rb4eprx__r88sfdx__rb4cnrm__r9p3prx__rb4afdt__rb4bfnm__rb4bfdt_segmentsgen__rb4bfdx__r8onfdx__lj9hfnm__rb4bfrq__rb4bnrm__r970fdt__r6jltis__rb4dfnm__ > b4gtis__rb4ctis__ra76prx__ratp_13del__r970prx__r7pv_3jdel__rb4bprx__r970fdx__ra76fdx__r6jlprx__ra76_2adel__" > at http://127.0.0.1:8983/solr in > /var/www/webapp/releases/20100407094800/vendor/plugins/acts_as_solr/lib/acts_as_solr.rb:49:in > `execute' while performing search {:query=>"(visible_to_candidates_b:(true) > AND site_id_t:(68)) AND (type_s:Vacancy OR type_s:VacancyLite);opening_at_d > desc", :operator=>nil, :rows=>20, :start=>0, :field_list=>["pk_i", > "score"]}) > > As you can see it uses a standalone Solr installation but uses the ruby > acts_as_solr plugin to interact with t
Re: Minimum Should Match the other way round
Hoss, before I ran into some missunderstandings, I want to come back to topic first. I will have a look at some classes later, to find out whether some other ideas which are not directly related to this topic (like the multiword-synonyms at query-time) will work or not. I'm sorry for beeing off-topic. Chris Hostetter-3 wrote: > > where the analyzer matters is in creating that numeric field at index time > ... hence my suggestion of having an analyzer chain that exactly matches > the field you are interested in, but ending with a TokenCountingFilter -- > it can take care of creating the "numeric-ish" (padded) field value when > the docs are indexed. > Okay, as I have understood you mean something like this: This fieldType should "store" (or let's say index) the number of tokens as something like "005" for 5 token, right? My problem is that I don't know how to query this field. I know what you mean with appending the query with "Add +titleLen:[* TO MAX_LEN]" - but I don't know how to retrive the MAX_LEN information for a specific query, since it depends in some cases of what an analyzer-chain will be used at the tokenLen-field. For example: I think it makes sense to use a WordDelimiterFilter at the end of my TokenFilter-chain. If my document is something like "The secrets of the iPhone 3G", than I want to index it as "The secrets of the iPhone 3 G" (3G is going to be indexed as two tokens). This means, that the document length is increased by one token. However, maybe I missunderstood your point: "- Pick MAX_LEN Based On Number Of Query Clauses From Super" since I thought, that the number of query clauses depends on the number of whitespaces in my query. If I am wrong, and it depends on the result of my analyzer-chain, there is no problem. But I am not sure, if this is the case or not. Thank you for help. - Mitch -- View this message in context: http://n3.nabble.com/Minimum-Should-Match-the-other-way-round-tp694867p708264.html Sent from the Solr - User mailing list archive at Nabble.com.
refreshing synonyms.txt - or other configs
i am wondering how config files like synonyms.txt or stopwords.txt can be refreshed without restarting of solr, maybe also how changes in solrconfig.xml or schema.xml can be refreshed? i can use a multicore setup - i just tested it with a "multicore"-setup with one one core (core0), there i can call /solr/admin/cores?action=RELOAD&core=core0 and changes in synonyms.txt are getting active. i also understand that this should work in a master/slave setup, where configfiles under /conf are replicated (at least when doing a a commit or optimize on an index). but whats with a standard setup? is there a way to do this? we have not yet decided how we run our production servers. at the moment were developing a enterprise search for our intranet... markus
RE: index corruption / deployment strategy
Thanks Erik, I forwarded your thoughts to management and put in good word for Lucid Imagination. Regards, Kallin Nagelberg -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Thursday, April 08, 2010 2:18 PM To: solr-user@lucene.apache.org Subject: Re: index corruption / deployment strategy Kallin, It's a very rare report, and practically impossible I'm told, to corrupt the index these days thanks to Lucene's improvements over the last several releases (ignoring hardware malfunctions). A single index is the best way to go, in my opinion - though at your scale you're probably looking at sharding it and using distributed search. So you'll have multiple physical indexes, one for each shard, and a single virtual index in the eyes of your searching clients. Backups, of course, are sensible, and Solr's replication capabilities can help here by requesting them periodically. You'll be using replication anyway to scale to your query volume. As for hardware scaling considerations, there are variables to consider like how faceting, sorting, and querying speed across a single large index versus sharding. I'm guessing you'll be best with at least two shards, though possibly more considering these variables. Erik @ Lucid Imagination p.s. have your higher-ups give us a call if they'd like to discuss their concerns and consider commercial support for your mission critical big scale use of Solr :) On Apr 8, 2010, at 1:33 PM, Nagelberg, Kallin wrote: > I've been doing work evaluating Solr for use on a hightraffic > website for sometime and things are looking positive. I have some > concerns from my higher-ups that I need to address. I have suggested > that we use a single index in order to keep things simple, but there > are suggestions to split are documents amongst different indexes. > > The primary motivation for this split is a worry about potential > index corruption. IE, if we only have one index and it becomes > corrupt what do we do? I never considered this to be an issue since > we would have backups etc., but I think they have had issues with > other search technology in the past where one big index resulted in > frequent and difficult to recover from corruption. Do you think this > is a concern with Solr? If so, what would you suggest to mitigate > the risk? > > My second question involves general deployment strategy. We will > expect about 50 million documents, each on average a few paragraphs, > and our website receives maybe 10 million hits a day. Can anyone > provide an idea of # of servers, clustering/replication setup etc. > that might be appropriate for this scenario? I'm interested to hear > what other's experience is with similar situations. > > Thanks, > -Kallin Nagelberg >
Re: Replication process on Master/Slave slowing down slave read/search performance
You don't need multi-core. Solr already does this automatically. It creates a new Searcher and auto-warms the cache. But, it will still be slow. If you use auto-warming, it uses most of one CPU, which slows down queries during warming. Also, warming isn't perfect, so queries will be slower after switching to the new Searcher. If you don't use warming, the cold cache will make queries slower. There is no way to get around this. Solr throws away all the caches after replication, so there is a performance hit. In the system I ran, it took a few minutes to recover, so I staggered the replications 10 minutes apart across the search farm. wunder On Apr 9, 2010, at 3:00 AM, Marco Martinez wrote: > Hi Marcin, > > This is because when you do the replication, all the caches are rebuild > cause the index has changed, so the searchs performance decrease. You can > change your architecture to a multicore one to reduce the impact of the > replication. Using two cores, one to do the replication, and other to > search, when the replication is done, do a swap of the cores so the caches > are updated all the time. > > Regards > > > Marco Martínez Bautista > http://www.paradigmatecnologico.com > Avenida de Europa, 26. Ática 5. 3ª Planta > 28224 Pozuelo de Alarcón > Tel.: 91 352 59 42 > > > 2010/4/9 Marcin > >> Hi guys, >> >> I have noticed that Master/Slave replication process is slowing down slave >> read/search performance during replication being done. >> >> >> please help >> cheers >>
RE: solr.WordDelimiterFilterFactory problem with hyphenated terms?
I've given it a try, and it definitely seems to have improved the situation. However, there is still one weird case that's clearly related to term positions. If I do this search, it fails: title:"love customs in eighteenthcentury spain" ...but if I do this search, it succeeds: title:"love customs in in eighteenthcentury spain" (note the duplicate "in"). - Demian > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Thursday, April 08, 2010 11:20 AM > To: solr-user@lucene.apache.org > Subject: Re: solr.WordDelimiterFilterFactory problem with hyphenated > terms? > > I'm not all that familiar with the underlying issues, but of the two > I'd > pick moving the WordDelimiterFactory rather than setting increments = > "false". > > But that's at least partly a guess > > Best > Erick > > On Thu, Apr 8, 2010 at 11:00 AM, Demian Katz > wrote: > > > Thanks for looking into this -- I appreciate the help (and feel a > little > > better that there seems to be a bug at work here and not just my > total > > incomprehension). > > > > Sorry for any confusion over the UnicodeNormalizationFactory -- > that's > > actually a plug-in from the SolrMarc project ( > > http://code.google.com/p/solrmarc/) that slipped into my example. > Also, > > as you guessed, my default operator is indeed set to "AND." > > > > It sounds to me that, of your two proposed work-arounds, moving the > > StopFilterFactory after WordDelimiterFactory is the least disruptive. > I'm > > guessing that disabling position increments across the board might > have > > implications for other types of phrase searches, while filtering > stopwords > > later in the chain should be more functionally equivalent, if > slightly less > > efficient (potentially more terms to examine). Would you agree with > this > > assessment? If not, what possible negative side effects am I > forgetting > > about? > > > > thanks, > > Demian > > > > > -Original Message- > > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > > Sent: Wednesday, April 07, 2010 10:04 PM > > > To: solr-user@lucene.apache.org > > > Subject: Re: solr.WordDelimiterFilterFactory problem with > hyphenated > > > terms? > > > > > > Well, for a quick trial using trunk, I had to remove the > > > UnicodeNormalizationFactory, is that yours? > > > > > > But with that removed, I get the results you do, ASSUMING that > you've > > > set > > > your default operator to AND in schema.xml... > > > > > > Believe it or not, it all changes and all your queries return a hit > if > > > you > > > do one of two things (I did this in both index and query when > testing > > > 'cause > > > I'm lazy): > > > 1> move the inclusion of the StopFilterFactory after > > > WordDelimiterFactory > > > or > > > 2> for StopFilterFactory, set enablePositionIncrements="false" > > > > > > I think either of these might work in your situation... > > > > > > On doing some more investigation, it appears that if a hyphenated > word > > > is > > > immediately after a stopword AND the above is true (stop factory > > > included > > > before WordDelimiterFactory and enablePositionIncrements="true"), > then > > > the > > > search fails. I indexed this title: > > > > > > Love-customs in eighteenth-century Spain for nineteenth-century > > > > > > Searching in solr/admin/form.jsp for: > > > title:(nineteenth-century) > > > > > > fails. But if I remove the "for" from the title, the above query > works. > > > Searching for > > > title:(love-customs) > > > always works. > > > > > > Finally, (and it's *really* time to go to sleep now), just setting > > > enablePositionIncrements="false" in the "index" portion of the > schema > > > also > > > causes things to work. > > > > > > Developer folks: > > > I didn't see anything in a quick look in SOLR or Lucene JIRAs, > should I > > > refine this a bit (really, sleepy time is near) and add a JIRA? > > > > > > Best > > > Erick > > > > > > On Wed, Apr 7, 2010 at 10:29 AM, Demian Katz > > > wrote: > > > > > > > Hello. It has been a few weeks, and I haven't gotten any > responses. > > > > Perhaps my question is too complicated -- maybe a better > approach is > > > to try > > > > to gain enough knowledge to answer it myself. My gut feeling is > > > still that > > > > it's something to do with the way term positions are getting > handled > > > by the > > > > WordDelimiterFilterFactory, but I don't have a good understanding > of > > > how > > > > term positions are calculated or factored into searching. Can > anyone > > > > recommend some good reading to familiarize myself with these > concepts > > > in > > > > better detail? > > > > > > > > thanks, > > > > Demian > > > > > > > > From: Demian Katz > > > > Sent: Tuesday, March 16, 2010 9:47 AM > > > > To: solr-user@lucene.apache.org > > > > Subject: solr.WordDelimiterFilterFactory problem with hyphenated > > > terms? > > > > > > > > This is my first post on this list -- apologies if this has been > > > dis
Re: solr.WordDelimiterFilterFactory problem with hyphenated terms?
but this behavior is correct, as you have position increments enabled. if you want the second query (which has 2 gaps) to match, you need to either use slop, or disable these increments alltogether. On Fri, Apr 9, 2010 at 11:44 AM, Demian Katz wrote: > I've given it a try, and it definitely seems to have improved the > situation. However, there is still one weird case that's clearly related to > term positions. If I do this search, it fails: > > title:"love customs in eighteenthcentury spain" > > ...but if I do this search, it succeeds: > > title:"love customs in in eighteenthcentury spain" > > (note the duplicate "in"). > > - Demian > > > -Original Message- > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > Sent: Thursday, April 08, 2010 11:20 AM > > To: solr-user@lucene.apache.org > > Subject: Re: solr.WordDelimiterFilterFactory problem with hyphenated > > terms? > > > > I'm not all that familiar with the underlying issues, but of the two > > I'd > > pick moving the WordDelimiterFactory rather than setting increments = > > "false". > > > > But that's at least partly a guess > > > > Best > > Erick > > > > On Thu, Apr 8, 2010 at 11:00 AM, Demian Katz > > wrote: > > > > > Thanks for looking into this -- I appreciate the help (and feel a > > little > > > better that there seems to be a bug at work here and not just my > > total > > > incomprehension). > > > > > > Sorry for any confusion over the UnicodeNormalizationFactory -- > > that's > > > actually a plug-in from the SolrMarc project ( > > > http://code.google.com/p/solrmarc/) that slipped into my example. > > Also, > > > as you guessed, my default operator is indeed set to "AND." > > > > > > It sounds to me that, of your two proposed work-arounds, moving the > > > StopFilterFactory after WordDelimiterFactory is the least disruptive. > > I'm > > > guessing that disabling position increments across the board might > > have > > > implications for other types of phrase searches, while filtering > > stopwords > > > later in the chain should be more functionally equivalent, if > > slightly less > > > efficient (potentially more terms to examine). Would you agree with > > this > > > assessment? If not, what possible negative side effects am I > > forgetting > > > about? > > > > > > thanks, > > > Demian > > > > > > > -Original Message- > > > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > > > Sent: Wednesday, April 07, 2010 10:04 PM > > > > To: solr-user@lucene.apache.org > > > > Subject: Re: solr.WordDelimiterFilterFactory problem with > > hyphenated > > > > terms? > > > > > > > > Well, for a quick trial using trunk, I had to remove the > > > > UnicodeNormalizationFactory, is that yours? > > > > > > > > But with that removed, I get the results you do, ASSUMING that > > you've > > > > set > > > > your default operator to AND in schema.xml... > > > > > > > > Believe it or not, it all changes and all your queries return a hit > > if > > > > you > > > > do one of two things (I did this in both index and query when > > testing > > > > 'cause > > > > I'm lazy): > > > > 1> move the inclusion of the StopFilterFactory after > > > > WordDelimiterFactory > > > > or > > > > 2> for StopFilterFactory, set enablePositionIncrements="false" > > > > > > > > I think either of these might work in your situation... > > > > > > > > On doing some more investigation, it appears that if a hyphenated > > word > > > > is > > > > immediately after a stopword AND the above is true (stop factory > > > > included > > > > before WordDelimiterFactory and enablePositionIncrements="true"), > > then > > > > the > > > > search fails. I indexed this title: > > > > > > > > Love-customs in eighteenth-century Spain for nineteenth-century > > > > > > > > Searching in solr/admin/form.jsp for: > > > > title:(nineteenth-century) > > > > > > > > fails. But if I remove the "for" from the title, the above query > > works. > > > > Searching for > > > > title:(love-customs) > > > > always works. > > > > > > > > Finally, (and it's *really* time to go to sleep now), just setting > > > > enablePositionIncrements="false" in the "index" portion of the > > schema > > > > also > > > > causes things to work. > > > > > > > > Developer folks: > > > > I didn't see anything in a quick look in SOLR or Lucene JIRAs, > > should I > > > > refine this a bit (really, sleepy time is near) and add a JIRA? > > > > > > > > Best > > > > Erick > > > > > > > > On Wed, Apr 7, 2010 at 10:29 AM, Demian Katz > > > > wrote: > > > > > > > > > Hello. It has been a few weeks, and I haven't gotten any > > responses. > > > > > Perhaps my question is too complicated -- maybe a better > > approach is > > > > to try > > > > > to gain enough knowledge to answer it myself. My gut feeling is > > > > still that > > > > > it's something to do with the way term positions are getting > > handled > > > > by the > > > > > WordDelimiterFilterFactory, but I don't have a good understanding > > of > > > >
Re: "json.nl=arrarr" does not work with "facet.date"
Apologies for the second post, I noticed the "json.nl=arrarr" does work with "facet.field" but not with "facet.date"? Is there a separate parameter required for "facet.date" to make it display as an array? Any help is much appreciated, Will { "responseHeader":{ "status":0, "QTime":2, "params":{ "facet.date.start":"NOW/YEAR-5YEARS", "facet":"true", "indent":"yes", "facet.limit":"5", "facet.date":"date", "json.nl":"arrarr", "wt":"json", "rows":"0", "q":"*:*", "facet.field":"date", "facet.date.gap":"+1YEAR", "facet.date.end":"NOW"}}, "response":{"numFound":1265,"start":0,"docs":[] }, "facet_counts":{ "facet_queries":{}, "facet_fields":{ "date": [ ["2010-01-19T00:00:00Z",63], ["2010-01-20T00:00:00Z",61], ["2010-01-29T00:00:00Z",60], ["2010-01-25T00:00:00Z",56], ["2010-01-21T00:00:00Z",55]]}, "facet_dates":{ "date":{ "2005-01-01T00:00:00Z":0, "2006-01-01T00:00:00Z":0, "2007-01-01T00:00:00Z":0, "2008-01-01T00:00:00Z":0, "2009-01-01T00:00:00Z":2, "2010-01-01T00:00:00Z":1263, "gap":"+1YEAR", "end":"2011-01-01T00:00:00Z" -- View this message in context: http://n3.nabble.com/json-nl-arrarr-does-not-work-with-facet-date-tp708730p708800.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: "json.nl=arrarr" does not work with "facet.date"
On Fri, Apr 9, 2010 at 1:04 PM, fabritw wrote: > > Apologies for the second post, I noticed the "json.nl=arrarr" does work with > "facet.field" but not with "facet.date"? Hmmm, this is because date faceting uses a SimpleOrderedMap instead of a NamedList (implying that access-like-a-map is more important than the order of the elements). If order is more important here, then it should have been a NamedList. -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague > Is there a separate parameter required for "facet.date" to make it display > as an array? > > Any help is much appreciated, Will > > > { > "responseHeader":{ > "status":0, > "QTime":2, > "params":{ > "facet.date.start":"NOW/YEAR-5YEARS", > "facet":"true", > "indent":"yes", > "facet.limit":"5", > "facet.date":"date", > "json.nl":"arrarr", > "wt":"json", > "rows":"0", > "q":"*:*", > "facet.field":"date", > "facet.date.gap":"+1YEAR", > "facet.date.end":"NOW"}}, > "response":{"numFound":1265,"start":0,"docs":[] > }, > "facet_counts":{ > "facet_queries":{}, > "facet_fields":{ > "date": > [ > ["2010-01-19T00:00:00Z",63], > ["2010-01-20T00:00:00Z",61], > ["2010-01-29T00:00:00Z",60], > ["2010-01-25T00:00:00Z",56], > ["2010-01-21T00:00:00Z",55]]}, > "facet_dates":{ > "date":{ > "2005-01-01T00:00:00Z":0, > "2006-01-01T00:00:00Z":0, > "2007-01-01T00:00:00Z":0, > "2008-01-01T00:00:00Z":0, > "2009-01-01T00:00:00Z":2, > "2010-01-01T00:00:00Z":1263, > "gap":"+1YEAR", > "end":"2011-01-01T00:00:00Z" > -- > View this message in context: > http://n3.nabble.com/json-nl-arrarr-does-not-work-with-facet-date-tp708730p708800.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: "json.nl=arrarr" does not work with "facet.date"
Yonik Seeley-2-2 wrote: > > If order is more important here, then it should have been a NamedList. > Hi Yonik, thanks for your quick reply! Unfortunately I cannot use the NamedList as I need to use the dateField parameters in my query also. I am trying to compile a list of facets, displaying each year and a corresponding count of matches. (i.e. "2010-01-01T00:00:00Z":1263, ) I need to parse through this list with javascript so would like to set the output to an array if possible? - Will -- View this message in context: http://n3.nabble.com/json-nl-arrarr-does-not-work-with-facet-date-tp708730p708877.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Minimum Should Match the other way round
I have searched for a tutorial in Lucene - instead of Solr itself - and I've found something on lucenetutorials.com: String querystr = args.length > 0 ? args[0] : "lucene"; // the "title" arg specifies the default field to use // when no field is explicitly specified in the query. Query q = new QueryParser( Version.LUCENE_CURRENT, "title", analyzer).parse(querystr); If I am right, than I can call getClauses() or clauses() to my booleanQuery object of my targetField and I can get the number of clauses from the returned result. Does this number already consider the number of clauses (or what I really mean: token) after the analyzer has worked on them? It would be really nice to feel certain of that. Kind regards - Mitch -- View this message in context: http://n3.nabble.com/Minimum-Should-Match-the-other-way-round-tp694867p708945.html Sent from the Solr - User mailing list archive at Nabble.com.
Questions about Solr
Hi, I would like to know the answer to the following: - How am I able to use wildcard searches with Solr? EX: querying Ado with a result that would retrieve something like Adolescent. - Phrase searches with stop words completely ruin the query and finds no results. How can I query something like "To be or not to be" with stop words enabled? - I use synonyms for certain keywords. However, when I search for a specific phrase which does contain synonyms, results with the synonyms rank higher than the ones that have the exact term. How can that be fixed? Thanks, Noel
Re: Questions about Solr
If the user query is not going to have wildcards then use NGrams. I talk about the black art of ngrams in my book. There are multiple ways of configuring it. If the query will have wildcards, Solr comes with a sample schema with a field type named, "text_rev" (I think that's what it's named) which supports wildcard searches such as "ado*". You could add the wildcard if it's not there. I've done this sort of thing with various boosting to get exact matches scored higher. For doing wildcards in a query string against NGram indexes, you'll have to wait till I am granted permission by my employer to open-source this (~2 months). ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ On Apr 9, 2010, at 2:42 PM, wrote: > - How am I able to use wildcard searches with Solr? EX: querying Ado with a > result that would retrieve something like Adolescent.
Re: StreamingUpdateSolrServer hangs
Stephen, were you running stock Solr 1.4, or did you apply any of the SolrJ patches? I'm trying to figure out if anyone still has any problems, or if this was fixed with SOLR-1711: * SOLR-1711: SolrJ - StreamingUpdateSolrServer had a race condition that could halt the streaming of documents. (Attila Babo via yonik) Also note that people may want this patch if dealing with i18n: * SOLR-1595: StreamingUpdateSolrServer used the platform default character set when streaming updates, rather than using UTF-8 as the HTTP headers indicated, leading to an encoding mismatch. (hossman, yonik) -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Fri, Feb 5, 2010 at 3:20 PM, Stephen Meyer wrote: > I am trying to use the StreamingUpdateSolrServer to index a bunch of > bibliographic data and it is hanging up every time I run it. Sometimes it > hangs after about 100k records (after about 2 minutes), sometimes after 4M > records (after about 80 minutes) and all different intervals in between. It > appears to be the same issue described here: > > https://issues.apache.org/jira/browse/SOLR-1543 > > The thread dump (included below) seems to indicate that a lock isn't being > released because somewhere in the thread chain after adding a > SolrInputDocument. > > Is there some kind of Solr equivalent to closing a session like you do in an > ORM like Hibernate? > > Thanks, > -Steve > -- > Stephen Meyer > Library Application Developer > UW-Madison Libraries > 312F Memorial Library > 728 State St. > Madison, WI 53706 > > sme...@library.wisc.edu > 608-265-2844 (ph) > > > "Just don't let the human factor fail to be a factor at all." > - Andrew Bird, "Tables and Chairs" > > Full thread dump Java HotSpot(TM) Client VM (1.5.0_22-147 mixed mode): > > "pool-1-thread-6" prio=5 tid=0x00d26d50 nid=0x1043c00 in Object.wait() > [0xb0e0d000..0xb0e0dd90] > at java.lang.Object.wait(Native Method) > - waiting on <0x0bbe29f8> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518) > - locked <0x0bbe29f8> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) > at > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) > at > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:153) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) > at java.lang.Thread.run(Thread.java:613) > > "pool-1-thread-5" prio=5 tid=0x00d11530 nid=0x1042e00 in Object.wait() > [0xb0d8c000..0xb0d8cd90] > at java.lang.Object.wait(Native Method) > - waiting on <0x0bbe29f8> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518) > - locked <0x0bbe29f8> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) > at > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) > at > org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:153) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) > at java.lang.Thread.run(Thread.java:613) > > "MultiThreadedHttpConnectionManager cleanup" daemon prio=5 tid=0x00d13630 > nid=0x10fba00 in Object.wait() [0xb0d0b000..0xb0d0bd90] > at java.lang.Object.wait(Native Method) > - waiting on <0x0bbb0270> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:120) > - locked <0x0bbb0270> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:136) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ReferenceQueueThread.run(
Re: use a solr-built index with lucene?
Are the Trie types in Lucene 2.9.2? Otherwise, be sure to use the old int (or sint?) types in your schema. On Fri, Apr 9, 2010 at 4:12 AM, Erik Hatcher wrote: > Oh, sorry, I got the direction backwards in my initial reply. > > Yes, of course you can use an index from Solr with Lucene directly. It's > just a Lucene index. Just make sure you use the same version of Lucene > (pull the JARs from solr.war, I'd say). For example, you can open a "Solr > index" with Luke. > > If you're using a Lucene app against a live Solr index, be careful with > locking (in short, the default lock setting in solrconfig.xml isn't set for > sharing the index between two processes). > > Erik > > > On Apr 9, 2010, at 3:18 AM, Tommy Chheng wrote: > >> I was thinking of the reverse case: from solr to lucene. lucene doesn't >> use a schema.xml >> >> Tommy Chheng >> Programmer and UC Irvine Graduate Student >> Twitter @tommychheng >> http://tommy.chheng.com >> >> >> On 4/9/10 12:15 AM, Paul Libbrecht wrote: >>> >>> This looks like an interesting avenue for a smooth transition from lucene >>> to solr. >>> >>> thanks for more hints you find around. >>> (e.g. maybe it is not too hard to pre-generate a schema.xml from an >>> actual index for the field-types?) >>> >>> paul >>> >>> >>> Le 09-avr.-10 à 02:32, Erik Hatcher a écrit : >>> Yes... gotta jive with schema.xml though. Erik On Apr 8, 2010, at 7:18 PM, Tommy Chheng wrote: > If i build an index with solr, is it possible to use the index folder > with lucene? > > -- > Tommy Chheng > Programmer and UC Irvine Graduate Student > Twitter @tommychheng > http://tommy.chheng.com > >>> > > -- Lance Norskog goks...@gmail.com
Re: including external files in config by corename
On 4/8/2010 1:15 PM, Chris Hostetter wrote: ...i suspect you want something like... where handlers.xml looks like... The xpointer you mentioned above didn't work. I finally found something that did, though: href="/index/solr/config/requestHandlers.xml#xpointer(/*/node())" /> I wouldn't have found this without your help. A thousand thanks. Shawn
OOM while indexing with Tika
There is a low-level memory "leak" (really an unfortunate retention) in Lucene which can cause OOMs when using the Tika tools on large files like PDF. A patch will be in the trunk sometime soon. http://markmail.org/thread/lhr7wodw4ctsekik https://issues.apache.org/jira/browse/LUCENE-2387 -- Lance Norskog goks...@gmail.com
Solr date "NOW" - format?
I've been trying to work out how SOLR thinks about dates internally so I can boost newer documents. My post_date field is stored as seconds since the epoch, so I think the following is probably what I want. I used 3.17 instead of the 3.16 in all the examples because my own math suggests that's a more accurate number: recip(ms(NOW,product(post_date,1000)),3.17e-11,1,1) Reading the solr 1.4 book, I am not very clear on how to configure qf, bf, and pf in the dismax requestHandler, specifically in regards to using the function above in conjunction with the field-based boosts that I want to try. Is there a place I can go to find some better examples, and find out what all the other fields in the example config do, such as mm? Thanks, Shawn
Re: Solr date "NOW" - format?
The example function seems to round time to years, so you're boosting by year? Your dates are stored as UTC 64-bit longs counting the number of milliseconds since Jan 1, 1970. That's it. They're in milliseconds whether you supplied them that way or not. So I think the example is what you want. Function queries are notoriously slow. Another way to boost by year is with range queries: [NOW-6MONTHS TO NOW]^5.0 , [NOW-1YEARS TO NOW-6MONTHS]^3.0 [NOW-2YEARS TO NOW-1YEARS]^2.0 [* TO NOW-2YEARS]^1.0 Notice that you get to have a non-linear curve when you select the ranges by hand. On Fri, Apr 9, 2010 at 4:32 PM, Shawn Heisey wrote: > I've been trying to work out how SOLR thinks about dates internally so I can > boost newer documents. My post_date field is stored as seconds since the > epoch, so I think the following is probably what I want. I used 3.17 > instead of the 3.16 in all the examples because my own math suggests that's > a more accurate number: > > recip(ms(NOW,product(post_date,1000)),3.17e-11,1,1) > > Reading the solr 1.4 book, I am not very clear on how to configure qf, bf, > and pf in the dismax requestHandler, specifically in regards to using the > function above in conjunction with the field-based boosts that I want to > try. Is there a place I can go to find some better examples, and find out > what all the other fields in the example config do, such as mm? > > Thanks, > Shawn > > -- Lance Norskog goks...@gmail.com
Benchmarking Solr
I am about to deploy Solr into our production environment and I would like to do some benchmarking to determine how many slaves I will need to set up. Currently the only way I know how to benchmark is to use Apache Benchmark but I would like to be able to send random requests to the Solr... not just one request over and over. I have a sample data set of 5000 user entered queries and I would like to be able to use AB to benchmark against all these random queries. Is this possible? FYI our current index is ~1.5 gigs with ~5m documents and we will be using faceting quite extensively. Are average requests per/day is ~2m. We will be running RHEL with about 8-12g ram. Any idea how many slaves might be required to handle our load? Thanks -- View this message in context: http://n3.nabble.com/Benchmarking-Solr-tp709561p709561.html Sent from the Solr - User mailing list archive at Nabble.com.