Re: Collapse with multiple fields
I haven't had time to actually ask this on the list my self but seeing this, I just had to reply. I was wondering this myself. Thijs On 23-10-2009 5:50, R. Tan wrote: Hi, Is it possible to collapse the results from multiple fields? Rih
Re: multicore query via solrJ
As no answer is given, I assume it's not possible. It will be great to code a method like this query(SolrServer, List) El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo < licinio.fernan...@gmail.com> escribió: > Hi there, > is there any way to perform a multi-core query using solrj? > > P.S.: > > I know about this syntax: > http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q= > but i'm looking for a more fancy way to do this using solrj (something like > shards(query) ) > > thx > > > > -- > Lici > -- Lici
Re: multicore query via solrJ
u guessed it right . Solrj cannot query on multiple cores 2009/10/23 Licinio Fernández Maurelo : > As no answer is given, I assume it's not possible. It will be great to code > a method like this > > query(SolrServer, List) > > > > El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo < > licinio.fernan...@gmail.com> escribió: > >> Hi there, >> is there any way to perform a multi-core query using solrj? >> >> P.S.: >> >> I know about this syntax: >> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q= >> but i'm looking for a more fancy way to do this using solrj (something like >> shards(query) ) >> >> thx >> >> >> >> -- >> Lici >> > > > > -- > Lici > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Classloading issues with solr 1.4 and tomcat
Hi there, I'm having trouble getting the latest solr from svn (I'm using trunk from Oct., 22nd, but it didn't work with an earlier revision either) to run in tomcat. I've checked it out, built and ran the tests - all fine. I run the example conf with jetty using the start.jar - all fine Now I copy the example/solr dir to someplace else, copy the war in dist to some webapp dir, configure a webapp in tomcat accoding to http://wiki.apache.org/solr/SolrTomcat, where I set solr/home via JNDI to the directory just created by copying example/solr. I then check solrconfig.xml and make sure solr.data.dir is pointing to the correct location and that the configs are pointing to valid locations When I then start tomcat solr fails and I get the following error: INFO: Solr home set to '/path/to/my/solr-home/' 23.10.2009 10:17:34 org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Reusing parent classloader 23.10.2009 10:17:34 org.apache.solr.servlet.SolrDispatchFilter init SCHWERWIEGEND: Could not start SOLR. Check solr/home property org.apache.solr.common.SolrException: Error loading class 'solr.FastLRUCache' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:273) at org.apache.solr.search.CacheConfig.getConfig(CacheConfig.java:90) at org.apache.solr.search.CacheConfig.getConfig(CacheConfig.java:73) at org.apache.solr.core.SolrConfig.(SolrConfig.java:128) at org.apache.solr.core.SolrConfig.(SolrConfig.java:70) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4450) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:583) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413) Caused by: java.lang.ClassNotFoundException: solr.FastLRUCache at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1387) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1233) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:399) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257) ... 33 more if I uncomment the FastLRU section in solrconfig.xml solr fails to start as well this time with this error: INFO: Solr home set to '/path/to/my/solr-home/' 23.10.2009 10:23:50 org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Reusing parent classloader 23.10.2009 10:23:50 org.apache.solr.core.SolrConfig INFO: Loaded SolrConfig: solrconfig.xml 23.10.2009 10:23:50 org.apache.solr.core.SolrCore INFO: Opening new SolrCore at /path/to/my/solr-home/, dataDir=/path/to/my/solr-home/data/ 23.10.2009 10:23:50 org.apache.solr.schema.IndexSchema readSchema INFO: Reading Solr Schema 23.10.2009 10:23:50 org.apache.solr.schema.IndexSchema readSchema INFO: Schema name=example 23.10.2009 10:23:50 org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created string: org.apache.solr.schema.StrField 23.
SolrJ and Json
Hi , I have following problem: Using CommonsHttpSolrServer (javabin format) I do a query with wt=json and get following response (by using qresponse = solr.query(params); and then qresponse.toString(); {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin, javabin],hl=on,rows=10,version=[1, 1]}},response={numFound=0,start=0,docs=[]},highlighting={}} Now this does not seems to be JSON format (or is it ) ? Should the equal sign not be a ':' and the values surrounded with double quotes ? The problem is that I want to pass the qresponse to a Javascript variable so the client javascript code can then inspect the JSON response and do whatever is needed. What I did was: var str = "<%=qresponse.toString()%>"; but I can't seem to correctly read the str variable as a JSON object and parse it (on the client side). Any ideas or code snippets to show the correct way ? Regards, St. -- View this message in context: http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Collapse with multiple fields
No this actually not supported at the moment. If you really need to collapse on two different field you can concatenate the two fields together in another field while indexing and then collapse on that field. Martijn 2009/10/23 Thijs : > I haven't had time to actually ask this on the list my self but seeing this, > I just had to reply. I was wondering this myself. > > Thijs > > On 23-10-2009 5:50, R. Tan wrote: >> >> Hi, >> Is it possible to collapse the results from multiple fields? >> >> Rih >> >
Re: Constant Score Queries and Function Queries
On Oct 22, 2009, at 9:44 PM, Chris Hostetter wrote: : > Why wouldn't you just query the function directly and leave out the *:* ? : : *:* was just a quick example, I might have other constant score queries, but I : guess I probably could do a filter query plus the function query, too. I guess i don't udnerstand what your point was ... you mentioned that using a function query with *:* didn't produce scores that equal to the function output, but that's just the nature of how a BooleanQuery works, it aggregates the clauses. if you *want* the scores to be a factor of both clauses, then use a booleanQuery, if you have a clause that you don't want factoring into the query, use "fq" Fair enough, I guess I was just kind of expecting a constant score query + a function query to result in a score of whatever the function query is. This is a common trick to sort by a function, but it's easy enough to just ^0 the non function clause. -Grant
Re: SolrJ and Json
CommonsHttpSolrServer will overwrite the wt param depending on the responseParser set.There are only two response parsers. javabin and xml. The qresponse.toString() actually is a String reperesentation of a namedList object . it has nothing to do with JSON On Fri, Oct 23, 2009 at 2:11 PM, SGE0 wrote: > > Hi , > > I have following problem: > Using CommonsHttpSolrServer (javabin format) I do a query with wt=json and > get following response (by using qresponse = solr.query(params); and then > qresponse.toString(); > > {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin, > javabin],hl=on,rows=10,version=[1, > 1]}},response={numFound=0,start=0,docs=[]},highlighting={}} > > Now this does not seems to be JSON format (or is it ) ? > > Should the equal sign not be a ':' and the values surrounded with double > quotes ? > > The problem is that I want to pass the qresponse to a Javascript variable so > the client javascript code can then inspect the JSON response and do > whatever is needed. > > What I did was: > > var str = "<%=qresponse.toString()%>"; > > but I can't seem to correctly read the str variable as a JSON object and > parse it (on the client side). > > Any ideas or code snippets to show the correct way ? > > Regards, > > St. > > > > > > -- > View this message in context: > http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: SolrJ and Json
Hi, thx for the fast response. So, is there a way to convert the response (javabin) to JSON ? Regards, S. Noble Paul നോബിള് नोब्ळ्-2 wrote: > > CommonsHttpSolrServer will overwrite the wt param depending on the > responseParser set.There are only two response parsers. javabin and > xml. > > The qresponse.toString() actually is a String reperesentation of a > namedList object . it has nothing to do with JSON > > On Fri, Oct 23, 2009 at 2:11 PM, SGE0 wrote: >> >> Hi , >> >> I have following problem: >> Using CommonsHttpSolrServer (javabin format) I do a query with wt=json >> and >> get following response (by using qresponse = solr.query(params); and >> then >> qresponse.toString(); >> >> {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin, >> javabin],hl=on,rows=10,version=[1, >> 1]}},response={numFound=0,start=0,docs=[]},highlighting={}} >> >> Now this does not seems to be JSON format (or is it ) ? >> >> Should the equal sign not be a ':' and the values surrounded with double >> quotes ? >> >> The problem is that I want to pass the qresponse to a Javascript variable >> so >> the client javascript code can then inspect the JSON response and do >> whatever is needed. >> >> What I did was: >> >> var str = "<%=qresponse.toString()%>"; >> >> but I can't seem to correctly read the str variable as a JSON object and >> parse it (on the client side). >> >> Any ideas or code snippets to show the correct way ? >> >> Regards, >> >> St. >> >> >> >> >> >> -- >> View this message in context: >> http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com > > -- View this message in context: http://www.nabble.com/SolrJ-and-Json-tp26022705p26025339.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrJ and Json
Why don't you directly hit Solr with wt=json? That will give you the output as JSON On Fri, Oct 23, 2009 at 5:53 PM, SGE0 wrote: > > Hi, > > thx for the fast response. > > So, is there a way to convert the response (javabin) to JSON ? > > Regards, > > S. > > > > > > > Noble Paul നോബിള് नोब्ळ्-2 wrote: >> >> CommonsHttpSolrServer will overwrite the wt param depending on the >> responseParser set.There are only two response parsers. javabin and >> xml. >> >> The qresponse.toString() actually is a String reperesentation of a >> namedList object . it has nothing to do with JSON >> >> On Fri, Oct 23, 2009 at 2:11 PM, SGE0 wrote: >>> >>> Hi , >>> >>> I have following problem: >>> Using CommonsHttpSolrServer (javabin format) I do a query with wt=json >>> and >>> get following response (by using qresponse = solr.query(params); and >>> then >>> qresponse.toString(); >>> >>> {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin, >>> javabin],hl=on,rows=10,version=[1, >>> 1]}},response={numFound=0,start=0,docs=[]},highlighting={}} >>> >>> Now this does not seems to be JSON format (or is it ) ? >>> >>> Should the equal sign not be a ':' and the values surrounded with double >>> quotes ? >>> >>> The problem is that I want to pass the qresponse to a Javascript variable >>> so >>> the client javascript code can then inspect the JSON response and do >>> whatever is needed. >>> >>> What I did was: >>> >>> var str = "<%=qresponse.toString()%>"; >>> >>> but I can't seem to correctly read the str variable as a JSON object and >>> parse it (on the client side). >>> >>> Any ideas or code snippets to show the correct way ? >>> >>> Regards, >>> >>> St. >>> >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> - >> Noble Paul | Principal Engineer| AOL | http://aol.com >> >> > > -- > View this message in context: > http://www.nabble.com/SolrJ-and-Json-tp26022705p26025339.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re:Classloading issues with solr 1.4 and tomcat
Forget my mail about that, there was an old 1.3 webapp interfering with the new webapp and I didn't immediately relise that Sorry for the noise Jörg
CAS client configuration with MT4-PHP.
Hi, We have CAS server of spring integrated and it is running in apache. We have application in MovableType4 - PHP. Is it possible to configure the MT4 authentication module to redirect to external CAS server when the application recieves login request? It would be helpful if there is any document available for this. Thanks in advance.
QTime always a multiple of 50ms ?
Hi all, I'm using Solr trunk from 2009-10-12 and I noticed that the QTime result is always a multiple of roughly 50ms, regardless of the used handler. For instance, for the update handler, I get : INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0 INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104 INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52 ... Is this a known issue ? Cheers! J. -- Jerome Eteve. http://www.eteve.net jer...@eteve.net
help with how to search using spaces in the query for string fields...
I'm having a problem with figuring out how to search for things that have spaces (just a single space character) in them. For example, I have a field named "FileName" and it is of type string. I've indexed a couple of documents, that have field FileName equal to "File 10 10AM" and another that has FileName "File 11 11AM". In my search query, I'm trying "Filename:(File 1*)" and I'd like to it return both documents. It return none. If I search for "Filename:(File*)" I get both of them and everything else. I've tried lots of different ways to form the query, but the only thing that returns any documents is the "FileName:(File*)" form. Anything else with an actual space in it fails. This has got to be another simple thing that I'm missing, but haven't figured it out yet nor stumbled upon the correct search query. Help please! The above scenario is an example, but I am using the string field type. -Dan -- Dan A. Dickey | Senior Software Engineer Savvis 10900 Hampshire Ave. S., Bloomington, MN 55438 Office: 952.852.4803 | Fax: 952.852.4951 E-mail: dan.dic...@savvis.net
number of Solr indexes per Tomcat instance
Hi, Currently we're running 10 Solr indexes inside a single Tomcat6 instance. In the near future we would like to add another 30-40 indexes to every Tomcat instance we host. What are the factors we have to take into account when planning for such deployments? Obviously we do know the sizes of the indexes but for example how much memory does Solr need to be allocated given that each index is treated as a webapp in Tomcat. Also, do you know if Tomcat has got a limit in number of apps that can be deployed (maybe I should ask this questions in a Tomcat forum). Thanks E -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26027238.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: number of Solr indexes per Tomcat instance
Are you using one single solr instance with multicore or multiple solr instances with one index each? Erik_l wrote: > > Hi, > > Currently we're running 10 Solr indexes inside a single Tomcat6 instance. > In the near future we would like to add another 30-40 indexes to every > Tomcat instance we host. What are the factors we have to take into account > when planning for such deployments? Obviously we do know the sizes of the > indexes but for example how much memory does Solr need to be allocated > given that each index is treated as a webapp in Tomcat. Also, do you know > if Tomcat has got a limit in number of apps that can be deployed (maybe I > should ask this questions in a Tomcat forum). > > Thanks > E > -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26027304.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: CAS client configuration with MT4-PHP.
Is it a query related to Solr ? On Fri, Oct 23, 2009 at 6:46 PM, Radha C. wrote: > Hi, > > We have CAS server of spring integrated and it is running in apache. We have > application in MovableType4 - PHP. > Is it possible to configure the MT4 authentication module to redirect to > external CAS server when the application recieves login request? > It would be helpful if there is any document available for this. > > Thanks in advance. > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: help with how to search using spaces in the query for string fields...
On Friday 23 October 2009 09:36:02 am AHMET ARSLAN wrote: > > --- On Fri, 10/23/09, Dan A. Dickey wrote: > > > From: Dan A. Dickey > > Subject: help with how to search using spaces in the query for string > > fields... > > To: solr-user@lucene.apache.org > > Date: Friday, October 23, 2009, 5:12 PM > > I'm having a problem with figuring > > out how to search for things > > that have spaces (just a single space character) in them. > > For example, I have a field named "FileName" and it is of > > type string. > > I've indexed a couple of documents, that have field > > FileName > > equal to "File 10 10AM" and another that has FileName "File > > 11 11AM". > > In my search query, I'm trying "Filename:(File 1*)" and I'd > > like to it > > return both documents. It return none. > > If I search for "Filename:(File*)" I get both of them and > > everything else. > > I've tried lots of different ways to form the query, but > > the only thing > > that returns any documents is the "FileName:(File*)" > > form. Anything > > else with an actual space in it fails. > > This has got to be another simple thing that I'm missing, > > but haven't > > figured it out yet nor stumbled upon the correct search > > query. > > Help please! The above scenario is an example, but I > > am using > > the string field type. > > -Dan > > You need to escape space. Try this one : Filename:(File\ 1*) Ahmet, When I first saw your suggestion - I thought - I tried that. It didn't work. I went and did it again - it works beautifully! Evidently, I *did not* try it. You're a genius! Thank you again! -Dan -- Dan A. Dickey | Senior Software Engineer Savvis 10900 Hampshire Ave. S., Bloomington, MN 55438 Office: 952.852.4803 | Fax: 952.852.4951 E-mail: dan.dic...@savvis.net
NGram query failing
I have a requirement to be able to find hits within words in a free-form id field. The field can have any type of alphanumeric data - it's as likely it will be something like "123456" as it is to be "SUN-123-ABC". I thought of using NGrams to accomplish the task, but I'm having a problem. I set up a field like this After indexing a field like this, the analysis page indicates my queries should work. If I give it a sample field value of "ABC-123456-SUN" and a query value of "45" it shows hits in several places, which is what I expected. However, when I actually query the field with something like "45" I get no hits back. Looking at the debugQuery output, it looks like it's taking my analyzed query text and putting it into a phrase query. So, for a query of "45" it turns into a phrase query of :"4 5 45" which then doesn't hit on anything in my index. What am I missing to make this work? - Charlie
Re: help with how to search using spaces in the query for string fields...
--- On Fri, 10/23/09, Dan A. Dickey wrote: > From: Dan A. Dickey > Subject: help with how to search using spaces in the query for string > fields... > To: solr-user@lucene.apache.org > Date: Friday, October 23, 2009, 5:12 PM > I'm having a problem with figuring > out how to search for things > that have spaces (just a single space character) in them. > For example, I have a field named "FileName" and it is of > type string. > I've indexed a couple of documents, that have field > FileName > equal to "File 10 10AM" and another that has FileName "File > 11 11AM". > In my search query, I'm trying "Filename:(File 1*)" and I'd > like to it > return both documents. It return none. > If I search for "Filename:(File*)" I get both of them and > everything else. > I've tried lots of different ways to form the query, but > the only thing > that returns any documents is the "FileName:(File*)" > form. Anything > else with an actual space in it fails. > This has got to be another simple thing that I'm missing, > but haven't > figured it out yet nor stumbled upon the correct search > query. > Help please! The above scenario is an example, but I > am using > the string field type. > -Dan You need to escape space. Try this one : Filename:(File\ 1*)
Re: number of Solr indexes per Tomcat instance
We're not using multicore. Today, one Tomcat instance host a number of indexes in form of 10 Solr indexes (10 individual war files). Marc Sturlese wrote: > > Are you using one single solr instance with multicore or multiple solr > instances with one index each? > > Erik_l wrote: >> >> Hi, >> >> Currently we're running 10 Solr indexes inside a single Tomcat6 instance. >> In the near future we would like to add another 30-40 indexes to every >> Tomcat instance we host. What are the factors we have to take into >> account when planning for such deployments? Obviously we do know the >> sizes of the indexes but for example how much memory does Solr need to be >> allocated given that each index is treated as a webapp in Tomcat. Also, >> do you know if Tomcat has got a limit in number of apps that can be >> deployed (maybe I should ask this questions in a Tomcat forum). >> >> Thanks >> E >> > > -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26028083.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: number of Solr indexes per Tomcat instance
Probably multicore would give you better performance... I think most important factors to take into account are the size of the index and the traffic you have to hold. With enought RAM memory you can hold 40 cores in a singe solr instance (or even more) but depending on the traffic you have to hold you will suffer of slow response times. Erik_l wrote: > > We're not using multicore. Today, one Tomcat instance host a number of > indexes in form of 10 Solr indexes (10 individual war files). > > > Marc Sturlese wrote: >> >> Are you using one single solr instance with multicore or multiple solr >> instances with one index each? >> >> Erik_l wrote: >>> >>> Hi, >>> >>> Currently we're running 10 Solr indexes inside a single Tomcat6 >>> instance. In the near future we would like to add another 30-40 indexes >>> to every Tomcat instance we host. What are the factors we have to take >>> into account when planning for such deployments? Obviously we do know >>> the sizes of the indexes but for example how much memory does Solr need >>> to be allocated given that each index is treated as a webapp in Tomcat. >>> Also, do you know if Tomcat has got a limit in number of apps that can >>> be deployed (maybe I should ask this questions in a Tomcat forum). >>> >>> Thanks >>> E >>> >> >> > > -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26028437.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: QTime always a multiple of 50ms ?
Jérôme Etévé wrote: Hi all, I'm using Solr trunk from 2009-10-12 and I noticed that the QTime result is always a multiple of roughly 50ms, regardless of the used handler. For instance, for the update handler, I get : INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0 INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104 INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52 ... Is this a known issue ? It may be an issue with System.currentTimeMillis() resolution on some platforms (e.g. Windows)? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Result missing from query, but match shows in Field Analysis tool
Hi, I have a field in my index called related_ids, indexed and stored, with the following field type: Several records in my index contain the token 1cuk in the related_ids field, but only *some* of them are returned when I query on this. e.g. if I send a query like this: http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids I get a single hit for the record with id:2.40.50 . But if I try this, on a different record with id:2.40 : http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids I get no hits. However, if I just query for id:2.40 ... http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids I can clearly see the token "1cuk" in the related_ids field. Not only that, but if I copy and paste record 2.40's related_ids field into the Field Analysis tool in the admin interface, and search on "1cuk", the term 1cuk is visible in the index analyzer's term list, and highlighted! So Field Analysis thinks that I *should* be getting a hit for this term. Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting a brick wall here. If it makes any difference, related_ids for the culprit record 2.40 is large-ish but not enormous (31000 terms). Also I've tried stopping and restarting Solr in case it was some weird caching thing. Thanks in advance, Andrew. -- View this message in context: http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html Sent from the Solr - User mailing list archive at Nabble.com.
Solrj client API and response in XML format (Solr 1.4)
Hi All, After a day of searching I'm quite confused. I use the solrj client as follows: CommonsHttpSolrServer solr = new CommonsHttpSolrServer("http://127.0.0.1:8080/apache-solr-1.4-dev/test";); solr.setRequestWriter(new BinaryRequestWriter()); ModifiableSolrParams params = new ModifiableSolrParams(); params.set("qt", "dismax"); params.set("indent", "on"); params.set("version", "2.2"); params.set("q", "test"); params.set("start", "0"); params.set("rows", "10"); params.set("wt", "xml"); params.set("hl", "on"); QueryResponse response = solr.query(params); How can I get the query result (response) in XML format out f? I know it sounds stupid but I can't seem to manage that. What do I need to do with the response object to get the response in XML format ? I already understood I can"t get the result in JSON so my idea was to go from XML to JSON. Thx for your answer already ! S. System.out.println("response = " + response); SolrDocumentList sdl = response.getResults(); -- View this message in context: http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26029197.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: number of Solr indexes per Tomcat instance
I ran into trouble running several cores (either as Solr multi-core or as separate web apps) in a single JVM because the Java garbage collector would freeze all cores during a collection. This may not be an issue if you're not dealing with large amounts of memory. My solution is to run each web app in its own JVM and Tomcat instance. -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26029243.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result missing from query, but match shows in Field Analysis tool
I'm really reaching here, but lucene only indexes the first 10,000 terms by default (you can up the limit). Is there a chancethat you're hitting that limit? That 1cuk is past the 10,000th term in record 2.40? For this to be possible, I have to assume that the FieldAnalysis tool ignores this limit FWIW Erick On Fri, Oct 23, 2009 at 12:01 PM, Andrew Clegg wrote: > > Hi, > > I have a field in my index called related_ids, indexed and stored, with the > following field type: > > > positionIncrementGap="100"> > > pattern="\W*\s+\W*" /> > > > > > Several records in my index contain the token 1cuk in the related_ids > field, > but only *some* of them are returned when I query on this. e.g. if I send a > query like this: > > > http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids > > I get a single hit for the record with id:2.40.50 . But if I try this, on a > different record with id:2.40 : > > > http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids > > I get no hits. However, if I just query for id:2.40 ... > > > http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids > > I can clearly see the token "1cuk" in the related_ids field. > > Not only that, but if I copy and paste record 2.40's related_ids field into > the Field Analysis tool in the admin interface, and search on "1cuk", the > term 1cuk is visible in the index analyzer's term list, and highlighted! So > Field Analysis thinks that I *should* be getting a hit for this term. > > Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting a > brick wall here. > > If it makes any difference, related_ids for the culprit record 2.40 is > large-ish but not enormous (31000 terms). Also I've tried stopping and > restarting Solr in case it was some weird caching thing. > > Thanks in advance, > > Andrew. > > -- > View this message in context: > http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Result missing from query, but match shows in Field Analysis tool
That's probably it! It is quite near the end of the field. I'll try upping it and re-indexing. Thanks :-) Erick Erickson wrote: > > I'm really reaching here, but lucene only indexes the first 10,000 terms > by > default (you can up the limit). Is there a chancethat you're hitting that > limit? That 1cuk is past the 10,000th term > in record 2.40? > > For this to be possible, I have to assume that the FieldAnalysis > tool ignores this limit > > FWIW > Erick > > On Fri, Oct 23, 2009 at 12:01 PM, Andrew Clegg > wrote: > >> >> Hi, >> >> I have a field in my index called related_ids, indexed and stored, with >> the >> following field type: >> >> >>> positionIncrementGap="100"> >> >>> pattern="\W*\s+\W*" /> >> >> >> >> >> Several records in my index contain the token 1cuk in the related_ids >> field, >> but only *some* of them are returned when I query on this. e.g. if I send >> a >> query like this: >> >> >> http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids >> >> I get a single hit for the record with id:2.40.50 . But if I try this, on >> a >> different record with id:2.40 : >> >> >> http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids >> >> I get no hits. However, if I just query for id:2.40 ... >> >> >> http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids >> >> I can clearly see the token "1cuk" in the related_ids field. >> >> Not only that, but if I copy and paste record 2.40's related_ids field >> into >> the Field Analysis tool in the admin interface, and search on "1cuk", the >> term 1cuk is visible in the index analyzer's term list, and highlighted! >> So >> Field Analysis thinks that I *should* be getting a hit for this term. >> >> Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting >> a >> brick wall here. >> >> If it makes any difference, related_ids for the culprit record 2.40 is >> large-ish but not enormous (31000 terms). Also I've tried stopping and >> restarting Solr in case it was some weird caching thing. >> >> Thanks in advance, >> >> Andrew. >> >> -- >> View this message in context: >> http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029417.html Sent from the Solr - User mailing list archive at Nabble.com.
Issues adding document to EmbeddedSolrServer
Hi everybody, I just started playing with Solr and think of it as a quite useful tool! I'm using Solrj (Solr 1.3) in combination with an EmbeddedSolrServer. I managed to get the server running and implemented a method (following the Solrj Wiki) to create a document and add it to the server's index. The method looks like the following: public void fillIndex() { SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField("id", "id1", 1.0f); doc1.addField("name", "doc1", 1.0f); doc1.addField("price", 10); try { server.add(doc1); } catch (Exception e) { e.printStackTrace(); } try { server.commit(true, true); } catch (Exception e) { e.printStackTrace(); } } My problem now is, that the method server.add() never finishes which leads the the whole fillIndex() method to crash. It's like it throws an exception, which is not catched and the server.commit() is never executed. I already used the maxTime configuration in the solrconfig.xml to commit new documents automatically. This looks like the following: 1 1000 This works. But I want the explicit commit to work, as this looks like the way it should be done! In addition, this would give me better control over adding new stuff. I assume this problem won't be the big challenge for an expert. :) Any hints are appreciated!! Thanks in advance. Egon -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser
Re: number of Solr indexes per Tomcat instance
That's a really good point. I didn't think about the GCs. Obviously we don't want to have all the indexes hanging if full GC occur. Wee running a +8GB heap so GCs are very important to us. Thanks Erik wojtekpia wrote: > > I ran into trouble running several cores (either as Solr multi-core or as > separate web apps) in a single JVM because the Java garbage collector > would freeze all cores during a collection. This may not be an issue if > you're not dealing with large amounts of memory. My solution is to run > each web app in its own JVM and Tomcat instance. > > -- View this message in context: http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26029654.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Collapse with multiple fields
Clever, I think that would work in some cases. On Fri, Oct 23, 2009 at 5:22 PM, Martijn v Groningen < martijn.is.h...@gmail.com> wrote: > No this actually not supported at the moment. If you really need to > collapse on two different field you can concatenate the two fields > together in another field while indexing and then collapse on that > field. > > Martijn > > 2009/10/23 Thijs : > > I haven't had time to actually ask this on the list my self but seeing > this, > > I just had to reply. I was wondering this myself. > > > > Thijs > > > > On 23-10-2009 5:50, R. Tan wrote: > >> > >> Hi, > >> Is it possible to collapse the results from multiple fields? > >> > >> Rih > >> > > >
keep index in production and snapshots in separate phisical disks
Is there any way to make snapinstaller install the index in spanpshot20091023124543 (for example) from another disk? I am asking this because I would like not to optimize the index in the master (if I do that it takes a long time to send it via rsync if it is so big). This way I would just have to send the new segments. In the slave I would have 2 phisical disks. Snappuller would send the snapshot to a disk (here the index would not be optimized). Snapinstaller would install the snapshot in the other disk, optimize it and open the newIndexReader. The optimization should be done in the disk wich contains the "not in production index" to not affect the search request speed. Any idea what should I hack to reach this goal in case it is possible? -- View this message in context: http://www.nabble.com/keep-index-in-production-and-snapshots-in-separate-phisical-disks-tp26029666p26029666.html Sent from the Solr - User mailing list archive at Nabble.com.
Is optimized?
Folks: If I issue two requests with no intervening changes to the index, will the second optimize request be smart enough to not do anything? Thanks, Bill
Too many open files
Hi, I am getting too many open files error. Usually I test on a server that has 4GB RAM and assigned 1GB for tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this server and has following setting for SolrConfig.xml true 1024 100 2147483647 1 In my case 200,000 documents is of 1024MB size and in this testing, I am indexing total of million documents. We have high setting because we are expected to index about 10+ million records in production. It works fine in this server. When I deploy same solr configuration on a server with 32GB RAM, I get "too many open files" error. The ulimit -n is 1024 for this server. Any idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit too low? Also I don't find any documentation for . I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am using Solr 1.3. Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs? What does ramBufferSizeMB mean? My understanding is that when documents added to index which are initially stored in memory reaches size 1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total memory used(by tomcat, etc) reaches 1024, it flushed data to disk? Thanks, Sharmila
Re: QTime always a multiple of 50ms ?
2009/10/23 Andrzej Bialecki : > Jérôme Etévé wrote: >> >> Hi all, >> >> I'm using Solr trunk from 2009-10-12 and I noticed that the QTime >> result is always a multiple of roughly 50ms, regardless of the used >> handler. >> >> For instance, for the update handler, I get : >> >> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0 >> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104 >> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52 >> ... >> >> Is this a known issue ? > > It may be an issue with System.currentTimeMillis() resolution on some > platforms (e.g. Windows)? I don't know, I'm using linux 2.6.22 and a jvm 1.6.0 -- Jerome Eteve. http://www.eteve.net jer...@eteve.net
field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)
seems to happen when sort on anything besides strictly score, even score desc, num desc triggers it, using latest nightly and 10/14 patch Problem accessing /solr/core1/select. Reason: 4731592 java.lang.ArrayIndexOutOfBoundsException: 4731592 at org.apache.lucene.search.FieldComparator$StringOrdValComparator.copy(FieldComparator.java:660) at org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentComparator.compare(NonAdjacentDocumentCollapser.java:235) at org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentPriorityQueue.lessThan(NonAdjacentDocumentCollapser.java:173) at org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158) at org.apache.solr.search.NonAdjacentDocumentCollapser.doCollapsing(NonAdjacentDocumentCollapser.java:95) at org.apache.solr.search.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:208) at org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:98) at org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:66) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)
Where the new replication pulls the files?
Hi all, I'm wondering where a slave pulls the files from the master on replication. Is it directly to the index/ directory or is it somewhere else before it's completed and gets copied to index? Cheers! Jerome. -- Jerome Eteve. http://www.eteve.net jer...@eteve.net
RE: Too many open files
Make it 10: 10 -Fuad > -Original Message- > From: Ranganathan, Sharmila [mailto:sranganat...@library.rochester.edu] > Sent: October-23-09 1:08 PM > To: solr-user@lucene.apache.org > Subject: Too many open files > > Hi, > > I am getting too many open files error. > > Usually I test on a server that has 4GB RAM and assigned 1GB for > tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > server and has following setting for SolrConfig.xml > > > > true > > 1024 > > 100 > > 2147483647 > > 1 > > > > In my case 200,000 documents is of 1024MB size and in this testing, I am > indexing total of million documents. We have high setting because we are > expected to index about 10+ million records in production. It works fine > in this server. > > > > When I deploy same solr configuration on a server with 32GB RAM, I get > "too many open files" error. The ulimit -n is 1024 for this server. Any > idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit > too low? Also I don't find any documentation for . > I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am > using Solr 1.3. > > > > Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs? What does > ramBufferSizeMB mean? My understanding is that when documents added to > index which are initially stored in memory reaches size > 1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total > memory used(by tomcat, etc) reaches 1024, it flushed data to disk? > > > > Thanks, > > Sharmila > > > > > > > > > > > > > >
RE: Too many open files
> 1024 Ok, it will lower frequency of Buffer flush to disk (buffer flush happens when it reaches capacity, due commit, etc.); it will improve performance. It is internal buffer used by Lucene. It is not total memory of Tomcat... > 100 It will deal with 100 Segments, and each segment will consist on number of files (equal to number of fields) - you may have 20 fields, 2000 files... For many such applications, set ulimit to 65536. You never know how many files you will need (including log files of Tomcat, class files, config files, image/css/html files, etc...) Even with 10 Lucene segments (mergeFactor), 10 files each, (100 files) Lucene may need much more during commit/optimize... -Fuad > -Original Message- > From: Ranganathan, Sharmila [mailto:sranganat...@library.rochester.edu] > Sent: October-23-09 1:08 PM > To: solr-user@lucene.apache.org > Subject: Too many open files > > Hi, > > I am getting too many open files error. > > Usually I test on a server that has 4GB RAM and assigned 1GB for > tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > server and has following setting for SolrConfig.xml > > > > true > > 1024 > > 100 > > 2147483647 > > 1 > > > > In my case 200,000 documents is of 1024MB size and in this testing, I am > indexing total of million documents. We have high setting because we are > expected to index about 10+ million records in production. It works fine > in this server. > > > > When I deploy same solr configuration on a server with 32GB RAM, I get > "too many open files" error. The ulimit -n is 1024 for this server. Any > idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit > too low? Also I don't find any documentation for . > I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am > using Solr 1.3. > > > > Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs? What does > ramBufferSizeMB mean? My understanding is that when documents added to > index which are initially stored in memory reaches size > 1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total > memory used(by tomcat, etc) reaches 1024, it flushed data to disk? > > > > Thanks, > > Sharmila > > > > > > > > > > > > > >
RE: Too many open files
I was partially wrong; this is what Mike McCandless (Lucene-in-Action, 2nd edition) explained at Manning forum: mergeFactor of 1000 means you will have up to 1000 segments at each level. A level 0 segment means it was flushed directly by IndexWriter. After you have 1000 such segments, they are merged into a single level 1 segment. Once you have 1000 level 1 segments, they are merged into a single level 2 segment, etc. So, depending on how many docs you add to your index, you'll could have 1000s of segments w/ mergeFactor=1000. http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 So, in case of mergeFactor=100 you may have (theoretically) 1000 segments, 10-20 files each (depending on schema)... mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you need at least double Java heap, but you have -Xmx1024m... -Fuad > > I am getting too many open files error. > > Usually I test on a server that has 4GB RAM and assigned 1GB for > tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > server and has following setting for SolrConfig.xml > > > > true > > 1024 > > 100 > > 2147483647 > > 1 >
Re: Too many open files
I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. Fuad Efendi wrote: > I was partially wrong; this is what Mike McCandless (Lucene-in-Action, 2nd > edition) explained at Manning forum: > > mergeFactor of 1000 means you will have up to 1000 segments at each level. > A level 0 segment means it was flushed directly by IndexWriter. > After you have 1000 such segments, they are merged into a single level 1 > segment. > Once you have 1000 level 1 segments, they are merged into a single level 2 > segment, etc. > So, depending on how many docs you add to your index, you'll could have > 1000s of segments w/ mergeFactor=1000. > > http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 > > > So, in case of mergeFactor=100 you may have (theoretically) 1000 segments, > 10-20 files each (depending on schema)... > > > mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you > need at least double Java heap, but you have -Xmx1024m... > > > -Fuad > > > >> I am getting too many open files error. >> >> Usually I test on a server that has 4GB RAM and assigned 1GB for >> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this >> server and has following setting for SolrConfig.xml >> >> >> >> true >> >> 1024 >> >> 100 >> >> 2147483647 >> >> 1 >> >> > > > -- - Mark http://www.lucidimagination.com
Re: Environment Timezone being considered when using SolrJ
Hi Hoss, Thanks for the clarification. I've a wrote a Unit Test in order to simulate the date processing. A high level detail of this problem is that it occurs only when used the JavaBin custom format (&wt=javabin), in this case the dates get back set with environment UTC offset coordinates. On Thu, Oct 22, 2009 at 11:41 PM, Chris Hostetter wrote: > > : When using SolrJ I've realized document dates are being modified > according > : to the environment UTC timezone. The timezone is being set in the inner > : class ISO8601CanonicalDateFormat of DateField class. > > The dates aren't "modified" based on UTC, they are formated in UTC before > being written to the Lucene index so that no matter what the current > locale is the index format is consistent. > yes, dates are consistent at index. > > : I've read some posts where people say Solr should be most locale and > culture > : agnostic. So, what's the purpose for that timezone processing before > > The use of UTC is specificly to be agnostic of where the server is > running. Any client, any where in the world, using any TimeZone can query > any solr server, running in any JVM, and know that the dates it gets back > are formated in UTC. > > : Code to simulate issue: > > I don't actaully see any "issue" being simulated in this code, can you > elaborate on how exactly it's behaving in a way that's inconsistent with > your expectaitons? (making it a JUNit TestCase that uses assserts to fail > where you are getting data you don't expect is pretty must the universal > way to describe a bug) > import static org.junit.Assert.assertEquals; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.lucene.document.Field; import org.apache.lucene.document.Field.Index; import org.apache.lucene.document.Field.Store; import org.apache.solr.schema.DateField; import org.junit.Test; public class DateFieldTest { @Test public void shouldReturnSameDateValueWhenDateFieldIsUsedToParseDates() throws ParseException { //Given String originalDateString = "2010-10-10T10:10:10Z"; //When Field field = new Field("field",originalDateString,Store.NO,Index.ANALYZED); DateField dateField = new DateField(); SimpleDateFormat dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm:ss'Z'"); Date originalDateObject = dateFormat.parse(originalDateString); Date parsedDate = dateField.toObject(field); //Then assertEquals(originalDateObject, parsedDate); /* TO MAKE TEST PASS * Solr 1.3 * * Comment line 271 at org.apache.solr.schema.DateField * this.setTimeZone(CANONICAL_TZ); */ } } > > My guess would be that you are getting confused by the fact that > Date.toString() uses the default locale of your JVM to generate a string, > which is why the data getting printed out doesn't match the hardcoded > value in your code... > > : System.out.println(dateField.toObject(field)); > > but if you take any Date object you want, print it's toString(), index it, > and then take that indexed string representation convert it back into a > Date (using dateField.toOBject()) you should originalDate.equals(newDate). > I was expecting this behaviour and I get it when performnig an HTTP query and the XMLResponseWriter is used. But the same does not occur when used the BinaryResponseWriter. > > > > -Hoss > > Thanks! Michel
RE: NGram query failing
Well, I fixed my own problem in the end. For the record, this is the schema I ended up going with: I could have left it a trigram but went with a bigram because with this setup, I can get queries to properly hit as long as the min/max gram size is met. In other words, for any queries two or more characters long, this works for me. Less than two characters and it fails. I don't know exactly why that is, but I'll take it anyway! - Charlie -Original Message- From: Charlie Jackson [mailto:charlie.jack...@cision.com] Sent: Friday, October 23, 2009 10:00 AM To: solr-user@lucene.apache.org Subject: NGram query failing I have a requirement to be able to find hits within words in a free-form id field. The field can have any type of alphanumeric data - it's as likely it will be something like "123456" as it is to be "SUN-123-ABC". I thought of using NGrams to accomplish the task, but I'm having a problem. I set up a field like this After indexing a field like this, the analysis page indicates my queries should work. If I give it a sample field value of "ABC-123456-SUN" and a query value of "45" it shows hits in several places, which is what I expected. However, when I actually query the field with something like "45" I get no hits back. Looking at the debugQuery output, it looks like it's taking my analyzed query text and putting it into a phrase query. So, for a query of "45" it turns into a phrase query of :"4 5 45" which then doesn't hit on anything in my index. What am I missing to make this work? - Charlie
New Technical White Papers on Apache Lucene 2.9 and Solr 1.4 from Lucid Imagination
Hi - FYI, Lucid's just put out a two white papers, one on Apache Lucene 2.9 and one on Apache Solr 1.4: - "What's New in Lucene 2.9" covers range of performance improvements and new features (per segment indexing, trierange numeric analysis, and more), along with recommendations for upgrading your Lucene application to the 2.9 release. Download (reg required) at http://www.lucidimagination.com/whitepaper/whats-new-in-lucene-2-9?sc=AP. - “What’s New in Solr 1.4.” also covers its performance and feature improvements (such as improved Data Import Handler, java-based replication, rich document acquisition and more) . Download (reg required) at http://www.lucidimagination.com/whitepaper/whats-new-in-solr-1-4?sc=AP Tom www.lucidimagination.com
Re: multicore query via solrJ
Hi Lici, You may want to try the following snippet --- SolrServer solr = new CommonsHttpSolrServer("http://localhost:8983/solr";); // ModifiableSolrParams params = new ModifiableSolrParams(); params.set("wt", "json"); // Can be json,standard.. params.set("rows", RowsToFetch); // Total # of rows to fetch params.set("start", StartingRow); // Starting record params.set("shards", "localhost:8983/solr,localhost:8984/solr,localhost:8985/solr"); // Shard URL . . . params.set("q", queryStr.toString()); // User Query QueryResponse response = solr.query(params); SolrDocumentList docs = response.getResults(); --- Thanks, sS --- On Fri, 10/23/09, Licinio Fernández Maurelo wrote: > From: Licinio Fernández Maurelo > Subject: Re: multicore query via solrJ > To: solr-user@lucene.apache.org > Date: Friday, October 23, 2009, 7:30 AM > As no answer is given, I assume it's > not possible. It will be great to code > a method like this > > query(SolrServer, List) > > > > El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo > < > licinio.fernan...@gmail.com> > escribió: > > > Hi there, > > is there any way to perform a multi-core query using > solrj? > > > > P.S.: > > > > I know about this syntax: > > http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q= > > but i'm looking for a more fancy way to do this using > solrj (something like > > shards(query) ) > > > > thx > > > > > > > > -- > > Lici > > > > > > -- > Lici >
RE: Too many open files
Reason of having big RAM buffer is lowering frequency of IndexWriter flushes and (subsequently) lowering frequency of index merge events, and (subsequently) merging of a few larger files takes less time... especially if RAM Buffer is intelligent enough (and big enough) to deal with 100 concurrent updates of existing document without 100-times flushing to disk of 100 document versions. I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes merge, and 1 minute update) with default SOLR settings (32Mb buffer). I increased buffer to 8Gb on Master, and it triggered significant indexing performance boost... -Fuad http://www.linkedin.com/in/liferay > -Original Message- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: October-23-09 3:03 PM > To: solr-user@lucene.apache.org > Subject: Re: Too many open files > > I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. > > Fuad Efendi wrote: > > I was partially wrong; this is what Mike McCandless (Lucene-in-Action, 2nd > > edition) explained at Manning forum: > > > > mergeFactor of 1000 means you will have up to 1000 segments at each level. > > A level 0 segment means it was flushed directly by IndexWriter. > > After you have 1000 such segments, they are merged into a single level 1 > > segment. > > Once you have 1000 level 1 segments, they are merged into a single level 2 > > segment, etc. > > So, depending on how many docs you add to your index, you'll could have > > 1000s of segments w/ mergeFactor=1000. > > > > http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 > > > > > > So, in case of mergeFactor=100 you may have (theoretically) 1000 segments, > > 10-20 files each (depending on schema)... > > > > > > mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you > > need at least double Java heap, but you have -Xmx1024m... > > > > > > -Fuad > > > > > > > >> I am getting too many open files error. > >> > >> Usually I test on a server that has 4GB RAM and assigned 1GB for > >> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > >> server and has following setting for SolrConfig.xml > >> > >> > >> > >> true > >> > >> 1024 > >> > >> 100 > >> > >> 2147483647 > >> > >> 1 > >> > >> > > > > > > > > > -- > - Mark > > http://www.lucidimagination.com > >
Re: Too many open files
8 GB is much larger than is well supported. Its diminishing returns over 40-100 and mostly a waste of RAM. Too high and things can break. It should be well below 2 GB at most, but I'd still recommend 40-100. Fuad Efendi wrote: > Reason of having big RAM buffer is lowering frequency of IndexWriter flushes > and (subsequently) lowering frequency of index merge events, and > (subsequently) merging of a few larger files takes less time... especially > if RAM Buffer is intelligent enough (and big enough) to deal with 100 > concurrent updates of existing document without 100-times flushing to disk > of 100 document versions. > > I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes > merge, and 1 minute update) with default SOLR settings (32Mb buffer). I > increased buffer to 8Gb on Master, and it triggered significant indexing > performance boost... > > -Fuad > http://www.linkedin.com/in/liferay > > > >> -Original Message- >> From: Mark Miller [mailto:markrmil...@gmail.com] >> Sent: October-23-09 3:03 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Too many open files >> >> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. >> >> Fuad Efendi wrote: >> >>> I was partially wrong; this is what Mike McCandless (Lucene-in-Action, >>> > 2nd > >>> edition) explained at Manning forum: >>> >>> mergeFactor of 1000 means you will have up to 1000 segments at each >>> > level. > >>> A level 0 segment means it was flushed directly by IndexWriter. >>> After you have 1000 such segments, they are merged into a single level 1 >>> segment. >>> Once you have 1000 level 1 segments, they are merged into a single level >>> > 2 > >>> segment, etc. >>> So, depending on how many docs you add to your index, you'll could have >>> 1000s of segments w/ mergeFactor=1000. >>> >>> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 >>> >>> >>> So, in case of mergeFactor=100 you may have (theoretically) 1000 >>> > segments, > >>> 10-20 files each (depending on schema)... >>> >>> >>> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you >>> need at least double Java heap, but you have -Xmx1024m... >>> >>> >>> -Fuad >>> >>> >>> >>> I am getting too many open files error. Usually I test on a server that has 4GB RAM and assigned 1GB for tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this server and has following setting for SolrConfig.xml true 1024 100 2147483647 1 >>> >>> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> > > > > -- - Mark http://www.lucidimagination.com
Re: Too many open files
Here is an example using the Lucene benchmark package. Indexing 64,000 wikipedia docs (sorry for the formatting): [java] > Report sum by Prefix (MAddDocs) and Round (4 about 32 out of 256058) [java] Operation round mrg flush runCnt recsPerRunrec/s elapsedSecavgUsedMemavgTotalMem [java] MAddDocs_8000 0 10 32.00MB8 800037.401,711.22 124,612,472182,689,792 [java] MAddDocs_8000 - 1 10 80.00MB - - 8 - - - 8000 - - 39.91 - 1,603.76 - 266,716,128 - 469,925,888 [java] MAddDocs_8000 2 10 120.00MB8 800040.741,571.02 348,059,488548,233,216 [java] MAddDocs_8000 - 3 10 512.00MB - - 8 - - - 8000 - - 38.25 - 1,673.05 - 746,087,808 - 926,089,216 After about 32-40, you don't gain much, and it starts decreasing once you start getting to high. 8GB is a terrible recommendation. Also, from the javadoc in IndexWriter: * NOTE: because IndexWriter uses * ints when managing its internal storage, * the absolute maximum value for this setting is somewhat * less than 2048 MB. The precise limit depends on * various factors, such as how large your documents are, * how many fields have norms, etc., so it's best to set * this value comfortably under 2048. Mark Miller wrote: > 8 GB is much larger than is well supported. Its diminishing returns over > 40-100 and mostly a waste of RAM. Too high and things can break. It > should be well below 2 GB at most, but I'd still recommend 40-100. > > Fuad Efendi wrote: > >> Reason of having big RAM buffer is lowering frequency of IndexWriter flushes >> and (subsequently) lowering frequency of index merge events, and >> (subsequently) merging of a few larger files takes less time... especially >> if RAM Buffer is intelligent enough (and big enough) to deal with 100 >> concurrent updates of existing document without 100-times flushing to disk >> of 100 document versions. >> >> I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes >> merge, and 1 minute update) with default SOLR settings (32Mb buffer). I >> increased buffer to 8Gb on Master, and it triggered significant indexing >> performance boost... >> >> -Fuad >> http://www.linkedin.com/in/liferay >> >> >> >> >>> -Original Message- >>> From: Mark Miller [mailto:markrmil...@gmail.com] >>> Sent: October-23-09 3:03 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Too many open files >>> >>> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. >>> >>> Fuad Efendi wrote: >>> >>> I was partially wrong; this is what Mike McCandless (Lucene-in-Action, >> 2nd >> >> edition) explained at Manning forum: mergeFactor of 1000 means you will have up to 1000 segments at each >> level. >> >> A level 0 segment means it was flushed directly by IndexWriter. After you have 1000 such segments, they are merged into a single level 1 segment. Once you have 1000 level 1 segments, they are merged into a single level >> 2 >> >> segment, etc. So, depending on how many docs you add to your index, you'll could have 1000s of segments w/ mergeFactor=1000. http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 So, in case of mergeFactor=100 you may have (theoretically) 1000 >> segments, >> >> 10-20 files each (depending on schema)... mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you need at least double Java heap, but you have -Xmx1024m... -Fuad > I am getting too many open files error. > > Usually I test on a server that has 4GB RAM and assigned 1GB for > tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > server and has following setting for SolrConfig.xml > > > > true > > 1024 > > 100 > > 2147483647 > > 1 > > > > >>> -- >>> - Mark >>> >>> http://www.lucidimagination.com >>> >>> >>> >>> >> >> >> > > > -- - Mark http://www.lucidimagination.com
Re: Too many open files
Hmm - came out worse than it looked. Here is a better attempt: MergeFactor: 10 BUF DOCS/S 32 37.40 80 39.91 120 40.74 512 38.25 Mark Miller wrote: > Here is an example using the Lucene benchmark package. Indexing 64,000 > wikipedia docs (sorry for the formatting): > > [java] > Report sum by Prefix (MAddDocs) and Round (4 > about 32 out of 256058) > [java] Operation round mrg flush runCnt > recsPerRunrec/s elapsedSecavgUsedMemavgTotalMem > [java] MAddDocs_8000 0 10 32.00MB8 > 800037.401,711.22 124,612,472182,689,792 > [java] MAddDocs_8000 - 1 10 80.00MB - - 8 - - - 8000 - > - 39.91 - 1,603.76 - 266,716,128 - 469,925,888 > [java] MAddDocs_8000 2 10 120.00MB8 > 800040.741,571.02 348,059,488548,233,216 > [java] MAddDocs_8000 - 3 10 512.00MB - - 8 - - - 8000 - > - 38.25 - 1,673.05 - 746,087,808 - 926,089,216 > > After about 32-40, you don't gain much, and it starts decreasing once > you start getting to high. 8GB is a terrible recommendation. > > Also, from the javadoc in IndexWriter: > >* NOTE: because IndexWriter uses >* ints when managing its internal storage, >* the absolute maximum value for this setting is somewhat >* less than 2048 MB. The precise limit depends on >* various factors, such as how large your documents are, >* how many fields have norms, etc., so it's best to set >* this value comfortably under 2048. > > Mark Miller wrote: > >> 8 GB is much larger than is well supported. Its diminishing returns over >> 40-100 and mostly a waste of RAM. Too high and things can break. It >> should be well below 2 GB at most, but I'd still recommend 40-100. >> >> Fuad Efendi wrote: >> >> >>> Reason of having big RAM buffer is lowering frequency of IndexWriter flushes >>> and (subsequently) lowering frequency of index merge events, and >>> (subsequently) merging of a few larger files takes less time... especially >>> if RAM Buffer is intelligent enough (and big enough) to deal with 100 >>> concurrent updates of existing document without 100-times flushing to disk >>> of 100 document versions. >>> >>> I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes >>> merge, and 1 minute update) with default SOLR settings (32Mb buffer). I >>> increased buffer to 8Gb on Master, and it triggered significant indexing >>> performance boost... >>> >>> -Fuad >>> http://www.linkedin.com/in/liferay >>> >>> >>> >>> >>> -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: October-23-09 3:03 PM To: solr-user@lucene.apache.org Subject: Re: Too many open files I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. Fuad Efendi wrote: > I was partially wrong; this is what Mike McCandless (Lucene-in-Action, > > > >>> 2nd >>> >>> >>> > edition) explained at Manning forum: > > mergeFactor of 1000 means you will have up to 1000 segments at each > > > >>> level. >>> >>> >>> > A level 0 segment means it was flushed directly by IndexWriter. > After you have 1000 such segments, they are merged into a single level 1 > segment. > Once you have 1000 level 1 segments, they are merged into a single level > > > >>> 2 >>> >>> >>> > segment, etc. > So, depending on how many docs you add to your index, you'll could have > 1000s of segments w/ mergeFactor=1000. > > http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 > > > So, in case of mergeFactor=100 you may have (theoretically) 1000 > > > >>> segments, >>> >>> >>> > 10-20 files each (depending on schema)... > > > mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you > need at least double Java heap, but you have -Xmx1024m... > > > -Fuad > > > > > > >> I am getting too many open files error. >> >> Usually I test on a server that has 4GB RAM and assigned 1GB for >> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this >> server and has following setting for SolrConfig.xml >> >> >> >> true >> >> 1024 >> >> 100 >> >> 2147483647 >> >> 1 >> >> >> >> >> > > > -- - Mark http://www.lucidimagination.com >>> >>> >>> >> >> > > > -- -
"exceeded limit of maxWarmingSearchers=2" when posting data
I'm trying to stress-test solr (nightly build of 2009-10-12) using JMeter. I set up JMeter to post pod_other.xml, then hd.xml, then commit.xml that only has a line "", 100 times. Solr instance runs on a multi-core system. Solr didn't complian when the number of test threads is 1, 2, 3 or 4. But when I increased the thnumber of test threads to 8, I saw this error on the console: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. What does this mean? Why Solr tries to make warm up searchers when I'm posting documents, not searching? Do I need to set this maxWarmingSearchers to greater than the number of CPU cores? Thanks. T. "Kuro" Kurosaka
Re: "exceeded limit of maxWarmingSearchers=2" when posting data
2009/10/23 Teruhiko Kurosaka : > I'm trying to stress-test solr (nightly build of 2009-10-12) using JMeter. > I set up JMeter to post pod_other.xml, then hd.xml, then commit.xml that only > has a line "", 100 times. > Solr instance runs on a multi-core system. > > Solr didn't complian when the number of test threads is 1, 2, 3 or 4. > > But when I increased the thnumber of test threads to 8, I saw this error > on the console: > > SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. > exceeded limit of maxWarmingSearchers=2, try again later. > > > What does this mean? > > Why Solr tries to make warm up searchers when I'm posting documents, not > searching? A commit flushes index changes to disk and opens a new index searcher. The maxWarmingSearchers limit is just a protection mechanism. > Do I need to set this maxWarmingSearchers to greater than the number of CPU > cores? No, that's unrelated. Don't commit so often. The error is also not a fatal one - the commit fails, but you won't lose data - you just won't see it until a commit succeeds in opening a new searcher. -Yonik http://www.lucidimagination.com
Solrj Javabin and JSON
Hi, did anyone write a Javabin to JSON convertor and is willing to share this ? In our servlet we use a CommonsHttpSolrServer instance to execute a query. The problem is that is returns Javabin format and we need to send the result back to the browser using JSON format. And no, the browser is not allowed to directly query Lucene with the wt=json format. Regards, S. -- View this message in context: http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26036551.html Sent from the Solr - User mailing list archive at Nabble.com.