Re: Huge load and long response times during search
Hi, Otis Gospodnetic pisze: Tom, It looks like the machine might simply be running too many things. > If the load is around 1 when Solr is not running, and this is a dual-core server, it shows its already relatively busy (cca 50% idle). The server is running the Postgresql and Apache/PHP as well, but without solr the server condition is more than good (load usually less than 1, sometimes , even dring rush hours we observed 1m load avg 0,68). It is double dual core so load 1 means 25% am I right (4 cores)? Your caches are not small, so I am guessing you either have to have a relatively big heap, or your heap is not large enough and it's the GC that's causing high CPU load. The java starts with Xmx3584m. Should that be fine for such cache settings? By the way I'm wondering if we need such caches. I did check query frequency for last 10 days (~7 unique users) and most frequent phrase appears ~150 times, and only 11 queries exists more than 100 times. I did not count if user used the same query but goes to next page. Is this worthy to keep quite big cache in this cas? If you are seeing Solr causing lots of IO, that's a sign the box doesn't have enough memory for all those servers running comfortably on it. We do have some free memory to use. Server has 8G RAM and mostly uses up to 6G, I haven't seen the swap used yet. I would try to give more RAM for java and use smaller cache to see if it would work. Tom
Re: Index time boosts, payloads, and long query strings
Thanks Erick! After reading your answer, and re-reading the Solr wiki, I realized my folly. I used to think that index-time boosts when applied on a per-field basis are equivalent to query time boosts to that field. To ensure that my new understanding is correct , I'll state it in my words. Index time boosts will determine boost for a *document* if it is counted as a hit. Query time boosts give you control on boosting the occurrence of a query in a specific field. Please correct me if I'm wrong (again) :-) Girish Redekar http://girishredekar.net On Sun, Nov 22, 2009 at 8:25 PM, Erick Erickson wrote: > I still think they are apples and oranges. If you boost *all* titles, > you're effectively boosting none of them. Index time boosting > expresses "this document's title is more important than other > document titles." What I think you're after is "titles are more > important than other parts of the document. > > For this latter, you're talking query-time boosting. Boosting only > really makes sense if there are multiple clauses, something > like title:important OR body:unimportant. If this is true, speed > is irrelevant, you need correct behavior. > > Not that I think you'd notice either way. Modern computers > can do a LOT of FLOPS/sec. Here's an experiment: time > some queries (but beware of timing the very first ones, see > the Wiki) with boosts and without boosts. I doubt you'll see > enough difference to matter (but please do report back if you > do, it'll further my education ). > > But, depending on your index structure, you may get this > anyway. Generally, matches on shorter fields weigh more > in the score calculations than on longer fields. If you have > fields like title and body and you are querying on title:term OR > body:term, documents with term in the title will tend toward > higher scores. > > But before putting too much effort into this, do you have any > evidence that the default behavior is unsatisfactory? Because > unless and until you do, I think this is a distraction ... > > Best > Erick > > On Sun, Nov 22, 2009 at 8:37 AM, Girish Redekar > wrote: > > > Hi Erick - > > > > Maybe I mis-wrote. > > > > My question is: would "title:any_query^4.0" be faster/slower than > applying > > index time boost to the field title. Basically, if I take *every* user > > query > > and search for it in title with boost (say, 4.0) - is it different than > > saying field title has boost 4.0? > > > > Cheers, > > Girish Redekar > > http://girishredekar.net > > > > > > On Sun, Nov 22, 2009 at 2:02 AM, Erick Erickson > >wrote: > > > > > I'll take a whack at index .vs. query boosting. They are expressing > very > > > different concepts. Let's claim we're interested in boosting the title > > > field > > > > > > Index time boosting is expressing "this document's title is X more > > > important > > > > > > than a normal document title". It doesn't matter *what* the title is, > > > any query that matches on anything in this document's title will give > > this > > > document a boost. I might use this to give preferential treatment to > all > > > encyclopedia entries or something. > > > > > > Query time boosting, like "title:solr^4.0" expresses "Any document with > > > solr > > > in > > > it's title is more important than documents without solr in the title". > > > This > > > really > > > only makes sense if you have other clauses that might cause a document > > > *without* > > > solr the title to match.. > > > > > > Since they are doing different things, efficiency isn't really > relevant. > > > > > > HTH > > > Erick > > > > > > > > > On Sat, Nov 21, 2009 at 2:13 AM, Girish Redekar > > > wrote: > > > > > > > Hi , > > > > > > > > I'm relatively new to Solr/Lucene, and am using Solr (and not lucene > > > > directly) primarily because I can use it without writing java code > > (rest > > > of > > > > my project is python coded). > > > > > > > > My application has the following requirements: > > > > (a) ability to search over multiple fields, each with different > weight > > > > (b) If possible, I'd like to have the ability to add extra/diminished > > > > weights to particular tokens within a field > > > > (c) My query strings have large lengths (50-100 words) > > > > (d) My index is 500K+ documents > > > > > > > > 1) The way to (a) is field boosting (right?). My question is: Is all > > > field > > > > boosting done at query time? Even if I give index time boosts to > > fields? > > > Is > > > > there a performance advantage in boosting fields at index time vs at > > > using > > > > something like fieldname:querystring^boost. > > > > 2) From what I've read, it seems that I can do (b) using payloads. > > > However, > > > > as this link ( > > > > > > > > > > > > > > http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ > > > > ) > > > > suggests, I will have to write a payload aware Query Parser. Wanted > to > > > > confirm if this is indeed the case - or is there a out-of-box way to
RE: schema-based Index-time field boosting
Yeah, like I said, I was mistaken about setting field boost in schema.xml - doesn't mean it's a bad idea though. At any rate, from your penultimate sentence I reckon at least one of us is still confused about field boosting, feel free to reply if you think it's me ;) Ian. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: 21 November 2009 01:54 To: solr-user@lucene.apache.org Subject: RE: schema-based Index-time field boosting : The field boost attribute was put there by me back in the 1.3 days, when : I somehow gained the mistaken impression that it was supposed to work! : Of course, despite a lot of searching I haven't been able to find : anything to back up my position ;) solr has never supported anything like a "boost" paramter on fields in schema.xml : Of course, by now I am convinced that this might be a really good : feature - I might get the chance to look into it in the near future - : can anyone think of reasons why this might not work in practice? field boosting only makes sense if it's only applied to some of the documents in the index, if every document has an index time boost on fieldX, then that boost is meaningless. are you looking for query time boosting on fields? like what dismax provides with the "qf" param? -Hoss Web design and intelligent Content Management. www.twitter.com/gossinteractive Registered Office: c/o Bishop Fleming, Cobourg House, Mayflower Street, Plymouth, PL1 1LG. Company Registration No: 3553908 This email contains proprietary information, some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this email, please notify the author by replying to this email. If you are not the intended recipient you may not use, disclose, distribute, copy, print or rely on this email. Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. GOSS Interactive Ltd accepts no liability for any loss or damage that may be caused by software viruses.
Re: Huge load and long response times during search
Tom, AFAIK Lucene performance is very much dependent on file system cache size, in case of large index. So if you see lots of IO, this probably means that your system doesn't have enough memory to hold large file system cache, suitable for your index size. In this case you don't need to give more memory to java processes, but instead you need to free as much memory as you can for the OS. On Mon, Nov 23, 2009 at 11:53 AM, Tomasz Kępski wrote: > Hi, > > Otis Gospodnetic pisze: > > Tom, >> >> It looks like the machine might simply be running too many things. >> > > If the load is around 1 when Solr is not running, and this is a dual-core > server, it shows its already relatively busy (cca 50% idle). > > The server is running the Postgresql and Apache/PHP as well, but without > solr the server condition is more than good (load usually less than 1, > sometimes , even dring rush hours we observed 1m load avg 0,68). > > It is double dual core so load 1 means 25% am I right (4 cores)? > > > Your caches are not small, so I am guessing you either have to have a >> relatively big heap, or your heap is not large enough and it's the GC that's >> causing high CPU load. >> > > The java starts with Xmx3584m. Should that be fine for such cache settings? > By the way I'm wondering if we need such caches. I did check query frequency > for last 10 days (~7 unique users) and most frequent phrase appears ~150 > times, and only 11 queries exists more than 100 times. I did not count if > user used the same query but goes to next page. > > Is this worthy to keep quite big cache in this cas? > > > If you are seeing Solr causing lots of IO, that's a sign the box doesn't >> have enough memory for all those servers running comfortably on it. >> > > We do have some free memory to use. Server has 8G RAM and mostly uses up to > 6G, I haven't seen the swap used yet. I would try to give more RAM for java > and use smaller cache to see if it would work. > > Tom > > > -- Andrew Klochkov Senior Software Engineer, Grid Dynamics
Output all, from one field
Hallo, I search for a way, to output all content from one field.. Like name: "NAME:*" And Solr gifs me all Names or "color:*" and i become all colors can io do this? or is this Impossible? Jörg
Re: Function queries question
Thanks for getting back to me. I've added inline responses below. 2009/11/20 Grant Ingersoll > > On Nov 20, 2009, at 3:15 AM, Oliver Beattie wrote: > > > Hi all, > > > > I'm a relative newcomer to Solr, and I'm trying to use it in a project > > of mine. I need to do a function query (I believe) to filter the > > results so they are within a certain distance of a point. For this, I > > understand I should use something like sqedist or hsin, and from the > > documentation on the FunctionQuery page, I believe that the function > > is executed on every "row" (or "record", not sure what the proper term > > for this is). So, my question is threefold really; are those functions > > the ones I should be using to perform a search where distance is one > > of the criteria (there are others), > > Short answer: yes. Long answer: I just committed those functions this week. > I believe they are good, but feedback is encouraged. I'll be sure to let you know if I find anything report-worthy :) They're definitely super-useful for people doing similar things to I though, so great work :) > > and if so, does Solr execute the > > query on every row (and again, if so, is there any way of preventing > > this [like subqueries, though I know they're not supported])? > > You can use the frange capability to filter first. See > http://www.lucidimagination.com/blog/tag/frange/ Thanks for the link. I'll definitely do that. Does Solr execute the function on every row in the database on every query otherwise? > > Here's an example from a soon to be published article I'm writing: > http://localhost:8983/solr/select/?q=*:*&fq={!frange l=0 u=400}hsin(0.57, > -1.3, lat_rad, lon_rad, 3963.205) > > This should filter out all documents that are beyond 400 miles in distance > from that point on a sphere (specified in radians, see also the rads() method) > > > > > > > Sorry if this is a little confusing… any help would be greatly appreciated > > :) > > No worries, a lot of this spatial stuff is still being ironed out. See > https://issues.apache.org/jira/browse/SOLR-773 for the issue that is tracking > all of the related issues. The pieces are starting to come together and I'm > pretty excited about it b/c not only will it bring native spatial support to > Solr, it will also give Solr some exciting new general capabilities (sort by > function, pseudo-fields, facet by function, etc.)
Boost document base on field length
Hi, I would like to boost documents with longer descriptions to move down documents with 0 length description, I'm wondering if there is possibility to boost document basing on the field length while searching or the only way is to store field length as an int in a separate field while indexing? Tom
Re: Index time boosts, payloads, and long query strings
Yep On Mon, Nov 23, 2009 at 4:13 AM, Girish Redekar wrote: > Thanks Erick! > > After reading your answer, and re-reading the Solr wiki, I realized my > folly. I used to think that index-time boosts when applied on a per-field > basis are equivalent to query time boosts to that field. > > To ensure that my new understanding is correct , I'll state it in my words. > Index time boosts will determine boost for a *document* if it is counted as > a hit. Query time boosts give you control on boosting the occurrence of a > query in a specific field. > > Please correct me if I'm wrong (again) :-) > > Girish Redekar > http://girishredekar.net > > > On Sun, Nov 22, 2009 at 8:25 PM, Erick Erickson >wrote: > > > I still think they are apples and oranges. If you boost *all* titles, > > you're effectively boosting none of them. Index time boosting > > expresses "this document's title is more important than other > > document titles." What I think you're after is "titles are more > > important than other parts of the document. > > > > For this latter, you're talking query-time boosting. Boosting only > > really makes sense if there are multiple clauses, something > > like title:important OR body:unimportant. If this is true, speed > > is irrelevant, you need correct behavior. > > > > Not that I think you'd notice either way. Modern computers > > can do a LOT of FLOPS/sec. Here's an experiment: time > > some queries (but beware of timing the very first ones, see > > the Wiki) with boosts and without boosts. I doubt you'll see > > enough difference to matter (but please do report back if you > > do, it'll further my education ). > > > > But, depending on your index structure, you may get this > > anyway. Generally, matches on shorter fields weigh more > > in the score calculations than on longer fields. If you have > > fields like title and body and you are querying on title:term OR > > body:term, documents with term in the title will tend toward > > higher scores. > > > > But before putting too much effort into this, do you have any > > evidence that the default behavior is unsatisfactory? Because > > unless and until you do, I think this is a distraction ... > > > > Best > > Erick > > > > On Sun, Nov 22, 2009 at 8:37 AM, Girish Redekar > > wrote: > > > > > Hi Erick - > > > > > > Maybe I mis-wrote. > > > > > > My question is: would "title:any_query^4.0" be faster/slower than > > applying > > > index time boost to the field title. Basically, if I take *every* user > > > query > > > and search for it in title with boost (say, 4.0) - is it different than > > > saying field title has boost 4.0? > > > > > > Cheers, > > > Girish Redekar > > > http://girishredekar.net > > > > > > > > > On Sun, Nov 22, 2009 at 2:02 AM, Erick Erickson < > erickerick...@gmail.com > > > >wrote: > > > > > > > I'll take a whack at index .vs. query boosting. They are expressing > > very > > > > different concepts. Let's claim we're interested in boosting the > title > > > > field > > > > > > > > Index time boosting is expressing "this document's title is X more > > > > important > > > > > > > > than a normal document title". It doesn't matter *what* the title is, > > > > any query that matches on anything in this document's title will give > > > this > > > > document a boost. I might use this to give preferential treatment to > > all > > > > encyclopedia entries or something. > > > > > > > > Query time boosting, like "title:solr^4.0" expresses "Any document > with > > > > solr > > > > in > > > > it's title is more important than documents without solr in the > title". > > > > This > > > > really > > > > only makes sense if you have other clauses that might cause a > document > > > > *without* > > > > solr the title to match.. > > > > > > > > Since they are doing different things, efficiency isn't really > > relevant. > > > > > > > > HTH > > > > Erick > > > > > > > > > > > > On Sat, Nov 21, 2009 at 2:13 AM, Girish Redekar > > > > wrote: > > > > > > > > > Hi , > > > > > > > > > > I'm relatively new to Solr/Lucene, and am using Solr (and not > lucene > > > > > directly) primarily because I can use it without writing java code > > > (rest > > > > of > > > > > my project is python coded). > > > > > > > > > > My application has the following requirements: > > > > > (a) ability to search over multiple fields, each with different > > weight > > > > > (b) If possible, I'd like to have the ability to add > extra/diminished > > > > > weights to particular tokens within a field > > > > > (c) My query strings have large lengths (50-100 words) > > > > > (d) My index is 500K+ documents > > > > > > > > > > 1) The way to (a) is field boosting (right?). My question is: Is > all > > > > field > > > > > boosting done at query time? Even if I give index time boosts to > > > fields? > > > > Is > > > > > there a performance advantage in boosting fields at index time vs > at > > > > using > > > > > something like fieldname:querystring^boost. > > > > > 2) From
ExtractingRequestHandler commitWithin
Any chance of getting the ExtractingRequestHandler to use the commitWithin parameter? -- View this message in context: http://old.nabble.com/ExtractingRequestHandler-commitWithin-tp26478144p26478144.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Boost document base on field length
On Nov 23, 2009, at 8:01 AM, Tomasz Kępski wrote: > Hi, > > I would like to boost documents with longer descriptions to move down > documents with 0 length description, > I'm wondering if there is possibility to boost document basing on the field > length while searching or the only way is to store field length as an int in > a separate field while indexing? Override the default Similarity (see the end of the schema.xml file) with your own Similarity implementation and then in that class override the lengthNorm() method.
Re: Function queries question
On Nov 23, 2009, at 6:54 AM, Oliver Beattie wrote: > Thanks for getting back to me. I've added inline responses below. > > 2009/11/20 Grant Ingersoll >> >> On Nov 20, 2009, at 3:15 AM, Oliver Beattie wrote: >> >>> Hi all, >>> >>> I'm a relative newcomer to Solr, and I'm trying to use it in a project >>> of mine. I need to do a function query (I believe) to filter the >>> results so they are within a certain distance of a point. For this, I >>> understand I should use something like sqedist or hsin, and from the >>> documentation on the FunctionQuery page, I believe that the function >>> is executed on every "row" (or "record", not sure what the proper term >>> for this is). So, my question is threefold really; are those functions >>> the ones I should be using to perform a search where distance is one >>> of the criteria (there are others), >> >> Short answer: yes. Long answer: I just committed those functions this >> week. I believe they are good, but feedback is encouraged. > > I'll be sure to let you know if I find anything report-worthy :) > They're definitely super-useful for people doing similar things to I > though, so great work :) > >>> and if so, does Solr execute the >>> query on every row (and again, if so, is there any way of preventing >>> this [like subqueries, though I know they're not supported])? >> >> You can use the frange capability to filter first. See >> http://www.lucidimagination.com/blog/tag/frange/ > > Thanks for the link. I'll definitely do that. Does Solr execute the > function on every row in the database on every query otherwise? If the query is unrestricted by other clauses or by filters, yes it will execute over all docs in the index. > >> >> Here's an example from a soon to be published article I'm writing: >> http://localhost:8983/solr/select/?q=*:*&fq={!frange l=0 >> u=400}hsin(0.57, -1.3, lat_rad, lon_rad, 3963.205) >> >> This should filter out all documents that are beyond 400 miles in distance >> from that point on a sphere (specified in radians, see also the rads() >> method) >> >> >> >>> >>> Sorry if this is a little confusing… any help would be greatly appreciated >>> :) Which part? The hsin() part calculates the distance between the point 0.57, -1.3 and the values in the fields lat_rad, lon_rad and is using 3963.205 as the radius of the sphere (which is the approx. radius of the Earth in miles). The frange stuff then filters such that it only accepts docs that have a value for hsin between 0 and 400. -Grant
[N to M] range search out of sum of field. howto search this?
Hi folks, I got documents like user:1 num:5 user:1 num: 8 user:5 num:7 user:5 num:1 I'd like to get per user that maches sum of num range 5 to 10 In this case it should return user 5 as 7+1=8 and is within range. User 1 will be false cause sum of num is 5+8=13 hence outside range 5 to 10 Thanks
Re: [N to M] range search out of sum of field. howto search this?
See frange: http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/ fq={!frange l=5 u=10}sum(user,num) -Yonik http://www.lucidimagination.com On Mon, Nov 23, 2009 at 8:49 AM, Julian Davchev wrote: > Hi folks, > I got documents like > user:1 num:5 > user:1 num: 8 > user:5 num:7 > user:5 num:1 > > > > I'd like to get per user that maches sum of num range 5 to 10 > In this case it should return user 5 as 7+1=8 and is within range. > User 1 will be false cause sum of num is 5+8=13 hence outside range 5 to 10 > > Thanks >
Re: access denied to solr home lib dir
Check. I even verified that the tomcat user could create the directory (i.e. "sudo -u tomcat6 mkdir /opt/solr/steve/lib"). Still solr complains. On Sun, Nov 22, 2009 at 10:03 PM, Yonik Seeley wrote: > Maybe ensuring that the full parent path (all parent directories) have > "rx" permissions? > > -Yonik > http://www.lucidimagination.com > > On Sun, Nov 22, 2009 at 2:59 PM, Charles Moad wrote: >> I have been trying to get a new solr install setup on Ubuntu 9.10 >> using tomcat6. I have tried the solr 1.4 release and the latest svn >> for good measure. No matter what, I am running into the following >> permission error. I removed all the lib includes from solrconfig.xml. >> I have created the "/opt/solr/steve/lib" directory and all permissions >> are good. This directory is optional, but I just cannot get past >> this. I've installed solr 1.3 many times without running into this on >> redhat boxes. >> >> Thanks, >> Charlie >> >> Nov 22, 2009 2:48:53 PM org.apache.catalina.core.StandardContext filterStart >> SEVERE: Exception starting filter SolrRequestFilter >> org.apache.solr.common.SolrException: >> java.security.AccessControlException: access denied >> (java.io.FilePermission /opt/solr/steve/./lib read) >> at >> org.apache.solr.servlet.SolrDispatchFilter.(SolrDispatchFilter.java:68) >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) >> at java.lang.Class.newInstance0(Class.java:355) >> at java.lang.Class.newInstance(Class.java:308) >> at >> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:255) >> at >> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) >> at >> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108) >> at >> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800) >> at >> org.apache.catalina.core.StandardContext.start(StandardContext.java:4450) >> at >> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) >> at >> org.apache.catalina.core.ContainerBase.access$000(ContainerBase.java:123) >> at >> org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:145) >> at java.security.AccessController.doPrivileged(Native Method) >> at >> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:769) >> at >> org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) >> at >> org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630) >> at >> org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556) >> at >> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491) >> at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206) >> at >> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314) >> at >> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) >> at >> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) >> at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) >> at >> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) >> at >> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) >> at >> org.apache.catalina.core.StandardService.start(StandardService.java:516) >> at >> org.apache.catalina.core.StandardServer.start(StandardServer.java:710) >> at org.apache.catalina.startup.Catalina.start(Catalina.java:583) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177) >> Caused by: java.security.AccessControlException: access denied >> (java.io.FilePermission /opt/solr/steve/./lib read) >> at >> java.security.AccessControlContex
Re: "query" function query; what's it for?
Thanks Yonik. That blog post was very interesting but it only has about a sentence or two on the query() function, and it points the user to the same link I have here at the wiki for examples. But those examples (which is really 1 example) doesn't explain the point. -- e.g. when/why would I use this? On Nov 22, 2009, at 11:11 PM, Yonik Seeley wrote: > On Sun, Nov 22, 2009 at 11:06 PM, David Smiley @MITRE.org > wrote: >> It's not clear to me what purpose the "query" function query solves. I've >> read the description: >> http://wiki.apache.org/solr/FunctionQuery#query but it doesn't really >> explain the point of it. I'm sure it has to do with subtleties in how >> scoring is done. Can someone please present a use-case? > > See the "Pure Nested Query" section here: > http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/ > > -Yonik > http://www.lucidimagination.com
Re: "query" function query; what's it for?
On Mon, Nov 23, 2009 at 9:36 AM, Smiley, David W. wrote: > Thanks Yonik. That blog post was very interesting but it only has about a > sentence or two on the query() function Ah, I had thought that you meant the "query" QParser. The query function just allows you to use any other query type inside a function query. So you could add or multiply the scores of two dismax queries together, or whatever. As far as practical usecases... I've used it for selectively boosting one query based on the results of another query. I'm sure others will find other uses for it. -Yonik http://www.lucidimagination.com
Re: Control DIH from PHP
Thankyou 2009/11/21 Lance Norskog > Nice! I didn't notice that before. Very useful. > > 2009/11/19 Noble Paul നോബിള് नोब्ळ् : > > you can pass the uniqueId as a param and use it in a sql query > > > http://wiki.apache.org/solr/DataImportHandler#Accessing_request_parameters > . > > --Noble > > > > On Thu, Nov 19, 2009 at 3:53 PM, Pablo Ferrari > wrote: > >> Most specificly, I'm looking to update only one document using it's > Unique > >> ID: I dont want the DIH to lookup the whole database because I already > know > >> the Unique ID that has changed. > >> > >> Pablo > >> > >> 2009/11/19 Pablo Ferrari > >> > >>> > >>> > >>> Hello! > >>> > >>> After been working in Solr documents updates using direct php code > (using > >>> SolrClient class) I want to use the DIH (Data Import Handler) to update > my > >>> documents. > >>> > >>> Any one knows how can I send commands to the DIH from php? Any idea or > >>> tutorial will be of great help because I'm not finding anything useful > so > >>> far. > >>> > >>> Thank you for you time! > >>> > >>> Pablo > >>> Tinkerlabs > >>> > >> > > > > > > > > -- > > - > > Noble Paul | Principal Engineer| AOL | http://aol.com > > > > > > -- > Lance Norskog > goks...@gmail.com >
Very busy search screen
I have a client who wants to search on almost every attribute of an object (nearly 15 attributes) on the search screen. Search sreen looks very crazy/busy. I was wondering if there are better ways to address these requirements and build intelligent categorized/configurable searchs? including allowing user to choose if they want to AND or OR attributes etc? Any pointers would be appreciated. thanks, -- View this message in context: http://old.nabble.com/Very-busy-search-screen-tp26482092p26482092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr artifacts / apache maven repository
On Mon, Nov 23, 2009 at 10:31 PM, TCK wrote: > Hi, > > I'd like to pull in the solr war from a public repository. I'm able to find > the individual jars at http://repo2.maven.org/maven2/org/apache/solr/ but > it > seems like the war artifact isn't published. Is there a reason for this or > is it published elsewhere ? > > The war is not published as a maven artifact. Why would you need the war in maven? -- Regards, Shalin Shekhar Mangar.
Re: solr artifacts / apache maven repository
I currently put the war into my own Nexus repository. I use it to build a war-overlay with the solr war to include my plugins & customizations (due to classloading issues with Spring and the external plugin solution). On Mon, Nov 23, 2009 at 12:08 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Mon, Nov 23, 2009 at 10:31 PM, TCK wrote: > > > Hi, > > > > I'd like to pull in the solr war from a public repository. I'm able to > find > > the individual jars at http://repo2.maven.org/maven2/org/apache/solr/but > > it > > seems like the war artifact isn't published. Is there a reason for this > or > > is it published elsewhere ? > > > > > The war is not published as a maven artifact. Why would you need the war in > maven? > > -- > Regards, > Shalin Shekhar Mangar. > -- Stephen Duncan Jr www.stephenduncanjr.com
Re: Very busy search screen
On Mon, Nov 23, 2009 at 10:36 PM, javaxmlsoapdev wrote: > > I have a client who wants to search on almost every attribute of an object > (nearly 15 attributes) on the search screen. Search sreen looks very > crazy/busy. I was wondering if there are better ways to address these > requirements and build intelligent categorized/configurable searchs? > including allowing user to choose if they want to AND or OR attributes etc? > Any pointers would be appreciated. > > You can go with simple text box search on a catch-all field with facets for drilling down. That's how most of us do it. If your client really want complete control you'd have to educate them on solr's query syntax (or perhaps create a simpler query syntax) but I wouldn't suggest going that way. -- Regards, Shalin Shekhar Mangar.
error with multicore CREATE action
Hey there, I am using Solr 1.4 out of the box and am trying to create a core at runtime using the CREATE action. I am getting this error when executing: http://localhost:8983/solr/admin/cores?action=CREATE&name=x&instanceDir=x&persist=true&config=solrconfig.xml&schema=schema.xml&dataDir=data Nov 23, 2009 6:18:44 PM org.apache.solr.core.SolrResourceLoader INFO: Solr home set to 'solr/x/' Nov 23, 2009 6:18:44 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error executing default implementation of CREATE at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:250) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:111) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:298) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:174) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/x/conf/', cwd=/home/smack/Desktop/apache-solr-1.4.0/example at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:228) at org.apache.solr.core.Config.(Config.java:101) at org.apache.solr.core.SolrConfig.(SolrConfig.java:130) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:405) at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:245) ... 21 more I don't know if I am missing something. Should I create manually de folders and schema and solconfig files? -- View this message in context: http://old.nabble.com/error-with-multicore-CREATE-action-tp26482255p26482255.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr artifacts / apache maven repository
On Mon, Nov 23, 2009 at 10:41 PM, Stephen Duncan Jr < stephen.dun...@gmail.com> wrote: > I currently put the war into my own Nexus repository. I use it to build a > war-overlay with the solr war to include my plugins & customizations (due > to > classloading issues with Spring and the external plugin solution). > > I see. If people find it generally useful, we could publish the war too. Patches welcome :) -- Regards, Shalin Shekhar Mangar.
Re: error with multicore CREATE action
On Mon, Nov 23, 2009 at 10:47 PM, Marc Sturlese wrote: > > Hey there, > I am using Solr 1.4 out of the box and am trying to create a core at > runtime > using the CREATE action. > I am getting this error when executing: > > http://localhost:8983/solr/admin/cores?action=CREATE&name=x&instanceDir=x&persist=true&config=solrconfig.xml&schema=schema.xml&dataDir=data > > Nov 23, 2009 6:18:44 PM org.apache.solr.core.SolrResourceLoader > INFO: Solr home set to 'solr/x/' > Nov 23, 2009 6:18:44 PM org.apache.solr.common.SolrException log > SEVERE: org.apache.solr.common.SolrException: Error executing default > implementation of CREATE >at > > org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:250) >at > > > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) > Caused by: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' > in classpath or 'solr/x/conf/', > cwd=/home/smack/Desktop/apache-solr-1.4.0/example >at > > org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260) >at > > org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:228) >at org.apache.solr.core.Config.(Config.java:101) >at org.apache.solr.core.SolrConfig.(SolrConfig.java:130) >at org.apache.solr.core.CoreContainer.create(CoreContainer.java:405) >at > > org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:245) >... 21 more > > I don't know if I am missing something. Should I create manually de folders > and schema and solconfig files? > > The instance directory and the configuration files should exist before you can create a core. The core CREATE command just creates a Solr core instance in memory after reading the configuration from disk. -- Regards, Shalin Shekhar Mangar.
Re: solr artifacts / apache maven repository
Thanks, yes that's what I do as well. I'd like to pull in the standard solr war distribution from a public repository and then explode it and put my own overlays. Shalin, any pointers to where I would look to go about making a patch to the artifact publishing process? Thanks, TCK On Mon, Nov 23, 2009 at 12:23 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Mon, Nov 23, 2009 at 10:41 PM, Stephen Duncan Jr < > stephen.dun...@gmail.com> wrote: > > > I currently put the war into my own Nexus repository. I use it to build > a > > war-overlay with the solr war to include my plugins & customizations (due > > to > > classloading issues with Spring and the external plugin solution). > > > > > I see. If people find it generally useful, we could publish the war too. > Patches welcome :) > > -- > Regards, > Shalin Shekhar Mangar. >
Re: solr artifacts / apache maven repository
On Mon, Nov 23, 2009 at 11:04 PM, TCK wrote: > Thanks, yes that's what I do as well. I'd like to pull in the standard solr > war distribution from a public repository and then explode it and put my > own > overlays. > > Shalin, any pointers to where I would look to go about making a patch to > the > artifact publishing process? > > That'd be great. See http://wiki.apache.org/solr/HowToContribute -- Regards, Shalin Shekhar Mangar.
Re: Output all, from one field
On Mon, Nov 23, 2009 at 4:29 PM, Jörg Agatz wrote: > Hallo, > > I search for a way, to output all content from one field.. > > Like name: > > "NAME:*" > > And Solr gifs me all Names > > or "color:*" > > and i become all colors > > can io do this? or is this Impossible? > > Do you want to return just one field from all documents? If yes, you can: 1. Query with q=*:*&fl=name 2. Use TermsComponent - http://wiki.apache.org/solr/TermsComponent -- Regards, Shalin Shekhar Mangar.
Re: error with multicore CREATE action
Are there any use cases for CREATE where the instance directory *doesn't* yet exist? I ask because I've noticed that Solr will create an instance directory for me sometimes with the CREATE command. In particular, if I run something like http://solrhost/solr/admin/cores?action=CREATE&name=newcore&instanceDir=d:\dir_that_does_not_exist\&config=C:\dir_that_does_exist\solrconfig.xml&schema=C:\dir_that_does_exist\schema.xml then Solr will create d:\dir_that_does_not_exist and d:\dir_that_does_not_exist\data for me (but not d:\dir_that_does_not_exist\conf). Maybe this has to do with some particularly in my solrconfig.xml? (There I've commented out the dataDir element because I prefer the default behavior to what you get with "${solr.data.dir:./solr/data}".) 2009/11/23 Shalin Shekhar Mangar : > The instance directory and the configuration files should exist before you > can create a core. The core CREATE command just creates a Solr core instance > in memory after reading the configuration from disk.
Re: NPE when trying to view a specific document via Luke
: I think thats the case - I'm not seeing the problem - though I didn't : follow your steps exactly, because I also set the data dir. yeah ... i went back and tested again and verified that's what was happening. There is a bug with Luke when viewing "binary" based fields introduced in Solr 1.4 (like the new Trie fields) which yonik has fixed in SOLR-1563 but i can't trigger any similar problems when using an existing 1.3 schema and/or index. -Hoss
RE: schema-based Index-time field boosting
: Yeah, like I said, I was mistaken about setting field boost in : schema.xml - doesn't mean it's a bad idea though. At any rate, from : your penultimate sentence I reckon at least one of us is still confused : about field boosting, feel free to reply if you think it's me ;) Yeah ... i think it's you. like i said... : field boosting only makes sense if it's only applied to some of the : documents in the index, if every document has an index time boost on : fieldX, then that boost is meaningless. ...if there was a way to oost fields at index time that was configured in the schema.xml, then every doc would get that boost on it's instances of those fields but the only purpose of index time boosting is to indicate that one document is more significant then another doc -- if every doc gets the same boost, it becomes a No-OP. (think about the math -- field boosts become multipliers in the fieldNorm -- if every doc gets the same multiplier, then there is no net effect) -Hoss
Re: Factory cannot be cast
: previously I was using a NGramFilterFactory for the completion on my website : but the EdgeNGramTokenizerFactory seems to be more pertinent. : : I defined my own field type but when I start solr I got the error log : : : GRAVE: java.lang.ClassCastException: : org.apache.solr.analysis.EdgeNGramTokenizerFactory cannot be cast to : org.apache.solr.analysis.Toke : nFilterFactory You can't use a TokenizerFactory as a TokenFilterFacotry --- they do very different things. A Tokenizer is responsible for comverting a stream of characters into a stream of Tokens, while a TokenFilter is responsile for processing an existing stream of Tokens and producing a (odified) stream of Tokens. -Hoss
Re: Fwd: solr index-time boost... help required please
: Now I am trying *index-time *boosting to improve response time. So i created : an algorithm where I do the following:- : 1. sort the records i get from database on approval_dt asc and increase the : boost value of the element for approval_dt by 0.1 as i encounter : higer approval_dt records. If there is no approval_dt for a record, not : boost value for it. I made omitnorms=false in schema.xml for approval_dt : field. Now when I apply the same query nothing special happens ie I dont : even see the latest dates first. index time boosting of a field just affects tehfieldNorms for the specific field you apply the boost too -- if you don't search on that field (with a score based query type), the boost doesn't affect things. so if you applied an index time boost to some field named "approval_dt" then that boost isn't going to matter unless you query against the approval_dt field -- but if you use something like a range query, the boost still won't matter because range queries don't affect the score. more then likely what you want to do is use a *document* boost instead of a field boost .. that way the boost factor gets applied to any field you have that includes the norms, so no matter what field you query on the boost will get applied. : 2. If we boost a doc or field in the xml should we again use the bf : parameter with a function to put the boost into effect while querying when : trying index-time boost also? index time boosts and query boosts are completley orthoginal, you can use both together, but they don't require (or know) about eachother at all : 3. Also can you frame a query for me to see the latest approval_dt coming : first using the index-time boost approach. not with the setup you've described ... date based queries really won't ever look at the norms for the data field (unlessy ou did a term query for a very specified date value) : 4. Does bf function play any role in solrconfig.xml when we plan to use : index-time boost. My understanding is bf is used only for query-time boost. you are correct. : 5. Is it necessary to use bq in case of index time boost. same answer as #2. -Hoss
Re: Index time boosting troubles
: I had working index time boosting on documents like so: : : Everything was great until I made some changes that I thought where no : related to the doc boost but after that my doc boosting appears to be : missing. : : I'm having a tough time debugging this and didn't have the sense to version : control this so I would have something to revert to (lesson learned). : : In schema.xml I have ...i don't relaly udnerstand your question. what does that one fieldtype have to do with your specific issue? if you post your whole schema, and some examples of hte types of docs you are indexing and the queries you are trying then people can probably help you see how/when/why your index time boosts come into play, but a single fieldtype from your schema without any context doesn't give us much to go on. -Hoss
Re: Where is upgrading documentation?
: Subject: Re: Where is upgrading documentation? : : CHANGES.txt contains information, but no instructions. Hmmm i see what you mean. we typically have a nice boiler plate set of instructions, and somehow those got removed from the 1.4 changes. in a nutshell, the instructions are the same as 1.3... IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves should be upgraded before the master! If the master were to be updated first, the older searchers would not be able to read the new index format. ... Older Apache Solr installations can be upgraded by replacing the relevant war file with the new version. No changes to configuration files should be needed. This version of Solr contains a new version of Lucene implementing an updated index format. This version of Solr/Lucene can still read and update indexes in the older formats, and will convert them to the new format on the first index change. Be sure to backup your index before upgrading in case you need to downgrade. : http://wiki.apache.org/solr/Solr1.4 -Hoss
Re: how to get the autocomplete feature in solr 1.4?
: how to get the autocomplete/autosuggest feature in the solr1.4.plz give me : the code also... there is no magical "one size fits all" solution for autocomplete in solr. if you look at the archives there have been lots of discussions about differnet ways ot get auto complete functionality, using things like the TermsComponent, or the LukeRequest handler, and there are lots of examples of using the SolrJS javascript functionality to populate an autocomplete box -- but you'll have to figure out what solution works best for your goals. : -- : View this message in context: http://old.nabble.com/how-to-get-the-autocomplete-feature-in-solr-1.4--tp26402992p26402992.html : Sent from the Solr - User mailing list archive at Nabble.com. : -Hoss
Re: Disable coord
: Thanks for your reply. Nested boolean queries is a valid concern. I also : realized that isCoordDisabled needs to be considered in : BooleanQuery.hashCode so that a query with coord=false will have a different : cache key in Solr. Hmmm... you're right, BooleanQuery.hashCode doesn't consider disableCoord. that's a nasty bug... https://issues.apache.org/jira/browse/LUCENE-2092 -Hoss
Spellcheck: java.lang.RuntimeException: java.io.IOException: read past EOF
Hello, Solr 1.3 reported the following error when our app tried to query it: java.lang.RuntimeException: java.io.IOException: read past EOF at org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:91) at org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:108) . I noticed that there were about 54 segments.* files under the spellcheck directory. The way I resolved the problem was by going into the spellcheck directory & deleting off all the files in it. I then issued a curl command to rebuild the spellcheck index (I also did a full-import & reload of the main index, to be safe.) When this error occured, our solrconfig.xml had spellcheck.build set to true. This was a configuration eror on our part. I was wondering if the spellcheck index being re-built for each query could have caused the above exception to occur. Kind clarify. Thanks, Ranjit. -- View this message in context: http://old.nabble.com/Spellcheck%3A-java.lang.RuntimeException%3A-java.io.IOException%3A-read-past-EOF-tp26484580p26484580.html Sent from the Solr - User mailing list archive at Nabble.com.
help with dataimport delta query
Hi, I have solr all working nicely, except im trying to get deltas to work on my data import handler Here is a simplification of my data import config, I have a table called "Book" which has categories, im doing subquries for the category info and calling a javascript helper. This all works perfectly for the regular query. I added these lines for the delta stuff: deltaImportQuery="SELECT f.id,f.title FROM Book f f.id='${dataimporter.delta.job_jobs_id}'" deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND lastModifiedDate > '${dataimporter.last_index_time}'" > basically im trying to rows that lastModifiedDate is newer than the last index (or deltaindex). I run: http://localhost:8983/solr/dataimport?command=delta-import And it says in logs: Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport INFO: Starting Delta Import Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties INFO: Read dataimport.properties Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Starting delta collection. Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=delta-import} status=0 QTime=0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Running ModifiedRowKey() for Entity: category Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: category rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: category Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Running ModifiedRowKey() for Entity: item Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: item rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: item Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.21 But the browser says no documents added/modified (even though one record in db is a match) Is there a way to turn debugging so I can see the queries the DIH is sending to the db? Any other ideas of what I could be doing wrong? thanks Joel deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND lastModifiedDate > '${dataimporter.last_index_time}'" > transformer="script:SplitAndPrettyCategory" query="select fc.bookId, group_concat(cr.name) as categoryName, from BookCat fc where fc.bookId = '${item.id}' AND group by fc.bookId">
Re: Oddness with Phrase Query
: ?q="Here there be dragons" : &qt=dismax : &qf=title ... : +DisjunctionMaxQuery((title:"here dragon")~0.01) () ...the quotes cause the entire string to be passed to the analyzer for the title field and the resulting Tokens are used to construct a phrase query. : ?q=Here there be dragons : &qt=dismax : &qf=title ... : +((DisjunctionMaxQuery((title:here)~0.01) : DisjunctionMaxQuery((title:dragon)~0.01))~2) () ...the lack of quotes just results in two term queries, that must be anywhere in the string. : It looks like it might be related to ... : http://issues.apache.org/jira/browse/SOLR-879 : : Although I added enablePositionIncrements="true" to : : : : in to the for in the : schema which didn't fix it - I presume this means that I have to reindex : everything (although the StopFilterFactory in : already had it). ...hmm, you shouldn't have to reindex everything. arey ou sure you restarted solr after making the enablePositionIncrements="true" change to the query analyzer? what do the offsets look like when you go to analysis.jsp and past in that sentence? the other thing to consider: you can increase the slop value on that phrase query (to allow looser matching) using the "qs" param (query slop) ... that could help in this situation (stop words getting striped out of hte query) as well as other situations (ie: what if the user just types "here be dragons" -- with or without stop words) -Hoss
Re: Huge load and long response times during search
In addition to some of hte other coments mentioned about IO, this caught my eye... : I'm using SOLR(1.4) to search among about 3,500,000 documents. After the : server kernel was updated to 64bit system has started to suffer. ...if the *only* thing that was upgraded was switching the kernel from 32bit to 64bit, then perhaps you are getting bit by java now using 64 bit pointers instead of 32 bit pointers, causing a lot more ram to be eaten up by the pointers? it's not soemthing i've done a lot of testing on, but i've heared other people claim that it can cause some serious problems if you don't actaully need 64bit pointers for accessing huge heaps. ...that said, you should really double check what exactly what changed when your server was upgraded ... perhaps the upgrad inlcuded a new filesystem type, or changes to RAID settings, or even hardware changes ... if your problems started when an upgrade took place, then looking into what exactly changed during hte upgrade should be your furst step. -Hoss
Re: how to get the autocomplete feature in solr 1.4?
Chris Hostetter wrote: : how to get the autocomplete/autosuggest feature in the solr1.4.plz give me : the code also... there is no magical "one size fits all" solution for autocomplete in solr. if you look at the archives there have been lots of discussions about differnet ways ot get auto complete functionality, using things like the TermsComponent, or the LukeRequest handler, and there are lots of examples of using the SolrJS javascript functionality to populate an autocomplete box -- but you'll have to figure out what solution works best for your goals. Also, take a look at SOLR-1316, there are patches there that implement such component using prefix trees. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Embedded solr with third party libraries
: distirbution. When we run test cases our schema.xml has defintion for lucid : kstem and it throws ClassNotFound Exception. : We declared the depency for the two jars lucid-kstem.jar and : lucid-solr-kstem.jar but still it throws an error. explain what you mean by "declared the depency" ? : C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr\conf\schema.xml : : Now in order for the jar to be loaded should i copy the two jars to solr/lib : directory. is that the default location embedded solr looks into for some : default jars. assuming "C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr" is your sole home dir, then yes you can copy your jars into "C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr\lib" and that should work ... or starting in Solr 1.4 you can use the directorives to specify a jar anywhere on disk. see the example solrconfig.xml for the syntax. -Hoss
Re: How to use DataImportHandler with ExtractingRequestHandler?
Anyone any idea? javaxmlsoapdev wrote: > > did you extend DIH to do this work? can you share code samples. I have > similar requirement where I need tp index database records and each record > has a column with document path so need to create another index for > documents (we allow users to search both index separately) in parallel > with reading some meta data of documents from database as well. I have all > sorts of different document formats to index. I am on solr 1.4.0. Any > pointers would be appreciated. > > Thanks, > > > -- View this message in context: http://old.nabble.com/How-to-use-DataImportHandler-with-ExtractingRequestHandler--tp25267745p26485245.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Announcing the Apache Solr extension in PHP - 0.9.0
Thanks Israel, exactly what I was looking for, but how would one get a pre-compiled dll for windows? using PHP 5.3 VS9 TS. On Mon, Oct 5, 2009 at 7:03 AM, Israel Ekpo wrote: > Fellow Apache Solr users, > > I have been working on a PHP extension for Apache Solr in C for quite > sometime now. > > I just finished testing it and I have completed the initial user level > documentation of the API > > Version 0.9.0-beta has just been released. > > It already has built-in readiness for Solr 1.4 > > If you are using Solr 1.3 or later in PHP, I would appreciate if you could > check it out and give me some feedback. > > It is very easy to install on UNIX systems. I am still working on the build > for windows. It should be available for Windows soon. > > http://solr.israelekpo.com/manual/en/solr.installation.php > > A quick list of some of the features of the API include : > - Built in serialization of Solr Parameter objects. > - Reuse of HTTP connections across repeated requests. > - Ability to obtain input documents for possible resubmission from query > responses. > - Simplified interface to access server response data (SolrObject) > - Ability to connect to Solr server instances secured behind HTTP > Authentication and proxy servers > > The following components are also supported > - Facets > - MoreLikeThis > - TermsComponent > - Stats > - Highlighting > > Solr PECL Extension Homepage > http://pecl.php.net/package/solr > > Some examples are available here > http://solr.israelekpo.com/manual/en/solr.examples.php > > Interim Documentation Page until refresh of official PHP documentation > http://solr.israelekpo.com/manual/en/book.solr.php > > The C source is available here > http://svn.php.net/viewvc/pecl/solr/ > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. >
RE: UTF-8 Character Set not specifed on OutputStreamWriter in StreamingUpdateSolrServer
: Specifying the file.encoding did work, although I don't think it is a : suitable workaround for my use case. Any idea what my next step is to : having a bug opened. no, you shouldn't *have* to specifying -Dfile.encoding=UTF8, Shalin was just asking to try that to verify that really was the extent of the problem I created a bug to track this... https://issues.apache.org/jira/browse/SOLR-1595 -Hoss
Re: creating Lucene document from an external XML file.
: If I understand you correctly, you really want to be constructing : SolrInputDocuments (not Lucene's Documents) and indexing those with : SolrJ. I don't think there is anything in the API that can read in an I read your question differently then Otis did. My understanding is that you already have code that builds up files in the "..." update message syntax solr expects, but you want to modify those documents (wi/o changing your existing code) one possibility to think about is that instead of modifying the documents before sending them to Solr, you could write an UpdateProcessor tha runs direclty in Solr and gets access to those Documents after Solr has already parsed that XML (or even if the documents come from someplace else, like DIH, or a CSV file) and then make your changes. If Otis and i have *both* missunderstood your question, please clarify. -Hoss
Complex multi-value boosting
Guys -- What schema will you use for 500K docs with a variety of 0-30 different category ids, each carrying its own weight and completely overriding the default scoring? For example, these documents: A: 1:0.21, 2:0.41, 3:0.15 ... B: 1:0.18, 2:0.65 4:0.98 ... C: 6:0.75 ... D: 2:0.14 ... When searching "1" I'd like document A to appear first (has 0.21) and when searching "1 || 2" i'd like document B to appear first (has an aggregate score of 0.83 vs. 0.62). Currently I run this with full-text after artificially repeating the number of each category's weight (i.e. "1" would appear 21 times on a text field) - is there a better way? Best, -- Michael
RE: Multi word synonym problem
: The response is not searching for Michael Jackson. Instead it is : searching for (text:Micheal and text: Jackson).To monitor the parsed : query, i turned on debugQuery, but in the present case, the parsed query : string was searching Micheal and Jackson separately. using index time synonyms isn't ggoing to have any effect on how your query is parsed. the Lucene/Solr query parsers uses whitespace as "markup" and will still analyze each of the "words" in your input seperately and build up a boolean query containing each of your words individually (the only way to change that is to use quotes to force "phrase query" behavior where everything in quotes is analyzed as one chunk, or pick a different queyr parse like the "field" parser) ...but none of that changes the point of *why* you can/should use index time synonyms for situations like this. the point of doing that is that at index time the alternate versions of the multi-word sequences can all be expanded and all varients are put in the index ... so it doesn't matter if you use a phrase query, or term queries, all of the synonyms are in the index document. -Hoss
Re: Announcing the Apache Solr extension in PHP - 0.9.0
Hi Mike, Thanks to Pierre, the Windows version of the extension are available here compiled from trunk r 291135 http://downloads.php.net/pierre/ I am planning to have 0.9.8 compiled for windows as soon as it is out sometime later this week. The 1.0 release should be out sometime before mid December after the API is finalized and tested. You can always check the project home page for news about upcoming releases http://pecl.php.net/package/solr The documentation is available here http://www.php.net/manual/en/book.solr.php Cheers On Mon, Nov 23, 2009 at 3:28 PM, Michael Lugassy wrote: > Thanks Israel, exactly what I was looking for, but how would one get a > pre-compiled dll for windows? using PHP 5.3 VS9 TS. > > On Mon, Oct 5, 2009 at 7:03 AM, Israel Ekpo wrote: > > Fellow Apache Solr users, > > > > I have been working on a PHP extension for Apache Solr in C for quite > > sometime now. > > > > I just finished testing it and I have completed the initial user level > > documentation of the API > > > > Version 0.9.0-beta has just been released. > > > > It already has built-in readiness for Solr 1.4 > > > > If you are using Solr 1.3 or later in PHP, I would appreciate if you > could > > check it out and give me some feedback. > > > > It is very easy to install on UNIX systems. I am still working on the > build > > for windows. It should be available for Windows soon. > > > > http://solr.israelekpo.com/manual/en/solr.installation.php > > > > A quick list of some of the features of the API include : > > - Built in serialization of Solr Parameter objects. > > - Reuse of HTTP connections across repeated requests. > > - Ability to obtain input documents for possible resubmission from query > > responses. > > - Simplified interface to access server response data (SolrObject) > > - Ability to connect to Solr server instances secured behind HTTP > > Authentication and proxy servers > > > > The following components are also supported > > - Facets > > - MoreLikeThis > > - TermsComponent > > - Stats > > - Highlighting > > > > Solr PECL Extension Homepage > > http://pecl.php.net/package/solr > > > > Some examples are available here > > http://solr.israelekpo.com/manual/en/solr.examples.php > > > > Interim Documentation Page until refresh of official PHP documentation > > http://solr.israelekpo.com/manual/en/book.solr.php > > > > The C source is available here > > http://svn.php.net/viewvc/pecl/solr/ > > > > -- > > "Good Enough" is not good enough. > > To give anything less than your best is to sacrifice the gift. > > Quality First. Measure Twice. Cut Once. > > > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: Question about the message "Indexing failed. Rolled back all changes."
This is definitely a bug. Please open a JIRA issue for this. On Sat, Nov 21, 2009 at 10:53 AM, Bertie Shen wrote: > Hey, > > I figured out why we always we have see Indexing failed. > Rolled back all changes.. It is because we need a > dataimport.properties file at conf/, into which indexing will write a last > indexing time. Without that file, SolrWriter.java will put throw an > exception and Solr will have this misleading Indexing failed. > Rolled back all changes.. output, although indexing is actually > successfully completed. > > I think we need to improve this functionality, at least documentation. > > There are one more thing that we need to pay attention to, i.e. we need to > make dataimport.properties writable by other users, otherwise, > last_index_time will not be written and the error message may still be > there. > > On Fri, Nov 13, 2009 at 9:35 AM, yountod wrote: > >> >> The process initially completes with: >> >> 2009-11-13 09:40:46 >> Indexing completed. Added/Updated: 20 documents. Deleted >> 0 documents. >> >> >> ...but then it fails with: >> >> 2009-11-13 09:40:46 >> Indexing failed. Rolled back all changes. >> 2009-11-13 09:41:10 >> 2009-11-13 09:41:10 >> 2009-11-13 09:41:10 >> >> >> >> I think it may have something to do with this, which I found by using the >> DataImport.jsp: >> >> (Thread.java:636) Caused by: java.sql.SQLException: Illegal value for >> setFetchSize(). at >> com.mysql.jdbc.Statement.setFetchSize(Statement.java:1864) at >> >> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:242) >> ... 28 more >> >> >> >> -- >> View this message in context: >> http://old.nabble.com/Question-about-the-message-%22Indexing-failed.-Rolled-back-all--changes.%22-tp26242714p26340360.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > -- Lance Norskog goks...@gmail.com
Re: Output all, from one field
: Do you want to return just one field from all documents? If yes, you can: : :1. Query with q=*:*&fl=name :2. Use TermsComponent - http://wiki.apache.org/solr/TermsComponent note that those are very differnet creatures ... #1 gives you all of the stored values for every document. #2 gives you all of the indexed terms (some of which may have all come from a single indexed value) -Hoss
Re: Boost document base on field length
: > I would like to boost documents with longer descriptions to move down documents with 0 length description, : > I'm wondering if there is possibility to boost document basing on the field length while searching or the only way is to store field length as an int in a separate field while indexing? : : Override the default Similarity (see the end of the schema.xml file) : with your own Similarity implementation and then in that class override : the lengthNorm() method. I think i'm reading he question differently then Grant -- his suggestion applies when you are searching in the description field, and don't want documents with shorter descriptions to score higher when the same terms match the same number of times (the default behavior of lengthNorm) my udnerstanding is that you want documents that don't have a description to score lower then documents that do -- and you might be querying against completely differnet fields (description might not even be indexed) in that case there is no easy way to to achieve this with just the description field ... the easy thing to do is to index a boolean "has_description" field and then incorporate that into your query (or as the input to a function query) -Hoss
Re: [N to M] range search out of sum of field. howto search this?
: fq={!frange l=5 u=10}sum(user,num) H, One of us massivly missunderstood the original question - and i'm pretty sure it's Yonik. i don't think he wants results where the user field plus the num field are in the range of 5-10 ... i think he wants the list of user Ids (which are numbers in his examples, but could just as easily be strings) where the sum of the "num" fields in all documents that have the same value in the "user" field are the same. I can't think of any easy way to do that ... it isn't the kind of thing an Inverted Index is particuaraly good at. but maybe there's soemthing in the Field Collapsing patch (searching the archives/wiki will bring up pointers) that can filter on stats like this? : On Mon, Nov 23, 2009 at 8:49 AM, Julian Davchev wrote: : > Hi folks, : > I got documents like : > user:1 num:5 : > user:1 num: 8 : > user:5 num:7 : > user:5 num:1 : > : > : > : > I'd like to get per user that maches sum of num range 5 to 10 : > In this case it should return user 5 as 7+1=8 and is within range. : > User 1 will be false cause sum of num is 5+8=13 hence outside range 5 to 10 -Hoss
Re: Announcing the Apache Solr extension in PHP - 0.9.0
Thanks Israel I plan to try it and compare with rsolr On Nov 23, 2009, at 2:28 PM, Michael Lugassy wrote: Thanks Israel, exactly what I was looking for, but how would one get a pre-compiled dll for windows? using PHP 5.3 VS9 TS. On Mon, Oct 5, 2009 at 7:03 AM, Israel Ekpo wrote: Fellow Apache Solr users, I have been working on a PHP extension for Apache Solr in C for quite sometime now. I just finished testing it and I have completed the initial user level documentation of the API Version 0.9.0-beta has just been released. It already has built-in readiness for Solr 1.4 If you are using Solr 1.3 or later in PHP, I would appreciate if you could check it out and give me some feedback. It is very easy to install on UNIX systems. I am still working on the build for windows. It should be available for Windows soon. http://solr.israelekpo.com/manual/en/solr.installation.php A quick list of some of the features of the API include : - Built in serialization of Solr Parameter objects. - Reuse of HTTP connections across repeated requests. - Ability to obtain input documents for possible resubmission from query responses. - Simplified interface to access server response data (SolrObject) - Ability to connect to Solr server instances secured behind HTTP Authentication and proxy servers The following components are also supported - Facets - MoreLikeThis - TermsComponent - Stats - Highlighting Solr PECL Extension Homepage http://pecl.php.net/package/solr Some examples are available here http://solr.israelekpo.com/manual/en/solr.examples.php Interim Documentation Page until refresh of official PHP documentation http://solr.israelekpo.com/manual/en/book.solr.php The C source is available here http://svn.php.net/viewvc/pecl/solr/ -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: access denied to solr home lib dir
: Check. I even verified that the tomcat user could create the : directory (i.e. "sudo -u tomcat6 mkdir /opt/solr/steve/lib"). Still : solr complains. Note that you have an AccessControlException, not a simple FileNotFoundException ... the error here is coming from File.canRead (when Solr is asking if it has permision to read the file) but your ServletContainer evidently has a security policy in place that prevent's solr from even checking (if the security policy allowed it to check, then it would return true/false based on the actaul file permisions)... http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html#canRead%28%29 Tests whether the application can read the file denoted by this abstract pathname. Returns: true if and only if the file specified by this abstract pathname exists and can be read by the application; false otherwise Throws: SecurityException - If a security manager exists and its SecurityManager.checkRead(java.lang.String) method denies read access to the file ...note that Tomcat doesn't have any special SecurityManager settings that prevent this by default. something about your tomcat deployment must be specifying specific Security Permision rules. : >> Caused by: java.security.AccessControlException: access denied : >> (java.io.FilePermission /opt/solr/steve/./lib read) : >> at java.security.AccessControlContext.checkPermission(AccessControlContext.java:323) : >> at java.security.AccessController.checkPermission(AccessController.java:546) : >> at java.lang.SecurityManager.checkPermission(SecurityManager.java:532) : >> at java.lang.SecurityManager.checkRead(SecurityManager.java:871) : >> at java.io.File.canRead(File.java:689) : >> at org.apache.solr.core.SolrResourceLoader.replaceClassLoader(SolrResourceLoader.java:157) : >> at org.apache.solr.core.SolrResourceLoader.addToClassLoader(SolrResourceLoader.java:128) : >> at org.apache.solr.core.SolrResourceLoader.(SolrResourceLoader.java:97) : >> at org.apache.solr.core.SolrResourceLoader.(SolrResourceLoader.java:195) : >> at org.apache.solr.core.Config.(Config.java:93) : >> at org.apache.solr.servlet.SolrDispatchFilter.(SolrDispatchFilter.java:65) : >> ... 40 more -Hoss
Re: Complex multi-value boosting
On Mon, Nov 23, 2009 at 3:39 PM, Michael Lugassy wrote: > Guys -- > > What schema will you use for 500K docs with a variety of 0-30 > different category ids, each carrying its own weight and completely > overriding the default scoring? > > For example, these documents: > A: 1:0.21, 2:0.41, 3:0.15 ... > B: 1:0.18, 2:0.65 4:0.98 ... > C: 6:0.75 ... > D: 2:0.14 ... > > When searching "1" I'd like document A to appear first (has 0.21) and > when searching "1 || 2" i'd like document B to appear first (has an > aggregate score of 0.83 vs. 0.62). Currently I run this with full-text > after artificially repeating the number of each category's weight > (i.e. "1" would appear 21 times on a text field) - is there a better > way? > > Best, > > -- Michael > It sounds to me like you want to use payloads (the same issue I had recently): http://old.nabble.com/Customizing-Field-Score-%28Multivalued-Field%29-tp26182254p26182254.html That thread has some details on the eventual implementation I chose. Let me know if you have any questions. Note that I did use the scoring as a boost, not "completely overriding the default scoring", but I think the impact is basically the same, as was satisfied it was good enough. -- Stephen Duncan Jr www.stephenduncanjr.com
ExternalRequestHandler and ContentStreamUpdateRequest usage
Following code is from my test case where it tries to index a file (of type .txt) ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract"); up.addFile(fileToIndex); up.setParam("literal.key", "8978"); //key is the uniqueId up.setParam("ext.literal.docName", "doc123.txt"); up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); server.request(up); test case doesn't give me any error and "I think" its indexing the file? but when I search for a text (which was part of the .txt file) search doesn't return me anything. Following is the config from solrconfig.xml where I have mapped content to "description" field(default search field) in the schema. description description Clearly it seems I am missing something. Any idea? Thanks, -- View this message in context: http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26486817.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Embedded solr with third party libraries
To deploy the Lucid KStem stemmer, copy these two files: lucid-kstem.jar lucid-solr-kstem.jar to the lib/ directory in your running solr instance. In the declaration for a text field, you would change this line: to this: (Remember that you have to make this change in both the query and analysis sections of the fieldType specification.) Now, to verify the change, restart solr and go to the analysis.jsp test page: http://localhost:8983/solr/admin/analysis.jsp Let's say you changed the 'text' type and left 'textTight' using PorterStemmer. Change the Field name/type drop-down to 'type' and type 'text' in the top box. Now type 'changing' in the "Field Value" box and click 'Analyze'. The bottom of the page will now show that 'changeing' was stemmed to 'change'. If you change the field type from 'text' to 'textTight' and try again, 'changing' will be stemmed to 'chang' by the original PorterStemmer. On Mon, Nov 23, 2009 at 12:23 PM, Chris Hostetter wrote: > > : distirbution. When we run test cases our schema.xml has defintion for lucid > : kstem and it throws ClassNotFound Exception. > : We declared the depency for the two jars lucid-kstem.jar and > : lucid-solr-kstem.jar but still it throws an error. > > explain what you mean by "declared the depency" ? > > : > C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr\conf\schema.xml > : > : Now in order for the jar to be loaded should i copy the two jars to solr/lib > : directory. is that the default location embedded solr looks into for some > : default jars. > > assuming "C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr" > is your sole home dir, then yes you can copy your jars into > "C:\DOCUME~1\username\LOCALS~1\Temp\solr-all\0.8194571792905493\solr\lib" > and that should work ... or starting in Solr 1.4 you can use the > directorives to specify a jar anywhere on disk. see the example > solrconfig.xml for the syntax. > > > > -Hoss > > -- Lance Norskog goks...@gmail.com
Re: ExternalRequestHandler and ContentStreamUpdateRequest usage
On Nov 23, 2009, at 5:04 PM, javaxmlsoapdev wrote: > > Following code is from my test case where it tries to index a file (of type > .txt) > ContentStreamUpdateRequest up = new > ContentStreamUpdateRequest("/update/extract"); > up.addFile(fileToIndex); > up.setParam("literal.key", "8978"); //key is the uniqueId > up.setParam("ext.literal.docName", "doc123.txt"); > up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); > server.request(up); > > test case doesn't give me any error and "I think" its indexing the file? but > when I search for a text (which was part of the .txt file) search doesn't > return me anything. What do your logs show? Else, what does Luke show or doing a *:* query (assuming this is the only file you added)? Also, I don't think you need ext.literal anymore, just literal. > > Following is the config from solrconfig.xml where I have mapped content to > "description" field(default search field) in the schema. > > class="org.apache.solr.handler.extraction.ExtractingRequestHandler"> > > description > description > > > > Clearly it seems I am missing something. Any idea? -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Oddness with Phrase Query
On Mon, Nov 23, 2009 at 12:10:42PM -0800, Chris Hostetter said: > ...hmm, you shouldn't have to reindex everything. arey ou sure you > restarted solr after making the enablePositionIncrements="true" change to > the query analyzer? Yup - definitely restarted > what do the offsets look like when you go to analysis.jsp and past in that > sentence? org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt, ignoreCase=true, enablePositionIncrements=true} term position: 1 4 term text: HereDragons term type: wordword source start,end0,4 14,21 payload > the other thing to consider: you can increase the slop value on that > phrase query (to allow looser matching) using the "qs" param (query slop) > ... that could help in this situation (stop words getting striped out of > hte query) as well as other situations (ie: what if the user just types > "here be dragons" -- with or without stop words) After fiddling with the position incremements stuff I upped the query slop to 2 which seems to now provide better results but I'm worried about that effecting relevancy elsewhere (which I presume is the reason why it's not the default value). If that's the case - is it worth writing something for my app so that if it detects a phrase query with lots of stop words it ups the phrase slop? Either way it seems to be working now - thanks for all the help, Simon
Re: Announcing the Apache Solr extension in PHP - 0.9.0
Sweeet! you guys rock. On Mon, Nov 23, 2009 at 11:12 PM, Thanh Doan wrote: > Thanks Israel > > I plan to try it and compare with rsolr > > > On Nov 23, 2009, at 2:28 PM, Michael Lugassy wrote: > >> Thanks Israel, exactly what I was looking for, but how would one get a >> pre-compiled dll for windows? using PHP 5.3 VS9 TS. >> >> On Mon, Oct 5, 2009 at 7:03 AM, Israel Ekpo wrote: >>> >>> Fellow Apache Solr users, >>> >>> I have been working on a PHP extension for Apache Solr in C for quite >>> sometime now. >>> >>> I just finished testing it and I have completed the initial user level >>> documentation of the API >>> >>> Version 0.9.0-beta has just been released. >>> >>> It already has built-in readiness for Solr 1.4 >>> >>> If you are using Solr 1.3 or later in PHP, I would appreciate if you >>> could >>> check it out and give me some feedback. >>> >>> It is very easy to install on UNIX systems. I am still working on the >>> build >>> for windows. It should be available for Windows soon. >>> >>> http://solr.israelekpo.com/manual/en/solr.installation.php >>> >>> A quick list of some of the features of the API include : >>> - Built in serialization of Solr Parameter objects. >>> - Reuse of HTTP connections across repeated requests. >>> - Ability to obtain input documents for possible resubmission from query >>> responses. >>> - Simplified interface to access server response data (SolrObject) >>> - Ability to connect to Solr server instances secured behind HTTP >>> Authentication and proxy servers >>> >>> The following components are also supported >>> - Facets >>> - MoreLikeThis >>> - TermsComponent >>> - Stats >>> - Highlighting >>> >>> Solr PECL Extension Homepage >>> http://pecl.php.net/package/solr >>> >>> Some examples are available here >>> http://solr.israelekpo.com/manual/en/solr.examples.php >>> >>> Interim Documentation Page until refresh of official PHP documentation >>> http://solr.israelekpo.com/manual/en/book.solr.php >>> >>> The C source is available here >>> http://svn.php.net/viewvc/pecl/solr/ >>> >>> -- >>> "Good Enough" is not good enough. >>> To give anything less than your best is to sacrifice the gift. >>> Quality First. Measure Twice. Cut Once. >>> > -- Sent from my mobile
Re: ExternalRequestHandler and ContentStreamUpdateRequest usage
*:* returns me 1 count but when I search for specific word (which was part of .txt file I indexed before) it doesn't return me anything. I don't have luke setup on my end. let me see if I can set that up quickly but otherwise do you see anything I am missing in solrconfig mapping or something? which maps document "content" to wrong attribute? thanks, Grant Ingersoll-6 wrote: > > > On Nov 23, 2009, at 5:04 PM, javaxmlsoapdev wrote: > >> >> Following code is from my test case where it tries to index a file (of >> type >> .txt) >> ContentStreamUpdateRequest up = new >> ContentStreamUpdateRequest("/update/extract"); >> up.addFile(fileToIndex); >> up.setParam("literal.key", "8978"); //key is the uniqueId >> up.setParam("ext.literal.docName", "doc123.txt"); >> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); >> server.request(up); >> >> test case doesn't give me any error and "I think" its indexing the file? >> but >> when I search for a text (which was part of the .txt file) search doesn't >> return me anything. > > What do your logs show? Else, what does Luke show or doing a *:* query > (assuming this is the only file you added)? > > Also, I don't think you need ext.literal anymore, just literal. > >> >> Following is the config from solrconfig.xml where I have mapped content >> to >> "description" field(default search field) in the schema. >> >> > class="org.apache.solr.handler.extraction.ExtractingRequestHandler"> >> >> description >> description >> >> >> >> Clearly it seems I am missing something. Any idea? > > > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > > > -- View this message in context: http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26487320.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ExternalRequestHandler and ContentStreamUpdateRequest usage
FYI: weirdly its returning me following when I run rsp.getResults().get(0).getFieldValue("description") [702, text/plain, doc123.txt, ] so it seems like its storing up.setParam("ext.literal.docName", "doc123.txt"); into description versus file content in "description" attribute. Any idea? Thanks, javaxmlsoapdev wrote: > > *:* returns me 1 count but when I search for specific word (which was part > of .txt file I indexed before) it doesn't return me anything. I don't have > luke setup on my end. let me see if I can set that up quickly but > otherwise do you see anything I am missing in solrconfig mapping or > something? which maps document "content" to wrong attribute? > > thanks, > > Grant Ingersoll-6 wrote: >> >> >> On Nov 23, 2009, at 5:04 PM, javaxmlsoapdev wrote: >> >>> >>> Following code is from my test case where it tries to index a file (of >>> type >>> .txt) >>> ContentStreamUpdateRequest up = new >>> ContentStreamUpdateRequest("/update/extract"); >>> up.addFile(fileToIndex); >>> up.setParam("literal.key", "8978"); //key is the uniqueId >>> up.setParam("ext.literal.docName", "doc123.txt"); >>> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); >>> server.request(up); >>> >>> test case doesn't give me any error and "I think" its indexing the file? >>> but >>> when I search for a text (which was part of the .txt file) search >>> doesn't >>> return me anything. >> >> What do your logs show? Else, what does Luke show or doing a *:* query >> (assuming this is the only file you added)? >> >> Also, I don't think you need ext.literal anymore, just literal. >> >>> >>> Following is the config from solrconfig.xml where I have mapped content >>> to >>> "description" field(default search field) in the schema. >>> >>> >> class="org.apache.solr.handler.extraction.ExtractingRequestHandler"> >>> >>> description >>> description >>> >>> >>> >>> Clearly it seems I am missing something. Any idea? >> >> >> >> -- >> Grant Ingersoll >> http://www.lucidimagination.com/ >> >> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using >> Solr/Lucene: >> http://www.lucidimagination.com/search >> >> >> > > -- View this message in context: http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26487409.html Sent from the Solr - User mailing list archive at Nabble.com.
Webinar: An Introduction to Basics of Search and Relevancy with Apache Solr hosted by Lucid Imagination
In this introductory technical presentation, renowned search expert Mark Bennett, CTO of Search Consultancy New Idea Engineering, will present practical tips and examples to help you quickly get productive with Solr, including: * Working with the "web command line" and controlling your inputs and outputs * Understanding the DISMAX parser * Using the Explain output to tune your results relevance * Using the Schema browser Wednesday, December 2, 2009 11:00am PST / 2:00pm EST Click here to sign up: http://www.eventsvc.com/lucidimagination/120209?trk=WR-DEC2009-AP
Re: help with dataimport delta query
got to love it when yahoo thinks your own mail is spam, anyone have any ideas how to get logging to work with 1.4. I went to the admin panel and set all logging to finest. In my jetty std out I see no SQL for any of the dataimport handler run. I see Nov 23, 2009 9:26:27 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Time taken for getConnection(): 6 Nov 23, 2009 9:26:32 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Creating a connection for entity category with URL: jdbc:mysql:// localhost/feeddb Nov 23, 2009 9:26:32 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Time taken for getConnection(): 5 But no sql, from looking at the source, it looks like it should be logging the sql if Im in debug mode. any ideas, I think I am losing my mind. my full import works, but the delta does nothing thanks Joel On Nov 23, 2009, at 2:49 PM, Joel Nylund wrote: Hi, I have solr all working nicely, except im trying to get deltas to work on my data import handler Here is a simplification of my data import config, I have a table called "Book" which has categories, im doing subquries for the category info and calling a javascript helper. This all works perfectly for the regular query. I added these lines for the delta stuff: deltaImportQuery="SELECT f.id,f.title FROM Book f f.id='${dataimporter.delta.job_jobs_id}'" deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND lastModifiedDate > '${dataimporter.last_index_time}'" > basically im trying to rows that lastModifiedDate is newer than the last index (or deltaindex). I run: http://localhost:8983/solr/dataimport?command=delta-import And it says in logs: Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport INFO: Starting Delta Import Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties INFO: Read dataimport.properties Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Starting delta collection. Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=delta-import} status=0 QTime=0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Running ModifiedRowKey() for Entity: category Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: category rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: category Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Running ModifiedRowKey() for Entity: item Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed DeletedRowKey for Entity: item rows obtained : 0 Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder collectDelta INFO: Completed parentDeltaQuery for Entity: item Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.21 But the browser says no documents added/modified (even though one record in db is a match) Is there a way to turn debugging so I can see the queries the DIH is sending to the db? Any other ideas of what I could be doing wrong? thanks Joel deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND lastModifiedDate > '${dataimporter.last_index_time}'" > transformer="script:SplitAndPrettyCategory" query="select fc.bookId, group_concat(cr.name) as categoryName, from BookCat fc where fc.bookId = '${item.id}' AND group by fc.bookId">
Re: ExternalRequestHandler and ContentStreamUpdateRequest usage
On Nov 23, 2009, at 5:33 PM, javaxmlsoapdev wrote: > > *:* returns me 1 count but when I search for specific word (which was part of > .txt file I indexed before) it doesn't return me anything. I don't have luke > setup on my end. http://localhost:8983/solr/admin/luke should give yo some info. > let me see if I can set that up quickly but otherwise do > you see anything I am missing in solrconfig mapping or something? What's your schema look like and how are you querying? > which maps > document "content" to wrong attribute? > > thanks, > > Grant Ingersoll-6 wrote: >> >> >> On Nov 23, 2009, at 5:04 PM, javaxmlsoapdev wrote: >> >>> >>> Following code is from my test case where it tries to index a file (of >>> type >>> .txt) >>> ContentStreamUpdateRequest up = new >>> ContentStreamUpdateRequest("/update/extract"); >>> up.addFile(fileToIndex); >>> up.setParam("literal.key", "8978"); //key is the uniqueId >>> up.setParam("ext.literal.docName", "doc123.txt"); >>> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); >>> server.request(up); >>> >>> test case doesn't give me any error and "I think" its indexing the file? >>> but >>> when I search for a text (which was part of the .txt file) search doesn't >>> return me anything. >> >> What do your logs show? Else, what does Luke show or doing a *:* query >> (assuming this is the only file you added)? >> >> Also, I don't think you need ext.literal anymore, just literal. >> >>> >>> Following is the config from solrconfig.xml where I have mapped content >>> to >>> "description" field(default search field) in the schema. >>> >>> >> class="org.apache.solr.handler.extraction.ExtractingRequestHandler"> >>> >>> description >>> description >>> >>> >>> >>> Clearly it seems I am missing something. Any idea? >> >> >> >> -- >> Grant Ingersoll >> http://www.lucidimagination.com/ >> >> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using >> Solr/Lucene: >> http://www.lucidimagination.com/search >> >> >> > > -- > View this message in context: > http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26487320.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Implementing phrase autopop up
hello all Let me first explain the task i am trying to do. i have article with title for example >Car Insurance for Teenage Drivers − A Total Loss? If a user begins to type car insu i want the autopop to show up with the entire phrase. There are two ways to implement this. First is to use the termcomponent and the other is to use a field with field type which uses solr.EdgeNGramFilterFactor filter. I started with using with Term component and i declared a term request handler and gave the following query http://localhost:8080/solr/terms?terms.fl=title&terms.prefix=car The issue is that its not giving the entire pharse, it gives me back results like car, caravan, carbon. Now i know using terms.prefix will only give me results where the sentence start with car. On top of this i also want if there is word like car somewhere in between the title that should also show up in autopop very much similar like google where a word is not necessarily start at the beginning but it could be present anywhere in the middle of the title. The question is does TermComponent is a good candidate or using a custom field lets the name is autoPopupText with field type configured with all filter and EdgeNGramFilterFactor defined and copying the title to the autoPopupText field and using it to power autopopup. The other thing is that using EdgeNGramFilterFactor is more from index point of view when you index document you need to know which fields you want to copy to autoPopupText field where as using Term component is more like you can define at query time what fields you want to use to fetch autocomplete from. Any idea whats the best and why the Term component is not giving me an entire phrase which i mentioned earlier. FYI my title field is of type text. Thanks darniz -- View this message in context: http://old.nabble.com/Implementing-phrase-autopop-up-tp26490419p26490419.html Sent from the Solr - User mailing list archive at Nabble.com.
auto-completion preview?
Hello Solr users, is there a live demo of the auto-completion feature somewhere? thanks in advance paul
Re: help with dataimport delta query
I guess the field names do not match in the deltaQuery you are selecting the field id and in the deltaImportQuery you us the field as ${dataimporter.delta.job_jobs_id} I guess it should be ${dataimporter.delta.id} On Tue, Nov 24, 2009 at 1:19 AM, Joel Nylund wrote: > Hi, I have solr all working nicely, except im trying to get deltas to work > on my data import handler > > Here is a simplification of my data import config, I have a table called > "Book" which has categories, im doing subquries for the category info and > calling a javascript helper. This all works perfectly for the regular query. > > I added these lines for the delta stuff: > > deltaImportQuery="SELECT f.id,f.title > FROM Book f > f.id='${dataimporter.delta.job_jobs_id}'" > deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND > lastModifiedDate > '${dataimporter.last_index_time}'" > > > basically im trying to rows that lastModifiedDate is newer than the last > index (or deltaindex). > > I run: > http://localhost:8983/solr/dataimport?command=delta-import > > And it says in logs: > > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DataImporter > doDeltaImport > INFO: Starting Delta Import > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.SolrWriter > readIndexerProperties > INFO: Read dataimport.properties > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > doDelta > INFO: Starting delta collection. > Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/solr path=/dataimport params={command=delta-import} > status=0 QTime=0 > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Running ModifiedRowKey() for Entity: category > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0 > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed DeletedRowKey for Entity: category rows obtained : 0 > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed parentDeltaQuery for Entity: category > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Running ModifiedRowKey() for Entity: item > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0 > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed DeletedRowKey for Entity: item rows obtained : 0 > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > collectDelta > INFO: Completed parentDeltaQuery for Entity: item > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > doDelta > INFO: Delta Import completed successfully > Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder > execute > INFO: Time taken = 0:0:0.21 > > But the browser says no documents added/modified (even though one record in > db is a match) > > Is there a way to turn debugging so I can see the queries the DIH is sending > to the db? > > Any other ideas of what I could be doing wrong? > > thanks > Joel > > > > query="SELECT f.id, f.title > FROM Book f > WHERE f.inMyList=1" > deltaImportQuery="SELECT f.id,f.title > FROM Book f > f.id='${dataimporter.delta.job_jobs_id}'" > deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND > lastModifiedDate > '${dataimporter.last_index_time}'" > > > > > transformer="script:SplitAndPrettyCategory" query="select fc.bookId, > group_concat(cr.name) as categoryName, > from BookCat fc > where fc.bookId = '${item.id}' AND > group by fc.bookId"> > > > > > > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: auto-completion preview?
On Tue, Nov 24, 2009 at 10:39 AM, Paul Libbrecht wrote: > > Hello Solr users, > > is there a live demo of the auto-completion feature somewhere? > > thanks in advance > Well, there is no preview but I can give you a couple of live instances: 1. http://autos.aol.com/ 2. http://travel.aol.com/ Try typing something into the top most search box. -- Regards, Shalin Shekhar Mangar.
Re: Implementing phrase autopop up
On Tue, Nov 24, 2009 at 10:12 AM, darniz wrote: > > hello all > Let me first explain the task i am trying to do. > i have article with title for example > > >Car Insurance for Teenage Drivers > > − > > A Total Loss? > > If a user begins to type car insu i want the autopop to show up with the > entire phrase. > There are two ways to implement this. > First is to use the termcomponent and the other is to use a field with > field > type which uses solr.EdgeNGramFilterFactor filter. > > I started with using with Term component and i declared a term request > handler and gave the following query > > http://localhost:8080/solr/terms?terms.fl=title&terms.prefix=car > The issue is that its not giving the entire pharse, it gives me back > results > like car, caravan, carbon. Now i know using terms.prefix will only give me > results where the sentence start with car. On top of this i also want if > there is word like car somewhere in between the title that should also show > up in autopop very much similar like google where a word is not necessarily > start at the beginning but it could be present anywhere in the middle of > the > title. > The question is does TermComponent is a good candidate or using a custom > field lets the name is autoPopupText with field type configured with all > filter and EdgeNGramFilterFactor defined and copying the title to the > autoPopupText field and using it to power autopopup. > > The other thing is that using EdgeNGramFilterFactor is more from index > point of view when you index document you need to know which fields you > want > to copy to autoPopupText field where as using Term component is more like > you can define at query time what fields you want to use to fetch > autocomplete from. > > Any idea whats the best and why the Term component is not giving me an > entire phrase which i mentioned earlier. > FYI > my title field is of type text. > You are using a tokenized field type with TermsComponent therefore each word in your phrase gets indexed as a separate token. You should use a non-tokenized type (such as a string type) with TermsComponent. However, this will only let you search by prefix and not by words in between the phrase. Your best bet here would be to use EdgeNGramFilterFactory. If your index is very large, you can consider doing a prefix search on shingles too. -- Regards, Shalin Shekhar Mangar.
Re: schema-based Index-time field boosting
On 23.11.2009 19:33 Chris Hostetter wrote: > ...if there was a way to oost fields at index time that was configured in > the schema.xml, then every doc would get that boost on it's instances of > those fields but the only purpose of index time boosting is to indicate > that one document is more significant then another doc -- if every doc > gets the same boost, it becomes a No-OP. > > (think about the math -- field boosts become multipliers in the fieldNorm > -- if every doc gets the same multiplier, then there is no net effect) Coming in a bit late but I would like a variant that is not a No-OP. Think of something like title:searchstring^10 OR catch_all:searchstring Of course I can always add the boosting at query time but it would make life easier if I could define a default boost in the schema so that my query could just be title:searchstring OR catch_all:searchstring but still get the boost for the title field. Thinking this further it would be even better if it was possible to define one (or more) fallback field(s) with associated boost factor in the schema. Then it would be enough to query for title:searchstring and it would be automatically expanded to e.g. title:searchstring^10 OR title_other_language:searchstring^5 OR catchall:searchstring or whatever you define in the schema. -Michael
Re: Spellcheck: java.lang.RuntimeException: java.io.IOException: read past EOF
On Tue, Nov 24, 2009 at 1:14 AM, ranjitr wrote: > > Hello, > > Solr 1.3 reported the following error when our app tried to query it: > > java.lang.RuntimeException: java.io.IOException: read past EOF >at > > org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:91) >at > > org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:108) > . > > Can you post the complete stack trace (i.e. the underlying exception's stack trace as well)? When this error occured, our solrconfig.xml had spellcheck.build set to > true. This was a configuration eror on our part. I was wondering if the > spellcheck index being re-built for each query could have caused the above > exception to occur. > > I don't know. Rebuilding the index for each query is not a good idea anyways. -- Regards, Shalin Shekhar Mangar.