Re: Solr and OpenPipe
Hi Espen! I tried to follow the getting started guide at openpipe site, the maven build for intranet example doesn't generate the jar with dependencies, what are the current dependencies of openpipe? 2008/4/4, Espen Amble Kolstad <[EMAIL PROTECTED]>: > > Hi, > > I'm one of the developers of the initial version of OpenPipe. > > We are currently using OpenPipe with Solr to index the Norwegian and > English wikipedia. > > Anything in particular you want to know? > > - Espen > > > From: "Rogerio Pereira" <[EMAIL PROTECTED]> > > Date: 2. april 2008 23.00.32 GMT+02:00 > > To: solr-user@lucene.apache.org > > Subject: Solr and OpenPipe > > Reply-To: solr-user@lucene.apache.org > > Reply-To: [EMAIL PROTECTED] > > > > > Hi! > > > > Somebody has been working with Solr and OpenPipe? > > > > -- > > Yours truly (Atenciosamente), > > > > Rogério (_rogerio_) > > http://faces.eti.br > > > > "Faça a diferença! Ajude o seu país a crescer, não retenha conhecimento, > > distribua e aprenda mais." (http://faces.eti.br/?p=45) > > > -- Yours truly (Atenciosamente), Rogério (_rogerio_) http://faces.eti.br "Faça a diferença! Ajude o seu país a crescer, não retenha conhecimento, distribua e aprenda mais." (http://faces.eti.br/?p=45)
why don't all stored fields show up?
I have about 20 stored fields in string, text, and int, but only about 10 fields show up when I query for them, whether I do fl=*,score or list them out. What's my problem? How do I retrieve all of fields? Thanks.
Re: why don't all stored fields show up?
On Fri, Apr 4, 2008 at 9:25 AM, Hung Huynh <[EMAIL PROTECTED]> wrote: > I have about 20 stored fields in string, text, and int, but only about 10 > fields show up when I query for them, whether I do fl=*,score or list them > out. What's my problem? How do I retrieve all of fields? Thanks. You should be getting back all stored fields for every document. Documents will only show fields they have (fields are sparse, it's not like a DB table). -Yonik
RE: why don't all stored fields show up?
Do you think it might be a problem with my schema and data loading? I loaded CSV with 39 fields and didn't get any error message. I have a total of 39 stored fields, but not all of them are reported back when I query for them. Should I reload the Index? Is there a way for me to check if the Index has all 39 fields? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Friday, April 04, 2008 10:48 AM To: solr-user@lucene.apache.org Subject: Re: why don't all stored fields show up? On Fri, Apr 4, 2008 at 9:25 AM, Hung Huynh <[EMAIL PROTECTED]> wrote: > I have about 20 stored fields in string, text, and int, but only about 10 > fields show up when I query for them, whether I do fl=*,score or list them > out. What's my problem? How do I retrieve all of fields? Thanks. You should be getting back all stored fields for every document. Documents will only show fields they have (fields are sparse, it's not like a DB table). -Yonik
Re: why don't all stored fields show up?
On Fri, Apr 4, 2008 at 11:57 AM, Hung Huynh <[EMAIL PROTECTED]> wrote: > Do you think it might be a problem with my schema and data loading? Maybe. > I loaded > CSV with 39 fields and didn't get any error message. I have a total of 39 > stored fields, but not all of them are reported back when I query for them. Try to tackle it by getting more specific. Look at a single row in the CSV, and query for the id of that document in the index and see what's missing. Check the schema for those missing fields. Try to replicate the problem with another CSV file with just that single record. If you still can't figure it out, give us the following info: - the URL used to load the CSV data - the single record CSV file - the result of querying for that single record - your schema -Yonik
Single Core Can't Find the solrconfig.xml file
Hi, I tried setting up a single core application and I get the following error which claims it can't find the solrconfig.xml yet it is located under solr/conf/solrconfig.xml in my application : thnx *type* Status report *message* *Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: false in solrconfig.xml - java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/', cwd=/home/kirber/Desktop/tomcat-solr at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:168) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:136) at org.apache.solr.core.Config.(Config.java:97) at org.apache.solr.core.SolrConfig.(SolrConfig.java:108) at org.apache.solr.core.SolrConfig.(SolrConfig.java:65) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:88) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:825) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:714) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) *
Re: numDocs and maxDoc
: Thanks hossman, this is exactly what I want to do. : Final question: so I need to merge the field by myself first? (Actually my : original plan is to do 2 consecutive postingso merging is possible) you need to send Solr whole documents with all the fields in them. if you send another "doc" with the same value for the uniqueKey field, it will replace the previous doc. -Hoss
Re: Multiple unique field?
: When I set 2 unique key field, it looks like Solr only accept the first : definition in schema.xml...question: so once the unique Key defined, it : can't be overrided? there is one and only one uniqueKey field ... trying to declare two should probably be an error 9anyone wnat to submit a patch?), but you definitley can't "override" the uniqueKey field ... declare it once, and that's what it is for your whole index. -Hoss
Re: Date range performance
On 3-Apr-08, at 4:24 PM, Jonathan Ariel wrote: Is this depends on the number of documents that matches the query or the number of documents in the index? This aspect is more depedent on the number of terms that the date query translates into. If in a 3 million documents index my query matches 4, having date with a precision of seconds could slow down the query? Yes. Solr does range queries by taking the disjunction of a bunch of term queries, so it is the total number of terms checked that is the limiting factor. It would be better to implement this using an ordered index that could be binary-searched, but Solr isn't currently designed for that (though I think range optimization algorithms would be a cool addition). -Mike On Thu, Apr 3, 2008 at 7:45 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: On 3-Apr-08, at 2:14 PM, Jonathan Ariel wrote: Hi, I'm experiencing a really poor performance when using date ranges in solr query. Is it a know issue? is there any special consideration when using date ranges? It seems weird because I always thought date dates are translated to strings, so internally lucene resolves everything the same way. So maybe the problem is with parsing the dates and traslating it to the internal value? Any suggestion? Range query is highly dependent on the total number of unique terms covered by the range. If you are indexing dates with very high precision (e.g., milliseconds), this can consist of ridiculous numbers of terms. Try rounding the dates to something more granular when indexing. -Mike
Re: solr commit command questions
On 3-Apr-08, at 10:04 AM, oleg_gnatovskiy wrote: Hello. I was wondering what happens when an add command is done without a commit command. Is there any way to roll back? No, there isn't (unless you've taken a snapshot of the index using snapshooter). The main problem is that there is no way to "undelete" a document in lucene, so this might be impossible until lucene has more transaction support. -Mike
Re: Date range performance
Thanks! I'll try taking some precision and let you know about the result. Looking into the code it seems like a Lucene problem, more than Solr. It is in the RangeQuery and RangeFilter classes. The problem with changing this to have a sorted index and than binary search is that you have to sort it, which is slow. Unless we can store the ordered index somewhere and reuse it, it will be even slower than now. And if we store it, we will have to face the problem with updating ordered index with new terms. On Fri, Apr 4, 2008 at 3:30 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > On 3-Apr-08, at 4:24 PM, Jonathan Ariel wrote: > > > Is this depends on the number of documents that matches the query or the > > number of documents in the index? > > > > This aspect is more depedent on the number of terms that the date query > translates into. > > If in a 3 million documents index my query matches 4, having date with a > > precision of seconds could slow down the query? > > > > Yes. Solr does range queries by taking the disjunction of a bunch of term > queries, so it is the total number of terms checked that is the limiting > factor. > > It would be better to implement this using an ordered index that could be > binary-searched, but Solr isn't currently designed for that (though I think > range optimization algorithms would be a cool addition). > > -Mike > > > > > On Thu, Apr 3, 2008 at 7:45 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > > > > > > > On 3-Apr-08, at 2:14 PM, Jonathan Ariel wrote: > > > > > > Hi, > > > > I'm experiencing a really poor performance when using date ranges in > > > > solr > > > > query. Is it a know issue? is there any special consideration when > > > > using > > > > date ranges? It seems weird because I always thought date dates are > > > > translated to strings, so internally lucene resolves everything the > > > > same > > > > way. So maybe the problem is with parsing the dates and traslating > > > > it to > > > > the > > > > internal value? > > > > Any suggestion? > > > > > > > > > > > Range query is highly dependent on the total number of unique terms > > > covered by the range. If you are indexing dates with very high > > > precision > > > (e.g., milliseconds), this can consist of ridiculous numbers of terms. > > > > > > Try rounding the dates to something more granular when indexing. > > > > > > -Mike > > > > > > >
RE: why don't all stored fields show up?
Thanks for spending time on this issue. I removed most the fields, and it's still not working: http://localhost:8983/solr/update/csv?commit=true&separator=|&escape=\&strea m.file=exampledocs/test1.txt test1.txt content guid|sku 1|ABC001 Query: http://localhost:8983/solr/select/?q=guid%3A1&version=2.2&start=0&rows=10&in dent=on&fl=*,score output: 0 0 *,score on 0 guid:1 2.2 10 0.71231794 1 2008-04-04T19:35:44.427Z Schema: Guid is the unique numeric field. Thanks, Hung -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Friday, April 04, 2008 12:02 PM To: solr-user@lucene.apache.org Subject: Re: why don't all stored fields show up? On Fri, Apr 4, 2008 at 11:57 AM, Hung Huynh <[EMAIL PROTECTED]> wrote: > Do you think it might be a problem with my schema and data loading? Maybe. > I loaded > CSV with 39 fields and didn't get any error message. I have a total of 39 > stored fields, but not all of them are reported back when I query for them. Try to tackle it by getting more specific. Look at a single row in the CSV, and query for the id of that document in the index and see what's missing. Check the schema for those missing fields. Try to replicate the problem with another CSV file with just that single record. If you still can't figure it out, give us the following info: - the URL used to load the CSV data - the single record CSV file - the result of querying for that single record - your schema -Yonik
Re: Date range performance
: Looking into the code it seems like a Lucene problem, more than Solr. It is : in the RangeQuery and RangeFilter classes. The problem with changing this to : have a sorted index and than binary search is that you have to sort it, : which is slow. Unless we can store the ordered index somewhere and reuse it, : it will be even slower than now. And if we store it, we will have to face : the problem with updating ordered index with new terms. FWIW: Lucene Term enumeration is already indexed, it's just not a binary search tree (the details escape me at the moment, but there there is an interval value of N somewhere in the code, and every Nth Term is loaded into memory so a TermEnum.seek can skip ahead N terms at a time). But the number of unique terms can be a bottle neck ... rounding to the level of precision you absolutely need can save you in these cases by reducing the number of unique terms. -Hoss
Merging Solr index
Hi- http://wiki.apache.org/solr/MergingSolrIndexes recommends using the Lucene contributed app IndexMergeTool to merge two Solr indexes. What happens if both indexes have records with the same unique key? Will they both go into the new index? Is the implementation of unique IDs in the Solr java or in Lucene? If it is in Solr, how would I hackup a Solr IndexMergeTool? Cheers, Lance Norskog
Re: solr commit command questions
So, what is the point of the commit? oleg_gnatovskiy wrote: > > Hello. I was wondering what happens when an add command is done without a > commit command. Is there any way to roll back? > -- View this message in context: http://www.nabble.com/solr-commit-command-questions-tp16467824p16504441.html Sent from the Solr - User mailing list archive at Nabble.com.
admin.jsp java.lang.NoSuchFieldError
I have been testing our solr homes and applications with Solr 1.3 using builds I do from the SVN trunk. All of our code runs fine with Solr 1.2. I am running Solr under Tomcat 5.5.26 using JNDI. When running with Solr 1.3, Tomcat comes up clean. However, if you hit the admin index page, you get the following exception. Apr 4, 2008 12:03:10 PM org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Servlet.service() for servlet jsp threw exception java.lang.NoSuchFieldError: config at org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:88) I have the IBM developer works sample app running under the same Tomcat instance with the war file I am building and the admin page for that instance does not throw an exception. I have been able to reproduce the behavior with nightly builds. Any ideas on what I might check to resolve this? Thanks. -Rick
Re: solr commit command questions
On 04/04/2008, at 20:24, oleg_gnatovskiy wrote: So, what is the point of the commit? I always tought about it... this should have been named flush as it is on xapian... it has nothing to do with databases commits and the data will end up in the index one way or the other. -- Leonardo Santagada
Re: solr commit command questions
On 4-Apr-08, at 4:24 PM, oleg_gnatovskiy wrote: So, what is the point of the commit? It makes the data you have updated since last commit visible to the searchers. -Mike