Solr Authentication Problem
Hi All, I am facing getting error when I am using Authentication in Solr. I followed Wiki. The error doesnot appear when I searching. Below is the code snippet and the error. Please note I am using Solr 1.4 Development build from SVN. HttpClient client=new HttpClient(); AuthScope scope = new AuthScope(AuthScope.ANY_HOST, AuthScope.ANY_PORT,null, null); client.getState().setCredentials( scope, new UsernamePasswordCredentials("guest", "guest") ); SolrServer server =new CommonsHttpSolrServer("http://localhost:8983/solr",client); SolrInputDocument doc1=new SolrInputDocument(); //Add fields to the document doc1.addField("employeeid", "1237"); doc1.addField("employeename", "Ann"); doc1.addField("employeeunit", "etc"); doc1.addField("employeedoj", "1995-11-31T23:59:59Z"); server.add(doc1); Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated. at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:468) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:63) at test.SolrAuthenticationTest.(SolrAuthenticationTest.java:49) at test.SolrAuthenticationTest.main(SolrAuthenticationTest.java:113) Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated. at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415) ... 5 more. Thanks and regards, Allahbaksh
Termscomponent and filter queries
Hi *, currently the terms component does not support filter queries. However, without them the returned count for the terms might differ to the actual results the user gets when conducting a search with a suggested word and (automatically) applied filter queries. So, are there any plans to add filter query support to the terms component? best Ingo -- Ingo Renner TYPO3 Core Developer, Release Manager TYPO3 4.2
howto understand solr stats
Hi Where can I read about understanding solr stats. I got this in cache section but kinda not talking too much to me. lookups : 149272 hits : 135267 hitratio : 0.90 inserts : 14018 evictions : 13506 size : 512 warmupTime : 0 cumulative_lookups : 7188459 cumulative_hits : 5429817 cumulative_hitratio : 0.75 cumulative_inserts : 1758642 cumulative_evictions : 812185
Re: Auto suggest.. how to do mixed case
On Fri, Jun 19, 2009 at 12:50 PM, Ian Holsman wrote: > I've noticed that one of the new features in Solr 1.4 is the Termscomponent > which enables the Autosuggest. > TermsComponent *can* be used for autosuggest though I don't think that was the original motivation. In the end it just the same thing as a prefix but returns the indexed tokens only rather than the stored field values. I think that by naming it as /autoSuggest, a lot of users have been misled since there are other techniques available. > > but what puzzles me is how to actually use it in an application. > > most autosuggests are case insensitive, so there is no difference if I type > in 'San Francisco' or 'san francisco'. > > now I've tried with a 'text' field, and a 'string' field with no joy. with > String providing the best result, but still with case sensitivity. > > at the moment I'm using a custom field type > > sortMissingLast="true" omitNorms="true"> > > > > > > > > > > > which converts all the field to all lower case, which allows me to submit > the query as lower case and better good results. > > so the point of the email is to find out how do I get the autosuggest to > return mixed case results, and not require me to lower case the query > before > I send it? > There is no way to do this right now using TermsComponent. You can index lower case terms and store the mixed case terms. Then you can use a prefix query which will return documents (and hence stored field values). -- Regards, Shalin Shekhar Mangar.
Re: Auto suggest.. how to do mixed case
Am 22.06.2009 um 11:09 schrieb Shalin Shekhar Mangar: Hi Shalin, I think that by naming it as /autoSuggest, a lot of users have been misled since there are other techniques available. what would you suggest? Ingo -- Ingo Renner TYPO3 Core Developer, Release Manager TYPO3 4.2
Re: Auto suggest.. how to do mixed case
On Mon, Jun 22, 2009 at 2:55 PM, Ingo Renner wrote: > > Hi Shalin, > > I think >> that by naming it as /autoSuggest, a lot of users have been misled since >> there are other techniques available. >> > > what would you suggest? > > There are many techniques. Personally, I've used 1. Prefix search on shingles 2. Exact (phrase) search on n-grams The regular prefix search also works. The good thing with these is that you can filter and different stored value is also possible. -- Regards, Shalin Shekhar Mangar.
Re: Auto suggest...
I'm not sure I'm understanding fully this thread, on the one hand it speaks about tuning the appropriate analyzer to get mixed case matching... This part I am not addressing and I zapped that part of the suject. on the other hand it seems to speak about an auto-suggestion facility? Is this http://wiki.apache.org/solr/SolrJS ? That page doesn't describe much of the server interface (e.g. the field types, the type of queries, how to fuzzify them). Are there other such plans in Solr? If that maybe be useful we have such an auto-completion with GWT under APL at http://i2geo.net/ where we intend to move to solr soon. paul Le 22-juin-09 à 13:11, Shalin Shekhar Mangar a écrit : On Mon, Jun 22, 2009 at 2:55 PM, Ingo Renner wrote: Hi Shalin, I think that by naming it as /autoSuggest, a lot of users have been misled since there are other techniques available. what would you suggest? There are many techniques. Personally, I've used 1. Prefix search on shingles 2. Exact (phrase) search on n-grams The regular prefix search also works. The good thing with these is that you can filter and different stored value is also possible. -- Regards, Shalin Shekhar Mangar. smime.p7s Description: S/MIME cryptographic signature
Re: Auto suggest...
On Mon, Jun 22, 2009 at 4:55 PM, Paul Libbrecht wrote: > I'm not sure I'm understanding fully this thread, > > on the one hand it speaks about tuning the appropriate analyzer to get > mixed case matching... > This part I am not addressing and I zapped that part of the suject. > > on the other hand it seems to speak about an auto-suggestion facility? > Is this http://wiki.apache.org/solr/SolrJS ? No. In the past the TermsComponent was defined in the example schema.xml as /autoSuggest which seems to suggest that it is *the* way to get auto-suggest support in Solr. This is what I was referring to which I said that users may have been misled by this. -- Regards, Shalin Shekhar Mangar.
Re: Solr Authentication Problem
Hi All, I am facing getting error when I am using Authentication in Solr. I followed Wiki. The error doesnot appear when I searching. Below is the code snippet and the error. Please note I am using Solr 1.4 Development build from SVN. HttpClient client=new HttpClient(); AuthScope scope = new AuthScope(AuthScope.ANY_HOST, AuthScope.ANY_PORT,null, null); client.getState().setCredentials( scope, new UsernamePasswordCredentials("guest", "guest") ); SolrServer server =new CommonsHttpSolrServer("http://localhost:8983/solr",client); SolrInputDocument doc1=new SolrInputDocument(); //Add fields to the document doc1.addField("employeeid", "1237"); doc1.addField("employeename", "Ann"); doc1.addField("employeeunit", "etc"); doc1.addField("employeedoj", "1995-11-31T23:59:59Z"); server.add(doc1); Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated. at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:468) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:63) at test.SolrAuthenticationTest.(SolrAuthenticationTest.java:49) at test.SolrAuthenticationTest.main(SolrAuthenticationTest.java:113) Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered entity enclosing request can not be repeated. at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415) ... 5 more. Thanks and regards, Allahbaksh
multi-word synonyms with multiple matches
We have a field with index-time synonyms called "title". Among the entries in the synonyms file are vp,vice president svp,senior vice president However, a search for "vp" does not return results where the title is "senior vice president". It appears that the term "vp" is not indexed when there is a longer string that matches a different synonym. Is this by design, and is there any way to make solr index all synonyms that match a term, even if it is contained in a longer synonym? Thanks! -Ken
Data Import Handler
After setting up a working Solr 1.3 example with a Tomcat 6 container, I have been trying to figure out the Data Import Handler so I can work with a MySQL database. However, after following the guidelines at http://wiki.apache.org/solr/DataImportHandler#head-b3518c890e46befa05c9242c8fc329517c1ea61b, I end up with the following message displayed in my browser when I go to http://localhost:8080/solr/dataimport: HTTP Status 404 - /solr/dataimport type Status report message /solr/dataimport description The requested resource (/solr/dataimport) is not available. Apache Tomcat/6.0.20 I have tried creating a dataimport directory in the hopes that /solr/dataimport would work like /solr/admin, and I put the dataimport.jsp file into this directory, but I still receive the same error message. When trying to go to http://localhost:8080/solr/admin/dataimport.jsp, I see two frames, the left frame having what I think I am supposed to see in order to deliver commands to the handler, and the right frame having the same error message as before. Is there something I am doing wrong? Does anyone know of a clearer set of guidelines I might be able to use? [Google hasn't pointed me to any as of yet.]
Re: Data Import Handler
On Mon, Jun 22, 2009 at 7:52 PM, Mukerjee, Neiloy (Neil) < neil.muker...@alcatel-lucent.com> wrote: > After setting up a working Solr 1.3 example with a Tomcat 6 container, I > have been trying to figure out the Data Import Handler so I can work with a > MySQL database. However, after following the guidelines at > http://wiki.apache.org/solr/DataImportHandler#head-b3518c890e46befa05c9242c8fc329517c1ea61b, > I end up with the following message displayed in my browser when I go to > http://localhost:8080/solr/dataimport: > > HTTP Status 404 - /solr/dataimport > type Status report > message /solr/dataimport > description The requested resource (/solr/dataimport) is not available. > Apache Tomcat/6.0.20 > That usually means that DataImportHandler is not registered at /dataimport in solrconfig.xml. Another reason might be that you are using multiple solr cores? Did you restart solr after changing the solrconfig.xml ? -- Regards, Shalin Shekhar Mangar.
spellcheck. limit the suggested words by some field
Hi, I have build spellcheck dictionary based on name field. It works like a charm but I'd like to limit the returned suggestion. For example we have following sturcutre id name type 1Berlin city 2berganphony So when I search for suggested words of "ber" I would get both Berlin and bergan but I somehow want to limit to only those of type city. I tried with fq=type:city but this didn't help either. Any pointers are more than welcome. The other approeach would be makind different spellcheck dictionaries based on type and just use the specific dictionary but then againI didn't see option howto build dictionary based on type. Thanks.
Re: ExtractRequestHandler - not properly indexing office docs?
Yep, I've tried both of those and still no joy. Here's both my curl statement and the resulting Solr log output. curl http://localhost:8983/solr/update/extract?ext.def.fl=text\&ext.literal.id=1\&ext.map.div=text\&ext.capture=div -F "myfi...@dj_character.doc" Curls output: 0317 Solr log: Jun 22, 2009 12:21:42 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update/extract params={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} status=0 QTime=544 Jun 22, 2009 12:22:26 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[1]} 0 317 Jun 22, 2009 12:22:26 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update/extract params={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} status=0 QTime=317 Jun 22, 2009 12:22:37 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/select params={wt=standard&rows=10&start=0&explainOther=&hl.fl=&indent=on&q=kondel&fl=*,score&qt=standard&version=2.2} hits=0 status=0 QTime=2 The submitted document has "kondel" in it numerous times, so Solr should have a hit. Yet it returns nothing. I also made sure I committed, but that didn't seem to help either. Grant Ingersoll-6 wrote: > > Do you have a default field declared? &ext.default.fl= > Either that, or you need to explicitly capture the fields you are > interested in using &ext.capture= > > You could add this to your curl statement to try out. > > -Grant > -- View this message in context: http://www.nabble.com/ExtractRequestHandler---not-properly-indexing-office-docs--tp24120125p24150763.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Slowness during submit the index
No VM. -Original Message- From: Bruno [mailto:brun...@gmail.com] Sent: Saturday, June 20, 2009 10:10 PM To: solr-user@lucene.apache.org Subject: Re: Slowness during submit the index We were having performance issues using servers running on VM. Are you running QA or Prod in a VM? 2009/6/21, Stephen Weiss : > Isn't it possible that the production equipment is simply under much > higher load (given that, since it's in production, your various users > are all actually using it), vs the QA equipment, which is only in use > by the people doing QA? > > We've found the same thing at one point - we had a very small index (< > 4 rows), so small it didn't seem worth the effort to do delta > updates. So we would just refresh the whole thing every time - or so > we planned. In the test environment it updated within a minute. In > production, it would take as long as 15 minutes. What we finally > realized was, because the DB was under much higher load in production > than in the test environment, especially considering the amount of > joins that needed to take place to pull out the data properly, various > writes from the users to the affected tables would slow down the data > selection process dramatically as the indexer would have to wait for > locks to clear. Now of course we do delta updates and everything's > fine (and blazingly fast in both environments). > > Try simulating higher load (involving a "normal" amount of writes to > the DB) against your QA equipment and then building the index. See if > the QA equipment still runs so quickly. > > -- > Steve > > On Jun 20, 2009, at 11:29 PM, Otis Gospodnetic wrote: > >> >> Hi Francis, >> >> I can't tell what the problem is from the information you've >> provided so far. My gut instinct is that this is due to some >> difference in QA vs. PROD environments that isn't Solr-specific. >> >> Otis >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> >> >> - Original Message >>> From: Francis Yakin >>> To: "solr-user@lucene.apache.org" >>> Sent: Saturday, June 20, 2009 2:18:07 AM >>> Subject: RE: Slowness during submit the index >>> >>> The amount of data in Prod is about 20% more than QA. >>> We tested the network speed is fine. The hardware in Prod is larger >>> and more >>> powerful than QA. >>> But QA is faster during reload. It takes QA only one hour than 6 >>> hours in Prod. >>> >>> That's why we don't understand what's the reason, the amount of >>> data is only 20% >>> more but it will not take 5 times slower because the data only 20% >>> more. >>> >>> So, we looked into the config file for solr, but it's not much >>> different, except >>> Prod has master/slave environment which QA only master. >>> >>> Thanks for the response. >>> >>> Francis >>> >>> >>> -Original Message- >>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] >>> Sent: Friday, June 19, 2009 8:58 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Slowness during submit the index >>> >>> >>> Francis, >>> >>> So it could easily be that your QA and PROD DBs are really just >>> simply different >>> (different amount of data, different network speed, different >>> hardware...) >>> >>> Otis >>> -- >>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >>> >>> >>> >>> - Original Message From: Francis Yakin To: "solr-user@lucene.apache.org" Sent: Friday, June 19, 2009 10:39:48 PM Subject: RE: Slowness during submit the index * is the java version the same on both machines (QA vs. PROD) - YES * are the same java parameters being used on both machines - YES * is the connection to the DB the same on both machines - Not sure, >>> need to ask the network guy * are both the PROD and QA DB servers the same and are both DB instances the same - they are not from the same DB -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Friday, June 19, 2009 6:23 PM To: solr-user@lucene.apache.org Subject: Re: Slowness during submit the index Francis, I'm not sure if I understood your email correctly, but I think you are saying you are indexing your DB content into a Solr index. If this is correct, here are things to look at: * is the java version the same on both machines (QA vs. PROD) * are the same java parameters being used on both machines * is the connection to the DB the same on both machines * are both the PROD and QA DB servers the same and are both DB instances the same ... Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Francis Yakin > To: "solr-user@lucene.apache.org" > Sent: Friday, June 19, 2009 5:27:59 PM > Subject: Slowness during submit the index > > > We are experiencin
Re: howto understand solr stats
Julian, Explanations below. -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Julian Davchev > To: solr-user@lucene.apache.org > Sent: Monday, June 22, 2009 5:01:12 AM > Subject: howto understand solr stats > > Hi > Where can I read about understanding solr stats. > I got this in cache section but kinda not talking too much to me. > > lookups : 149272 lookups for a given cache key/value > hits : 135267 cache hits - successful lookups - value found for the key > hitratio : 0.90 hit/miss ratio - 0.90 is pretty good. > inserts : 14018 number of items inserted > evictions : 13506 number of items evicted - this looks high - your cache is likely too small > size : 512 index size - looks smallish > warmupTime : 0 time taken to warm up the new cache when a new searcher is opened > cumulative_lookups : 7188459 > cumulative_hits : 5429817 > cumulative_hitratio : 0.75 > cumulative_inserts : 1758642 > cumulative_evictions : 812185 cumulative/aggregate numbers over the whole/current lifespan of the Solr instance/JVM. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Re: Sorlj when to commit?
Hi, If you don't need the searcher to see index changes (new docs) during your indexing, just wait until you are done and commit/optimize at the end. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: pof > To: solr-user@lucene.apache.org > Sent: Monday, June 22, 2009 2:31:53 AM > Subject: Sorlj when to commit? > > > Hi, I am doing a large batch (thousands) of insertions to my index using an > EmbeddedSolrServer. I was wondering how often should I use server.commit() > as I am trying to avoid unecessary bottlenecks. > > Thanks, Brett. > -- > View this message in context: > http://www.nabble.com/Sorlj-when-to-commit--tp24142326p24142326.html > Sent from the Solr - User mailing list archive at Nabble.com.
RE: Data Import Handler
I am not using multiple Solr cores, but I hadn't restarted after making changes to the solrconfig file or adding a data-config file, so I did that and got a "severe errors" warning in my browser, with the below text in my logs. When I delete the data-config file and remove the DataImportHandler section from the solrconfig file, I restart and see Solr running fine (although, of course, without the data import handler), and when I go in and repeat the process, I get the same errors. I suspect that the fact that the data-config file is blank is causing these issues, but per the documentation on the website, there is no indication of what, if anything, should go there - is there an alternate resource that anyone knows of which I could use? Jun 22, 2009 1:07:48 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165) at org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:99) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:97) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:415) at org.apache.solr.core.SolrCore.(SolrCore.java:572) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:128) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4450) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:583) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413) Caused by: org.xml.sax.SAXParseException: Premature end of file. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153) ... 33 more Jun 22, 2009 1:07:48 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener sending requests to searc...@134ce4a main Jun 22, 2009 1:07:48 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:105) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:415) at org.apache.solr.core.SolrCore.(SolrCore.java:572) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContain
Re: Data Import Handler
On Mon, Jun 22, 2009 at 10:51 PM, Mukerjee, Neiloy (Neil) < neil.muker...@alcatel-lucent.com> wrote: > > I suspect that the fact that the data-config file is blank is causing these > issues, but per the documentation on the website, there is no indication of > what, if anything, should go there - is there an alternate resource that > anyone knows of which I could use? > > The data-config.xml is the file which specified how and from where Solr can pull data. For example look at the full-import from a database data-config.xml at http://wiki.apache.org/solr/DataImportHandler#head-c24dc86472fa50f3e87f744d3c80ebd9c31b791c Or, look at the Slashdot feed example at http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476 -- Regards, Shalin Shekhar Mangar.
Re: Keyword Density
: Date: Wed, 3 Jun 2009 10:19:06 -0700 (PDT) : From: Otis Gospodnetic : Subject: Re: Keyword Density : > > But I don't need to sort using this value. I need to cut results, where : > > this value (for particular term of query!) not in some range. : I don't think this is possible without changing Solr. Or maybe it's : possible with a custom Search Component that looks at all hits and : checks the "df" (document frequency) for a term in each document? : Sounds like a very costly operation... FWIW: The best place to try and tackle something like this would probably be to write a new subclass of FilteredTermDocs that only returned docs/frequncies where the the freq was in the range you were interested in. Then use your new FilteredTermDocs class in a subclass of TermQuery when constructing a TermScorer. *then* use your new TermQuery subclass in a custom Solr QParser. It can be done efficiently, but it definitely requires making some low level changes to the code. -Hoss
Re: Sending Mlt POST request
: I wish to send an Mlt request to Solr and filter the result by a list of : values to specific field. The problem is sometimes the list can include : thousands of values and it's impossible to send such GET request. : : Sending this request as POST didn't work well... Is POST supported by : mlt? If not, is there suppose to be added in one of the next versions? : Or is there a different solution maybe? POST to any RequestHandler should work fine ... provided the POST is structured correctly. What exactly is hte behavior you are seeing (ie: an error message?) -Hoss
Re: searchcomponent howto ...
: and then ask, :- how can i set the value of query so that it is reflected in the 'q' : node of the search results e.g. solr. : the example 'process' method above works, but the original query is still : written to the search results page. if you're talking about the param values that get written out in the header section, those always contain the "original" params (either form the URL, or from defaults in configs ... I don't think you can modify those easily. your component can always add the your new "q" value to the response as a new object (with whatever name you want), and your client code can get at it that way. -Hoss
Re: Schema vs Dynamic Fields
: Date: Mon, 08 Jun 2009 16:44:45 -0700 : From: Phil Hagelberg : Subject: Schema vs Dynamic Fields : Is the use of a predefined schema primarily a "type safety" feature? : We're considering using Solr for a data set that is very free-form; will : we get much slower results if the majority of our data is in a dynamic : field such as: : : : : I'm a little unclear on the trade-offs involved and would appreciate : a hint. There is some cost involved in every new "field" that exists in your index (regardless of wether it was explicitly declared, or sprang into existence because of a dynamicField declaration) but there are ways to mitigate some of those costs (omitNorms=true being a big one) in general the big advantage to explicitly delcaring fields is that you can customize their analysis/datatypes ... you can do similar things by having "type specific" dynamic fields but then youre fiend names must follow set convnetions based on data type. -Hoss
Re: no .war with ubuntu release ?
: Date: Thu, 18 Jun 2009 19:00:18 -0400 : From: Jonathan Vanasco : Subject: no .war with ubuntu release ? : after countless searching, it seems that there is no .war file in the distro : http://packages.ubuntu.com/hardy/all/solr-common/filelist : http://packages.ubuntu.com/hardy/all/solr-jetty/filelist : : as you can see, there is no .war i'm not familiar with the ubuntu packaging, but by the looks of those file lists, they have unzipedthe solr.war into /usr/share/solr/ (note the WEB-INF directory). the interesting thing about java "webapps" is that they can be distributed as a "war" file or as a directory ... most servlet containers actually unzip the war file into a directory on local disk anyway (so they don't have to keep the whole thing in memory) and it looks like the ubuntu packagers just decided to package the uncompressed webapp in the .deb instead of having a war in there that would get uncompressed on first usage. that's just a theory however, and doens't explain why it isn't working for you. presumably somwhere in one of the jetty config files there should be a refrence to /user/share/ as the place to fine webapps, and a refrence to /etc/solr as the SolrHomeDir. -Hoss
THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle
Hey all, just a friendly reminder that this is Wednesday! I hope to see everyone there again. Please let me know if there's something interesting you'd like to talk about -- I'll help however I can. You don't even need a Powerpoint presentation -- there's many whiteboards. I'll try to have a video cam, but no promises. Feel free to call at 904-415-3009 if you need directions or any questions :) ~~` Greetings, On the heels of our smashing success last month, we're going to be convening the Pacific Northwest (Oregon and Washington) Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the 24th. The meeting should start at 6:45, organized chats will end around 8:00, and then there shall be discussion and socializing :) The meeting will be at the University of Washington in Seattle again. It's in the Computer Science building (not electrical engineering!), room 303, located here: http://www.washington.edu/home/maps/southcentral.html?80,70,792,660 If you've ever wanted to learn more about distributed computing, or just see how other people are innovating with Hadoop, you can't miss this opportunity. Our focus is on learning and education, so every presentation must end with a few questions for the group to research and discuss. (But if you're an introvert, we won't mind). The format is two or three 15-minute "deep dive" talks, followed by several 5 minute "lightning chats". We had a few interesting topics last month: -Building a Social Media Analysis company on the Apache Cloud Stack -Cancer detection in images using Hadoop -Real-time OLAP on HBase -- is it possible? -Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS -Custom Ranking in Lucene We already have one "deep dive" scheduled this month, on truly scalable Lucene with Katta. If you've been looking for a way to handle those large Lucene indices, this is a must-attend! Looking forward to seeing everyone there again. Cheers, Bradford http://www.roadtofailure.com -- The Fringes of Distributed Computing, Computer Science, and Social Media.
Re: DataImportHandler configuration - externalizing environment-specific settings?
Ah, thanks Noble. I should have figured that one out myself - I think the built-in capabilities of setting a parameter from the handler mapping will do the trick nicely, indirecting it from a system property. Erik On Jun 21, 2009, at 11:49 PM, Noble Paul നോബിള് नोब्ळ् wrote: There is no straight way but there is a way http://wiki.apache.org/solr/DataImportHandlerFaq#head-c4003ab5af86a200b35cf6846a58913839a5a096 On Mon, Jun 22, 2009 at 6:23 AM, Erik Hatcher wrote: In an environment where there are developer machines, test, staging, and production servers there is a need to externalize DIH configuration options like JDBC connections strings (at least the database server name), username, password, and base paths for XML and plain text files. How are folks handling this currently? Didn't seem to be a way to use system properties like we can in solrconfig/schema.xml files using ${sys.property[:defaultValue]} syntax. Having system properties be available in the variable resolver would be quite useful. Is this already there and I missed it? Thanks, Erik -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: ExtractRequestHandler - not properly indexing office docs?
What's your default search field? On Jun 22, 2009, at 12:29 PM, cloax wrote: Yep, I've tried both of those and still no joy. Here's both my curl statement and the resulting Solr log output. curl http://localhost:8983/solr/update/extract?ext.def.fl=text \&ext.literal.id=1\&ext.map.div=text\&ext.capture=div -F "myfi...@dj_character.doc" Curls output: 0317 Solr log: Jun 22, 2009 12:21:42 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update/extract params ={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} status=0 QTime=544 Jun 22, 2009 12:22:26 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[1]} 0 317 Jun 22, 2009 12:22:26 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update/extract params ={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} status=0 QTime=317 Jun 22, 2009 12:22:37 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/select params = {wt = standard &rows = 10 &start = 0 &explainOther =&hl.fl=&indent=on&q=kondel&fl=*,score&qt=standard&version=2.2} hits=0 status=0 QTime=2 The submitted document has "kondel" in it numerous times, so Solr should have a hit. Yet it returns nothing. I also made sure I committed, but that didn't seem to help either. Grant Ingersoll-6 wrote: Do you have a default field declared? &ext.default.fl= Either that, or you need to explicitly capture the fields you are interested in using &ext.capture= You could add this to your curl statement to try out. -Grant -- View this message in context: http://www.nabble.com/ExtractRequestHandler---not-properly-indexing-office-docs--tp24120125p24150763.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Solr Authentication Problem
I have raised an issue https://issues.apache.org/jira/browse/SOLR-1238 there is patch attached to the issue. On Mon, Jun 22, 2009 at 1:40 PM, Allahbaksh Asadullah wrote: > > Hi All, > I am facing getting error when I am using Authentication in Solr. I > followed Wiki. The error doesnot appear when I searching. Below is the > code snippet and the error. > > Please note I am using Solr 1.4 Development build from SVN. > > > HttpClient client=new HttpClient(); > > AuthScope scope = new AuthScope(AuthScope.ANY_HOST, > AuthScope.ANY_PORT,null, null); > > client.getState().setCredentials( > > scope, > > new UsernamePasswordCredentials("guest", > "guest") > > ); > > SolrServer server =new > CommonsHttpSolrServer("http://localhost:8983/solr",client); > > > > > > SolrInputDocument doc1=new SolrInputDocument(); > > //Add fields to the document > > doc1.addField("employeeid", "1237"); > > doc1.addField("employeename", "Ann"); > > doc1.addField("employeeunit", "etc"); > > doc1.addField("employeedoj", "1995-11-31T23:59:59Z"); > > server.add(doc1); > > > > > > Exception in thread "main" > org.apache.solr.client.solrj.SolrServerException: > org.apache.commons.httpclient.ProtocolException: Unbuffered entity > enclosing request can not be repeated. > > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:468) > > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) > > at > org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259) > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:63) > > at test.SolrAuthenticationTest.(SolrAuthenticationTest.java:49) > > at test.SolrAuthenticationTest.main(SolrAuthenticationTest.java:113) > > Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered > entity enclosing request can not be repeated. > > at > org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487) > > at > org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114) > > at > org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) > > at > org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) > > at > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) > > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) > > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) > > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415) > > ... 5 more. > > Thanks and regards, > Allahbaksh -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: ExtractRequestHandler - not properly indexing office docs?
I've tried 'text' ( taken from the example config ) and then tried creating a new field called doc_content and using that. Neither has worked. Grant Ingersoll-6 wrote: > > What's your default search field? > > On Jun 22, 2009, at 12:29 PM, cloax wrote: > >> >> Yep, I've tried both of those and still no joy. Here's both my curl >> statement >> and the resulting Solr log output. >> >> curl >> http://localhost:8983/solr/update/extract?ext.def.fl=text >> \&ext.literal.id=1\&ext.map.div=text\&ext.capture=div >> -F "myfi...@dj_character.doc" >> >> Curls output: >> >> >> 0> name="QTime">317 >> >> >> Solr log: >> Jun 22, 2009 12:21:42 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/update/extract >> params >> ={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} >> status=0 QTime=544 >> Jun 22, 2009 12:22:26 PM >> org.apache.solr.update.processor.LogUpdateProcessor >> finish >> INFO: {add=[1]} 0 317 >> Jun 22, 2009 12:22:26 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/update/extract >> params >> ={ext.map.div=text&ext.def.fl=text&ext.capture=div&ext.literal.id=1} >> status=0 QTime=317 >> Jun 22, 2009 12:22:37 PM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/select >> params >> = >> {wt >> = >> standard >> &rows >> = >> 10 >> &start >> = >> 0 >> &explainOther >> =&hl.fl=&indent=on&q=kondel&fl=*,score&qt=standard&version=2.2} >> hits=0 status=0 QTime=2 >> >> The submitted document has "kondel" in it numerous times, so Solr >> should >> have a hit. Yet it returns nothing. I also made sure I committed, >> but that >> didn't seem to help either. >> >> >> Grant Ingersoll-6 wrote: >>> >>> Do you have a default field declared? &ext.default.fl= >>> Either that, or you need to explicitly capture the fields you are >>> interested in using &ext.capture= >>> >>> You could add this to your curl statement to try out. >>> >>> -Grant >>> >> >> >> -- >> View this message in context: >> http://www.nabble.com/ExtractRequestHandler---not-properly-indexing-office-docs--tp24120125p24150763.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) > using Solr/Lucene: > http://www.lucidimagination.com/search > > > -- View this message in context: http://www.nabble.com/ExtractRequestHandler---not-properly-indexing-office-docs--tp24120125p24159267.html Sent from the Solr - User mailing list archive at Nabble.com.