Re: How to apply filters to stored data
Hi Erick, The problem I am trying to solve is to filter invalid entities. Users might mispell or enter a new entity name. This new/invalid entities need to pass through a KeepWordFilter so that it won't pollute our autocomplete result. I was looking into Luke. And it does seem to solve my use case, but is Luke something I can use in a production setup? Also when does happens? Is the data being copied a result of application of all filters or unmodified one? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-apply-filters-to-stored-data-tp3366230p3366987.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to apply filters to stored data
Erick Erickson wrote: > > See below: > > On Sun, Sep 25, 2011 at 9:53 AM, Jithin <jithin1...@gmail.com> > wrote: >> Hi Erick, The problem I am trying to solve is to filter invalid entities. >> Users might mispell or enter a new entity name. This new/invalid entities >> need to pass through a KeepWordFilter so that it won't pollute our >> autocomplete result. >> > > Right. But if you have a KeepWordFilter, that implies that you have a list > of known good words. Couldn't you use that file as your base for the > autosuggest component? > I think that is possible. But is there any other mechanism within solr/lucene to preprocess stored data. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-apply-filters-to-stored-data-tp3366230p3367158.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to apply filters to stored data
Is UpdateProcessor triggered when updating an existing document or for new documents also? On Tue, Sep 27, 2011 at 6:00 AM, Chris Hostetter-3 [via Lucene] < ml-node+s472066n3371110...@n3.nabble.com> wrote: > > : Hi Erick, The problem I am trying to solve is to filter invalid entities. > > : Users might mispell or enter a new entity name. This new/invalid entities > > : need to pass through a KeepWordFilter so that it won't pollute our > : autocomplete result. > > how are you doing autocomplete? > > if you are using the Suggest feature of solr, then thta's based on the > indexed terms anyway (last time i checked) so you don't need to manipulate > the stored field values. > > In general, the only way to manipluate the stored field values is to do it > in an update processor -- which can mutate the documents long before the > schema is ever even consulted. > > -Hoss > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/How-to-apply-filters-to-stored-data-tp3366230p3371110.html > To unsubscribe from How to apply filters to stored data, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3366230&code=aml0aGluMTk4N0BnbWFpbC5jb218MzM2NjIzMHwtMTEwMTgwMTA3Ng==>. > > -- Thanks Jithin Emmanuel -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-apply-filters-to-stored-data-tp3366230p3371200.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stopword filter - refreshing stop word list periodically
I am not running in a multi core environment. My application requires only a single search schema. Does it make sense to go for a multi core setup in this scenario? Given that we currently have a single core is there any alternative to RELOAD which work in a single core setup? On Fri, Oct 14, 2011 at 6:48 PM, Michael Kuhlmann-4 [via Lucene] < ml-node+s472066n3421627...@n3.nabble.com> wrote: > Am 14.10.2011 15:10, schrieb Jithin: > > Hi, > > Is it possible to refresh the stop word list periodically say once in 6 > > hours. Is this already supported in Solr or are there any work arounds. > > Kindly help me in understanding this. > > Hi, > > you can trigger a reload command to the core admin, assuming you're > running a multi core environment (which I'd recommend anyway). > > Simply add > "curl http://host:port > /solr/admin/cores?action=RELOAD&core=corename">http://host:port/solr/admin/cores?action=RELOAD&core=corename"; > > to your /etc/crontab file, and set the leading time fields correspondingly. > > > -Kuli > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3421627.html > To unsubscribe from Stopword filter - refreshing stop word list > periodically, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3421611&code=aml0aGluMTk4N0BnbWFpbC5jb218MzQyMTYxMXwtMTEwMTgwMTA3Ng==>. > > -- Thanks Jithin Emmanuel -- View this message in context: http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3422004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stopword filter - refreshing stop word list periodically
What will be the name of this hard coded core? I was re arranging my directory structure adding a separate directory for code. And it does work with a single core. On Fri, Oct 14, 2011 at 11:47 PM, Chris Hostetter-3 [via Lucene] < ml-node+s472066n3422415...@n3.nabble.com> wrote: > > : I am not running in a multi core environment. My application requires > only a > : single search schema. Does it make sense to go for a multi core setup in > : this scenario? Given that we currently have a single core is there any > : alternative to RELOAD which work in a single core setup? > > In recent versions of Solr (I think since 3.1) every Solr installation is > a multi-core environemnet, Solr just silently uses a hardcoded default > solr.xml that uses the solr home dir as the instanceDir of the "default > core" if your solr home dir doesn't already contain a solr.xml. > > So even if a single core setup, you should still be able to reload the > core. > > > -Hoss > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3422415.html > To unsubscribe from Stopword filter - refreshing stop word list > periodically, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3421611&code=aml0aGluMTk4N0BnbWFpbC5jb218MzQyMTYxMXwtMTEwMTgwMTA3Ng==>. > > -- Thanks Jithin Emmanuel -- View this message in context: http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3422550.html Sent from the Solr - User mailing list archive at Nabble.com.
Callback on starting solr?
Hi, Is is possible to have a callback after solr starts listening on the configured port. What I have found is there is a certain delay by the time solr starts listening on the port after restarting solr is done. So if I try to reindex solr it fails during this period. What I want is a notification mechanism after solr starts listening on the port. Is is doable? -- View this message in context: http://lucene.472066.n3.nabble.com/Callback-on-starting-solr-tp3426349p3426349.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Callback on starting solr?
I am doing something similar to that. checking netstat for any connection on port. Wanted to know if there is anything solr can do built in. Also I notice that my reindex is failing when I have to reindex some 7k+ docs. Solr is giving error in logs - Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at org.mortbay.io.ByteArrayBuffer.writeTo(ByteArrayBuffer.java:368) at org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:129) at org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:161) at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:714) ... 25 more 2011-10-16 18:05:05.431:WARN::Committed before 500 null||org.mortbay.jetty.EofException|?at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)|?at org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)|?at org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)|?at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:296)|?at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:140)|?at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)|?at org.apache.solr.common.util.FastWriter.flush(FastWriter.java:115)|?at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:344)|?at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)|?at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)|?at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)|?at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)|?at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)|?at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)|?at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)|?at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)|?at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)|?at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)|?at org.mortbay.jetty.Server.handle(Server.java:326)|?at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)|?at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)|?at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)|?at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)|?at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)|?at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)|?at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)|Caused by: java.net.SocketException: Broken pipe|?at java.net.SocketOutputStream.socketWrite0(Native Method)|?at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)|?at java.net.SocketOutputStream.write(SocketOutputStream.java:153)|?at org.mortbay.io.ByteArrayBuffer.writeTo(ByteArrayBuffer.java:368)|?at org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:129)|?at org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:161)|?at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:714)|?... 25 more| 2011-10-16 18:05:05.432:WARN::/solr/core0/update/ java.lang.IllegalStateException: Committed Is it a case where solr is not able to handle load? Currently solr is running with a max memory setting of 25MB. All the docs are very small. Each one contains just a few words. On Sun, Oct 16, 2011 at 11:52 PM, Jan Høydahl / Cominvent [via Lucene] < ml-node+s472066n3426389...@n3.nabble.com> wrote: > Hi, > > This depends on your application server and config. A very simple option is > to let your client poll with a ping request > http://localhost:8983/solr/admin/ping/ until it succeeds. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > On 16. okt. 2011, at 19:47, Jithin wrote: > > > Hi, > > Is is possible to have a callback after solr starts listening on the > > configured port. What I have found is there is a certain delay by the > time > > solr starts listening on the port after restarting solr is done. > > So if I try to reindex solr it fails during this period. What I want is a > > > notification mechanism after solr starts listening on the port. > > Is is doable? > > > > -- > > View this message in context: > http://lucene.472066.n3.nabble.com/Callback-on-starting-solr-tp3426349p3426349.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > -- > If you repl
Re: Callback on starting solr?
How do I configure solr with a ping request? http://localhost:8983/solr/admin/ping/ gives HTTP 404. On Mon, Oct 17, 2011 at 1:06 AM, Jan Høydahl / Cominvent [via Lucene] < ml-node+s472066n3426539...@n3.nabble.com> wrote: > Your app-server will start listening to the port some time before the Solr > webapp is ready, so you should check directly with Solr. You could also use > JMX to check Solr's status. If you want help with your reindex failing > issue, please provide more context. 25Mb is very low, please try give your > VM more memory and see if indexing succeeds then. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > On 16. okt. 2011, at 20:38, Jithin wrote: > > > I am doing something similar to that. checking netstat for any connection > on > > port. Wanted to know if there is anything solr can do built in. > > > > Also I notice that my reindex is failing when I have to reindex some 7k+ > > docs. Solr is giving error in logs - > > > > > > Caused by: java.net.SocketException: Broken pipe > >at java.net.SocketOutputStream.socketWrite0(Native Method) > >at > > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) > >at java.net.SocketOutputStream.write(SocketOutputStream.java:153) > >at > org.mortbay.io.ByteArrayBuffer.writeTo(ByteArrayBuffer.java:368) > >at > org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:129) > >at > org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:161) > >at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:714) > >... 25 more > > > > 2011-10-16 18:05:05.431:WARN::Committed before 500 > > null||org.mortbay.jetty.EofException|?at > > org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)|?at > > > org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)|?at > > > > org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)|?at > > sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:296)|?at > > sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:140)|?at > > java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)|?at > > org.apache.solr.common.util.FastWriter.flush(FastWriter.java:115)|?at > > > org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:344)|?at > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)|?at > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)|?at > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)|?at > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)|?at > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)|?at > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)|?at > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)|?at > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)|?at > > > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)|?at > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)|?at > > > org.mortbay.jetty.Server.handle(Server.java:326)|?at > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)|?at > > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)|?at > > > org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)|?at > > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)|?at > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)|?at > > > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)|?at > > > > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)|Caused > > > by: java.net.SocketException: Broken pipe|?at > > java.net.SocketOutputStream.socketWrite0(Native Method)|?at > > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)|?at > > java.net.SocketOutputStream.write(SocketOutputStream.java:153)|?at > > org.mortbay.io.ByteArrayBuffer.writeTo(ByteArrayBuffer.java:368)|?at > > org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:129)|?at > > org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:161)|?at > > org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:714)|?... 25 > more| > > 2011-10-16 18:05:05.432:WARN::/solr/
Re: Stopword filter - refreshing stop word list periodically
Thanks Sami. I ended up setting up a proper core as per documentation, named core0. On Thu, Nov 3, 2011 at 11:07 PM, Sami Siren-2 [via Lucene] < ml-node+s472066n3477844...@n3.nabble.com> wrote: > On Fri, Oct 14, 2011 at 10:06 PM, Jithin <[hidden > email]<http://user/SendEmail.jtp?type=node&node=3477844&i=0>> > wrote: > > What will be the name of this hard coded core? I was re arranging my > > directory structure adding a separate directory for code. And it does > work > > with a single core. > > In trunk the "single core setup" core is called "collection1". So to > reload that you'd call url: > http://localhost:8983/solr/admin/cores?action=RELOAD&core=collection1 > > -- > Sami Siren > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3477844.html > To unsubscribe from Stopword filter - refreshing stop word list > periodically, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3421611&code=aml0aGluMTk4N0BnbWFpbC5jb218MzQyMTYxMXwtMTEwMTgwMTA3Ng==>. > > -- Thanks Jithin Emmanuel -- View this message in context: http://lucene.472066.n3.nabble.com/Stopword-filter-refreshing-stop-word-list-periodically-tp3421611p3479040.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Suggester - building terms from both a field and dictionary
Hi, I have a use case where in I need to provide autocomplete from both values in a index field and a dictionary file. I am planning to use Solr Suggester. On reading the documentation I am getting the idea that terms can be either from a field or from a dictionary but not both. Can this behavior be modified so that it fetches terms from both dictionary file and an index field? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Suggester-building-terms-from-both-a-field-and-dictionary-tp3555457p3555457.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Suggester - perform case insensitive search
Hi, Is it possible to do case insensitive suggestions via Solr suggester. On reading the documentation it seems like there is no option for that. Can anyone please give suggestions on how to deal with this. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Suggester-perform-case-insensitive-search-tp3555496p3555496.html Sent from the Solr - User mailing list archive at Nabble.com.
Using more than one search component in a requestHandler
Hi, Is it possible to use have more than one search handler invoked as part of a requestHandler. I find that I am restricted in using only one spellcheck.dictionary per query. I want to use more than one dictionary for one query. Kindly let me know how I can do this. -- View this message in context: http://lucene.472066.n3.nabble.com/Using-more-than-one-search-component-in-a-requestHandler-tp3557587p3557587.html Sent from the Solr - User mailing list archive at Nabble.com.
Interpreting solr response time from log
Hi, For this request curl 'solrhost:8983/solr/core0/admin/ping' response is 0*2*falseall110truefalse10allsolrpingquerysearchOK which indicates response time is 2 ms. But in solr log its been seen as 01/Jan/2012:06:25:16 +] "GET /solr/core0/admin/ping HTTP/1.1" 200 *544* Can anyone please explain to me why the logs is showing 544 ms as response time? (My understanding is that the past parameter 544 is the response time) -- View this message in context: http://lucene.472066.n3.nabble.com/Interpreting-solr-response-time-from-log-tp3624340p3624340.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Interpreting solr response time from log
Thanks Gora for clarifying. So if my understanding is correct then the total response time is not logged in solr logs and I need to rely on the QTime in the response. -- View this message in context: http://lucene.472066.n3.nabble.com/Interpreting-solr-response-time-from-log-tp3624340p3624931.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Interpreting solr response time from log
Thans Chris for clarifying. This helps a lot. On Wed, Jan 4, 2012 at 2:07 AM, Chris Hostetter-3 [via Lucene] < ml-node+s472066n3630181...@n3.nabble.com> wrote: > : If your log level is set at least to INFO, as it should be by default > Solr does > : log response time to a different file. E.g., I have > : INFO: [] webapp=/solr path=/select/ > : params={indent=on&start=0&q=*:*&version=2.2&rows=10} hits=22 status=0 > : QTime=40 > : where the QTime is 40ms, as also reflected in the HTTP response. You > > It's also really important to understand exactly what QTime is measuring. > > > I've added a FAQ to try and make this more obvious... > > > https://wiki.apache.org/solr/FAQ#Why_is_the_QTime_Solr_returns_lower_then_the_amount_of_time_I.27m_measuring_in_my_client.3F > > -Hoss > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Interpreting-solr-response-time-from-log-tp3624340p3630181.html > To unsubscribe from Interpreting solr response time from log, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3624340&code=aml0aGluMTk4N0BnbWFpbC5jb218MzYyNDM0MHwtMTEwMTgwMTA3Ng==> > . > NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespace&breadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- Thanks Jithin Emmanuel -- View this message in context: http://lucene.472066.n3.nabble.com/Interpreting-solr-response-time-from-log-tp3624340p3630843.html Sent from the Solr - User mailing list archive at Nabble.com.
Implementing a custom ResourceLoader
Hi, As part of writing a solr plugin I need to override the ResourceLoader. My plugin is intended stop word analyzer filter factory and I need to change the way stop words are being fetched. My assumption is overriding ResourceLoader->getLines() will help me to meet my target of fetching stop word data from an external webservice. Is thisi feasible? Or should I go about overriding Factory->inform(ResourceLoader) method. Kindly let me know how to achieve this. -- Thanks Jithin