URL Redirect
Hello, I have been assigned the task to migrate from Endeca to Solr. The former engine allowed me to set keyword triggers that, when matched exactly, caused the web client to redirect to a specified URL. Does that feature exist in Solr? If so, where can I get some info? Thank you
R: URL Redirect
Ok, so I installed tucky on Tomcat 6.0. I have the following configuration: /solr-p <- the solr configuration /solr <- the tucky configuration I want to redirect request such as http://localhost:8080/solr/select/?q=Somename&;... to a different address, and this is promptly done with rule: RuleAlessi ^/select/\?q=[Aa]lessi&.*$ /design/searchresult.asp/ene_m/4294950939/dept/design/ I have many of these rule and they works. If none of these rules are matched, I want to *forward* request http://localhost:8080/solr/select/?q=NonMatchedQuery&;... to http://localhost:8080/solr-p/select/?q=NonMatchedQuery&;... It works with the following rule: LastRule /(.*) /solr-p/$1 but it shows me on the browser's address bar the 'solr-p' thing, and this is not desired. If I change the rule to LastRule /(.*) /$1 it seems that the forwarding is done correctly, but the answer to http://localhost:8080/solr/admin/ is HTTP Status 400 - Missing solr core name in path while if I access to http://localhost:8080/solr-p/admin/ it works as expected. According to Tomcat's log files, it seems that the forwarding is done correctly. Please, could anybody explain me what's going on there? Thanks Inizio: Ranveer Kumar [ranveer.s...@gmail.com] Inviato: giovedì 6 ottobre 2011 10.21 Fine: solr-user@lucene.apache.org Oggetto: Re: URL Redirect Tucky can also help you if you are u Sing java.. On Oct 6, 2011 1:24 PM, "Finotti Simone" wrote: > Hello, > > I have been assigned the task to migrate from Endeca to Solr. > > The former engine allowed me to set keyword triggers that, when matched exactly, caused the web client to redirect to a specified URL. > > Does that feature exist in Solr? If so, where can I get some info? > > Thank you
R: URL Redirect
Hi, for those who may be interested, I resolved it (with a little help from urlrewrite user group :-) ) by using type="proxy" rule. S ____ Inizio: Finotti Simone [tech...@yoox.com] Inviato: venerdì 7 ottobre 2011 11.38 Fine: solr-user@lucene.apache.org Oggetto: R: URL Redirect Ok, so I installed tucky on Tomcat 6.0. I have the following configuration: /solr-p <- the solr configuration /solr <- the tucky configuration I want to redirect request such as http://localhost:8080/solr/select/?q=Somename&;... to a different address, and this is promptly done with rule: RuleAlessi ^/select/\?q=[Aa]lessi&.*$ /design/searchresult.asp/ene_m/4294950939/dept/design/ I have many of these rule and they works. If none of these rules are matched, I want to *forward* request http://localhost:8080/solr/select/?q=NonMatchedQuery&;... to http://localhost:8080/solr-p/select/?q=NonMatchedQuery&;... It works with the following rule: LastRule /(.*) /solr-p/$1 but it shows me on the browser's address bar the 'solr-p' thing, and this is not desired. If I change the rule to LastRule /(.*) /$1 it seems that the forwarding is done correctly, but the answer to http://localhost:8080/solr/admin/ is HTTP Status 400 - Missing solr core name in path while if I access to http://localhost:8080/solr-p/admin/ it works as expected. According to Tomcat's log files, it seems that the forwarding is done correctly. Please, could anybody explain me what's going on there? Thanks Inizio: Ranveer Kumar [ranveer.s...@gmail.com] Inviato: giovedì 6 ottobre 2011 10.21 Fine: solr-user@lucene.apache.org Oggetto: Re: URL Redirect Tucky can also help you if you are u Sing java.. On Oct 6, 2011 1:24 PM, "Finotti Simone" wrote: > Hello, > > I have been assigned the task to migrate from Endeca to Solr. > > The former engine allowed me to set keyword triggers that, when matched exactly, caused the web client to redirect to a specified URL. > > Does that feature exist in Solr? If so, where can I get some info? > > Thank you
Solr response writer
Hello, I need to change the HTTP result code of the query result if some conditions are met. Analyzing the flow of execution of Solr query process, it seems to me that the "place" that fits better is the QueryResponseWriter. Anyway I didn't found a way to change the HTTP request layout (I need to set 307 instead of 200), so I wonder if it's possible at all with the Solr (v 3.4) plugin mechanism actually provided. Any insight would be greatly appreciated J Thanks S
Re: Solr response writer
That's the scenario: I have an XML that maps words W to URLs; when a search request is issued by my web client, a query will be issued to my Solr application. If, after stemming, the query matches any in W, the client must be redirected to the associated URL. I agree that it should be handled outside, but we are currently on progress of migrating from Endeca, and it has a feature that allow this scenario. For this reason, my boss asked if it was somehow possible to leave that functionality in the search engine. thanks again Inizio: Erik Hatcher [erik.hatc...@gmail.com] Inviato: mercoledì 7 dicembre 2011 14.12 Fine: solr-user@lucene.apache.org Oggetto: Re: Solr response writer First, could you tell us more about your use case? Why do you want to change the response code? HTTP 307 = Temporary redirect - where are you going to redirect? Sounds like something best handled outside of Solr. If you went down the route of creating your own custom response writer, then you'd be locked into a single format (XML, or JSON, or which ever that you subclassed) On Dec 7, 2011, at 06:48 , Finotti Simone wrote: > Hello, > I need to change the HTTP result code of the query result if some conditions > are met. > > Analyzing the flow of execution of Solr query process, it seems to me that > the "place" that fits better is the QueryResponseWriter. Anyway I didn't > found a way to change the HTTP request layout (I need to set 307 instead of > 200), so I wonder if it's possible at all with the Solr (v 3.4) plugin > mechanism actually provided. > > Any insight would be greatly appreciated J > > Thanks > S
R: Solr response writer
I got your and Michael's point. Indeed, I'm not very skilled in web devolpment so there may be something that I'm missing. Anyway, Endeca does something like this: 1. accept a query 2. does the stemming; 3. check if the result of the step 2. matches one of the redirectable words. If so, returns an URL, otherwise returns the regular matching documents (our products' description). Do you think that in Solr I will be able to replicate this behaviour without writing a custom plugin (request handler, response writer, etc)? Maybe I'm a little dense, but I fail to see how it would be possible... S Inizio: Erik Hatcher [erik.hatc...@gmail.com] Inviato: mercoledì 7 dicembre 2011 14.40 Fine: solr-user@lucene.apache.org Oggetto: Re: Solr response writer Either way (Endeca's 307, which seems crazy to me) or simply plucking off a "url" field from the first document returned in a search request... you're getting a URL back to your client and then using that URL to further send back to a users browser, I presume. I personally wouldn't implement it with a custom response writer, just get the URL from the standard Solr response. Erik On Dec 7, 2011, at 08:26 , Finotti Simone wrote: > That's the scenario: > I have an XML that maps words W to URLs; when a search request is issued by > my web client, a query will be issued to my Solr application. If, after > stemming, the query matches any in W, the client must be redirected to the > associated URL. > > I agree that it should be handled outside, but we are currently on progress > of migrating from Endeca, and it has a feature that allow this scenario. For > this reason, my boss asked if it was somehow possible to leave that > functionality in the search engine. > > thanks again > > > Inizio: Erik Hatcher [erik.hatc...@gmail.com] > Inviato: mercoledì 7 dicembre 2011 14.12 > Fine: solr-user@lucene.apache.org > Oggetto: Re: Solr response writer > > First, could you tell us more about your use case? Why do you want to > change the response code? HTTP 307 = Temporary redirect - where are you > going to redirect? Sounds like something best handled outside of Solr. > > If you went down the route of creating your own custom response writer, then > you'd be locked into a single format (XML, or JSON, or which ever that you > subclassed) > > > On Dec 7, 2011, at 06:48 , Finotti Simone wrote: > >> Hello, >> I need to change the HTTP result code of the query result if some conditions >> are met. >> >> Analyzing the flow of execution of Solr query process, it seems to me that >> the "place" that fits better is the QueryResponseWriter. Anyway I didn't >> found a way to change the HTTP request layout (I need to set 307 instead of >> 200), so I wonder if it's possible at all with the Solr (v 3.4) plugin >> mechanism actually provided. >> >> Any insight would be greatly appreciated J >> >> Thanks >> S > > > > >
Re: Solr response writer
No, actually it's a .NET web service that queries Endeca (call it Wrapper). It returns to its clients a collection of unique product IDs, then the client will ask other web services for more detailed informations about the given products. As long as no URL redirection is involved, I think that solrnet ( http://code.google.com/p/solrnet/ ) is good enough to make our Wrapper connect to Solr, thus shielding the client from changes in the underlying search engine. Endeca C# API also returns a 'RedirectionUrl' property in one of its object, which is set to an URL if the text search matches a redirection rule, in this case the Wrapper passes it down to its client (my fault here, I thought there was some sort of redirection through HTTP result code, but that's not the case). The point is: since Solr doesn't have this feature, my only chance is to implement it into the "wrapping" web service itself, but I need to "access" how the words are analyzed by the search engine to make it work correctly. AFAICS, Solr only returns documents matching the request, so I'm missing something :-( S Inizio: Michael Kuhlmann [k...@solarier.de] Inviato: mercoledì 7 dicembre 2011 15.29 Fine: solr-user@lucene.apache.org Oggetto: Re: R: Solr response writer Am 07.12.2011 15:09, schrieb Finotti Simone: > I got your and Michael's point. Indeed, I'm not very skilled in web > devolpment so there may be something that I'm missing. Anyway, Endeca does > something like this: > > 1. accept a query > 2. does the stemming; > 3. check if the result of the step 2. matches one of the redirectable words. > If so, returns an URL, otherwise returns the regular matching documents (our > products' description). > > Do you think that in Solr I will be able to replicate this behaviour without > writing a custom plugin (request handler, response writer, etc)? Maybe I'm a > little dense, but I fail to see how it would be possible... Endeca not only is a search engine, it's part of a web application. You can send a query to the Endeca engine and send the response directly to the user; it's already fully rendered. (At least when you configured it this way.) Solr can't do this in any way. Solr responses are always pure technical data, not meant to be delivered to an end user. An exception to this is the VelocityResponseWriter which can fill a web template. Anything beyond the possibilities of the VelocityReponseWriter must be handled by some web application that anaylzes Solr's reponses. How do you want ot display your product descriptions, the default case? I don't think you want to show some XML data. Solr is a great search engine, but not more. It's just a small subset of commercial search frameworks like Endeca. Therefore, you can't simply replace it, you'll need some web application. However, you don't need a custom response writer in this case, nor do you have to Solr extend in any way. At least not for this requrement. -Kuli
Re: Solr response writer
Thank you Erik, I will work on your suggestion! It seems it could work, provided I can boost matches on "redirect" document type S Inizio: Erik Hatcher [erik.hatc...@gmail.com] Inviato: mercoledì 7 dicembre 2011 16.56 Fine: solr-user@lucene.apache.org Oggetto: Re: Solr response writer What you can do is index the "redirect" documents along with the associated words, and let Solr do the stemming. Maybe add a "document type" field and if you get a match on a redirect document type, your web service can do what it needs to do from there. Erik On Dec 7, 2011, at 10:43 , Finotti Simone wrote: > No, actually it's a .NET web service that queries Endeca (call it Wrapper). > It returns to its clients a collection of unique product IDs, then the client > will ask other web services for more detailed informations about the given > products. As long as no URL redirection is involved, I think that solrnet ( > http://code.google.com/p/solrnet/ ) is good enough to make our Wrapper > connect to Solr, thus shielding the client from changes in the underlying > search engine. > > Endeca C# API also returns a 'RedirectionUrl' property in one of its object, > which is set to an URL if the text search matches a redirection rule, in this > case the Wrapper passes it down to its client (my fault here, I thought there > was some sort of redirection through HTTP result code, but that's not the > case). > > The point is: since Solr doesn't have this feature, my only chance is to > implement it into the "wrapping" web service itself, but I need to "access" > how the words are analyzed by the search engine to make it work correctly. > AFAICS, Solr only returns documents matching the request, so I'm missing > something :-( > > S > > Inizio: Michael Kuhlmann [k...@solarier.de] > Inviato: mercoledì 7 dicembre 2011 15.29 > Fine: solr-user@lucene.apache.org > Oggetto: Re: R: Solr response writer > > Am 07.12.2011 15:09, schrieb Finotti Simone: >> I got your and Michael's point. Indeed, I'm not very skilled in web >> devolpment so there may be something that I'm missing. Anyway, Endeca does >> something like this: >> >> 1. accept a query >> 2. does the stemming; >> 3. check if the result of the step 2. matches one of the redirectable words. >> If so, returns an URL, otherwise returns the regular matching documents (our >> products' description). >> >> Do you think that in Solr I will be able to replicate this behaviour without >> writing a custom plugin (request handler, response writer, etc)? Maybe I'm a >> little dense, but I fail to see how it would be possible... > > Endeca not only is a search engine, it's part of a web application. You > can send a query to the Endeca engine and send the response directly to > the user; it's already fully rendered. (At least when you configured it > this way.) > > Solr can't do this in any way. Solr responses are always pure technical > data, not meant to be delivered to an end user. An exception to this is > the VelocityResponseWriter which can fill a web template. > > Anything beyond the possibilities of the VelocityReponseWriter must be > handled by some web application that anaylzes Solr's reponses. > > How do you want ot display your product descriptions, the default case? > I don't think you want to show some XML data. > > Solr is a great search engine, but not more. It's just a small subset of > commercial search frameworks like Endeca. Therefore, you can't simply > replace it, you'll need some web application. > > However, you don't need a custom response writer in this case, nor do > you have to Solr extend in any way. At least not for this requrement. > > -Kuli > > > >
Large RDBMS dataset
Hello, I have a very large dataset (> 1 Mrecords) on the RDBMS which I want my Solr application to pull data from. Problem is that the document fields which I have to index aren't in the same table, but I have to join records with two other tables. Well, in fact they are views, but I don't think that this makes any difference. That's the data import handler that I've actually written: It works, but it takes 1'38" to parse 100 records: it means 1 rec/s! That means that digesting the whole dataset would take 1 Ms (=> 12 days). The problem is that for each record in "fd", Solr makes three distinct SELECT on the other three tables. Of course, this is absolutely inefficient. Is there a way to have Solr loading every record in the four tables and join them when they are already loaded in memory? TIA
Re: Large RDBMS dataset
Thank you (and all the others who spent time answering me) very much for your insights! I didn't know how I've managed to miss CachedSqlEntityProcessor, but it seems that's just what I need. bye Inizio: Gora Mohanty [g...@mimirtech.com] Inviato: mercoledì 14 dicembre 2011 16.39 Fine: solr-user@lucene.apache.org Oggetto: Re: Large RDBMS dataset On Wed, Dec 14, 2011 at 3:48 PM, Finotti Simone wrote: > Hello, > I have a very large dataset (> 1 Mrecords) on the RDBMS which I want my Solr > application to pull data from. [...] > It works, but it takes 1'38" to parse 100 records: it means 1 rec/s! That > means that digesting the whole dataset would take 1 Ms (=> 12 days). Depending on the size of the data that you are pulling from the database, 1M records is not really that large a number. We were doing ~75GB of stored data from ~7million records in about 9h, including quite complicated transfomers. I would imagine that there is much room for improvement in your case also. Some notes on this: * If you have servers to throw at the problem, and a sensible way to shard your RDBMS data, use parallel indexing to multiple Solr cores, maybe on multiple servers, followed by a merge. In our experience, given enough RAM and adequate provisioning of database servers, indexing speed scales linearly with the total no. of cores. * Replicate your database, manually if needed. Look at the load on a database server during the indexing process, and provision enough database servers to match the no. of Solr indexing servers. * This point is leading into flamewar territory, but consider switching databases. From our (admittedly non-rigorous measurements), mysql was at least a factor of 2-3 faster than MS-SQL, with the same dataset. * Look at cloud-computing. If finances permit, one should be able to shrink indexing times to almost any desired level. E.g., for the dataset that we used, I have little doubt that we could have shrunk the time down to less than 1h, at an affordable cost on Amazon EC2. Unfortunately, we have not yet had the opportunity to try this. > The problem is that for each record in "fd", Solr makes three distinct SELECT > on the other three tables. Of course, this is absolutely inefficient. > > Is there a way to have Solr loading every record in the four tables and join > them when they are already loaded in memory? For various reasons, we did not investigate this in depth, but you could also look at Solr's CachedSqlEntityProcessor. Regards, Gora
boosting
Hello ML, I wonder if it is possibile to define a boost for certains fields in schema.xml configuration. As far, I have found ways to define a boost while indexing and while querying, so I suspect the straight answer is no. Anyway, I'd like a confirm, if possible. Thank you in advance S
Sorting on non-stored field
I was wondering: is it possible to sort a Solr result-set on a non-stored value? Thank you
Spellchecker problem
Hello, I have this configuration where a single master builds the Solr index and it replicates to two slave Solr instances. Regular queries are sent only to those two slaves. Configurations are the same for everyone (except of replication section, of course). My problem: it's happened that, in a particular query, I expected spellchecker to give me a suggestion. Fact is that only one of the two instances answers as I had expected! I checked the data directory and discovered that the failing instance had a data/spellchecker directory almost empty (12 KB against 7 MB of the other working instance). I don't understand this behaviour. I tried to issue a spellchecker.build=true command, and this is what I've got: Problem accessing /solr/yoox_slave/select. Reason: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock java.lang.RuntimeException: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock at org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:92) at org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:110) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1406) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:129) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:59) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:122) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:110) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:840) at org.apache.lucene.search.spell.SpellChecker.clearIndex(SpellChecker.java:470) at org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:88) ... 27 more Does anybody faced a similar problem? Can you point me to the solution? Thank you in advance
Skip first word
Hi is there a tokenizer and/or a combination of filter to remove the first term from a field? For example: The quick brown fox should be tokenized as: quick brown fox thank you in advance S
Re: Skip first word
Hi Ahmet, business asked me to apply EdgeNGram with minGramSize=1 on the first term and with minGramSize=3 on the latter terms. We are developing a search suggestion mechanism, the idea is that if the user types "D", the engine should suggest "Dolce & Gabbana", but if we type "G", it should suggest other brands. Only if users type "Gab" it should suggest "Dolce & Gabbana". Thanks S Inizio: Ahmet Arslan [iori...@yahoo.com] Inviato: mercoledì 25 luglio 2012 18.10 Fine: solr-user@lucene.apache.org Oggetto: Re: Skip first word > is there a tokenizer and/or a combination of filter to > remove the first term from a field? > > For example: > The quick brown fox > > should be tokenized as: > quick > brown > fox There is no such filter that i know of. Though, you can implement one with modifying source code of LengthFilterFactory or StopFilterFactory. They both remove tokens. Out of curiosity, what is the use case for this?
Re: Skip first word
Hi Chantal, if I understand correctly, this implies that I have to populate different fields according to their lenght. Since I'm not aware of any logical condition you can apply to copyField directive, it means that this logic has to be implementend by the process that populates the Solr core. Is this assumption correct? That's kind of bad, because I'd like to have this kind of "rules" in the Solr configuration. Of course, if that's the only way... :) Thank you Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com] Inviato: giovedì 26 luglio 2012 18.32 Fine: solr-user@lucene.apache.org Oggetto: Re: Skip first word Hi, use two fields: 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for inputs of length < 3, 2. the other one tokenized as appropriate with minsize=3 and longer for all longer inputs Cheers, Chantal Am 26.07.2012 um 09:05 schrieb Finotti Simone: > Hi Ahmet, > business asked me to apply EdgeNGram with minGramSize=1 on the first term and > with minGramSize=3 on the latter terms. > > We are developing a search suggestion mechanism, the idea is that if the user > types "D", the engine should suggest "Dolce & Gabbana", but if we type "G", > it should suggest other brands. Only if users type "Gab" it should suggest > "Dolce & Gabbana". > > Thanks > S > > Inizio: Ahmet Arslan [iori...@yahoo.com] > Inviato: mercoledì 25 luglio 2012 18.10 > Fine: solr-user@lucene.apache.org > Oggetto: Re: Skip first word > >> is there a tokenizer and/or a combination of filter to >> remove the first term from a field? >> >> For example: >> The quick brown fox >> >> should be tokenized as: >> quick >> brown >> fox > > There is no such filter that i know of. Though, you can implement one with > modifying source code of LengthFilterFactory or StopFilterFactory. They both > remove tokens. Out of curiosity, what is the use case for this? > > > >
R: Skip first word
Could you elaborate it, please? thanks S Inizio: in.abdul [in.ab...@gmail.com] Inviato: giovedì 26 luglio 2012 20.36 Fine: solr-user@lucene.apache.org Oggetto: Re: Skip first word That's is best option I had also used shingle filter factory . . On Jul 26, 2012 10:03 PM, "Chantal Ackermann-2 [via Lucene]" < ml-node+s472066n399748...@n3.nabble.com> wrote: > Hi, > > use two fields: > 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 > for inputs of length < 3, > 2. the other one tokenized as appropriate with minsize=3 and longer for > all longer inputs > > > Cheers, > Chantal > > > Am 26.07.2012 um 09:05 schrieb Finotti Simone: > > > Hi Ahmet, > > business asked me to apply EdgeNGram with minGramSize=1 on the first > term and with minGramSize=3 on the latter terms. > > > > We are developing a search suggestion mechanism, the idea is that if the > user types "D", the engine should suggest "Dolce & Gabbana", but if we type > "G", it should suggest other brands. Only if users type "Gab" it should > suggest "Dolce & Gabbana". > > > > Thanks > > S > > > > Inizio: Ahmet Arslan [[hidden > > email]<http://user/SendEmail.jtp?type=node&node=3997480&i=0>] > > > Inviato: mercoledì 25 luglio 2012 18.10 > > Fine: [hidden email]<http://user/SendEmail.jtp?type=node&node=3997480&i=1> > > Oggetto: Re: Skip first word > > > >> is there a tokenizer and/or a combination of filter to > >> remove the first term from a field? > >> > >> For example: > >> The quick brown fox > >> > >> should be tokenized as: > >> quick > >> brown > >> fox > > > > There is no such filter that i know of. Though, you can implement one > with modifying source code of LengthFilterFactory or StopFilterFactory. > They both remove tokens. Out of curiosity, what is the use case for this? > > > > > > > > > > > > -- > If you reply to this email, your message will be added to the discussion > below: > http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997480.html > To unsubscribe from Lucene, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472066&code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw> > . > NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > - THANKS AND REGARDS, SYED ABDUL KATHER -- View this message in context: http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997509.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Skip first word
Brilliant! Thank you very much :) Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com] Inviato: venerdì 27 luglio 2012 11.20 Fine: solr-user@lucene.apache.org Oggetto: Re: Skip first word Hi Simone, no I meant that you populate the two fields with the same input - best done via copyField directive. The first field will contain ngrams of size 1 and 2. The other field will contain ngrams of size 3 and longer (you might want to set a decent maxsize there). The query for the autocomplete list uses the first field when the input (typed in by the user) is one or two characters long. Your example was: "D", "G", or than "Do" or "Ga". The result would search only on the single token field that contains for the input "Dolce & Gabbana" only the ngrams "D" and "Do". So, only the input "D" or "Do" would result in a hit on "Dolce & Gabbana". Once the user has typed in the third letter: "Dol" or "Gab", you query the second, more tokenized field which would contain for "Dolce & Gabbana" the ngrams "Dol" "Dolc" "Dolce" "Gab" "Gabb" "Gabba" etc. Both inputs "Gab" and "Dol" would then return "Dolce & Gabbana". 1. First field type: 2. Secong field type: 3. field declarations: Chantal Am 27.07.2012 um 11:05 schrieb Finotti Simone: > Hi Chantal, > > if I understand correctly, this implies that I have to populate different > fields according to their lenght. Since I'm not aware of any logical > condition you can apply to copyField directive, it means that this logic has > to be implementend by the process that populates the Solr core. Is this > assumption correct? > > That's kind of bad, because I'd like to have this kind of "rules" in the Solr > configuration. Of course, if that's the only way... :) > > Thank you > > > Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com] > Inviato: giovedì 26 luglio 2012 18.32 > Fine: solr-user@lucene.apache.org > Oggetto: Re: Skip first word > > Hi, > > use two fields: > 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for > inputs of length < 3, > 2. the other one tokenized as appropriate with minsize=3 and longer for all > longer inputs > > > Cheers, > Chantal > > > Am 26.07.2012 um 09:05 schrieb Finotti Simone: > >> Hi Ahmet, >> business asked me to apply EdgeNGram with minGramSize=1 on the first term >> and with minGramSize=3 on the latter terms. >> >> We are developing a search suggestion mechanism, the idea is that if the >> user types "D", the engine should suggest "Dolce & Gabbana", but if we type >> "G", it should suggest other brands. Only if users type "Gab" it should >> suggest "Dolce & Gabbana". >> >> Thanks >> S >> >> Inizio: Ahmet Arslan [iori...@yahoo.com] >> Inviato: mercoledì 25 luglio 2012 18.10 >> Fine: solr-user@lucene.apache.org >> Oggetto: Re: Skip first word >> >>> is there a tokenizer and/or a combination of filter to >>> remove the first term from a field? >>> >>> For example: >>> The quick brown fox >>> >>> should be tokenized as: >>> quick >>> brown >>> fox >> >> There is no such filter that i know of. Though, you can implement one with >> modifying source code of LengthFilterFactory or StopFilterFactory. They both >> remove tokens. Out of curiosity, what is the use case for this? >> >> >> >> > > > > >
Split XML configuration
Hi, is it possible to split schema.xml and solrconfig.xml configurations? My configurations are getting quite large and I'd like to be able to partition them logically in multiple files. thank you in advance, S
Query filtering
Hello, I'm doing this query to return top 10 facets within a given "context", specified via the fq parameter. http://solr/core/select?fq=(...)&q=*:*&rows=0&facet.field=interesting_facet&facet.limit=10 Now, I should search for a term inside the context AND the previously identified top 10 facet values. Is there a way to do this with a single query? thank you in advance, S
Re: Query filtering
Hi Amit, thank you for your answer, but I did know how to do it with two distinct queries: I hoped for some way to do it with a single query :-) (maybe using some advanced functionality with nested queries...) S Inizio: Amit Nithian [anith...@gmail.com] Inviato: giovedì 27 settembre 2012 19.18 Fine: solr-user@lucene.apache.org Oggetto: Re: Query filtering I think one way to do this is issue another query and set a bunch of filter queries to restrict "interesting_facet" to just those ten values returned in the first query. fq=interesting_facet:1 OR interesting_facet:2 etc&q=context: Does that help? Amit On Thu, Sep 27, 2012 at 6:33 AM, Finotti Simone wrote: > Hello, > I'm doing this query to return top 10 facets within a given "context", > specified via the fq parameter. > > http://solr/core/select?fq=(...)&q=*:*&rows=0&facet.field=interesting_facet&facet.limit=10 > > Now, I should search for a term inside the context AND the previously > identified top 10 facet values. > > Is there a way to do this with a single query? > > thank you in advance, > S